Memory (preview) in Foundry Agent Service and the Memory Store API (preview) are licensed to you as part of your Azure subscription and are subject to terms applicable to “Previews” in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum, and the Supplemental Terms of Use for Microsoft Azure Previews.The latest preview offers new capabilities and enhancements, including:
- Memory item operations to create, read, update, list, and delete individual memory records.
- Store-level default retention controls, including default TTL for newly created memory stores.
- Direct remember-or-forget synchronized memory command behavior.
scope parameter, which segments memory across users to ensure secure and isolated experiences.
This article explains how to create, manage, and use memory stores. For conceptual information, see Memory in Foundry Agent Service.
Usage support
| Capability | Python SDK | C# SDK | JavaScript SDK | REST API |
|---|---|---|---|---|
| Create, update, list, and delete memory stores | ✔️ | ✔️ | ✔️ | ✔️ |
| Attach memory to a prompt agent | ✔️ | ✔️ | ✔️ | ✔️ |
| Update and search memories | ✔️ | ✔️ | ✔️ | ✔️ |
| Create, read, update, list, and delete memory items | ✔️ | ✔️ | ✔️ | ✔️ |
Prerequisites
- An Azure subscription. Create one for free.
- A Microsoft Foundry project with configured authorization and permissions.
- A chat model deployment, such as
gpt-5.2, in your project. - An embedding model deployment, such as
text-embedding-3-small, in your project. - A configured local environment with required packages and environment variables.
Authorization and permissions
We recommend role-based access control for production deployments. If roles aren’t feasible, skip this section and use key-based authentication instead. To configure role-based access:- Sign in to the Azure portal.
-
On your project:
- From the left pane, select Resource Management > Identity.
- Use the toggle to enable a system-assigned managed identity.
-
On the resource that contains your project:
- From the left pane, select Access control (IAM).
- Select Add > Add role assignment.
- Assign Foundry User to the managed identity of your project.
The Foundry RBAC roles were recently renamed. Foundry User, Foundry Owner, Foundry Account Owner, and Foundry Project Manager were previously named Azure AI User, Azure AI Owner, Azure AI Account Owner, and Azure AI Project Manager. You might still see the previous names in some places while the rename rolls out. The role IDs and core permissions are unchanged by the rename.
Set up your environment
Understand scope
Thescope parameter controls how memory is partitioned. Each scope in the memory store keeps an isolated collection of memory items. For example, if you create a customer support agent with memory, each customer should have their own individual memory.
As a developer, you choose the key used to store and retrieve memory items. The right approach depends on how you access memory.
Via the memory search tool
When you attach the memory search tool to an agent, setscope to {{$userId}} to enable per-user memory isolation without hard-coding identifiers. The system automatically resolves the end user’s identity on each response call from one of two sources:
-
x-memory-user-idrequest header: If present, the header value is used as the user ID. Use this in proxy or backend scenarios where your service calls the API on behalf of an end user. - Microsoft Entra authentication token: If the header isn’t set, the system falls back to the caller’s tenant ID (TID) and object ID (OID). This is the default in frontend scenarios where users authenticate directly with Microsoft Entra.
scope value instead.
Via low-level memory APIs
When you call memory APIs directly, specifyscope explicitly in each request. You can pass a static value, such as a universally unique identifier (UUID) or another stable identifier from your system. Automatic identity extraction isn’t supported for these operations.
Create a memory store
Create a dedicated memory store for each agent to establish clear boundaries for memory access and optimization. When you create a memory store, specify the chat model and embedding model deployments that process your memory content. Use memory store options to control extraction behavior and retention defaults. In the latest preview, you can enable procedural memory and set a default TTL (seconds) for newly created memory entries.Customize memory
Customize what information the agent stores to keep memory efficient, relevant, and privacy-respecting. Use theuser_profile_details parameter to specify the types of data that are critical to the agent’s function.
For example, set user_profile_details to prioritize “flight carrier preference and dietary restrictions” for a travel agent. This focused approach helps the memory system know which details to extract, summarize, and commit to long-term memory.
You can also use this parameter to exclude certain types of data, keeping memory lean and compliant with privacy requirements. For example, set user_profile_details to “avoid irrelevant or sensitive data, such as age, financials, precise location, and credentials.”
Configure TTL and retention policies
TTL applies to all memories, whether from direct memory commands, extraction and consolidation, or item-level CRUD operations. If a memory is updated and consolidated, the service resets its last-updated time. TTL applies only to memory stores created after TTL support was introduced. It doesn’t affect existing memory stores. Adefault_ttl_seconds value of 0 indicates no expiration. Choose a retention period that matches your compliance and user-data lifecycle requirements.
Update a memory store
Update memory store properties, such asdescription or metadata, to better manage memory stores.
List memory stores
Retrieve a list of memory stores in your project to manage and monitor your memory infrastructure.Use memories via an agent tool
After you create a memory store, you can attach the memory search tool to a prompt agent. This tool enables the agent to read from and write to your memory store during conversations. Configure the tool with the appropriatescope and update_delay to control how and when memories are updated.
Create a conversation
You can now create conversations and request agent responses. At the start of each conversation, static memories are injected so the agent has immediate, persistent context. Contextual memories are retrieved per turn based on the latest messages to inform each response. After each agent response, the service internally callsupdate_memories. However, actual writes to long‑term memory are debounced by the update_delay setting. The update is scheduled and only completes after the configured period of inactivity.
In the updated preview schema, the memory search tool output uses a
memories collection instead of the legacy results field. If you process raw output payloads, update your parsers accordingly.Apply direct remember-or-forget behavior
When a user explicitly asks the agent to remember or forget information, the memory search tool in thetools array applies the operation immediately and returns the result as memory command items in the response output. No additional tool configuration is required.
Direct memory commands don’t override memory TTL. If a memory store has TTL configured, memory items can still expire, even if they were added by a remember command.
Use memories via APIs
You can interact with a memory store directly using the memory store APIs. Start by adding memories from conversation content to the memory store, and then search for relevant memories to provide context for agent interactions.Add memories to a memory store
Add memories by providing conversation content to the memory store. The system preprocesses and postprocesses the data, including memory extraction and consolidation, to optimize the agent’s memory. This long-running operation might take about one minute. Decide how to segment memory across users by specifying thescope parameter. You can scope the memory to a specific end user, a team, or another identifier.
You can update a memory store with content from multiple conversation turns, or update after each turn and chain updates using the previous update operation ID.
Search for memories in a memory store
Search memories to retrieve relevant context for agent interactions. Specify the memory store name and scope to narrow the search.Retrieve static or contextual memories
Often, user profile memories can’t be retrieved based on semantic similarity to a user’s message. We recommend that you inject static memories into the beginning of each conversation and use contextual memories to generate each agent response.-
To retrieve static memories, call
search_memorieswith ascopebut withoutitemsorprevious_search_id. This returns user profile memories associated with the scope. -
To retrieve contextual memories, call
search_memorieswithitemsset to the latest messages. This can return both user profile and chat summary memories most relevant to the given items.
Manage memory items
Use item-level operations to directly create, inspect, update, and delete individual memory records. For scope-level or store-level deletion, see Delete memories.The latest preview uses
/memories as the item-level path segment. The previous preview used /items, with :list for listing. If you’re on the previous API version, update your routes accordingly.Create a memory item
Get a memory item
List memory items
Update a memory item
Delete a memory item
Delete memories
Memories are organized by scope within a memory store. You can delete memories for a specific scope to remove user-specific data, or you can delete the entire memory store to remove all memories across all scopes.Delete memories by scope
Remove all memories associated with a particular user or group scope while preserving the memory store structure. Use this operation to handle user data deletion requests or reset memory for specific users.Delete a memory store
Remove the entire memory store and all associated memories across all scopes. This operation is irreversible.Best practices
-
Implement per-user access controls: Avoid giving agents access to memories shared across all users. Use the
scopeproperty to partition the memory store by user. When you sharescopeacross users, useuser_profile_detailsto instruct the memory system not to store personal information. -
Map scope to the end user: When you use the memory search tool, set
scopeto{{$userId}}in the tool definition. The system resolves the user identity from thex-memory-user-idrequest header, if present. Otherwise, it falls back to the caller’s Microsoft Entra token ({tid}_{oid}). - Minimize and protect sensitive data: Store only what’s necessary for your use case. If you must store sensitive data, such as personal data, health data, or confidential business inputs, redact or remove other content that could be used to trace back to an individual.
- Support privacy and compliance: Provide users with transparency, including options to access and delete their data. Record all deletions in a tamper-evident audit trail. Ensure the system adheres to local compliance requirements and regulatory standards.
- Segment data and isolate memory: In multi-agent systems, segment memory logically and operationally. Allow customers to define, isolate, inspect, and delete their own memory footprint.
- Monitor memory usage: Track token usage and memory operations to understand costs and optimize performance.
- Expose user-facing memory controls: Provide item-level edit and delete actions to support trust and data rights workflows.
- Set explicit retention defaults: Use TTL settings that match policy requirements. Document retention behavior in your product UX.
Troubleshooting
| Issue | Cause | Resolution |
|---|---|---|
| Requests fail with an authentication or authorization error. | Your identity or the project managed identity doesn’t have the required roles. | Verify the roles in Authorization and permissions. For REST calls, generate a fresh access token and retry. |
| Memories don’t appear after a conversation. | Memory updates are debounced or still processing. | Increase the wait time or call the update API with update_delay set to 0 to trigger processing immediately. |
| Memory search returns no results. | The scope value doesn’t match the scope used when memories were stored. | Use the same scope for update and search. If you map scope to users, use a stable user identifier. |
| The agent response doesn’t use stored memory. | The agent isn’t configured with the memory search tool, or the memory store name is incorrect. | Confirm the agent definition includes the memory_search_preview tool and references the correct memory store name. |
| Procedural memory or default TTL setting didn’t take effect after an update. | In the latest preview, you can only set default options at memory store creation time. | Recreate the memory store with the desired defaults or check whether your API version supports post-create option updates. |
| An explicit remember-or-forget request didn’t return memory command items in the response. | Memory tooling isn’t configured correctly, or the input wasn’t recognized as a remember-or-forget command. | Confirm the memory tool configuration and test with direct remember-or-forget phrasing. |