Create and Use Memory - Microsoft Foundry Docs

Memory (preview) in Foundry Agent Service and the Memory Store API (preview) are licensed to you as part of your Azure subscription and are subject to terms applicable to “Previews” in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum, and the Supplemental Terms of Use for Microsoft Azure Previews.The latest preview offers new capabilities and enhancements, including:

Memory item operations to create, read, update, list, and delete individual memory records.
Store-level default retention controls, including default TTL for newly created memory stores.
Direct remember-or-forget synchronized memory command behavior.

Memory in Foundry Agent Service is a managed, long-term memory solution. It enables agent continuity across sessions, devices, and workflows. By creating and managing memory stores, you can build agents that retain user preferences, maintain conversation history, and deliver personalized experiences. Memory stores act as persistent storage, defining which types of information are relevant to each agent. You control access using the scope parameter, which segments memory across users to ensure secure and isolated experiences. This article explains how to create, manage, and use memory stores. For conceptual information, see Memory in Foundry Agent Service.

Usage support

Capability	Python SDK	C# SDK	JavaScript SDK	REST API
Create, update, list, and delete memory stores	✔️	✔️	✔️	✔️
Attach memory to a prompt agent	✔️	✔️	✔️	✔️
Update and search memories	✔️	✔️	✔️	✔️
Create, read, update, list, and delete memory items	✔️	✔️	✔️	✔️

Prerequisites

An Azure subscription. Create one for free.
A Microsoft Foundry project with configured authorization and permissions.
A chat model deployment, such as gpt-5.2, in your project.
An embedding model deployment, such as text-embedding-3-small, in your project.
A configured local environment with required packages and environment variables.

Authorization and permissions

We recommend role-based access control for production deployments. If roles aren’t feasible, skip this section and use key-based authentication instead. To configure role-based access:

Sign in to the Azure portal.
On your project:
1. From the left pane, select Resource Management > Identity.
2. Use the toggle to enable a system-assigned managed identity.
On the resource that contains your project:
1. From the left pane, select Access control (IAM).
2. Select Add > Add role assignment.
3. Assign Foundry User to the managed identity of your project.

The Foundry RBAC roles were recently renamed. Foundry User, Foundry Owner, Foundry Account Owner, and Foundry Project Manager were previously named Azure AI User, Azure AI Owner, Azure AI Account Owner, and Azure AI Project Manager. You might still see the previous names in some places while the rename rolls out. The role IDs and core permissions are unchanged by the rename.

Set up your environment

Understand scope

The scope parameter controls how memory is partitioned. Each scope in the memory store keeps an isolated collection of memory items. For example, if you create a customer support agent with memory, each customer should have their own individual memory. As a developer, you choose the key used to store and retrieve memory items. The right approach depends on how you access memory.

Via the memory search tool

When you attach the memory search tool to an agent, set scope to {{$userId}} to enable per-user memory isolation without hard-coding identifiers. The system automatically resolves the end user’s identity on each response call from one of two sources:

x-memory-user-id request header: If present, the header value is used as the user ID. Use this in proxy or backend scenarios where your service calls the API on behalf of an end user.
Microsoft Entra authentication token: If the header isn’t set, the system falls back to the caller’s tenant ID (TID) and object ID (OID). This is the default in frontend scenarios where users authenticate directly with Microsoft Entra.

If you don’t need per-user isolation, use a static scope value instead.

Via low-level memory APIs

When you call memory APIs directly, specify scope explicitly in each request. You can pass a static value, such as a universally unique identifier (UUID) or another stable identifier from your system. Automatic identity extraction isn’t supported for these operations.

Create a memory store

Create a dedicated memory store for each agent to establish clear boundaries for memory access and optimization. When you create a memory store, specify the chat model and embedding model deployments that process your memory content. Use memory store options to control extraction behavior and retention defaults. In the latest preview, you can enable procedural memory and set a default TTL (seconds) for newly created memory entries.

The remaining Python, C#, and TypeScript snippets build on the client and variables defined in Create a memory store. If you run those code snippets independently, include the import and client initialization code from this section.
The C# snippets in this article use synchronous methods. For asynchronous usage, see the memory search tool and memory store samples.

Customize memory

Customize what information the agent stores to keep memory efficient, relevant, and privacy-respecting. Use the user_profile_details parameter to specify the types of data that are critical to the agent’s function. For example, set user_profile_details to prioritize “flight carrier preference and dietary restrictions” for a travel agent. This focused approach helps the memory system know which details to extract, summarize, and commit to long-term memory. You can also use this parameter to exclude certain types of data, keeping memory lean and compliant with privacy requirements. For example, set user_profile_details to “avoid irrelevant or sensitive data, such as age, financials, precise location, and credentials.”

Configure TTL and retention policies

TTL applies to all memories, whether from direct memory commands, extraction and consolidation, or item-level CRUD operations. If a memory is updated and consolidated, the service resets its last-updated time. TTL applies only to memory stores created after TTL support was introduced. It doesn’t affect existing memory stores. A default_ttl_seconds value of 0 indicates no expiration. Choose a retention period that matches your compliance and user-data lifecycle requirements.

Update a memory store

Update memory store properties, such as description or metadata, to better manage memory stores.

List memory stores

Retrieve a list of memory stores in your project to manage and monitor your memory infrastructure.

Use memories via an agent tool

After you create a memory store, you can attach the memory search tool to a prompt agent. This tool enables the agent to read from and write to your memory store during conversations. Configure the tool with the appropriate scope and update_delay to control how and when memories are updated.

To scope memories to an individual end user, set scope to "{{$userId}}" in the tool definition and pass x-memory-user-id: <user-id> as a header on each response call. The system resolves the scope to that user’s identity. Without the header, the scope falls back to the caller’s Microsoft Entra identity (TID and OID). For more information, see Understand scope.

Create a conversation

You can now create conversations and request agent responses. At the start of each conversation, static memories are injected so the agent has immediate, persistent context. Contextual memories are retrieved per turn based on the latest messages to inform each response. After each agent response, the service internally calls update_memories. However, actual writes to long‑term memory are debounced by the update_delay setting. The update is scheduled and only completes after the configured period of inactivity.

In the updated preview schema, the memory search tool output uses a memories collection instead of the legacy results field. If you process raw output payloads, update your parsers accordingly.

Apply direct remember-or-forget behavior

When a user explicitly asks the agent to remember or forget information, the memory search tool in the tools array applies the operation immediately and returns the result as memory command items in the response output. No additional tool configuration is required.

Direct memory commands don’t override memory TTL. If a memory store has TTL configured, memory items can still expire, even if they were added by a remember command.

Use memories via APIs

You can interact with a memory store directly using the memory store APIs. Start by adding memories from conversation content to the memory store, and then search for relevant memories to provide context for agent interactions.

Add memories to a memory store

Add memories by providing conversation content to the memory store. The system preprocesses and postprocesses the data, including memory extraction and consolidation, to optimize the agent’s memory. This long-running operation might take about one minute. Decide how to segment memory across users by specifying the scope parameter. You can scope the memory to a specific end user, a team, or another identifier. You can update a memory store with content from multiple conversation turns, or update after each turn and chain updates using the previous update operation ID.

Search for memories in a memory store

Search memories to retrieve relevant context for agent interactions. Specify the memory store name and scope to narrow the search.

Retrieve static or contextual memories

Often, user profile memories can’t be retrieved based on semantic similarity to a user’s message. We recommend that you inject static memories into the beginning of each conversation and use contextual memories to generate each agent response.

To retrieve static memories, call search_memories with a scope but without items or previous_search_id. This returns user profile memories associated with the scope.
To retrieve contextual memories, call search_memories with items set to the latest messages. This can return both user profile and chat summary memories most relevant to the given items.

For more information about user profile and chat summary memories, see Memory types.

Manage memory items

Use item-level operations to directly create, inspect, update, and delete individual memory records. For scope-level or store-level deletion, see Delete memories.

The latest preview uses /memories as the item-level path segment. The previous preview used /items, with :list for listing. If you’re on the previous API version, update your routes accordingly.

Create a memory item

Get a memory item

List memory items

Update a memory item

Delete a memory item

Delete memories

Before you delete a memory store, consider the impact on dependent agents. Agents with attached memory stores might lose access to historical context.

Memories are organized by scope within a memory store. You can delete memories for a specific scope to remove user-specific data, or you can delete the entire memory store to remove all memories across all scopes.

Delete memories by scope

Remove all memories associated with a particular user or group scope while preserving the memory store structure. Use this operation to handle user data deletion requests or reset memory for specific users.

Delete a memory store

Remove the entire memory store and all associated memories across all scopes. This operation is irreversible.

Best practices

Implement per-user access controls: Avoid giving agents access to memories shared across all users. Use the scope property to partition the memory store by user. When you share scope across users, use user_profile_details to instruct the memory system not to store personal information.
Map scope to the end user: When you use the memory search tool, set scope to {{$userId}} in the tool definition. The system resolves the user identity from the x-memory-user-id request header, if present. Otherwise, it falls back to the caller’s Microsoft Entra token ({tid}_{oid}).
Minimize and protect sensitive data: Store only what’s necessary for your use case. If you must store sensitive data, such as personal data, health data, or confidential business inputs, redact or remove other content that could be used to trace back to an individual.
Support privacy and compliance: Provide users with transparency, including options to access and delete their data. Record all deletions in a tamper-evident audit trail. Ensure the system adheres to local compliance requirements and regulatory standards.
Segment data and isolate memory: In multi-agent systems, segment memory logically and operationally. Allow customers to define, isolate, inspect, and delete their own memory footprint.
Monitor memory usage: Track token usage and memory operations to understand costs and optimize performance.
Expose user-facing memory controls: Provide item-level edit and delete actions to support trust and data rights workflows.
Set explicit retention defaults: Use TTL settings that match policy requirements. Document retention behavior in your product UX.

Troubleshooting

Issue	Cause	Resolution
Requests fail with an authentication or authorization error.	Your identity or the project managed identity doesn’t have the required roles.	Verify the roles in Authorization and permissions. For REST calls, generate a fresh access token and retry.
Memories don’t appear after a conversation.	Memory updates are debounced or still processing.	Increase the wait time or call the update API with `update_delay` set to `0` to trigger processing immediately.
Memory search returns no results.	The `scope` value doesn’t match the scope used when memories were stored.	Use the same scope for update and search. If you map scope to users, use a stable user identifier.
The agent response doesn’t use stored memory.	The agent isn’t configured with the memory search tool, or the memory store name is incorrect.	Confirm the agent definition includes the `memory_search_preview` tool and references the correct memory store name.
Procedural memory or default TTL setting didn’t take effect after an update.	In the latest preview, you can only set default options at memory store creation time.	Recreate the memory store with the desired defaults or check whether your API version supports post-create option updates.
An explicit remember-or-forget request didn’t return memory command items in the response.	Memory tooling isn’t configured correctly, or the input wasn’t recognized as a remember-or-forget command.	Confirm the memory tool configuration and test with direct remember-or-forget phrasing.

​Usage support

​Prerequisites

​Authorization and permissions

​Set up your environment

​Understand scope

​Via the memory search tool

​Via low-level memory APIs

​Create a memory store

​Customize memory

​Configure TTL and retention policies

​Update a memory store

​List memory stores

​Use memories via an agent tool

​Create a conversation

​Apply direct remember-or-forget behavior

​Use memories via APIs

​Add memories to a memory store

​Search for memories in a memory store

​Retrieve static or contextual memories

​Manage memory items

​Create a memory item

​Get a memory item

​List memory items

​Update a memory item

​Delete a memory item

​Delete memories

​Delete memories by scope

​Delete a memory store

​Best practices

​Troubleshooting

​Related content

Usage support

Prerequisites

Authorization and permissions

Set up your environment

Understand scope

Via the memory search tool

Via low-level memory APIs

Create a memory store

Customize memory

Configure TTL and retention policies

Update a memory store

List memory stores

Use memories via an agent tool

Create a conversation

Apply direct remember-or-forget behavior

Use memories via APIs

Add memories to a memory store

Search for memories in a memory store

Retrieve static or contextual memories

Manage memory items

Create a memory item

Get a memory item

List memory items

Update a memory item

Delete a memory item

Delete memories

Delete memories by scope

Delete a memory store

Best practices

Troubleshooting

Related content