Vector stores for file search
Vector store objects give the file search tool the ability to search your files. When you add a file to a vector store, the service parses, chunks, embeds, and indexes it so the tool can run both keyword and semantic search. Vector stores can be attached to both agents and conversations. Currently, you can attach at most one vector store to an agent and at most one vector store to a conversation. For a conceptual overview of conversations, see Agent runtime components. In the current agents developer experience, response generation uses responses and conversations. Some SDKs and older samples use the term run. If you see both terms, treat run as response generation. For migration guidance, see How to migrate to the new agent service. For a list of limits for vector search (such as maximum allowable file sizes), see the quotas and limits article.Prerequisites
- A Microsoft Foundry project.
- An agent or conversation that uses the file search tool.
- If you use standard agent setup, connect Azure Blob Storage and Azure AI Search during setup so your files remain in your storage. See Agent environment setup.
- Roles and permissions vary by task (for example, creating projects, assigning roles for standard setup, or creating and editing agents). See the required permissions table in Agent environment setup.
- Feature availability can vary by region. For current coverage, see Microsoft Foundry feature availability across cloud regions.
Key limits and defaults
Vector stores are often the first place retrieval workflows fail in production, so it helps to know the defaults and hard limits.- Files per vector store: Each vector store can hold up to 10,000 files.
- Attachments: You can attach at most one vector store to an agent and at most one vector store to a conversation.
- Default retrieval settings (file search):
- Chunk size: 800 tokens
- Chunk overlap: 400 tokens
- Embedding model: text-embedding-3-large at 256 dimensions
- Maximum number of chunks added to context: 20
Key concepts
| Term | Meaning |
|---|---|
| Vector store | A container for searchable file content (chunks and embeddings) used by the file search tool. |
| Ingestion | The asynchronous process that parses, chunks, embeds, and indexes a file for search. |
| Readiness | Whether ingestion has completed and the vector store is searchable. |
| Expiration policy | A lifecycle policy that expires a vector store after a period of inactivity. |
How vector stores work with file search
File search applies retrieval best practices to help your agent find the right content from your files. Depending on the query and your data, the tool can:- Rewrite user queries to improve retrieval.
- Break down complex queries into multiple searches.
- Run both keyword and semantic searches across agent and conversation vector stores.
- Rerank results before adding them to the model context.
Where your data lives (basic vs standard agent setup)
Where files and search resources live depends on your agent setup:- Basic agent setup: File search uses Microsoft-managed storage and search resources.
- Standard agent setup: File search uses the Azure Blob Storage and Azure AI Search resources you connect during setup, so your files remain in your storage.
Ensure vector store readiness before creating responses
Ensure all files in a vector store are fully processed before you create a response. This step ensures that all the data in your vector store is searchable. To check readiness, use the SDK polling helpers (for example, create-and-poll and upload-and-poll) or poll the vector store object until its status is completed. For code examples, see File search tool for agents. During ingestion, a vector store can be in in_progress status. When ingestion completes, the status changes to completed. As a fallback, response generation includes a 60-second maximum wait when the conversation’s vector store contains files that are still being processed. This fallback wait doesn’t apply to the agent’s vector store.End-to-end workflow checklist
Use this checklist to validate a working vector-store workflow from ingestion to lifecycle management.- Decide whether you use basic agent setup or standard agent setup, based on where you want your files and search resources to live. See Where your data lives (basic vs standard agent setup).
- Upload your files and create a vector store. For a step-by-step example, see Upload files and add them to a vector store.
- Wait for ingestion to finish before you generate responses. Use SDK polling helpers or poll the vector store until its status is completed and no files remain in in_progress. See Ensuring vector store readiness before creating responses.
- Attach the vector store to the agent or conversation that you use for file search. Keep the attachment limits in mind. See Vector stores.
- Create a response that uses file search and verify that the tool is retrieving from the expected sources. See Create response with file search and Verify results.
- Manage lifecycle: remove files you no longer need, and plan for expiration policies (especially for vector stores created by conversation helpers). See Vector stores and Conversation vector stores have default expiration policies.
Add files and manage vector stores
Adding files to vector stores is an asynchronous operation. To ensure ingestion completes, use the create-and-poll helpers in the official SDKs. If you aren’t using an SDK, poll the vector store until its status is completed and no files remain in in_progress. Files can also be added to a vector store after it’s created by creating vector store files. Alternatively, you can add several files to a vector store by creating batches of up to 500 files. When you upload a file to create a vector store, the system automatically:- Chunks your content into manageable pieces.
- Converts each chunk into high-dimensional vectors using embedding models.
- Stores these vectors in an optimized search index.
- Creates associations between the vectors and your original content.
Remove files from vector stores
You can remove files from a vector store in two different ways:- Delete the vector store file object.
- Delete the underlying file object. This removes the file from all vector store configurations across all agents and conversations in your organization.
Manage lifecycle with expiration policies
Expiration policies help you manage vector store lifecycle. You can set these policies when creating or updating the vector store object.Conversation vector stores have default expiration policies
Vector stores created using conversation helpers have a default expiration policy of seven days after they were last active (defined as the last time the vector store was used during response generation). When a vector store expires, response generation for that conversation fails. To fix the issue, recreate a new vector store with the same files and reattach it to the conversation. For more detail, see Conversation vector stores have default expiration policies.Supported file types and key limits
For the supported file types list and encoding requirements, see Supported file types. Key limits to keep in mind:- You can attach at most one vector store to an agent and at most one vector store to a conversation.
- File size and token limits vary by feature. See Quotas and limits.
Troubleshooting
- Your vector store isn’t searchable yet: Wait for ingestion to finish. Use SDK polling helpers or poll the vector store until its status is completed.
- Response generation fails after a few days: Your conversation vector store might have expired. Recreate a new vector store with the same files and reattach it.
- A file disappeared from multiple agents or conversations: You might have deleted the underlying file object, which removes the file from all vector store configurations across your organization.
- Uploads or ingestion fail: Check file size and token limits in Quotas and limits.
Next steps
- Learn more about the file search tool
- Review tool best practices for guidance on reliability and security
- Learn about agent runtime components