Learn how to use OpenAI-compatible LangChain classes with chat and embedding models deployed in Microsoft Foundry, including prompt chains, async calls, and vector search.
Use langchain-azure-ai to build LangChain apps that call models deployed
in Microsoft Foundry. Models with OpenAI-compatible APIs can be directly
used. In this article, you create
chat and embeddings clients, run prompt chains, and combine generation with
verification workflows.
The Foundry RBAC roles were recently renamed. Foundry User, Foundry Owner, Foundry Account Owner, and Foundry Project Manager were previously named Azure AI User, Azure AI Owner, Azure AI Account Owner, and Azure AI Project Manager. You might still see the previous names in some places while the rename rolls out. The role IDs and core permissions are unchanged by the rename.
A deployed chat model that supports OpenAI-compatible APIs, such as
gpt-4.1 or Mistral-Large-3.
A deployed embeddings model, such as text-embedding-3-large.
langchain-azure-ai uses the new Microsoft Foundry SDK (v2). If you’re using Foundry classic, use langchain-azure-ai[v1],
which uses Azure AI Inference SDK (legacy). Learn more.
You can easily instantiate a model by using init_chat_model:
from langchain.chat_models import init_chat_modelmodel = init_chat_model("azure_ai:gpt-4.1")
Using init_chat_model requires langchain>=1.2.13. If you can’t update your version, configure clients directly.
All Foundry models supporting OpenAI-compatible APIs can be used with the client, but they need to be deployed to your Foundry resource first. Using project_endpoint (environment variable AZURE_AI_PROJECT_ENDPOINT) requires Microsoft Entra ID for authentication and the role Foundry User.What this snippet does: Creates a chat model client by using the
init_chat_model convenience method. The client routes to the specified model
through the Foundry project endpoint or direct endpoint configured in the environment.References:
You can also create a runtime-configurable model by specifying configurable_fields. When you omit the model parameter, it becomes a configurable field by default.
from langchain.chat_models import init_chat_modelfrom azure.identity import DefaultAzureCredentialconfigurable_model = init_chat_model( model_provider="azure_ai", temperature=0, credential=DefaultAzureCredential())configurable_model.invoke( "what's your name", config={"configurable": {"model": "gpt-5-nano"}}, # Run with GPT-5-nano).pretty_print()configurable_model.invoke( "what's your name", config={"configurable": {"model": "Mistral-Large-3"}}, # Run with Mistral Large).pretty_print()
================================== Ai Message ==================================Hi! I'm ChatGPT, an AI assistant built by OpenAI. You can call me ChatGPT or just Assistant. How can I help you today?================================== Ai Message ==================================I don't have a name, but you can call me **Assistant** or anything you like! 😊 What can I help you with today?
What this snippet does: Creates a configurable model instance that allows you to switch
models easily at invocation time. Because the model parameter is missing in init_chat_model,
it’s by default a configurable field and can be passed with invoke(). You can add other
fields to be configurable by configuring configurable_fields.
Use asynchronous credentials if your app calls models with ainvoke. When using Microsoft Entra ID for authentication, use
the corresponding asynchronous implementation for credentials:
Many models can perform multi-step reasoning to arrive at a conclusion. This involves breaking down complex problems into smaller, more manageable steps.
from langchain.chat_models import init_chat_modelmodel = init_chat_model("azure_ai:DeepSeek-R1-0528")for chunk in model.stream("Why do parrots have colorful feathers?"): reasoning_steps = [r for r in chunk.content_blocks if r["type"] == "reasoning"] print(reasoning_steps if reasoning_steps else chunk.text, end="")print("\n")
Parrots have colorful feathers primarily due to a combination of evolutionary ...
OpenAI models deployed in Foundry support server-side tool-calling loops: models can interact with web search, code interpreters, and other tools, and then analyze the results in a single conversational turn.
If a model invokes a tool server-side, the content of the response message will include content representing the invocation and result of the tool.
Tools in the namespace langchain_azure_ai.tools.builtin are only supported in OpenAI models.
These are tools provided by OpenAI that extend the model’s capabilities. To see the full list of supported tools, see built-in tools.The following example shows how to use web search:
from langchain.chat_models import init_chat_modelfrom langchain_azure_ai.tools.builtin import WebSearchToolfrom azure.identity import DefaultAzureCredentialmodel = init_chat_model("azure_ai:gpt-4.1", credential=DefaultAzureCredential())model_with_web_search = model.bind_tools([WebSearchTool()])result = model_with_web_search.invoke("What is the current price of gold? Give me the answer in one sentence.")result.content[-1]["text"]
As of today, March 24, 2026, the spot price of gold is approximately $4,397.80 per ounce. ([tradingeconomics.com](https://tradingeconomics.com/commodity/gold))
Some tools might require configuration of other resources in your project. Use azure-ai-projects to configure those resources and then reference them from LangChain/LangGraph.The following example shows how to configure a file store before using it in a tool:
import osfrom azure.ai.projects import AIProjectClientfrom azure.identity import DefaultAzureCredential# Create clients to call Foundry APIproject = AIProjectClient( endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"], credential=DefaultAzureCredential(),)openai = project.get_openai_client()# Create vector store and upload filevector_store = openai.vector_stores.create(name="ProductInfoStore")vector_store_id = vector_store.idwith open("product_info.md", "rb") as file_handle: vector_store_file = openai.vector_stores.files.upload_and_poll( vector_store_id=vector_store.id, file=file_handle, )
What this snippet does: Sets up a vector store with a file in Microsoft
Foundry so that a model can later search over that file’s content (used
with the FileSearchTool in the next code block).
from langchain_azure_ai.tools.builtin import FileSearchToolmodel_with_tools = model.bind_tools([FileSearchTool(vector_store_ids=[vector_store.id])])results = model_with_tools.invoke("Tell me about Contoso products")print("Answer:", results.content[-1]["text"])print("Annotations:", results.content[-1]["annotations"])
Answer: Contoso offers the following products:1. **The widget** - Description: A high-quality widget that is perfect for all your widget needs. - Price: $19.992. **The gadget** - Description: An advanced gadget that offers exceptional performance and reliability. - Price: $49.99These products are part of Contoso's main offerings as detailed in their product information documentation.Annotations: [{'file_id': 'assistant-MvU5SEqUcUBumoLUV5BXxn', 'filename': 'product_info.md', 'type': 'file_citation', 'file_index': 395}]
Use create_agent with models connected to Foundry to create ReAct-style agent loops:
from langchain.agents import create_agentagent = create_agent( model="azure_ai:gpt-5.2", system_prompt="You're an informational agent. Answer questions cheerfully.", )response = agent.invoke({"messages": "what's your name?"})response["messages"][-1].pretty_print()
================================== Ai Message ==================================I’m ChatGPT, your AI assistant.
Server-side tools can also be used, but they require calling bind_tools.
from langchain.chat_models import init_chat_modelfrom langchain.agents import create_agentfrom langchain_azure_ai.tools.builtin import ImageGenerationToolmodel = init_chat_model("azure_ai:gpt-5.2")tools = [ImageGenerationTool(model="gpt-image-1.5", size="1024x1024")]model_with_tools = model.bind_tools(tools)agent = create_agent( model=model_with_tools, tools=tools, system_prompt="You're an informational agent. Answer questions with graphics.", )
The image generation tool in Foundry requires passing the model deployment name for image generation
as part of a header, x-ms-oai-image-generation-deployment. When using langchain-azure-ai, this is handled
automatically. However, if you plan to use this tool with langchain-openai, you must pass the header
manually.
You can easily instantiate a model by using init_embeddings:
from langchain.embeddings import init_embeddingsembed_model = init_embeddings("azure_ai:text-embedding-3-small")
What this snippet does: Creates an embeddings model client by using the
init_embeddings convenience method.All Foundry models supporting OpenAI-compatible APIs can be used with the client, but they need to be deployed to your Foundry resource first. Using project_endpoint (environment variable AZURE_AI_PROJECT_ENDPOINT) requires Microsoft Entra ID for authentication and the role Foundry User.Or create the embeddings client with AzureAIOpenAIApiEmbeddingsModel.
Direct OpenAI-compatible endpoint used for model calls.
https://contoso.services.ai.azure.com/openai/v1
endpoint
OPENAI_API_KEY or AZURE_OPENAI_API_KEY
API key used with OPENAI_BASE_URL or AZURE_OPENAI_ENDPOINT for key-based authentication.
<your-api-key>
credential
AZURE_OPENAI_DEPLOYMENT_NAME
Model’s deployment name in the Foundry or OpenAI resource. Check the name in the Foundry portal as deployment names can be different from the underlying model used. Any model supporting OpenAI-compatible APIs can be used, however, not all parameters might be supported.
Mistral-Large-3
model
AZURE_OPENAI_API_VERSION
The API version to use. When an api_version is available we construct the OpenAI clients and inject the api-version query parameter via default_query.
v1 or preview
api_version
Environment variables AZURE_AI_INFERENCE_ENDPOINT and AZURE_AI_CREDENTIALS used for AzureAIChatCompletionsModel or AzureAIEmbeddingsModel (legacy) are no longer used.