Deploy and use Claude models in Microsoft Foundry (preview)

This article refers to the Microsoft Foundry (new) portal.

Anthropic’s Claude models bring advanced conversational AI capabilities to Microsoft Foundry, enabling you to build intelligent applications with state-of-the-art language understanding and generation. Claude models excel at complex reasoning, code generation, and multimodal tasks including image analysis. In this article, you learn how to:

Deploy Claude models in Microsoft Foundry
Authenticate by using Microsoft Entra ID or API keys
Call the Claude Messages API from Python, JavaScript, or REST
Choose the right Claude model for your use case

Claude models in Foundry include:

Model family	Models
Claude Opus	`claude-opus-4-6` (preview), `claude-opus-4-5` (preview), `claude-opus-4-1` (preview)
Claude Sonnet	`claude-sonnet-4-5` (preview)
Claude Haiku	`claude-haiku-4-5` (preview)

To learn more about the individual models, see Available Claude models.

To use Claude models in Microsoft Foundry, you need a paid Azure subscription with a billing account in a country or region where Anthropic offers the models for purchase. The following paid subscription types are currently restricted: Cloud Solution Providers (CSP), sponsored accounts with Azure credits, enterprise accounts in Singapore and South Korea, and Microsoft accounts.For a list of common subscription-related errors, see Common error messages and solutions.

Prerequisites

An Azure subscription with a valid payment method. If you don’t have an Azure subscription, create a paid Azure account.
Access to Microsoft Foundry with appropriate permissions to create and manage resources.
A Microsoft Foundry project created in one of the supported regions: East US2 or Sweden Central.
Foundry Models from partners and community require access to Azure Marketplace to create subscriptions. Ensure that you have the permissions required to subscribe to model offerings.
Contributor or Owner role on the resource group to deploy models. For more information, see Azure RBAC roles.

Deploy Claude models

Claude models in Foundry are available for global standard deployment. To deploy a Claude model, follow the instructions in Deploy Microsoft Foundry Models in the Foundry portal. After deployment, use the Foundry playground to interactively test the model.

Call the Claude Messages API

After you deploy a Claude model, interact with it to generate text responses:

Use the Anthropic SDKs and the following Claude APIs:
- Messages API: Send a structured list of input messages with text or image content. The model generates the next message in the conversation.
- Token Count API: Count the number of tokens in a message.
- Files API: Upload and manage files for use with the Claude API without re-uploading content with each request.
- Skills API: Create custom skills for Claude AI.

Send messages with authentication

The following examples show how to send requests to Claude Sonnet 4.5 using Microsoft Entra ID or API key authentication. To work with your deployed model, you need:

Your base URL, which is of the form https://<resource name>.services.ai.azure.com/anthropic.
Your target URI from your deployment details, which is of the form https://<resource name>.services.ai.azure.com/anthropic/v1/messages.
Microsoft Entra ID for keyless authentication or your deployment’s API key for API authentication.
Deployment name you chose during deployment creation. This name can be different from the model ID.

Python
JavaScript
REST API

Use Microsoft Entra ID authentication

For Messages API endpoints, use your base URL with Microsoft Entra ID authentication.

Install the Azure Identity client library: Install this library to use the DefaultAzureCredential. Authorization is easiest when you use DefaultAzureCredential because it finds the best credential to use in its running environment.
```
pip install azure-identity
```
Set the values of the client ID, tenant ID, and client secret of the Microsoft Entra ID application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.
```
export AZURE_CLIENT_ID="<AZURE_CLIENT_ID>"
export AZURE_TENANT_ID="<AZURE_TENANT_ID>"
export AZURE_CLIENT_SECRET="<AZURE_CLIENT_SECRET>"
```
Install dependencies: Install the Anthropic SDK by using pip (requires Python 3.8 or later).
```
pip install -U "anthropic"
```

Run a basic code sample to complete the following tasks:

Create a client with the Anthropic SDK, using Microsoft Entra ID authentication.
Make a basic call to the Messages API. The call is synchronous.

from anthropic import AnthropicFoundry
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

baseURL = "https://<resource-name>.services.ai.azure.com/anthropic" # Your base URL. Replace <resource-name> with your resource name
deploymentName = "claude-sonnet-4-5" # Replace with your deployment name

# Create token provider for Entra ID authentication
tokenProvider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

# Create client with Entra ID authentication
client = AnthropicFoundry(
    azure_ad_token_provider=tokenProvider,
    base_url=baseURL
)

# Send request
message = client.messages.create(
    model=deploymentName,
    messages=[
        {"role": "user", "content": "What is the capital/major city of France?"}
    ],
    max_tokens=1024,
)

print(message.content)

Expected output: A JSON response with the model’s text completion in message.content, such as "The capital/major city of France is Paris." Reference: Anthropic Client SDK, DefaultAzureCredential

Use API key authentication

For Messages API endpoints, use your base URL and API key to authenticate against the service.

Install dependencies: Install the Anthropic SDK by using pip (requires Python 3.8 or later):
```
pip install -U "anthropic"
```

Run a basic code sample to complete the following tasks:

Create a client with the Anthropic SDK by passing your API key to the SDK’s configuration. This authentication method lets you interact seamlessly with the service.
Make a basic call to the Messages API. The call is synchronous.

from anthropic import AnthropicFoundry

baseURL = "https://<resource-name>.services.ai.azure.com/anthropic" # Your base URL. Replace <resource-name> with your resource name
deploymentName = "claude-sonnet-4-5" # Replace with your deployment name
apiKey = "YOUR_API_KEY" # Replace YOUR_API_KEY with your API key

# Create client with API key authentication
client = AnthropicFoundry(
    api_key=apiKey,
    base_url=baseURL
)

# Send request
message = client.messages.create(
    model=deploymentName,
    messages=[
        {"role": "user", "content": "What is the capital/major city of France?"}
    ],
    max_tokens=1024,
)

print(message.content)

Expected output: A JSON response with the model’s text completion in message.content, such as "The capital/major city of France is Paris." Reference: Anthropic Client SDK

Use Microsoft Entra ID authentication

For Messages API endpoints, use your base URL with Microsoft Entra ID authentication.

Install the Azure Identity client library: Install the @azure/identity package to use the DefaultAzureCredential. Authorization is easiest when you use DefaultAzureCredential because it finds the best credential to use in its running environment.
```
npm install @azure/identity
```
Set the values of the client ID, tenant ID, and client secret of the Microsoft Entra ID application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.
```
export AZURE_CLIENT_ID="<AZURE_CLIENT_ID>"
export AZURE_TENANT_ID="<AZURE_TENANT_ID>"
export AZURE_CLIENT_SECRET="<AZURE_CLIENT_SECRET>"
```
Install dependencies
1. Install Node.js 20 LTS or later (non-EOL) versions.
2. Copy the following lines of text and save them as a file package.json inside your folder.
  { "type": "module", "dependencies": { "@anthropic-ai/sdk": "latest", "@azure/identity": "latest" } }
3. Open a terminal window in this folder and run npm install.
4. For each of the code snippets that follow, copy the content into a file sample.js and run with node sample.js.

Run a basic code sample to complete the following tasks:

Creates a client with the Anthropic SDK, using Microsoft Entra ID authentication.
Makes a basic call to the Messages API. The call is synchronous.

import AnthropicFoundry from '@anthropic-ai/foundry-sdk';
import { getBearerTokenProvider, DefaultAzureCredential } from "@azure/identity";

const baseURL = "https://<resource-name>.services.ai.azure.com/anthropic"; // Your base URL. Replace <resource-name> with your resource name
const deploymentName = "claude-sonnet-4-5" // Replace with your deployment name

// Create token provider for Entra ID authentication
const tokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    'https://cognitiveservices.azure.com/.default');

// Create client with Entra ID authentication
const client = new AnthropicFoundry({
    azureADTokenProvider: tokenProvider,
    baseURL: baseURL,
    apiVersion: "2023-06-01"
});

// Send request
const message = await client.messages.create({
    model: deploymentName,
    messages: [{ role: "user", content: "What is the capital/major city of France?" }],
    max_tokens: 1024,
});
console.log(message);

Use API key authentication

For Messages API endpoints, use your base URL and API key to authenticate against the service.

Install dependencies
1. Install Node.js 20 LTS or later (non-EOL) versions.
2. Copy the following lines of text and save them as a file package.json inside your folder.
  { "type": "module", "dependencies": { "@anthropic-ai/sdk": "latest" } }
3. Open a terminal window in this folder and run npm install.
4. For each of the code snippets that follow, copy the content into a file sample.js and run with node sample.js.

Run a basic code sample. This sample completes the following tasks:

Creates a client with the Anthropic SDK by passing your API key to the SDK’s configuration. This authentication method lets you interact seamlessly with the service.
Makes a basic call to the Messages API. The call is synchronous.

import AnthropicFoundry from '@anthropic-ai/foundry-sdk';

const baseURL = "https://<resource-name>.services.ai.azure.com/anthropic"; // Your base URL. Replace <resource-name> with your resource name
const deploymentName = "claude-sonnet-4-5" // Replace with your deployment name
const apiKey = "<your-api-key>"; // Your API key

// Create client with API key
const client = new AnthropicFoundry({
    apiKey: apiKey,
    baseURL: baseURL,
    apiVersion: "2023-06-01"
});

// Send request
const message = await client.messages.create({
    model: deploymentName,
    messages: [{ role: "user", content: "What is the capital/major city of France?" }],
    max_tokens: 1024,
});
console.log(message);

Expected output: A JSON response with the model’s text completion in message.content, such as "The capital/major city of France is Paris." Reference: AnthropicFoundry SDK

For a list of supported runtimes, see Requirements to use Anthropic TypeScript API Library.

Use Microsoft Entra ID authentication

For Messages API endpoints, use the deployed model’s endpoint URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages with Microsoft Entra ID authentication.If you configure the resource with Microsoft Entra ID support, pass your token in the Authorization header with the format Bearer $AZURE_AUTH_TOKEN. Use scope https://cognitiveservices.azure.com/.default. Using Microsoft Entra ID might require additional configuration in your resource to grant access. For more information, see Configure authentication with Microsoft Entra ID.

Export your Microsoft Entra ID token to an environment variable: If you’re using bash:
```
export AZURE_AUTH_TOKEN="<your-entra-id-key>"
```
If you’re in PowerShell:
```
$Env:AZURE_AUTH_TOKEN = "<your-entra-id-key>"
```
If you’re using Windows command prompt:
```
set AZURE_AUTH_TOKEN = <your-entra-id-key>
```

Run the following cURL command. For cURL, use your deployment’s target URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages.

curl -X POST https://<resource-name>.services.ai.azure.com/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_AUTH_TOKEN" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "messages": [
      {
        "role": "system", "content": "You are a helpful assistant."
      },
      {
        "role": "user", "content": "What are 3 things to visit in Seattle?"
      }
    ],
    "max_tokens": 1000,
    "temperature": 0.7,
    "model": "claude-sonnet-4-5"
    }'

Expected output: A JSON response containing the model’s text completion with three Seattle recommendations. Reference: Claude Messages API

Use API key authentication

For Messages API endpoints, use the deployed model’s endpoint URI https://<resource-name>.services.ai.azure.com/anthropic/v1/messages and API key to authenticate against the service.

Export your API key to an environment variable: If you’re using bash:
```
export AZURE_API_KEY="<your-api-key>"
```
If you’re in PowerShell:
```
$Env:AZURE_API_KEY = "<your-api-key>"
```
If you’re using Windows command prompt:
```
set AZURE_API_KEY = <your-api-key>
```

Run the following cURL command:

curl -X POST https://<resource-name>.services.ai.azure.com/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $AZURE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "messages": [
      {
        "role": "system", "content": "You are a helpful assistant."
      },
      {
        "role": "user", "content": "What are 3 things to visit in Seattle?"
      }
    ],
    "max_tokens": 1000,
    "temperature": 0.7,
    "model": "claude-sonnet-4-5"
    }'

Expected output: A JSON response containing the model’s text completion with three Seattle recommendations. Reference: Claude Messages API

Available Claude models

Foundry supports Claude Opus 4.6, Claude Opus 4.5, Claude Opus 4.1, Claude Sonnet 4.5, and Claude Haiku 4.5 models through global standard deployment. These models have key capabilities:

Extended thinking: Enhanced reasoning for complex tasks.
Image and text input: Strong vision for analyzing charts, graphs, technical diagrams, reports, and other visual assets.
Code generation: Advanced code generation, analysis, and debugging.

For more details about the model capabilities, see capabilities of Claude models.

Claude Opus 4.6 (preview)

Claude Opus 4.6 is the latest version of Anthropic’s most intelligent model, and the world’s best model for coding, enterprise agents, and professional work. With a 1M token context window (beta) and 128K max output, Opus 4.6 is ideal for production code, sophisticated agents, office tasks, financial analysis, cybersecurity, and computer use.

Claude Opus 4.5 (preview)

Claude Opus 4.5 is an industry leader in coding, agents, computer use, and enterprise workflows. With a 200K token context window and 64K max output, Opus 4.5 is ideal for production code, sophisticated agents, office tasks, financial analysis, cybersecurity, and computer use tasks.

Claude Opus 4.1 (preview)

Claude Opus 4.1 is an industry leader for coding. It delivers sustained performance on long-running tasks that require focused effort and thousands of steps, significantly expanding what AI agents can solve.

Claude Sonnet 4.5 (preview)

Claude Sonnet 4.5 is a highly capable model designed for building real-world agents and handling complex, long-horizon tasks. It offers a strong balance of speed and cost for high-volume use cases. Sonnet 4.5 also provides advanced accuracy for computer use, enabling developers to direct Claude to use computers the way people do.

Claude Haiku 4.5 (preview)

Claude Haiku 4.5 delivers near-frontier performance for a wide range of use cases. It stands out as one of the best coding and agent models, with the right speed and cost to power free products and scaled subagents.

Advanced features and capabilities of Claude models

Claude in Foundry Models supports advanced features and capabilities. Core capabilities enhance Claude’s fundamental abilities for processing, analyzing, and generating content across various formats and use cases. Tools enable Claude to interact with external systems, execute code, and perform automated tasks through various tool interfaces. Some of the Core capabilities that Foundry supports are:

Large context window: An extended context window that processes larger documents and longer conversations.
Agent skills: Extend Claude’s capabilities with skills.
Citations: Ground Claude’s responses in source documents.
Context editing: Automatically manage conversation context with configurable strategies.
Extended thinking: Enhanced reasoning capabilities for complex tasks.
PDF support: Process and analyze text and visual content from PDF documents.
Prompt caching: Provide Claude with more background knowledge and example outputs to reduce costs and latency.

Some of the Tools that Foundry supports are:

MCP connector: Connect to remote MCP servers directly from the Messages API without a separate MCP client.
Memory: Store and retrieve information across conversations. Build knowledge bases over time, maintain project context, and learn from past interactions.
Web fetch: Retrieve full content from specified web pages and PDF documents for in-depth analysis.

For a full list of supported capabilities and tools, see Claude’s features overview.

Agent support

Microsoft Agent Framework supports creating agents that use Claude models.
Build custom AI agents with the Claude Agent SDK.

API quotas and limits

Currently, only Enterprise and MCA-E subscriptions are eligible for Claude model usage in Foundry.

Claude models in Foundry have the following rate limits, measured in Tokens Per Minute (TPM) and Requests Per Minute (RPM):

Model	Deployment type	Enterprise and MCA-E RPM	Enterprise and MCA-E TPM
claude-opus-4-6	Global Standard	2,000	2,000,000
claude-opus-4-5	Global Standard	2,000	2,000,000
claude-opus-4-1	Global Standard	2,000	2,000,000
claude-sonnet-4-5	Global Standard	4,000	2,000,000
claude-haiku-4-5	Global Standard	4,000	4,000,000

To increase your quota beyond the default limits, submit a request through the quota increase request form.

Rate-limit best practices

To optimize your usage and avoid rate limiting:

Implement retry logic: Handle 429 responses with exponential backoff.
Batch requests: Combine multiple prompts when possible.
Monitor usage: Track your token consumption and request patterns.
Use appropriate models: Choose the right Claude model for your use case.

Responsible AI considerations

When using Claude models in Foundry, consider these responsible AI practices:

Configure AI content safety during model inference, because Foundry doesn’t provide built-in content filtering for Claude models at deployment time.
Ensure your applications comply with Anthropic’s Acceptable Use Policy. Also, see details of safety evaluations for Claude Opus 4.6, Claude Opus 4.5, Claude Opus 4.1, Claude Sonnet 4.5, and Claude Haiku 4.5.

Best practices

Follow these best practices when working with Claude models in Foundry:

Model selection

Choose the appropriate Claude model based on your specific requirements:

Claude Opus 4.6: Most intelligent model for building agents, coding, and enterprise workflows.
Claude Opus 4.5: Best performance across coding, agents, computer use, and enterprise workflows.
Claude Opus 4.1: Complex reasoning and enterprise applications.
Claude Sonnet 4.5: Balanced performance and capabilities, production workflows.
Claude Haiku 4.5: Speed and cost optimization, high-volume processing.

Prompt engineering

Clear instructions: Provide specific and detailed prompts.
Context management: Use the available context window effectively.
Role definitions: Use system messages to define the assistant’s role and behavior.
Structured prompts: Use consistent formatting for better results.

Cost optimization

Token management: Monitor and optimize token usage.
Model selection: Use the most cost-effective model for your use case.
Caching: Implement explicit prompt caching where appropriate.
Request batching: Combine multiple requests when possible.

Troubleshooting

The following table lists common errors when you work with Claude models in Foundry and their solutions:

Error	Cause	Solution
401 Unauthorized	Invalid or expired API key, or incorrect Entra ID token scope.	Verify your API key is correct. For Entra ID, confirm you use scope `https://cognitiveservices.azure.com/.default`.
403 Forbidden	Insufficient permissions on the resource or subscription.	Verify you have Contributor or Owner role on the resource group. For Entra ID, ensure the Cognitive Services User role is assigned.
404 Not Found	Incorrect endpoint URL or deployment name.	Confirm your base URL follows the pattern `https://<resource-name>.services.ai.azure.com/anthropic` and the deployment name matches your configuration.
429 Too Many Requests	Rate limit exceeded for your subscription tier.	Implement exponential backoff with retry logic. Consider reducing request frequency or requesting a quota increase.
Subscription eligibility error	Non-Enterprise or non-MCA-E subscription.	Claude models require an Enterprise or MCA-E subscription. See API quotas and limits for details.
Region not available	Deployment attempted in an unsupported region.	Deploy to East US2 or Sweden Central, the supported regions for Claude models.

What is Microsoft Foundry (new)?

Get started

Agent development

Agent tools & integration

Model capabilities

Fine-tuning

Manage agents, models, & tools

Observability, evaluation, & tracing

Developer experience

API & SDK

Responsible AI

Best practices

Setup & configure

Security & governance

Operate & support

​Deploy and use Claude models in Microsoft Foundry (preview)

​Prerequisites

​Deploy Claude models

​Call the Claude Messages API

​Send messages with authentication

​Use Microsoft Entra ID authentication

​Use API key authentication

​Use Microsoft Entra ID authentication

​Use API key authentication

​Use Microsoft Entra ID authentication

​Use API key authentication

​Available Claude models

​Claude Opus 4.6 (preview)

​Claude Opus 4.5 (preview)

​Claude Opus 4.1 (preview)

​Claude Sonnet 4.5 (preview)

​Claude Haiku 4.5 (preview)

​Advanced features and capabilities of Claude models

​Agent support

​API quotas and limits

​Rate-limit best practices

​Responsible AI considerations

​Best practices

​Model selection

​Prompt engineering

​Cost optimization

​Troubleshooting

​Related content

Deploy and use Claude models in Microsoft Foundry (preview)

Prerequisites

Deploy Claude models

Call the Claude Messages API

Send messages with authentication

Use Microsoft Entra ID authentication

Use API key authentication

Use Microsoft Entra ID authentication

Use API key authentication

Use Microsoft Entra ID authentication

Use API key authentication

Available Claude models

Claude Opus 4.6 (preview)

Claude Opus 4.5 (preview)

Claude Opus 4.1 (preview)

Claude Sonnet 4.5 (preview)

Claude Haiku 4.5 (preview)

Advanced features and capabilities of Claude models

Agent support

API quotas and limits

Rate-limit best practices

Responsible AI considerations

Best practices

Model selection

Prompt engineering

Cost optimization

Troubleshooting

Related content