Foundry Models from partners and community
This article refers to the Microsoft Foundry (new) portal.
For a list of models sold directly by Azure, see Foundry Models sold directly by Azure.For a list of Azure OpenAI models that are supported by the Foundry Agent Service, see Models supported by Agent Service.
Anthropic
Anthropic’s flagship product is Claude, a frontier AI model trusted by leading enterprises and millions of users worldwide for complex tasks including coding, agents, financial analysis, research, and office tasks. Claude delivers exceptional performance while maintaining high safety standards. To work with Claude models in Foundry, see Deploy and use Claude models in Microsoft Foundry.To use Claude models in Microsoft Foundry, you need a paid Azure subscription with a billing account in a country or region where Anthropic offers the models for purchase. The following paid subscription types are currently restricted: Cloud Solution Providers (CSP), sponsored accounts with Azure credits, enterprise accounts in Singapore and South Korea, and Microsoft accounts.For a list of common subscription-related errors, see Common error messages and solutions.
| Model | Type | Capabilities |
|---|---|---|
claude-opus-4-6 (Preview) | Messages | - Input: text, image, and code - Output: text, image, and code (128,000 max tokens) - Context window: 1,000,000,000 (beta) - Languages: en, fr, ar, zh, ja, ko, es, hi - Tool calling: Yes (file search and code execution) - Response formats: Text in various formats (e.g., prose, lists, Markdown tables, JSON, HTML, code in various programming languages) |
claude-opus-4-5 (Preview) | Messages | - Input: text, image, and code - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: en, fr, ar, zh, ja, ko, es, hi - Tool calling: Yes (file search and code execution) - Response formats: Text in various formats (e.g., prose, lists, Markdown tables, JSON, HTML, code in various programming languages) |
claude-opus-4-1 (Preview) | Messages | - Input: text, image, and code - Output: text (32,000 max tokens) - Context window: 200,000 - Languages: en, fr, ar, zh, ja, ko, es, hi - Tool calling: Yes (file search and code execution) - Response formats: Text in various formats (e.g., prose, lists, Markdown tables, JSON, HTML, code in various programming languages) |
claude-sonnet-4-5 (Preview) | Messages | - Input: text, image, and code - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: en, fr, ar, zh, ja, ko, es, hi - Tool calling: Yes (file search and code execution) - Response formats: Text in various formats (e.g., prose, lists, Markdown tables, JSON, HTML, code in various programming languages) |
claude-haiku-4-5 (Preview) | Messages | - Input: text and image - Output: text (64,000 max tokens) - Context window: 200,000 - Languages: en, fr, ar, zh, ja, ko, es, hi - Tool calling: Yes (file search and code execution) - Response formats: Text in various formats (e.g., prose, lists, Markdown tables, JSON, HTML, code in various programming languages) |
Cohere
The Cohere family of models includes various models optimized for different use cases, including chat completions and embeddings. Cohere models are optimized for various use cases that include reasoning, summarization, and question answering. To deploy Cohere models in Foundry, see Deploy Microsoft Foundry Models in the Foundry portal.| Model | Type | Capabilities |
|---|---|---|
Cohere-command-r-plus-08-2024 | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar - Tool calling: Yes - Response formats: Text, JSON |
Cohere-command-r-08-2024 | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar - Tool calling: Yes - Response formats: Text, JSON |
Cohere-embed-v3-english | embeddings | - Input: text and images (512 tokens) - Output: Vector (1024 dim.) - Languages: en |
Cohere-embed-v3-multilingual | embeddings | - Input: text (512 tokens) - Output: Vector (1024 dim.) - Languages: en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar |
Meta
Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models. Meta models range in scale to include:- Small language models (SLMs) like 1B and 3B Base and Instruct models for on-device and edge inferencing
- Mid-size large language models (LLMs) like 7B, 8B, and 70B Base and Instruct models
- High-performance models like Meta Llama 3.1-405B Instruct for synthetic data generation and distillation use cases.
| Model | Type | Capabilities |
|---|---|---|
Llama-3.2-11B-Vision-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Llama-3.2-90B-Vision-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Meta-Llama-3.1-405B-Instruct | chat-completion | - Input: text (131,072 tokens) - Output: text (8,192 tokens) - Languages: en, de, fr, it, pt, hi, es, and th - Tool calling: No - Response formats: Text |
Meta-Llama-3.1-8B-Instruct | chat-completion | - Input: text (131,072 tokens) - Output: text (8,192 tokens) - Languages: en, de, fr, it, pt, hi, es, and th - Tool calling: No - Response formats: Text |
Llama-4-Scout-17B-16E-Instruct | chat-completion | - Input: text and image (128,000 tokens) - Output: text (8,192 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Microsoft
Microsoft models include various model groups such as MAI models, Phi models, healthcare AI models, and more. To deploy Microsoft models in Foundry, see Deploy Microsoft Foundry Models in the Foundry portal.| Model | Type | Capabilities |
|---|---|---|
Phi-4-mini-instruct | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: ar, zh, cs, da, nl, en, fi, fr, de, he, hu, it, ja, ko, no, pl, pt, ru, es, sv, th, tr, and uk - Tool calling: No - Response formats: Text |
Phi-4-multimodal-instruct | chat-completion | - Input: text, images, and audio (131,072 tokens) - Output: text (4,096 tokens) - Languages: ar, zh, cs, da, nl, en, fi, fr, de, he, hu, it, ja, ko, no, pl, pt, ru, es, sv, th, tr, and uk - Tool calling: No - Response formats: Text |
Phi-4 | chat-completion | - Input: text (16,384 tokens) - Output: text (16,384 tokens) - Languages: en, ar, bn, cs, da, de, el, es, fa, fi, fr, gu, ha, he, hi, hu, id, it, ja, jv, kn, ko, ml, mr, nl, no, or, pa, pl, ps, pt, ro, ru, sv, sw, ta, te, th, tl, tr, uk, ur, vi, yo, and zh - Tool calling: No - Response formats: Text |
Phi-4-reasoning | chat-completion with reasoning content | - Input: text (32,768 tokens) - Output: text (32,768 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Phi-4-mini-reasoning | chat-completion with reasoning content | - Input: text (128,000 tokens) - Output: text (128,000 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Mistral AI
Mistral AI offers models for code generation, general-purpose chat, and multimodal tasks, including Codestral, Ministral, Mistral Small, and Mistral Medium. To deploy Mistral AI models in Foundry, see Deploy Microsoft Foundry Models in the Foundry portal.| Model | Type | Capabilities |
|---|---|---|
Codestral-2501 | chat-completion | - Input: text (262,144 tokens) - Output: text (4,096 tokens) - Languages: en - Tool calling: No - Response formats: Text |
Ministral-3B | chat-completion | - Input: text (131,072 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON |
Mistral-small-2503 | chat-completion | - Input: text (32,768 tokens) - Output: text (4,096 tokens) - Languages: fr, de, es, it, and en - Tool calling: Yes - Response formats: Text, JSON |
Mistral-medium-2505 | chat-completion | - Input: text (128,000 tokens), image - Output: text (128,000 tokens) - Tool calling: No - Response formats: Text, JSON |
Stability AI
The Stability AI collection of image generation models includes Stable Image Core, Stable Image Ultra, and Stable Diffusion 3.5 Large. Stable Diffusion 3.5 Large accepts both image and text input. To deploy Stability AI models in Foundry, see Deploy Microsoft Foundry Models in the Foundry portal.| Model | Type | Capabilities |
|---|---|---|
Stable Diffusion 3.5 Large | Image generation | - Input: text and image (1,000 tokens and 1 image) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |
Stable Image Core | Image generation | - Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |
Stable Image Ultra | Image generation | - Input: text (1,000 tokens) - Output: One Image - Tool calling: No - Response formats: Image (PNG and JPG) |