Skip to main content
Azure AI Inference beta SDK is deprecated and will be retired on August 26, 2026. Switch to the generally available OpenAI/v1 API with a stable OpenAI SDK. Follow the migration guide to switch to OpenAI/v1, using the SDK for your preferred programming language.
In this article, you learn how to add a new model deployment to a Foundry Models endpoint. The deployment is available for inference in your Foundry resource when you specify the deployment name in your requests.

Prerequisites

To complete this article, you need the following:

Troubleshooting

ErrorCauseResolution
Quota exceededYour subscription reached the deployment quota for the selected SKU or region.Check your quota in the Foundry portal or request an increase through Azure support.
Authorization failedThe identity used doesn’t have the required RBAC role.Assign the Cognitive Services Contributor role on the Foundry resource.
Model not availableThe model isn’t available in your region or subscription.Run az cognitiveservices account list-models to check available models and regions.
Extension not foundThe cognitiveservices CLI extension isn’t installed.Run az extension add -n cognitiveservices to install the extension.