2023-05-01 for resource management related activities. This API version is only for managing your resources, and doesn’t impact the API version used for inferencing calls like completions, chat completions, embedding, image generation, etc.
Prerequisites
Before you create deployments programmatically, complete the following:- An Azure subscription. Create one for free.
- An existing Azure OpenAI resource. To create one, see Create a resource and deploy a model with Azure OpenAI.
- Quota available in the target region for the model you want to deploy. To check or request quota, see Manage Azure OpenAI in Microsoft Foundry Models quota.
- Permissions to create deployments on the resource. The Cognitive Services Contributor role at the resource scope provides the required access. For details, see Role-based access control for Azure OpenAI.
- The model name and version that you want to deploy. For supported models, see Azure OpenAI models.
Create a deployment and query usage
Select the tab for the tool or template language you want to use. Each tab includes a deployment example that sets a TPM-based capacity, followed by a usage query that returns your remaining quota in the specified region.- REST
- Azure CLI
- Azure PowerShell
- Azure Resource Manager
- Bicep
- Terraform
Deployment
| Parameter | Type | Required? | Description |
|---|---|---|---|
accountName | string | Required | The name of your Azure OpenAI Resource. |
deploymentName | string | Required | The deployment name you chose when you deployed an existing model or the name you would like a new model deployment to have. |
resourceGroupName | string | Required | The name of the associated resource group for this model deployment. |
subscriptionId | string | Required | Subscription ID for the associated subscription. |
api-version | string | Required | The API version to use for this operation. This follows the YYYY-MM-DD format. |
2023-05-01Swagger spec
| Parameter | Type | Description |
|---|---|---|
| sku | Sku | The resource model definition representing SKU. |
| capacity | integer | This represents the amount of quota you’re assigning to this deployment. A value of 1 equals 1,000 Tokens per Minute (TPM). A value of 10 equals 10k Tokens per Minute (TPM). |
Example request
There are multiple ways to generate an authorization token. The easiest method for initial testing is to launch the Cloud Shell from the Azure portal. Then run
az account get-access-token. You can use this token as your temporary authorization token for API testing.Usage
To query your quota usage in a given region, for a specific subscription| Parameter | Type | Required? | Description |
|---|---|---|---|
subscriptionId | string | Required | Subscription ID for the associated subscription. |
location | string | Required | Location to view usage for ex: eastus |
api-version | string | Required | The API version to use for this operation. This follows the YYYY-MM-DD format. |
2023-05-01Swagger spec