Understand deployment types in Microsoft Foundry Models in Azure Government

When you deploy a model in Microsoft Foundry in Azure Government, you choose a deployment type that determines:

Where your data is processed (data zone or single region)
How you pay (pay-per-token or reserved capacity)
Performance characteristics (latency variance, throughput limits)

The service offers two main categories: standard (pay-per-token) and provisionedmanaged (reserved capacity). Within each category, you can choose data zone or single regional processing based on your requirements.

Screenshot of the Foundry portal deployment dialog showing the deployment type selection box with Global Standard selected.

Data residency for all deployment types: Data stored at rest remains in the designated Azure region. However, inferencing data is processed as follows:

USGov DataZone types: Processed only within the Azure Government cloud USGov data zone
Standard/Regional types: Processed in the deployment region

Deployment type comparison

Deployment type	SKU code	Data processing	Billing	Best for
Data Zone Standard	`DataZoneStandard`	Within data zone	Pay-per-token	USGov data zone compliance
Data Zone Provisioned	`DataZoneProvisionedManaged`	Within data zone	Reserved PTU	USGov Data zone + predictable throughput
Standard	`Standard`	Single region	Pay-per-token	Regional compliance, low volume
Regional Provisioned	`ProvisionedManaged`	Single region	Reserved PTU	Regional compliance + throughput

Not all models support all deployment types. Check Foundry Models sold by Azure for model availability by deployment type and region.

SLA guarantees vary by deployment type. Provisioned types provide guaranteed throughput and lower latency variance. Standard types offer best-effort service. For details, see the Azure SLA for Azure OpenAI Service.

For detailed pricing, see Azure OpenAI Service pricing.

Choose the right deployment type

Use the following criteria to select a deployment type:

By data residency requirement

USGov data zone: Use DataZone Standard or DataZone Provisioned in an Azure Government region
Single region only: Use Standard or Regional Provisioned

By workload pattern

Variable, bursty traffic: Use Standard or DataZone (pay-per-token)
Consistent high volume: Use Provisioned types (reserved capacity)

By latency requirement

Low latency variance required: Use Provisioned types
Latency variance acceptable: Use Standard types

Data Zone deployments

For DataZone deployment types, prompts and responses are processed only within the specified data zone:

USGov: Data processed within the two Azure Government regions (USGovArizona or USGovVirginia)

Learn more in the “Model region availability by deployment type” section of Foundry Models sold by Azure.

With Data Zone Standard deployment types, if the primary region experiences an interruption in service, all traffic initially routed to this region is affected. To learn more, see the high availability and disaster recovery guide.

Data Zone Standard

SKU name in code: DataZoneStandard

Data Zone Standard deployments dynamically route traffic to datacenters within the Microsoft-defined data zone (USGov). This deployment type provides higher default quotas than geography-based deployment types while keeping data within the specified zone. Customers with high consistent volume might experience greater latency variability. The threshold is set per model. To learn more about Azure OpenAI quotas in Azure Government, see the Quotas and limits in Azure OpenAI. For workloads that require low latency variance at large volume, consider provisioned deployment types.

Data Zone Provisioned

SKU name in code: DataZoneProvisionedManaged

Data Zone Provisioned deployments dynamically route traffic within the Microsoft-specified data zone (USGov) while providing reserved model processing capacity. This deployment type combines data zone compliance with high and predictable throughput.

Standard

SKU name in code: Standard

Standard deployments use pay-per-token billing. You pay only for what you consume. Models available in each region and throughput might be limited. Standard deployments are suited for low-to-medium volume workloads with high burstiness. Customers with high consistent volume might experience greater latency variability.

Regional Provisioned

SKU name in code: ProvisionedManaged

Regional Provisioned deployments allow you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it’s ready for you. Throughput is defined in terms of provisioned throughput units (PTUs), which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTUs to deploy, and provides different amounts of throughput per PTU. Minimum PTU requirements vary by model. For current minimums and available capacity, see Provisioned throughput concepts.

Troubleshooting deployment issues

Common issues when creating or using deployments:

Issue	Cause	Resolution
Deployment type unavailable	Model doesn’t support the selected type	Check model availability by deployment type
Quota exceeded	Subscription limit reached for tokens per minute	Request quota increase at Azure Government AOAI Quota or use a different region
Region unavailable	Model not deployed in selected region	Select a region from the model’s availability list
Provisioned capacity unavailable	No PTU capacity in region	Try a different region or use DataZone Provisioned for broader availability

For Azure OpenAI quota limits by deployment type in Azure Government, see Quotas and limits in Azure OpenAI.

Abuse Monitoring in Azure Government

Not all features of Abuse Monitoring are enabled for Azure OpenAI deployments in Azure Government. You are responsible for implementing reasonable technical and operational measures to detect and mitigate any use of the service in violation of the Product Terms. Automated Content Classification and Filtering remains enabled by default for Azure Government. If modified content filters are required, apply at Azure Government Modified Filter Application.

​Deployment type comparison

​Choose the right deployment type

​By data residency requirement

​By workload pattern

​By latency requirement

​Data Zone deployments

​Data Zone Standard

​Data Zone Provisioned

​Standard

​Regional Provisioned

​Troubleshooting deployment issues

​Abuse Monitoring in Azure Government

​Related content

Deployment type comparison

Choose the right deployment type

By data residency requirement

By workload pattern

By latency requirement

Data Zone deployments

Data Zone Standard

Data Zone Provisioned

Standard

Regional Provisioned

Troubleshooting deployment issues

Abuse Monitoring in Azure Government

Related content