How to use global batch processing with Azure OpenAI in Microsoft Foundry Models

The Azure OpenAI Batch API efficiently handles large-scale and high-volume processing tasks. It processes asynchronous groups of requests with separate quota and offers a 24-hour target turnaround at 50% less cost than global standard. With batch processing, you send a large number of requests in a single file instead of sending one request at a time. Global batch requests have a separate enqueued token quota, so your online workloads aren’t disrupted. Key use cases include:

Large-Scale Data Processing: Quickly analyze extensive datasets in parallel.
Content Generation: Create large volumes of text, such as product descriptions or articles.
Document Review and Summarization: Automate the review and summarization of lengthy documents.
Customer Support Automation: Handle numerous queries simultaneously for faster responses.
Data Extraction and Analysis: Extract and analyze information from vast amounts of unstructured data.
Natural Language Processing (NLP) Tasks: Perform tasks like sentiment analysis or translation on large datasets.
Marketing and Personalization: Generate personalized content and recommendations at scale.

If your batch jobs are so large that you hit the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new feature that allows you to queue multiple batch jobs with exponential backoff.Once your enqueued token quota is available, the next batch job can be created and kicked off automatically. To learn more, see automating retries of large batch jobs with exponential backoff.

The service aims to process batch requests within 24 hours, but it doesn’t expire jobs that take longer. You can cancel the job anytime. When you cancel the job, the service cancels any remaining work and returns any already completed work. You pay for any completed work.Data stored at rest remains in the designated Azure geography, while data might be processed for inferencing in any Azure OpenAI location. Learn more about data residency.

Batch support

Global Batch
Data Zone Batch

Global batch model availability

Region	gpt-5.1, 2025-11-13	gpt-5, 2025-08-07	o3, 2025-04-16	o4-mini, 2025-04-16	gpt-4.1, 2025-04-14	gpt-4.1-nano, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-05-13	gpt-4o, 2024-08-06	gpt-4o, 2024-11-20	gpt-4o-mini, 2024-07-18
australiaeast	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
brazilsouth	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
canadaeast	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
centralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
eastus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
eastus2	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
francecentral	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
germanywestcentral	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
japaneast	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
koreacentral	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
northcentralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
norwayeast	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
polandcentral	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
southafricanorth	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
southcentralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
southindia	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
swedencentral	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
switzerlandnorth	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
uksouth	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
westeurope	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
westus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
westus3	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅

To access gpt-5 and o3, you need to register. For more information, see the reasoning models guide.

Data zone batch model availability

Region	gpt-5.1, 2025-11-13	gpt-5, 2025-08-07	o3, 2025-04-16	o4-mini, 2025-04-16	gpt-4.1, 2025-04-14	gpt-4.1-nano, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-08-06	gpt-4o, 2024-11-20	gpt-4o-mini, 2024-07-18
centralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
eastus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
eastus2	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
francecentral	-	-	✅	✅	✅	✅	✅	-	✅	✅	✅
germanywestcentral	-	-	✅	✅	✅	✅	✅	-	✅	✅	✅
northcentralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
polandcentral	-	-	✅	✅	✅	✅	✅	-	✅	✅	✅
southcentralus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
swedencentral	-	-	✅	✅	✅	✅	✅	-	✅	✅	✅
westeurope	-	-	✅	✅	✅	✅	✅	-	✅	✅	✅
westus	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
westus3	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅

To access gpt-5 and o3, you need to register. For more information, see the reasoning models guide.

While Global Batch supports older API versions, some models require newer API versions. For example, o3-mini isn’t supported with 2024-10-21 since it was released after this date. To access newer models with Global Batch, use the v1 API.

Feature support

The following features aren’t currently supported:

Integration with the Assistants API.
Integration with Azure OpenAI On Your Data feature.

Batch deployment

In the Microsoft Foundry portal, the batch deployment types appear as Global-Batch and Data Zone Batch. To learn more about Azure OpenAI deployment types, see the deployment types guide.

We recommend enabling dynamic quota for all global batch model deployments to help avoid job failures due to insufficient enqueued token quota. Using dynamic quota allows your deployment to opportunistically take advantage of more quota when extra capacity is available. When dynamic quota is set to off, your deployment will only be able to process requests up to the enqueued token limit that was defined when you created the deployment.

Batch limits

Limit name	Limit value
Maximum Batch input files - (no expiration)	500
Maximum Batch input files - (expiration set)	10,000
Maximum input file size	200 MB
Maximum input file size - Bring your own storage (BYOS)	1 GB
Maximum requests per file	100,000

Batch file limits don’t apply to output files (for example, result.jsonl, and error.jsonl). To remove batch input file limits, use Batch with Azure Blob Storage.

Batch quota

The table shows the batch quota limit. Quota values for global batch are represented in terms of enqueued tokens. When you submit a file for batch processing, the number of tokens in the file is counted. Until the batch job reaches a terminal state, those tokens count against your total enqueued token limit.

Global batch

Model	Enterprise and MCA-E	Default	Monthly credit card-based subscriptions	MSDN subscriptions	Azure for Students, free trials
`gpt-4.1`	5B	200M	50M	90K	N/A
`gpt-4.1 mini`	15B	1B	50M	90K	N/A
`gpt-4.1-nano`	15B	1B	50M	90K	N/A
`gpt-4o`	5B	200M	50M	90K	N/A
`gpt-4o-mini`	15B	1B	50M	90K	N/A
`gpt-4-turbo`	300M	80M	40M	90K	N/A
`gpt-4`	150M	30M	5M	100K	N/A
`o3-mini`	15B	1B	50M	90K	N/A
`o4-mini`	15B	1B	50M	90K	N/A
`gpt-5`	5B	200M	50M	90K	N/A
`gpt-5.1`	5B	200M	50M	90K	N/A

B = billion | M = million | K = thousand

Data zone batch

Model	Enterprise and MCA-E	Default	Monthly credit card-based subscriptions	MSDN subscriptions	Azure for Students, free trials
`gpt-4.1`	500M	30M	30M	90K	N/A
`gpt-4.1-mini`	1.5B	100M	50M	90K	N/A
`gpt-4o`	500M	30M	30M	90K	N/A
`gpt-4o-mini`	1.5B	100M	50M	90K	N/A
`o3-mini`	1.5B	100M	50M	90K	N/A
`gpt-5`	5B	200M	50M	90K	N/A
`gpt-5.1`	5B	200M	50M	90K	N/A

Batch object

Property	Type	Definition
`id`	string	The identifier of the batch.
`object`	string	`batch`
`endpoint`	string	The API endpoint used by the batch.
`errors`	object	Error information for the batch, if any.
`input_file_id`	string	The ID of the input file for the batch.
`completion_window`	string	The time frame within which the batch should be processed.
`status`	string	The current status of the batch. Possible values: `validating`, `failed`, `in_progress`, `finalizing`, `completed`, `expired`, `cancelling`, `cancelled`.
`output_file_id`	string	The ID of the file containing the outputs of successfully executed requests.
`error_file_id`	string	The ID of the file containing the outputs of requests with errors.
`created_at`	integer	A timestamp when this batch was created (in Unix epoch seconds).
`in_progress_at`	integer	A timestamp when this batch started progressing (in Unix epoch seconds).
`expires_at`	integer	A timestamp when this batch will expire (in Unix epoch seconds).
`finalizing_at`	integer	A timestamp when this batch started finalizing (in Unix epoch seconds).
`completed_at`	integer	A timestamp when this batch completed (in Unix epoch seconds).
`failed_at`	integer	A timestamp when this batch failed (in Unix epoch seconds).
`expired_at`	integer	A timestamp when this batch expired (in Unix epoch seconds).
`cancelling_at`	integer	A timestamp when this batch started `cancelling` (in Unix epoch seconds).
`cancelled_at`	integer	A timestamp when this batch was `cancelled` (in Unix epoch seconds).
`request_counts`	object	Object structure: `total` integer The total number of requests in the batch. `completed` integer The number of requests in the batch that are completed successfully. `failed` integer The number of requests in the batch that failed.
`metadata`	map	A set of key-value pairs that you can attach to the batch. This property can be useful for storing additional information about the batch in a structured format.

Frequently asked questions (FAQ)

Can images be used with the batch API?

This capability is limited to certain multimodal models. You can provide images as input either through an image URL or a base64 encoded representation of the image.

Can I use the batch API with fine-tuned models?

The batch API doesn’t currently support fine-tuned models.

Can I use the batch API for embeddings models?

The batch API doesn’t currently support fine-tuned models.

Does content filtering work with Global Batch deployment?

Yes. Similar to other deployment types, you can create content filters and associate them with the Global Batch deployment type.

Can I request additional quota?

Yes, from the quota page in the Foundry portal. Default quota allocation can be found in the quota and limits article.

What happens if the API doesn’t complete my request within the 24 hour time frame?

We aim to process these requests within 24 hours; we don’t expire the jobs that take longer. You can cancel the job anytime. When you cancel the job, any remaining work is canceled and any already completed work is returned. You’ll be charged for any completed work.

How many requests can I queue using batch?

There’s no fixed limit on the number of requests you can batch, however, it will depend on your enqueued token quota. Your enqueued token quota includes the maximum number of input tokens you can enqueue at one time. Once your batch request is completed, your batch rate limit is reset, as your input tokens are cleared. The limit depends on the number of global requests in the queue. If the Batch API queue processes your batches quickly, your batch rate limit is reset more quickly.

Troubleshooting

A job is successful when status is completed. Successful jobs will still generate an error_file_id, but it will be associated with an empty file with zero bytes. When a job failure occurs, you’ll find details about the failure in the errors property:

{
  "value": [
    {
      "id": "batch_80f5ad38-e05b-49bf-b2d6-a799db8466da",
      "completion_window": "24h",
      "created_at": 1725419394,
      "endpoint": "/chat/completions",
      "input_file_id": "file-c2d9a7881c8a466285e6f76f6321a681",
      "object": "batch",
      "status": "failed",
      "cancelled_at": null,
      "cancelling_at": null,
      "completed_at": 1725419955,
      "error_file_id": "file-3b0f9beb-11ce-4796-bc31-d54e675f28fb",
      "errors": {
        "object": "list",
        "data": [
          {
            "code": "empty_file",
            "message": "The input file is empty. Please ensure that the batch contains at least one request."
          }
        ]
      },
      "expired_at": null,
      "expires_at": 1725505794,
      "failed_at": null,
      "finalizing_at": 1725419710,
      "in_progress_at": 1725419572,
      "metadata": null,
      "output_file_id": "file-ef12af98-dbbc-4d27-8309-2df57feed572",
      "request_counts": {
        "total": 10,
        "completed": null,
        "failed": null
      }
    }
  ]
}

Error codes

Error code	Definition
`invalid_json_line`	A line (or multiple) in your input file wasn’t able to be parsed as valid json. Please ensure no typos, proper opening and closing brackets, and quotes as per JSON standard, and resubmit the request.
`too_many_tasks`	The number of requests in the input file exceeds the maximum allowed value of 100,000. Please ensure your total requests are under 100,000 and resubmit the job.
`url_mismatch`	Either a row in your input file has a URL that doesn’t match the rest of the rows, or the URL specified in the input file doesn’t match the expected endpoint URL. Please ensure all request URLs are the same, and that they match the endpoint URL associated with your Azure OpenAI deployment.
`model_not_found`	The Azure OpenAI model deployment name that was specified in the `model` property of the input file wasn’t found. Please ensure this name points to a valid Azure OpenAI model deployment.
`duplicate_custom_id`	The custom ID for this request is a duplicate of the custom ID in another request.
`empty_file`	The input file is empty. Please ensure the batch contains at least one request.
`model_mismatch`	The Azure OpenAI model deployment name that was specified in the `model` property of this request in the input file doesn’t match the rest of the file. Please ensure that all requests in the batch point to the same Azure OpenAI in Foundry Models model deployment in the `model` property of the request.
`invalid_request`	The schema of the input line is invalid or the deployment SKU is invalid. Please ensure the properties of the request in your input file match the expected input properties, and that the Azure OpenAI deployment SKU is `globalbatch` for batch API requests.
`input_modified`	Blob input has been modified after the batch job has been submitted.
`input_no_permissions`	It’s not possible to access the input blob. Please check permissions and network access between the Azure OpenAI account and Azure Storage account.

Known issues

Resources deployed with Azure CLI won’t work out-of-box with Azure OpenAI global batch. This is due to an issue where resources deployed using this method have endpoint subdomains that don’t follow the https://your-resource-name.openai.azure.com pattern. A workaround for this issue is to deploy a new Azure OpenAI resource using one of the other common deployment methods which will properly handle the subdomain setup as part of the deployment process.
UTF-8-BOM encoded jsonl files aren’t supported. JSON lines files should be encoded using UTF-8. Use of Byte-Order-Mark (BOM) encoded files isn’t officially supported by the JSON RFC spec, and Azure OpenAI will currently treat BOM encoded files as invalid. A UTF-8-BOM encoded file will currently return the generic error message: “Validation failed: A valid model deployment name couldn’t be extracted from the input file. Please ensure that each row in the input file has a valid deployment name specified in the ‘model’ field, and that the deployment name is consistent across all rows.”
When using your own storage for batch input data, once the batch job is submitted, if the input blob is modified the scoring job will be failed by the service.

​Batch support

​Global batch model availability

​Data zone batch model availability

​Feature support

​Batch deployment

​Batch limits

​Batch quota

​Global batch

​Data zone batch

​Batch object

​Frequently asked questions (FAQ)

​Can images be used with the batch API?

​Can I use the batch API with fine-tuned models?

​Can I use the batch API for embeddings models?

​Does content filtering work with Global Batch deployment?

​Can I request additional quota?

​What happens if the API doesn’t complete my request within the 24 hour time frame?

​How many requests can I queue using batch?

​Troubleshooting

​Error codes

​Known issues

​See also

Batch support

Global batch model availability

Data zone batch model availability

Feature support

Batch deployment

Batch limits

Batch quota

Global batch

Data zone batch

Batch object

Frequently asked questions (FAQ)

Can images be used with the batch API?

Can I use the batch API with fine-tuned models?

Can I use the batch API for embeddings models?

Does content filtering work with Global Batch deployment?

Can I request additional quota?

What happens if the API doesn’t complete my request within the 24 hour time frame?

How many requests can I queue using batch?

Troubleshooting

Error codes

Known issues

See also