Skip to main content
Microsoft Foundry provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main deployment categories, standard (pay-per-token) and provisioned (reserved capacity), along with other categories like batch (for asynchronous requests). Within these categories, you can choose global, data zone, or regional processing based on your compliance requirements. For all deployment types, data stored at rest remains in the designated Azure geography (Americas, Europe, Asia Pacific, Middle East & Africa). However, inferencing data is processed as follows:
  • Global types: May be processed in any Azure region where the Foundry Model is deployed
  • DataZone types: Processed anywhere within the Microsoft-specified data zone (US or EU)
  • Standard/Regional types: Processed in the region associated with your deployment (not available for batch deployments)
All deployments can perform the exact same inference operations, but the billing, scale, and performance are substantially different. To learn more about Microsoft Foundry deployment types, including Batch deployment types, see Deployment types for Microsoft Foundry Models.
Use the tabs at the top of this page to switch deployment categories: Standard deployment options, Provisioned deployment options, and Batch deployment options.