Configure keyless authentication with Microsoft Entra ID

This article refers to the Microsoft Foundry (new) portal.

This article explains how to configure keyless authentication with Microsoft Entra ID for Microsoft Foundry Models. Keyless authentication enhances security by eliminating the need for API keys, simplifies the user experience with role-based access control (RBAC), and reduces operational complexity while providing robust compliance support.

Prerequisites

To complete this article, you need:

An Azure subscription. If you’re using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Microsoft Foundry Models if that’s your case.
A Foundry project. This kind of project is managed under a Foundry resource. If you don’t have a Foundry project, see Create a project for Foundry (Foundry projects)
The endpoint’s URL.
An account with Microsoft.Authorization/roleAssignments/write and Microsoft.Authorization/roleAssignments/delete permissions, such as the Administrator role-based access control. See the next section on Required Azure roles and permissions for more details.

Required Azure roles and permissions

Microsoft Entra ID uses role-based access control (RBAC) to manage access to Azure resources. You need different roles, depending on whether you’re setting up authentication (administrator) or using it to make API calls (developer).

For setting up authentication

Subscription owner or administrator: An account with Microsoft.Authorization/roleAssignments/write and Microsoft.Authorization/roleAssignments/delete permissions, such as the Owner or User Access Administrator role, required to assign the Cognitive Services User role to developers.

For making authenticated API calls

Cognitive Services User role: Required for developers to authenticate and make inference API calls using Microsoft Entra ID. This role must be assigned at the scope of your Foundry resource.

Role assignment requirements

When assigning roles, specify these three elements:

Security principal: Your user account, service principal, or security group (recommended for managing multiple users)
Role definition: The Cognitive Services User role
Scope: Your specific Foundry resource

Azure role assignments can take up to 5 minutes to propagate. When using security groups, changes to group membership propagate immediately.

Custom role (optional)

If you prefer a custom role instead of Cognitive Services User, make sure it includes these permissions:

{
  "permissions": [
    {
      "dataActions": [
        "Microsoft.CognitiveServices/accounts/MaaS/*"
      ]
    }
  ]
}

For more context on how roles work with Azure resources, see Understand roles in the context of resource in Azure.

Configure Microsoft Entra ID for inference

This section lists the steps to configure Microsoft Entra ID for inference from the Microsoft Foundry resource page in the Azure portal. Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).

On the landing page, select Management center.
Go to the Connected resources section and select the connection to the Foundry resource that you want to configure. If it isn’t listed, select View all to see the full list.

Screenshot showing how to navigate to the details of the connection in Foundry in the management center.

On the Connection details section, under Resource, select the name of the Azure resource. This action opens the resource in the Azure portal.

Screenshot showing the resource to which we configure Microsoft Entra ID.

Configure Microsoft Entra ID from the resource page

::: moniker-end

Select the resource name to open it.
In the left pane, select Access control (IAM), and then select Add > Add role assignment.

Screenshot showing how to add a role assignment in the Access control section of the resource in the Azure portal.

Use the View my access option to verify which roles are already assigned to you.

In Job function roles, type Cognitive Services User.

Screenshot showing how to select the Cognitive Services User role assignment.

Select the role and select Next.
On Members, select the user or group you want to grant access to. Use security groups whenever possible because they’re easier to manage and maintain.

Screenshot showing how to select the user to whom assign the role.

Select Next and finish the wizard.
The selected user can now use Microsoft Entra ID for inference.

Azure role assignments can take up to five minutes to propagate. When working with security groups, adding or removing users from the security group propagates immediately.

Verify the role assignment:
1. On the left pane in the Azure portal, select Access control (IAM).
2. Select Check access.
3. Search for the user or security group you assigned the role to.
4. Verify that Cognitive Services User appears in their assigned roles.

Key-based access is still possible for users who already have keys available to them. To revoke the keys, in Azure portal, on the left navigation, select Resource Management > Keys and Endpoints > Regenerate Key1 and Regenerate Key2.

Use Microsoft Entra ID in your code

After you configure Microsoft Entra ID in your resource, update your code to use it when you consume the inference endpoint. This example shows how to use a chat completions model:

Python
C#
JavaScript
Java
REST

Install the OpenAI SDK using a package manager like pip:

pip install openai

For Microsoft Entra ID authentication, also install:

pip install azure-identity

Use the package to consume the model. The following example shows how to create a client to consume chat completions with Microsoft Entra ID and make a test call to the chat completions endpoint with your model deployment.Replace <resource> with your Foundry resource name. Find it in the Azure portal or by running az cognitiveservices account list. Replace DeepSeek-V3.1 with your actual deployment name.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), 
    "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(
    base_url="https://<resource>.openai.azure.com/openai/v1/",
    api_key=token_provider,
)

completion = client.chat.completions.create(
    model="DeepSeek-V3.1",  # Required: your deployment name
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Azure AI?"}
    ]
)

print(completion.choices[0].message.content)

Expected output

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Python SDK and DefaultAzureCredential class.

Install the OpenAI SDK:

dotnet add package OpenAI

For Microsoft Entra ID authentication, also install the Azure.Identity package:

dotnet add package Azure.Identity

Import the following namespaces:

using Azure.Identity;
using OpenAI;
using OpenAI.Chat;
using System.ClientModel.Primitives;

Then, use the package to consume the model. The following example shows how to create a client to consume chat completions with Microsoft Entra ID, and then make a test call to the chat completions endpoint with your model deployment.Replace <resource> with your Foundry resource name (find it in the Azure portal). Replace gpt-4o-mini with your actual deployment name.

#pragma warning disable OPENAI001

BearerTokenPolicy tokenPolicy = new(
    new DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
);

ChatClient client = new(
    model: "gpt-4o-mini", // Your deployment name
    authenticationPolicy: tokenPolicy,
    options: new OpenAIClientOptions() {
        Endpoint = new Uri("https://<resource>.openai.azure.com/openai/v1/")
    }
);

ChatCompletion completion = client.CompleteChat(
    new SystemChatMessage("You are a helpful assistant."),
    new UserChatMessage("What is Azure AI?")
);

Console.WriteLine(completion.Content[0].Text);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI .NET SDK and DefaultAzureCredential class.

Install the OpenAI SDK with npm:

npm install openai

For Microsoft Entra ID authentication, also install:

npm install @azure/identity

Then, use the package to consume the model. The following example shows how to create a client to consume chat completions with Microsoft Entra ID, and then make a test call to the chat completions endpoint with your model deployment.Replace <resource> with your Foundry resource name (find it in the Azure portal or by running az cognitiveservices account list). Replace DeepSeek-V3.1 with your actual deployment name.

import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";
import { OpenAI } from "openai";

const tokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    'https://cognitiveservices.azure.com/.default'
);

const client = new OpenAI({
    baseURL: "https://<resource>.openai.azure.com/openai/v1/",
    apiKey: tokenProvider
});

const completion = await client.chat.completions.create({
    model: "DeepSeek-V3.1", // Required: your deployment name
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What is Azure AI?" }
    ]
});

console.log(completion.choices[0].message.content);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Node.js SDK and DefaultAzureCredential class.

Add the OpenAI SDK to your project. Check the OpenAI Java GitHub repository for the latest version and installation instructions.For Microsoft Entra ID authentication, also add:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.18.0</version>
</dependency>

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.azure.identity.DefaultAzureCredential;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.openai.models.chat.completions.*;

DefaultAzureCredential tokenCredential = new DefaultAzureCredentialBuilder().build();

OpenAIClient client = OpenAIOkHttpClient.builder()
    .baseUrl("https://<resource>.openai.azure.com/openai/v1/")
    .credential(BearerTokenCredential.create(
        AuthenticationUtil.getBearerTokenSupplier(
            tokenCredential, 
            "https://cognitiveservices.azure.com/.default"
        )
    ))
    .build();

ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
    .addSystemMessage("You are a helpful assistant.")
    .addUserMessage("What is Azure AI?")
    .model("DeepSeek-V3.1") // Required: your deployment name
    .build();

ChatCompletion completion = client.chat().completions().create(params);
System.out.println(completion.choices().get(0).message().content());

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Java SDK and DefaultAzureCredential class.

Explore the API design in the reference section to see which parameters are available. Indicate the authentication token in the header Authorization. For example, the Chat completion reference section details how to use the /chat/completions route to generate predictions based on chat-formatted instructions. The path /models is included in the root of the URL:RequestReplace <resource> with your Foundry resource name (find it in the Azure portal or by running az cognitiveservices account list). Replace MAI-DS-R1 with your actual deployment name.The base_url will accept both https://<resource>.openai.azure.com/openai/v1/ and https://<resource>.services.ai.azure.com/openai/v1/ formats.

curl -X POST https://<resource>.openai.azure.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
      "model": "MAI-DS-R1",
      "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain what the bitter lesson is?"
      }
    ]
  }'

ResponseIf authentication is successful, you receive a 200 OK response with chat completion results in the response body:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1738368234,
  "model": "MAI-DS-R1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The bitter lesson refers to a key insight in AI research that emphasizes the importance of general-purpose learning methods that leverage computation, rather than human-designed domain-specific approaches. It suggests that methods which scale with increased computation tend to be more effective in the long run."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  }
}

Tokens must be issued with scope https://cognitiveservices.azure.com/.default.For testing purposes, the easiest way to get a valid token for your user account is to use the Azure CLI. In a console, run the following Azure CLI command:

az account get-access-token --resource https://cognitiveservices.azure.com --query "accessToken" --output tsv

This command outputs an access token that you can store in the $AZURE_OPENAI_AUTH_TOKEN environment variable.Reference: Chat Completions API

Options for credential when using Microsoft Entra ID

DefaultAzureCredential is an opinionated, ordered sequence of mechanisms for authenticating to Microsoft Entra ID. Each authentication mechanism is a class that’s derived from the TokenCredential class and is known as a credential. At runtime, DefaultAzureCredential attempts to authenticate using the first credential. If that credential fails to acquire an access token, the next credential in the sequence is attempted, and so on, until an access token is obtained. In this way, your app can use different credentials in different environments without writing environment-specific code. When the preceding code runs on your local development workstation, it looks in the environment variables for an application service principal or at locally installed developer tools, like Visual Studio, for a set of developer credentials. You can use either approach to authenticate the app to Azure resources during local development. When deployed to Azure, this same code can also authenticate your app to other Azure resources. DefaultAzureCredential can retrieve environment settings and managed identity configurations to authenticate to other services automatically.

Best practices

Use deterministic credentials in production environments: Strongly consider moving from DefaultAzureCredential to one of the following deterministic solutions in production environments:
- A specific TokenCredential implementation, like ManagedIdentityCredential. See the Derived list for options.
- A pared-down ChainedTokenCredential implementation that’s optimized for the Azure environment in which your app runs. ChainedTokenCredential essentially creates a specific allowlist of acceptable credential options, like ManagedIdentity for production and VisualStudioCredential for development.
Configure system-assigned or user-assigned managed identities to the Azure resources where your code runs, if possible. Configure Microsoft Entra ID access to those specific identities.

Go to the projects or hubs that use the Foundry resource through a connection.
Select Management center.
Go to the Connected resources section and select the connection to the Foundry resource that you want to configure. If it’s not listed, select View all to see the full list.
In the Connection details section, next to Access details, select the edit icon.
Under Authentication, change the value to Microsoft Entra ID.
Select Update.
Your connection is configured to work with Microsoft Entra ID.

::: moniker-end

Disable key-based authentication in the resource

Disable key-based authentication when you implement Microsoft Entra ID and fully address compatibility or fallback concerns in all applications that consume the service. You can disable key-based authentication by using Azure CLI or when deploying with Bicep or ARM. Key-based access is still possible for users that already have keys available to them. To revoke the keys, in the Azure portal, on the left navigation, select Resource Management > Keys and Endpoints > Regenerate Key1 and Regenerate Key2.

Install the Azure CLI
Identify the following information:
- Your Azure subscription ID
- Your Microsoft Foundry resource name
- The resource group where you deployed the Foundry resource

Configure Microsoft Entra ID for inference

To configure Microsoft Entra ID for inference, follow these steps:

# Authenticate with Azure and sign in interactively
az login

If you have more than one subscription, select the subscription where your resource is located.
```
# Set the active subscription context
az account set --subscription "<subscription-id>"
```

Set the following environment variables with the name of the resource and resource group you plan to use.

# Store resource identifiers for reuse in subsequent commands
ACCOUNT_NAME="<ai-services-resource-name>"
RESOURCE_GROUP="<resource-group>"

Get the full name of your resource.

# Retrieve the full Azure Resource Manager ID for role assignment scoping
RESOURCE_ID=$(az resource show -g $RESOURCE_GROUP -n $ACCOUNT_NAME --resource-type "Microsoft.CognitiveServices/accounts" --query id --output tsv)

Get the object ID of the security principal you want to assign permissions to. The following examples show how to get the object ID associated with: Your own signed in account:

# Get your user's Microsoft Entra ID object ID
OBJECT_ID=$(az ad signed-in-user show --query id --output tsv)

A security group:

# Get the object ID for a security group (recommended for production)
OBJECT_ID=$(az ad group show --group "<group-name>" --query id --output tsv)

A service principal:

# Get the object ID for a service principal (for app authentication)
OBJECT_ID=$(az ad sp show --id "<service-principal-guid>" --query id --output tsv)

Assign the Cognitive Services User role to the service principal (scoped to the resource). By assigning a role, you grant the service principal access to this resource.

# Grant inference access by assigning the Cognitive Services User role
az role assignment create --assignee-object-id $OBJECT_ID --role "Cognitive Services User" --scope $RESOURCE_ID

The selected user can now use Microsoft Entra ID for inference.

Keep in mind that Azure role assignments can take up to five minutes to propagate. Adding or removing users from a security group propagates immediately.

Verify the role assignment:

az role assignment list --scope $RESOURCE_ID --assignee $OBJECT_ID --query "[?roleDefinitionName=='Cognitive Services User'].{principalName:principalName, roleDefinitionName:roleDefinitionName}" --output table

The output should show the Cognitive Services User role assigned to your principal.

Use Microsoft Entra ID in your code

After you configure Microsoft Entra ID in your resource, update your code to use it when you consume the inference endpoint. The following example shows how to use a chat completions model:

Python
C#
JavaScript
Java
REST

Install the OpenAI SDK using a package manager like pip:

pip install openai

For Microsoft Entra ID authentication, also install:

pip install azure-identity

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), 
    "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(
    base_url="https://<resource>.openai.azure.com/openai/v1/",
    api_key=token_provider,
)

completion = client.chat.completions.create(
    model="DeepSeek-V3.1",  # Required: your deployment name
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Azure AI?"}
    ]
)

print(completion.choices[0].message.content)

Expected output

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Python SDK and DefaultAzureCredential class.

Install the OpenAI SDK:

dotnet add package OpenAI

For Microsoft Entra ID authentication, also install the Azure.Identity package:

dotnet add package Azure.Identity

Import the following namespaces:

using Azure.Identity;
using OpenAI;
using OpenAI.Chat;
using System.ClientModel.Primitives;

#pragma warning disable OPENAI001

BearerTokenPolicy tokenPolicy = new(
    new DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
);

ChatClient client = new(
    model: "gpt-4o-mini", // Your deployment name
    authenticationPolicy: tokenPolicy,
    options: new OpenAIClientOptions() {
        Endpoint = new Uri("https://<resource>.openai.azure.com/openai/v1/")
    }
);

ChatCompletion completion = client.CompleteChat(
    new SystemChatMessage("You are a helpful assistant."),
    new UserChatMessage("What is Azure AI?")
);

Console.WriteLine(completion.Content[0].Text);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI .NET SDK and DefaultAzureCredential class.

Install the OpenAI SDK with npm:

npm install openai

For Microsoft Entra ID authentication, also install:

npm install @azure/identity

Then, use the package to consume the model. The following example shows how to create a client to consume chat completions with Microsoft Entra ID, and then make a test call to the chat completions endpoint with your model deployment.Replace <resource> with your Foundry resource name (find it in the Azure portal or by running az cognitiveservices account list). Replace DeepSeek-V3.1 with your actual deployment name.

import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";
import { OpenAI } from "openai";

const tokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    'https://cognitiveservices.azure.com/.default'
);

const client = new OpenAI({
    baseURL: "https://<resource>.openai.azure.com/openai/v1/",
    apiKey: tokenProvider
});

const completion = await client.chat.completions.create({
    model: "DeepSeek-V3.1", // Required: your deployment name
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What is Azure AI?" }
    ]
});

console.log(completion.choices[0].message.content);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Node.js SDK and DefaultAzureCredential class.

Add the OpenAI SDK to your project. Check the OpenAI Java GitHub repository for the latest version and installation instructions.For Microsoft Entra ID authentication, also add:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.18.0</version>
</dependency>

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.azure.identity.DefaultAzureCredential;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.openai.models.chat.completions.*;

DefaultAzureCredential tokenCredential = new DefaultAzureCredentialBuilder().build();

OpenAIClient client = OpenAIOkHttpClient.builder()
    .baseUrl("https://<resource>.openai.azure.com/openai/v1/")
    .credential(BearerTokenCredential.create(
        AuthenticationUtil.getBearerTokenSupplier(
            tokenCredential, 
            "https://cognitiveservices.azure.com/.default"
        )
    ))
    .build();

ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
    .addSystemMessage("You are a helpful assistant.")
    .addUserMessage("What is Azure AI?")
    .model("DeepSeek-V3.1") // Required: your deployment name
    .build();

ChatCompletion completion = client.chat().completions().create(params);
System.out.println(completion.choices().get(0).message().content());

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Java SDK and DefaultAzureCredential class.

curl -X POST https://<resource>.openai.azure.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
      "model": "MAI-DS-R1",
      "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain what the bitter lesson is?"
      }
    ]
  }'

ResponseIf authentication is successful, you receive a 200 OK response with chat completion results in the response body:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1738368234,
  "model": "MAI-DS-R1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The bitter lesson refers to a key insight in AI research that emphasizes the importance of general-purpose learning methods that leverage computation, rather than human-designed domain-specific approaches. It suggests that methods which scale with increased computation tend to be more effective in the long run."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  }
}

az account get-access-token --resource https://cognitiveservices.azure.com --query "accessToken" --output tsv

This command outputs an access token that you can store in the $AZURE_OPENAI_AUTH_TOKEN environment variable.Reference: Chat Completions API

Options for credential when using Microsoft Entra ID

Best practices

Use deterministic credentials in production environments: Strongly consider moving from DefaultAzureCredential to one of the following deterministic solutions in production environments:
- A specific TokenCredential implementation, like ManagedIdentityCredential. See the Derived list for options.
- A pared-down ChainedTokenCredential implementation that’s optimized for the Azure environment in which your app runs. ChainedTokenCredential essentially creates a specific allowlist of acceptable credential options, like ManagedIdentity for production and VisualStudioCredential for development.
Configure system-assigned or user-assigned managed identities to the Azure resources where your code runs, if possible. Configure Microsoft Entra ID access to those specific identities.

Disable key-based authentication in the resource

Disable key-based authentication when you implement Microsoft Entra ID and fully address compatibility or fallback concerns in all the applications that consume the service. Use PowerShell with the Azure CLI to disable local authentication for an individual resource. First sign in with the Connect-AzAccount command. Then use the Set-AzCognitiveServicesAccount cmdlet with the parameter -DisableLocalAuth $true, like the following example:

Set-AzCognitiveServicesAccount -ResourceGroupName "my-resource-group" -Name "my-resource-name" -DisableLocalAuth $true

For more information about how to use the Azure CLI to disable or reenable local authentication and verify authentication status, see Disable local authentication in Foundry Tools.

Install the Azure CLI
Identify the following information:
- Your Azure subscription ID

About this tutorial

The example in this article is based on code samples in the Azure-Samples/azureai-model-inference-bicep repository. To run the commands locally without copying or pasting file content, clone the repository with these commands and go to the folder for your coding language:

git clone https://github.com/Azure-Samples/azureai-model-inference-bicep

The files for this example are in the following directory:

cd azureai-model-inference-bicep/infra

Understand the resources

In this tutorial, you create the following resources:

A Microsoft Foundry resource with key access disabled. For simplicity, this template doesn’t deploy models.
A role assignment for a given security principal with the role Cognitive Services User.

To create these resources, use the following assets:

Use the template modules/ai-services-template.bicep to describe your Foundry resource. modules/ai-services-template.bicep
```
// Source: ai-services-template.bicep (not available)
```

This template accepts the allowKeys parameter. Set it to false to disable key access in the resource.

Use the template modules/role-assignment-template.bicep to describe a role assignment in Azure: modules/role-assignment-template.bicep
```
// Source: role-assignment-template.bicep (not available)
```

Create the resources

In your console, follow these steps:

Define the main deployment: deploy-entra-id.bicep
```
// Source: deploy-entra-id.bicep (not available)
```
Sign in to Azure:
```
az login
```
Make sure you’re in the right subscription:
```
az account set --subscription "<subscription-id>"
```

Run the deployment:

RESOURCE_GROUP="<resource-group-name>"
SECURITY_PRINCIPAL_ID="<your-security-principal-id>"

az deployment group create \
  --resource-group $RESOURCE_GROUP \
  --parameters securityPrincipalId=$SECURITY_PRINCIPAL_ID \
  --template-file deploy-entra-id.bicep

The template outputs the Foundry Models endpoint that you can use to consume any of the model deployments you created.

Verify the deployment and role assignment:

# Get the endpoint from deployment output
ENDPOINT=$(az deployment group show --resource-group $RESOURCE_GROUP --name deploy-entra-id --query properties.outputs.endpoint.value --output tsv)

# Verify role assignment
RESOURCE_ID=$(az deployment group show --resource-group $RESOURCE_GROUP --name deploy-entra-id --query properties.outputs.resourceId.value --output tsv)
az role assignment list --scope $RESOURCE_ID --assignee $SECURITY_PRINCIPAL_ID --query "[?roleDefinitionName=='Cognitive Services User'].roleDefinitionName" --output tsv

# Test authentication by getting an access token
az account get-access-token --resource https://cognitiveservices.azure.com --query "accessToken" --output tsv

If successful, you see Cognitive Services User from the role assignment check and an access token from the authentication test. You can now use this endpoint and Microsoft Entra ID authentication in your code.

Use Microsoft Entra ID in your code

After you configure Microsoft Entra ID in your resource, update your code to use it when you consume the inference endpoint. The following example shows how to use a chat completions model.

Python
C#
JavaScript
Java
REST

Install the OpenAI SDK using a package manager like pip:

pip install openai

For Microsoft Entra ID authentication, also install:

pip install azure-identity

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), 
    "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(
    base_url="https://<resource>.openai.azure.com/openai/v1/",
    api_key=token_provider,
)

completion = client.chat.completions.create(
    model="DeepSeek-V3.1",  # Required: your deployment name
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Azure AI?"}
    ]
)

print(completion.choices[0].message.content)

Expected output

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Python SDK and DefaultAzureCredential class.

Install the OpenAI SDK:

dotnet add package OpenAI

For Microsoft Entra ID authentication, also install the Azure.Identity package:

dotnet add package Azure.Identity

Import the following namespaces:

using Azure.Identity;
using OpenAI;
using OpenAI.Chat;
using System.ClientModel.Primitives;

#pragma warning disable OPENAI001

BearerTokenPolicy tokenPolicy = new(
    new DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
);

ChatClient client = new(
    model: "gpt-4o-mini", // Your deployment name
    authenticationPolicy: tokenPolicy,
    options: new OpenAIClientOptions() {
        Endpoint = new Uri("https://<resource>.openai.azure.com/openai/v1/")
    }
);

ChatCompletion completion = client.CompleteChat(
    new SystemChatMessage("You are a helpful assistant."),
    new UserChatMessage("What is Azure AI?")
);

Console.WriteLine(completion.Content[0].Text);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI .NET SDK and DefaultAzureCredential class.

Install the OpenAI SDK with npm:

npm install openai

For Microsoft Entra ID authentication, also install:

npm install @azure/identity

Then, use the package to consume the model. The following example shows how to create a client to consume chat completions with Microsoft Entra ID, and then make a test call to the chat completions endpoint with your model deployment.Replace <resource> with your Foundry resource name (find it in the Azure portal or by running az cognitiveservices account list). Replace DeepSeek-V3.1 with your actual deployment name.

import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";
import { OpenAI } from "openai";

const tokenProvider = getBearerTokenProvider(
    new DefaultAzureCredential(),
    'https://cognitiveservices.azure.com/.default'
);

const client = new OpenAI({
    baseURL: "https://<resource>.openai.azure.com/openai/v1/",
    apiKey: tokenProvider
});

const completion = await client.chat.completions.create({
    model: "DeepSeek-V3.1", // Required: your deployment name
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What is Azure AI?" }
    ]
});

console.log(completion.choices[0].message.content);

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Node.js SDK and DefaultAzureCredential class.

Add the OpenAI SDK to your project. Check the OpenAI Java GitHub repository for the latest version and installation instructions.For Microsoft Entra ID authentication, also add:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.18.0</version>
</dependency>

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.azure.identity.DefaultAzureCredential;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.openai.models.chat.completions.*;

DefaultAzureCredential tokenCredential = new DefaultAzureCredentialBuilder().build();

OpenAIClient client = OpenAIOkHttpClient.builder()
    .baseUrl("https://<resource>.openai.azure.com/openai/v1/")
    .credential(BearerTokenCredential.create(
        AuthenticationUtil.getBearerTokenSupplier(
            tokenCredential, 
            "https://cognitiveservices.azure.com/.default"
        )
    ))
    .build();

ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
    .addSystemMessage("You are a helpful assistant.")
    .addUserMessage("What is Azure AI?")
    .model("DeepSeek-V3.1") // Required: your deployment name
    .build();

ChatCompletion completion = client.chat().completions().create(params);
System.out.println(completion.choices().get(0).message().content());

Expected output:

Azure AI is a comprehensive suite of artificial intelligence services and tools from Microsoft that enables developers to build intelligent applications. It includes services for natural language processing, computer vision, speech recognition, and machine learning capabilities.

Reference: OpenAI Java SDK and DefaultAzureCredential class.

curl -X POST https://<resource>.openai.azure.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
      "model": "MAI-DS-R1",
      "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain what the bitter lesson is?"
      }
    ]
  }'

ResponseIf authentication is successful, you receive a 200 OK response with chat completion results in the response body:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1738368234,
  "model": "MAI-DS-R1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The bitter lesson refers to a key insight in AI research that emphasizes the importance of general-purpose learning methods that leverage computation, rather than human-designed domain-specific approaches. It suggests that methods which scale with increased computation tend to be more effective in the long run."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  }
}

az account get-access-token --resource https://cognitiveservices.azure.com --query "accessToken" --output tsv

This command outputs an access token that you can store in the $AZURE_OPENAI_AUTH_TOKEN environment variable.Reference: Chat Completions API

Options for credential when using Microsoft Entra ID

Best practices

Use deterministic credentials in production environments: Strongly consider moving from DefaultAzureCredential to one of the following deterministic solutions in production environments:
- A specific TokenCredential implementation, like ManagedIdentityCredential. See the Derived list for options.
- A pared-down ChainedTokenCredential implementation that’s optimized for the Azure environment in which your app runs. ChainedTokenCredential essentially creates a specific allowlist of acceptable credential options, like ManagedIdentity for production and VisualStudioCredential for development.
Configure system-assigned or user-assigned managed identities to the Azure resources where your code runs, if possible. Configure Microsoft Entra ID access to those specific identities.

Disable key-based authentication in the resource

Disable key-based authentication when you implement Microsoft Entra ID and fully address compatibility or fallback concerns in all applications that consume the service. Change the disableLocalAuth property to disable key-based authentication. For more information about how to disable local authentication when you’re using a Bicep or ARM template, see How to disable local authentication. modules/ai-services-template.bicep

// Source: ai-services-template.bicep (not available)

Understand roles in the context of resource in Azure

Microsoft Entra ID uses role-based access control (RBAC) for authorization, which controls what actions users can perform on Azure resources. Roles are central to managing access to cloud resources. A role is a collection of permissions that define what actions can be performed on specific Azure resources. By assigning roles to users, groups, service principals, or managed identities—collectively known as security principals—you control their access within your Azure environment to specific resources. When you assign a role, you specify the security principal, role definition, and scope. This combination is known as a role assignment. Foundry Models is a capability of the Foundry Tools resources, therefore, roles assigned to that particular resource control the access for inference. There are two types of access to the resources:

Administration access: Actions related to the administration of the resource. These actions usually change the resource state and its configuration. In Azure, these operations are control-plane operations that you can execute using the Azure portal, Azure CLI, or infrastructure as code. Examples include creating new model deployments, changing content filtering configurations, changing the version of the model served, or changing the SKU of a deployment.
Developer access: Actions related to consuming the resources, such as invoking the chat completions API. However, the user can’t change the resource state and its configuration.

In Azure, Microsoft Entra ID always performs administration operations. Roles like Cognitive Services Contributor allow you to perform those operations. Developer operations can be performed using either access keys or Microsoft Entra ID. Roles like Cognitive Services User allow you to perform those operations.

Having administration access to a resource doesn’t grant developer access to it. Explicit access by granting roles is still required. This is analogous to how database servers work. Having administrator access to the database server doesn’t mean you can read the data inside of a database.

Troubleshooting

Before you troubleshoot, verify that you have the right permissions assigned:

Go to the Azure portal and locate the Microsoft Foundry resource that you’re using.
On the left pane, select Access control (IAM) and then select Check access.
Type the name of the user or identity you’re using to connect to the service.
Verify that the role Cognitive Services User is listed (or a role that contains the required permissions, as explained in the Prerequisites section).

Roles like Owner or Contributor don’t provide access via Microsoft Entra ID.

If the role isn’t listed, follow the steps in this guide before you continue.

The following table contains multiple scenarios that can help you troubleshoot Microsoft Entra ID:

Error / Scenario	Root cause	Solution
You’re using an SDK	Known issues	Before you troubleshoot further, install the latest version of the software you’re using to connect to the service. Authentication bugs might already be fixed in a newer version of the software you’re using.
`401 Principal does not have access to API/Operation`	The request indicates authentication in the correct way, but the user principal doesn’t have the required permissions to use the inference endpoint.	Ensure you have: 1. Assigned the role Cognitive Services User to your principal to the Foundry resource. Notice that Cognitive Services OpenAI User grants only access to OpenAI models. Owner or Contributor don’t provide access either. 1. Waited at least 5 minutes before making the first call.
`401 HTTP/1.1 401 PermissionDenied`	The request indicates authentication in the correct way, but the user principal doesn’t have the required permissions to use the inference endpoint.	Assigned the role Cognitive Services User to your principal in the Foundry resource. Roles like Administrator or Contributor don’t grant inference access. Wait at least 5 minutes before making the first call.
You’re using REST API calls and you get `401 Unauthorized. Access token is missing, invalid, audience is incorrect, or have expired.`	The request fails to authenticate with Microsoft Entra ID.	Ensure the `Authentication` header contains a valid token with a scope `https://cognitiveservices.azure.com/.default`.

Next step

Azure OpenAI supported programming languages

What is Microsoft Foundry (new)?

Get started

Agent development

Agent tools & integration

Model capabilities

Fine-tuning

Manage agents, models, & tools

Observability, evaluation, & tracing

Developer experience

API & SDK

Responsible AI

Best practices

Setup & configure

Security & governance

Operate & support

​Configure keyless authentication with Microsoft Entra ID

​Prerequisites

​Required Azure roles and permissions

​For setting up authentication

​For making authenticated API calls

​Role assignment requirements

​Custom role (optional)

​Configure Microsoft Entra ID for inference

​Configure Microsoft Entra ID from the resource page

​Use Microsoft Entra ID in your code

​Options for credential when using Microsoft Entra ID

​Best practices

​Disable key-based authentication in the resource

​Configure Microsoft Entra ID for inference

​Use Microsoft Entra ID in your code

​Options for credential when using Microsoft Entra ID

​Best practices

​Disable key-based authentication in the resource

​About this tutorial

​Understand the resources

​Create the resources

​Use Microsoft Entra ID in your code

​Options for credential when using Microsoft Entra ID

​Best practices

​Disable key-based authentication in the resource

​Understand roles in the context of resource in Azure

​Troubleshooting

​Next step

Configure keyless authentication with Microsoft Entra ID

Prerequisites

Required Azure roles and permissions

For setting up authentication

For making authenticated API calls

Role assignment requirements

Custom role (optional)

Configure Microsoft Entra ID for inference

Configure Microsoft Entra ID from the resource page

Use Microsoft Entra ID in your code

Options for credential when using Microsoft Entra ID

Best practices

Disable key-based authentication in the resource

Configure Microsoft Entra ID for inference

Use Microsoft Entra ID in your code

Options for credential when using Microsoft Entra ID

Best practices

Disable key-based authentication in the resource

About this tutorial

Understand the resources

Create the resources

Use Microsoft Entra ID in your code

Options for credential when using Microsoft Entra ID

Best practices

Disable key-based authentication in the resource

Understand roles in the context of resource in Azure

Troubleshooting

Next step