How to configure guardrails and controls in Microsoft Foundry
Create, configure, and manage guardrails and controls for your model deployments and agents in Microsoft Foundry. This article covers creating guardrails through the Foundry portal and the REST API. For background on guardrails concepts, risks, and intervention points, see Guardrails and controls overview.Prerequisites
- An Azure subscription. Create one for free.
- A Microsoft Foundry project.
- At least one model deployment in your project.
- Azure AI Account Owner role or higher on the Azure AI resource.
Create a guardrail in Foundry
- Go to Foundry and navigate to your project.
- Select Build in the top right menu.
- Select the Guardrails page from the left navigation.
- Select Create Guardrail in the top right. The guardrail wizard opens with Step 1: Add Controls.
Add controls to a guardrail
Default controls are displayed in the right pane when you create a new guardrail.- Select a risk from the dropdown menu.
- Choose intervention points and actions: Recommended intervention points and actions for that risk are shown. Select one or many intervention points and one action to configure your control.
Some intervention points will not be available for a risk if that is inapplicable at that intervention point. For example, by definition, user input attacks are malicious content added to the user input. So, that risk can be scanned only at that intervention point.
- Select Add control. The control is added to the table on the right.
Delete controls from a guardrail
To delete a control:- Select the control you want to remove.
- Select Delete.
Some controls can only be deleted by Managed Customers who are approved for modified content filtering. Learn more about modified content filtering.
Edit controls in a guardrail
There are two ways to edit a control: deleting it and adding a new one, or overriding an existing control. The latter is the only way to edit a control that can’t be deleted, such as Violence, Hate, Sexual, and Self-harm controls on user inputs and outputs. To edit a control by overriding it:- Select the same risk of the control that needs to be edited.
- Change the configuration of the control or the intervention points and action as desired.
- Select Add control.
- A pop-up asks for confirmation to override the existing control. Select Confirm.
Assign a guardrail to agents and models
After adding, editing, and/or deleting controls as desired:- Select Next to proceed to Step 2: assigning a guardrail to agents and/or models.
- Select Add agents and/or Add models to view a list of agents and models in this project.
- Select models or agents. Previously assigned agents and models can also be deselected to remove this guardrail and re-assign the Microsoft Default.
- Select Save to confirm. A success notification appears.
Review and name guardrail
- Select Next to proceed to Step 3: Review.
- Review the controls added to this guardrail and the models and/or agent it’s assigned to.
- Name the guardrail, or leave the automatically assigned name.
- Select Create. The guardrail appears in the list on the Guardrails page and applies to the selected models and agents.
Edit an existing guardrail
- Select Build in the top right menu.
- Select the Guardrails page from the left navigation.
- Find the guardrail in the list of guardrails. Select its name directly, or select its row and then select Edit in the detail pane.
Microsoft Default guardrails, such as Default.V2, can’t be edited.
- Follow the same instructions as in Create a guardrail to edit, add, or remove controls; assign or re-assign agents and/or models; and rename the guardrail, as needed.
Assign a guardrail
There are two paths to assigning a guardrail to a model or agent:Option 1: Edit the guardrail
- Select Build in the top right menu.
- Select the Guardrails page from the left navigation.
- Find the guardrail in the list of guardrails. Select its name directly, or select its row and then select Edit in the detail pane.
- Select Next on Step 1: Add Controls to skip forward to the assignment step.
- Select Add agents or Add models and select and deselect models and/or agents as needed to update the guardrail’s assignment.
Option 2: Edit the model or agent
- Select Build in the top right menu.
- Select Agents or Models in the left navigation.
- Select the agent or model that you want to update.
- A section for Guardrails appears in the left panel of the Agent Playground or Chat Playground.
- Select Manage at the bottom of the Guardrails section.
- Select Assign a new guardrail.
- Browse guardrails available in this project.
- Select the desired guardrail from the list on the left of the pop-up.
- Select Assign to update the agent or model’s guardrail assignment. The new guardrail takes effect immediately.
Delete a guardrail
To delete a guardrail with no models or agents assigned
- Select Build in the top right menu.
- Select the Guardrails page from the left navigation.
- Find the guardrail in the list of guardrails and select its row.
- A panel appears on the right. Select Delete at the top of the panel.
To delete a guardrail with assigned models or agents
- Reassign the models and/or agents to a different guardrail, or remove them from this guardrail to reassign them to the Microsoft Default Guardrails.
- Follow the instructions to delete a guardrail with no models or agents assigned.
Test guardrails
To test the behavior of a particular guardrail:- Select Build in the top right menu.
- Select the Guardrails page from the left navigation.
- Find the guardrail in the list of guardrails and select its row.
- A panel appears on the right. Select Try in Playground at the top of the panel.
If that button doesn’t appear, assign this guardrail to a model or agent first. Assigning a guardrail immediately changes the safety and security behavior, so use a non-production model or agent for testing.
- In the playground, send queries to the model or agent.
- When a control that has “Annotate and block” as its action is triggered, a message appears in the chat with details on which risk was detected and at which intervention point.
Configure guardrails with the REST API
In the Azure AI Services REST API, a guardrail is represented as a RAI policy — a resource-level object in Azure Resource Manager.Create or update a guardrail
Use the RAI Policies - Create Or Update operation to create or update a guardrail. Specify the controls (content filter rules) in the request body, including the risk category, severity level, and whether to block or annotate.Assign a guardrail to a model deployment
Set theraiPolicyName property on a deployment to assign a guardrail. Use the Deployments - Create Or Update operation and include the guardrail name in the deployment properties.
Work with annotations
Foundry provides annotations to help you understand the guardrail results for your requests. Annotations can be enabled even for filters and severity levels that have been disabled from blocking content.Standard guardrail annotations
When annotations are enabled, the following information is returned via the API for the categories hate, sexual, violence, and self-harm:- Risk category (hate, sexual, violence, self_harm)
- Severity level (safe, low, medium, or high) within each content category
- Filtering status (true or false)
Optional model annotations
Optional model annotations can be set to annotate mode (returns information when content is flagged, but not filtered) or filter mode (returns information when content is flagged and filtered).| Model | Output |
|---|---|
| User prompt attack | - detected (true or false) - filtered (true or false) |
| Indirect attacks | - detected (true or false) - filtered (true or false) |
| Protected material text | - detected (true or false) - filtered (true or false) |
| Protected material code | - detected (true or false) - filtered (true or false) - Example citation of public GitHub repository where code snippet was found - The license of the repository |
| Personally identifiable information (PII) | - detected (true or false) - filtered (true or false) - redacted (true or false) |
| Groundedness | - detected (true or false) - filtered (true or false, with details) - (Annotate mode only) details:(completion_end_offset, completion_start_offset) |
When displaying code in your application, we strongly recommend that the application also displays the example citation from the annotations. Compliance with the cited license may also be required for Customer Copyright Commitment coverage.
API version compatibility
The following table shows annotation mode availability in each API version:| Filter category | 2024-10-01-preview | 2024-02-01 GA | 2024-04-01-preview | 2023-10-01-preview | 2023-06-01-preview | 2025-01-01-preview |
|---|---|---|---|---|---|---|
| Hate | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Violence | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Sexual | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Self-harm | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Prompt Shield for user prompt attacks | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Prompt Shield for indirect attacks | ❌ | ❌ | ✅ | ❌ | ❌ | ✅ |
| Protected material text | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Protected material code | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Personally identifiable information (PII) | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Profanity blocklist | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Custom blocklist | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ |
| Groundedness¹ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
Code examples
The following code snippets show how to view guardrail annotations in different programming languages.Example output
Document embedding in prompts
Guardrails perform better when they can differentiate between the various elements of your prompt, like system input, user input, and the AI assistant’s output. For enhanced detection capabilities, prompts should be formatted according to the following recommended methods.Default behavior in Chat Completions API
The Chat Completions API is structured by definition. Inputs consist of a list of messages, each with an assigned role. The safety system parses this structured format and applies the following behavior: On the latest “user” content, the following categories of risks are detected:- Hate
- Sexual
- Violence
- Self-Harm
- Prompt shields (optional)
Embedding documents in your prompt
In addition to detection on last user content, guardrails also support the detection of specific risks inside context documents via Prompt Shields – Indirect Prompt Attack Detection and Groundedness detection. You should identify the parts of the input that are a document (for example, retrieved website, email, etc.) with the following document delimiter:- Indirect attacks (optional)
- Groundedness detection
JSON escaping
When you tag unvetted documents for detection, the document content should be JSON-escaped to ensure successful parsing by the Azure AI safety system. For example, see the following email body:Specify guardrail configuration at request time
In addition to the model deployment-level guardrail, you can specify your custom guardrail at request time for each API call using a request header.Guardrails specification at request time is not available for image input (chat with images) scenarios. In those cases the default guardrail is used.
Best practices
Follow these practices when configuring guardrails:- Test before production: Use the playground to test guardrail behavior before applying changes to production deployments.
- Start restrictive, then relax: Begin with higher severity thresholds and adjust downward only after confirming acceptable behavior.
- Red-team your configuration: Run red-team testing, stress-testing, and analysis to identify potential harms specific to your model, application, and deployment scenario.
- Measure after changes: After implementing or updating guardrails, repeat your measurement process to verify effectiveness.
Troubleshooting
| Issue | Resolution |
|---|---|
| Can’t delete a guardrail | Reassign or remove all models and agents from the guardrail first. See Delete a guardrail. |
| Can’t edit or delete a control | Some controls (Violence, Hate, Sexual, Self-harm) on user inputs and outputs can only be overridden, not deleted. See Edit controls. |
| Guardrail changes not taking effect | Verify the guardrail is assigned to the correct model or agent. For agents, the agent’s guardrail overrides the model’s guardrail. |
| ”InvalidContentFilterPolicy” error | The guardrail name in the x-policy-id header doesn’t match an existing guardrail. Verify the name on the Guardrails page. |
| Can’t edit Default.V2 guardrail | Microsoft Default guardrails can’t be modified. Create a custom guardrail instead. |