Skip to main content

Monitor agents with the Agent Monitoring Dashboard (preview)

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don’t recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Use the Agent Monitoring Dashboard in Microsoft Foundry to track operational metrics and evaluation results for your agents. This dashboard helps you understand token usage, latency, success rates, and evaluation outcomes for production traffic. This article covers two approaches: viewing metrics in the Foundry portal and setting up continuous evaluation programmatically with the Python SDK.

Prerequisites

  • A Foundry project with at least one agent.
  • An Application Insights resource connected to your project.
  • Azure role-based access control (RBAC) access to the Application Insights resource. For log-based views, you also need access to the associated Log Analytics workspace. To verify access, open the Application Insights resource in the Azure portal, select Access control (IAM), and confirm your account has an appropriate role. For log access, assign the Log Analytics Reader role.

Connect Application Insights

The Agent Monitoring Dashboard reads telemetry from the Application Insights resource connected to your Foundry project. If you haven’t connected Application Insights yet, follow the tracing setup steps and then return to this article.

View agent metrics

To view metrics for an agent in the Foundry portal:
  1. Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
  1. Navigate to the Build page using the top navigation and select the agent you’d like to view data for.
  2. Select the Monitor tab to view operational, evaluation, and red-teaming data for your agent.
Screenshot of the Agent Monitoring Dashboard in Foundry showing summary cards at the top with high-level metrics and charts below displaying evaluation scores, agent run success rates, and token usage over time.
The dashboard is designed for quick insights and deep analysis of your agent’s performance. It consists of two main areas:
  • Summary cards at the top for high-level metrics.
  • Charts and graphs below for granular details. These visualizations reflect data for the selected time range.

Understand the dashboard metrics

Use these definitions to interpret the dashboard:
  • Token usage: Token counts for agent traffic in the selected time range. High token usage might indicate verbose prompts or responses that could benefit from optimization.
  • Latency: Response time for agent runs. Latency above 10 seconds might indicate model throttling, complex tool calls, or network issues.
  • Run success rate: The percentage of runs that complete successfully. A rate below 95% warrants investigation into failed runs.
  • Evaluation metrics: Scores produced by evaluators that run on sampled agent outputs. Scores vary by evaluator; review individual evaluator documentation for interpretation guidance.
  • Red teaming results: Outcomes from scheduled red team scans, if enabled. Failed scans indicate potential security risks that require remediation.
Monitoring data is stored in the connected Application Insights resource. Retention and billing follow your Application Insights configuration.

Configure settings

Use the Monitor settings panel to configure telemetry, evaluations, and security checks for your agents. These settings control which charts the dashboard shows and which evaluations run.
Screenshot showing the Monitor Settings panel in Foundry with options for operational metrics, continuous evaluation, scheduled evaluations, red team scans, and alerts configuration.
To access Monitor settings, select the gear icon on the Monitor tab. The following table describes each monitoring feature:
SettingPurposeConfiguration Options
Continuous evaluationRuns evaluations on sampled agent responses.Enable or disable
Add evaluators
Set the sample rate
Scheduled evaluationsRuns evaluations on a schedule to validate performance against benchmarks.Enable or disable
Select an evaluation template and run
Set a schedule
Red team scansRuns adversarial tests to detect risks such as data leakage or prohibited actions.Enable or disable
Select an evaluation template and run
Set a schedule
AlertsDetects performance anomalies, evaluation failures, and security risks.Configure alerts for latency, token usage, evaluation scores, or red team findings

Set up continuous evaluation (Python SDK)

Use the Python SDK to set up continuous evaluation rules for agent responses. This section requires Python 3.9 or later.
pip install "azure-ai-projects>=2.0.0b1" python-dotenv
Set these environment variables with your own values:
  • AZURE_AI_PROJECT_ENDPOINT: The Foundry project endpoint, as found on the project overview page in the Foundry portal.
  • AZURE_AI_AGENT_NAME: The name of the agent to use for evaluation.
  • AZURE_AI_MODEL_DEPLOYMENT_NAME: The deployment name of the model.

Assign permissions for continuous evaluation

To enable continuous evaluation rules, assign the project managed identity the Azure AI User role.
  1. In the Azure portal, open the resource for your Foundry project.
  2. Select Access control (IAM), and then select Add.
  3. Create a role assignment for Azure AI User.
  4. For the member, select your Foundry project’s managed identity.

Create an agent

import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import (
    PromptAgentDefinition,
)

load_dotenv()

endpoint = os.environ["AZURE_AI_PROJECT_ENDPOINT"]

with (
    DefaultAzureCredential() as credential,
    AIProjectClient(endpoint=endpoint, credential=credential) as project_client,
    project_client.get_openai_client() as openai_client,
):
    agent = project_client.agents.create_version(
        agent_name=os.environ["AZURE_AI_AGENT_NAME"],
        definition=PromptAgentDefinition(
            model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
            instructions="You are a helpful assistant that answers general questions",
        ),
    )
    print(f"Agent created (id: {agent.id}, name: {agent.name}, version: {agent.version})")
References: AIProjectClient, DefaultAzureCredential

Create a continuous evaluation rule

Define the evaluation and the rule that runs when a response completes. To learn more about supported evaluators, see Built in evaluators.
from azure.ai.projects.models import (
    EvaluationRule,
    ContinuousEvaluationRuleAction,
    EvaluationRuleFilter,
    EvaluationRuleEventType,
)

data_source_config = {"type": "azure_ai_source", "scenario": "responses"}
testing_criteria = [
    {"type": "azure_ai_evaluator", "name": "violence_detection", "evaluator_name": "builtin.violence"}
]
eval_object = openai_client.evals.create(
    name="Continuous Evaluation",
    data_source_config=data_source_config,  # type: ignore
    testing_criteria=testing_criteria,  # type: ignore
)
print(f"Evaluation created (id: {eval_object.id}, name: {eval_object.name})")

continuous_eval_rule = project_client.evaluation_rules.create_or_update(
    id="my-continuous-eval-rule",
    evaluation_rule=EvaluationRule(
        display_name="My Continuous Eval Rule",
        description="An eval rule that runs on agent response completions",
        action=ContinuousEvaluationRuleAction(eval_id=eval_object.id, max_hourly_runs=100),
        event_type=EvaluationRuleEventType.RESPONSE_COMPLETED,
        filter=EvaluationRuleFilter(agent_name=agent.name),
        enabled=True,
    ),
)
print(
    f"Continuous Evaluation Rule created (id: {continuous_eval_rule.id}, name: {continuous_eval_rule.display_name})"
)
References: EvaluationRuleEventType, EvaluationRule

Verify continuous evaluation results

  1. Generate agent traffic (for example, run your app or test the agent in the portal).
  2. In the Foundry portal, open the agent and select Monitor.
  3. Review evaluation-related charts for the selected time range.
If the setup is successful, the evaluation-related charts display scores for your selected time range, and the evaluation runs list shows entries with status Completed. You can also list recent evaluation runs and open the report URL:
eval_run_list = openai_client.evals.runs.list(
    eval_id=eval_object.id,
    order="desc",
    limit=10,
)

if len(eval_run_list.data) > 0 and eval_run_list.data[0].report_url:
    print(f"Report URL: {eval_run_list.data[0].report_url}")

Full sample code

To view the full sample code, see:

Troubleshooting

IssueCauseResolution
Dashboard charts are emptyNo recent traffic, time range excludes data, or ingestion delayGenerate new agent traffic, expand the time range, and refresh after a few minutes.
You see authorization errorsMissing RBAC permissions on Application Insights or Log AnalyticsConfirm access in Access control (IAM) for the connected resources. For log access, assign the Log Analytics Reader role.
Continuous evaluation results don’t appearContinuous evaluation isn’t enabled or rule creation failedConfirm that your rule is enabled and that agent traffic is flowing. If you use the Python SDK setup, confirm the project managed identity has the Azure AI User role.
Evaluation runs are skippedHourly run limit reachedIncrease max_hourly_runs in the evaluation rule configuration or wait for the next hour. The default limit is 100 runs per hour.