Optimize model cost and performance
Prerequisites
Detect cost increases
Investigate high-cost agents
Switch to a cost-efficient model
Evaluate model differences
Update your agent
Track improvements
Related content

Optimize model cost and performance

When your model or agent costs start increasing, use the Ask AI agent to quickly diagnose issues, take action, and verify improvements. The Ask AI agent is a built-in chat assistant. You can access it from the toolbar in the Microsoft Foundry portal. This article walks you through a recommended workflow, from identifying cost spikes to switching models and validating performance improvements. All these activities happen within the Foundry portal.

Prerequisites

An Azure account with an active subscription. If you don’t have one, create a free Azure account, which includes a free trial subscription.
A Foundry project. If you don’t have one, create a project.
At least one deployed or published agent with cost data. For meaningful trend analysis, you need a minimum of seven days of usage data.
Access to the Ask AI agent.
An evaluation dataset configured for your project. To set one up, see Evaluate your generative AI application locally with the Azure AI Evaluation SDK.

Detect cost increases

Start by opening the Ask AI agent from the toolbar. Or, go to Operate > Overview to use one of the prebuilt prompts that are specific to agent optimization and performance. Ask the assistant to provide a summary of your metrics and cost data from the Foundry Control Plane dashboard. You can select a predefined prompt on the Overview pane or type your own question, such as:

“Summarize my recent cost trend.”
“Which agents contributed most to my cost increase?”

The Ask AI agent generates a summary that highlights key cost drivers, such as high token usage, longer completion length, or frequent evaluation runs. The summary includes annotated links to the dashboard charts for deeper inspection.

Investigate high-cost agents

After you review the summary, you can explore detailed insights for specific agents by asking:

“Show me cost and performance details for [agent name].”
“Break down cost by model or deployment for this agent.”

You can also select Assets on the left pane. Then, select View Agent details to view the Assets pane. There, you can compare your agents with cost and token usage, and see which agent costs the most.

Switch to a cost-efficient model

When you identify a model as a cost driver, ask the Ask AI agent:

“Recommend a cheaper model with similar performance.”
“Switch this agent’s deployment to a more cost-efficient model.”

The Ask AI agent:

Recommends alternative models available in the model catalog.
Provides performance and cost comparisons.
Upon confirmation, provides a link to the model deployment page.

Follow the instructions in the model deployment page, or continue to chat with the Ask AI agent to complete the model deployment step.

Evaluate model differences

After you switch models, you can ask the Ask AI agent to run an evaluation that compares the old and new models:

“Evaluate performance and cost difference between the old and new model.”

The Ask AI agent provides guidance on how to create an evaluation and gives you a link to the evaluation creation wizard. You can follow the instructions step by step to create two evaluation runs on the two models. View the results after both evaluation runs finish.

Update your agent

When you confirm that the new model performs better than the current model, go to Agent Playground to update the model and save a new version.

Track improvements

Later, return to the Ask AI agent and ask:

“Show me the summary on the latest data for cost.”

The Ask AI agent retrieves the latest metrics from your continuous evaluation and summarizes improvements in cost and performance trends. This feature helps you continuously monitor efficiency and return on investment.

Quickstart: Create a Guardrail Policy

Observability in Generative AI

What is Microsoft Foundry (new)?

Get started

Agent development

Agent tools & integration

Model capabilities

Fine-tuning

Manage agents, models, & tools

Observability, evaluation, & tracing

Developer experience

API & SDK

Responsible AI

Best practices

Setup & configure

Security & governance

Operate & support

Optimize Cost and Performance in Microsoft Foundry

Optimize model cost and performance

Prerequisites

Detect cost increases

Investigate high-cost agents

Switch to a cost-efficient model

Evaluate model differences

Update your agent

Track improvements

What is Microsoft Foundry (new)?

Get started

Agent development

Agent tools & integration

Model capabilities

Fine-tuning

Manage agents, models, & tools

Observability, evaluation, & tracing

Developer experience

API & SDK

Responsible AI

Best practices

Setup & configure

Security & governance

Operate & support

​Optimize model cost and performance

​Prerequisites

​Detect cost increases

​Investigate high-cost agents

​Switch to a cost-efficient model

​Evaluate model differences

​Update your agent

​Track improvements

​Related content

Optimize model cost and performance

Prerequisites

Detect cost increases

Investigate high-cost agents

Switch to a cost-efficient model

Evaluate model differences

Update your agent

Track improvements

Related content