Skip to main content
In this quickstart, you call the Responses API on a Foundry project endpoint from your own code to build an ephemeral agent — an agent whose definition (instructions, tools, model) lives in your application code instead of as a persisted resource in Foundry Agent Service. Each call constructs the agent in your process and invokes the Responses API for model inference and tool orchestration. This pattern fits developers, ISVs, and digital natives who want their agent definitions to ship and version with the rest of their application code, rather than as an out-of-band resource that someone has to keep in sync with the app. Unlike prompt agents, there’s no agent resource to create, update, or delete in Foundry — lifecycle management is replaced by calling the Responses API directly. The Responses API is the single model and tools entry point for Foundry. You can call it on two different endpoints:
  • Foundry project endpoint (this quickstart, recommended) — full Foundry support. Exposes Foundry models from the catalog and platform tools (file search, code interpreter, memory, web search, MCP, SharePoint, WorkIQ, Fabric IQ, and more) through a single project-scoped API surface, reached at {project_endpoint}/openai/v1/responses.
  • Azure OpenAI endpoint — best latency and maximum compatibility with existing OpenAI clients. Use this when you only need OpenAI models and standard OpenAI tools and don’t need Foundry-specific capabilities.
The recommended path is the Agent Framework, which handles authentication, tool wiring, and message orchestration for you. In Python this is FoundryChatClient; in .NET it’s AIProjectClient.AsAIAgent(...). The OpenAI SDK also works against this endpoint and is covered as an alternative in Use the OpenAI SDK directly. If you don’t have an Azure subscription, create a free account.

When to use the ephemeral agent pattern

Use this pattern when you’re hosting agent code outside of Foundry — potentially embedded in your own application — but want to access Foundry agent features like models and platform tools. The ephemeral pattern and hosted agents are additive, not alternatives. The same Agent Framework agent code can also be packaged as a hosted agent and exposed through the Foundry Agents API — useful when you want a Foundry-managed endpoint that other apps, services, or agents can call. You can do both from one codebase: run the agent in-process where it ships with your app, and publish the same definition as a hosted agent where other callers need it.

What the Foundry project endpoint adds on top of the OpenAI Responses API

The Responses API on a Foundry project endpoint is compatible with the OpenAI Responses API, so existing OpenAI clients work against it with minimal changes. The Foundry project endpoint adds the following on top:
  • Project-scoped data: Files, vector stores, and other data are stored at the project level instead of the resource level, which gives per-project data isolation and lets you use bring-your-own resources through standard agent setup.
  • Foundry Models in addition to OpenAI: Foundry Models sold directly by Azure (not just OpenAI models) are available through the same API.
  • Foundry-specific tools: Platform tools like SharePoint, WorkIQ, and Fabric IQ are available alongside the standard OpenAI tools.
  • On-behalf-of (OBO) authentication for tools: Tools can call downstream services as the signed-in user, not just as the application identity.
  • Project-level observability and governance: Calls made through the project endpoint flow through the project’s tracing, monitoring, content filters, and identity configuration without extra wiring (see Observability and enterprise capabilities).
Calling the project endpoint — not a resource-level OpenAI endpoint — is what unlocks these project-scoped capabilities.

Prerequisites

Set environment variables

Store your project endpoint and deployed model name as environment variables. The samples below read these values from the environment.
FOUNDRY_PROJECT_ENDPOINT=<endpoint copied from welcome screen>
FOUNDRY_MODEL=<your deployed model name>

Install packages

Install the Agent Framework package with the Foundry provider:

Create an agent

Create an ephemeral agent that runs locally in your process and calls the Responses API for model inference and tool orchestration.

Add function tools

Define local function tools and pass them to the agent. The agent automatically calls these tools when needed during a conversation.

Use the web search tool

The Responses API on the Foundry project endpoint provides built-in hosted tools like web search. Give your agent access to web search without any local implementation.

Stream responses

Receive responses as they generate instead of waiting for the full message. Streaming output appears incrementally in the console as the model generates each token.

Observability and enterprise capabilities

Ephemeral doesn’t mean unmanaged. Because calls go through the project endpoint, they inherit the project’s enterprise configuration without extra wiring:
  • Tracing and monitoring: Requests, tool invocations, and token usage flow into Foundry observability for the project.
  • Content filters and governance: Project-level content filters and responsible AI policies apply to every call.
  • Identity and access: Calls authenticate against the project’s identity configuration; OBO-enabled tools can act as the signed-in user.
The ephemeral pattern isn’t a reduced-capability tier — you get the same Foundry models, tools, observability, and governance whether you run the agent in-process or package the same code as a hosted agent. The choice is about the deployment shape, not the feature set.

Use the OpenAI SDK directly

Because the Foundry project Responses API is OpenAI-compatible, you can also call it directly from the OpenAI SDK by pointing the client at the project endpoint ({project_endpoint}/openai/v1/responses). Use this path only if you already have OpenAI SDK code or need lower-level control over the request and response shapes. New code should prefer the Agent Framework, which handles authentication, tool wiring, and orchestration for you. For SDK samples, see:

Clean up resources

Because Agent Framework agents created here are ephemeral, no service-side cleanup is needed. The agent exists only in your local process. If you created Foundry resources you no longer need, delete them in the Foundry portal. Go deeper on this pattern Package the same agent code as a hosted agent