![]() |
VOOZH | about |
dotnet add package Azure.AI.AgentServer.Responses --version 1.0.0-beta.5
NuGet\Install-Package Azure.AI.AgentServer.Responses -Version 1.0.0-beta.5
<PackageReference Include="Azure.AI.AgentServer.Responses" Version="1.0.0-beta.5" />
<PackageVersion Include="Azure.AI.AgentServer.Responses" Version="1.0.0-beta.5" />Directory.Packages.props
<PackageReference Include="Azure.AI.AgentServer.Responses" />Project file
paket add Azure.AI.AgentServer.Responses --version 1.0.0-beta.5
#r "nuget: Azure.AI.AgentServer.Responses, 1.0.0-beta.5"
#:package Azure.AI.AgentServer.Responses@1.0.0-beta.5
#addin nuget:?package=Azure.AI.AgentServer.Responses&version=1.0.0-beta.5&prereleaseInstall as a Cake Addin
#tool nuget:?package=Azure.AI.AgentServer.Responses&version=1.0.0-beta.5&prereleaseInstall as a Cake Tool
Azure.AI.AgentServer.Responses is a .NET library for building ASP.NET Core servers that implement the Azure AI Responses API. Add the NuGet package, extend one abstract class (ResponseHandler), and the library handles routing, streaming (SSE), background execution, cancellation, caching, and response lifecycle management.
Source code | Package (NuGet) | REST API reference | Product documentation
Install the library for .NET with NuGet:
dotnet add package Azure.AI.AgentServer.Responses --prerelease
The recommended way to start a Responses server is with the built-in one-line API:
ResponsesServer.Run<EchoHandler>();
This starts a Kestrel server with OpenTelemetry, health checks, server version header, inbound request logging, and your handler mapped to the Responses API endpoints. The Azure.AI.AgentServer.Core package is included as a transitive dependency.
Alternatively, use AgentHost.CreateBuilder() for more control over service registration and middleware:
var builder = AgentHost.CreateBuilder();
builder.AddResponses<EchoHandler>();
builder.Build().Run();
The core abstraction you implement. The library calls CreateAsync for each incoming request and delivers the returned IAsyncEnumerable<ResponseStreamEvent> to clients via SSE.
TextResponse — recommended for text-only responses:
public class EchoHandler : ResponseHandler
{
public override IAsyncEnumerable<ResponseStreamEvent> CreateAsync(
CreateResponse request,
ResponseContext context,
CancellationToken cancellationToken)
{
return new TextResponse(context, request,
createText: async ct =>
{
var input = await context.GetInputTextAsync(cancellationToken: ct);
return $"Echo: {input}";
});
}
}
ResponseEventStream convenience generators — recommended for multi-output scenarios:
When TextResponse is too simple but the full builder API is more than you need, use the convenience generators on ResponseEventStream. They handle all inner events (output_item.added, content deltas, output_item.done) automatically:
public class EchoHandlerConvenience : ResponseHandler
{
public override async IAsyncEnumerable<ResponseStreamEvent> CreateAsync(
CreateResponse request,
ResponseContext context,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
var stream = new ResponseEventStream(context, request);
yield return stream.EmitCreated();
yield return stream.EmitInProgress();
// One call emits all text output events automatically.
var input = await context.GetInputTextAsync(cancellationToken: cancellationToken);
foreach (var evt in stream.OutputItemMessage($"Echo: {input}"))
yield return evt;
yield return stream.EmitCompleted();
}
}
Available convenience generators (commonly used):
| Method | Description |
|---|---|
OutputItemMessage(string) |
Emits a complete text message output item |
OutputItemMessage(string, IEnumerable<Annotation>) |
Emits a text message with file annotations |
OutputItemMessage(IAsyncEnumerable<string>, CancellationToken) |
Streams tokens as response.output_text.delta events |
OutputItemFunctionCall(name, callId, arguments) |
Emits a complete function call output item |
OutputItemFunctionCallOutput(callId, output) |
Emits a function call output (no deltas) |
OutputItemReasoningItem(...) |
Emits a reasoning output item |
OutputItemImageGenCall(resultBase64) |
Emits an image generation result with status transitions |
OutputItemStructuredOutputs(output) |
Emits an arbitrary structured JSON output item |
Additional convenience generators are available for computer calls, local shell calls, function shell calls, apply-patch calls, custom tool call outputs, MCP approval requests/responses, and compaction. Each follows the same pattern — accepts domain parameters and yields the complete output_item.added → output_item.done event pair.
See Sample 3 — Full control ResponseStream and Sample 4 — Function calling for more examples.
ResponseEventStream — full builder control:
Use the full builder API only when you need fine-grained control over individual delta/done events within a content part, or to set custom properties on output items:
public class EchoHandlerFullControl : ResponseHandler
{
public override async IAsyncEnumerable<ResponseStreamEvent> CreateAsync(
CreateResponse request,
ResponseContext context,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
await Task.CompletedTask;
var stream = new ResponseEventStream(context, request);
yield return stream.EmitCreated();
yield return stream.EmitInProgress();
var message = stream.AddOutputItemMessage();
yield return message.EmitAdded();
var text = message.AddTextContent();
yield return text.EmitAdded();
yield return text.EmitDelta("Hello, world!");
yield return text.EmitTextDone("Hello, world!");
yield return text.EmitDone();
yield return message.EmitDone();
yield return stream.EmitCompleted();
}
}
Injected into every CreateAsync call, ResponseContext provides access to the client's input, conversation history, and request metadata:
GetInputItemsAsync(resolveReferences, cancellationToken) — returns the resolved input items from the request. Item references are resolved to their content by default; pass resolveReferences: false to receive them as-is. Computed once and cached.GetInputTextAsync(resolveReferences, cancellationToken) — shorthand that resolves input items and concatenates all text content from ItemMessage entries.GetHistoryAsync(cancellationToken) — returns output items from previous responses in the conversation chain (oldest-first). Uses previous_response_id to walk the conversation and resolves items via the provider. Limit controlled by ResponsesServerOptions.DefaultFetchHistoryCount (default: 100).ResponseId — the unique ID for this response, used to construct child item IDs.ClientHeaders — forwarded HTTP headers from the original client request.QueryParameters — query parameters from the original request.RawBody — the raw request body as BinaryData for advanced scenarios.Isolation — isolation context (tenant/session) extracted from request headers.For collections of Item objects, the GetInputText() extension method (on IEnumerable<Item>) extracts and joins text content without needing a ResponseContext.
See the handler implementation guide for the full ResponseContext API reference.
Manages sequenceNumber, outputIndex, contentIndex, and itemId tracking internally. Each yield return maps 1:1 to an SSE event with zero bookkeeping.
stream parameter is true (defaults to false); SSE events are delivered in real-time to the connected client.GET /responses/{id}. Requires background=true and store=true.The library orchestrates the complete response lifecycle: created → in_progress → completed (or failed / cancelled). Cancellation, error handling, and terminal event guarantees are all managed automatically.
For detailed handler implementation guidance, see docs/handler-implementation-guide.md.
The library eagerly validates previous_response_id and conversation.id before starting handler execution:
400 Bad Request with param identifying the invalid field.previous_response_id values that pass format validation but reference a nonexistent response return 404 Not Found with a structured error containing code and param.This means handlers can rely on ResponseContext.GetInputItemsAsync() and GetHistoryAsync() returning valid data — invalid references are caught before CreateAsync is called.
All error responses follow the same JSON structure:
{
"error": {
"code": "invalid_request_error",
"message": "The response 'resp_abc123' was not found.",
"param": "previous_response_id",
"type": "invalid_request_error"
}
}
Exception types carry structured fields that map to the error body:
| Exception | HTTP status | error.code |
error.param |
|---|---|---|---|
PayloadValidationException |
400 | invalid_request_error |
Per-field errors |
BadRequestException |
400 | Caller-provided or invalid_request_error |
Caller-provided |
ResourceNotFoundException |
404 | Caller-provided or not_found |
Caller-provided |
ResponsesApiException |
500 | Upstream code or server_error |
— |
Every response includes an x-request-id header (set by Core's RequestIdMiddleware). Error responses also embed the same value in error.additionalInfo.request_id, so clients can correlate errors to specific requests even when headers are stripped by intermediaries.
All error responses (4xx/5xx) include the x-platform-error-source header classifying the error origin as user, platform, or upstream. See the Core README for the full classification table.
When the platform injects x-agent-user-isolation-key and x-agent-chat-isolation-key request headers, the library forwards them to the storage provider so that responses are scoped to the correct tenant and conversation. The resolved session ID is returned on every response via the x-agent-session-id header.
Handlers can access the isolation context through ResponseContext.Isolation for custom partitioning logic.
When storage operations fail (e.g., Foundry storage is unreachable), responses complete gracefully instead of crashing:
status: "failed" and error.code: "storage_error".response.failed carrying error_code="storage_error".This ensures clients always receive a definitive terminal state rather than hanging indefinitely.
All service instances registered via AddResponsesServer() are thread-safe. Handler instances are scoped per-request.
You can familiarize yourself with different APIs using Samples.
model (when provided) are valid and that input items are well-formed.background=true, or it has already reached a terminal state.store=false, or a non-background response is still in-flight and not findable.The library emits OpenTelemetry traces via Azure.AI.AgentServer.Responses activity source. Enable logging in your ASP.NET Core application to diagnose issues.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact with any additional questions or comments.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 net8.0 is compatible. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
Showing the top 1 NuGet packages that depend on Azure.AI.AgentServer.Responses:
| Package | Downloads |
|---|---|
|
Microsoft.Agents.AI.Foundry.Hosting
Microsoft Agent Framework is a comprehensive .NET library for building, orchestrating, and deploying AI agents and multi-agent workflows. The framework provides everything from simple chat agents to complex multi-agent systems with graph-based orchestration capabilities. |
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.0-beta.5 | 12,268 | 5/21/2026 |
| 1.0.0-beta.4 | 37,212 | 4/23/2026 |
| 1.0.0-beta.3 | 5,248 | 4/20/2026 |
| 1.0.0-beta.2 | 538 | 4/19/2026 |
| 1.0.0-beta.1 | 650 | 4/15/2026 |