![]() |
VOOZH | about |
dotnet add package Azure.AI.VoiceLive --version 1.1.0
NuGet\Install-Package Azure.AI.VoiceLive -Version 1.1.0
<PackageReference Include="Azure.AI.VoiceLive" Version="1.1.0" />
<PackageVersion Include="Azure.AI.VoiceLive" Version="1.1.0" />Directory.Packages.props
<PackageReference Include="Azure.AI.VoiceLive" />Project file
paket add Azure.AI.VoiceLive --version 1.1.0
#r "nuget: Azure.AI.VoiceLive, 1.1.0"
#:package Azure.AI.VoiceLive@1.1.0
#addin nuget:?package=Azure.AI.VoiceLive&version=1.1.0Install as a Cake Addin
#tool nuget:?package=Azure.AI.VoiceLive&version=1.1.0Install as a Cake Tool
Azure VoiceLive is a managed service that enables low-latency, high-quality speech-to-speech interactions for voice agents. The API consolidates speech recognition, generative AI, and text-to-speech functionalities into a single, unified interface, providing an end-to-end solution for creating seamless voice-driven experiences.
Use the client library to:
Source code | Package (NuGet) | API reference documentation | Product documentation | Samples
This section includes everything a developer needs to install the package and create their first VoiceLive client connection.
Install the client library for .NET with NuGet:
dotnet add package Azure.AI.VoiceLive
You must have an Azure subscription and an Azure AI Foundry resource to use this service.
The client library targets .NET Standard 2.0 and .NET 8.0, providing compatibility with a wide range of .NET implementations. To use the async streaming features demonstrated in the examples, you'll need .NET 6.0 or later.
The Azure.AI.VoiceLive client supports two authentication methods:
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
DefaultAzureCredential credential = new DefaultAzureCredential();
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential);
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
AzureKeyCredential credential = new AzureKeyCredential("your-api-key");
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential);
For the recommended keyless authentication with Microsoft Entra ID, you need to:
Cognitive Services User role to your user account or managed identity in the Azure portal under Access control (IAM) > Add role assignmentTokenCredential implementation - the SDK automatically handles token acquisition and refresh with the appropriate scopeThe client library targets the latest service API version by default. You can optionally specify the API version when creating a client instance.
You have the flexibility to explicitly select a supported service API version when instantiating a client by configuring its associated options:
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
DefaultAzureCredential credential = new DefaultAzureCredential();
VoiceLiveClientOptions options = new VoiceLiveClientOptions(VoiceLiveClientOptions.ServiceVersion.V2025_10_01);
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential, options);
The Azure.AI.VoiceLive client library provides several key classes for real-time voice interactions:
The primary entry point for the Azure.AI.VoiceLive service. Use this client to establish sessions and configure authentication.
Represents an active WebSocket connection to the VoiceLive service. This class handles bidirectional communication, allowing you to send audio input and receive audio output, text transcriptions, and other events in real-time.
The service uses session configuration to control various aspects of the voice interaction:
The VoiceLive API supports multiple AI models with different capabilities:
| Model | Description | Use Case |
|---|---|---|
gpt-4o-realtime-preview |
GPT-4o with real-time audio processing | High-quality conversational AI |
gpt-4o-mini-realtime-preview |
Lightweight GPT-4o variant | Fast, efficient interactions |
phi4-mm-realtime |
Phi model with multimodal support | Cost-effective voice applications |
The VoiceLive API provides Azure-specific enhancements:
We guarantee that all client instance methods are thread-safe and independent of each other (guideline). This ensures that the recommendation of reusing client instances is always safe, even across threads.
Client options | Accessing the response | Long-running operations | Handling failures | Diagnostics | Mocking | Client lifetime
You can familiarize yourself with different APIs using Samples.
// Create the VoiceLive client
Uri endpoint = new Uri("https://your-resource.cognitiveservices.azure.com");
DefaultAzureCredential credential = new DefaultAzureCredential();
VoiceLiveClient client = new VoiceLiveClient(endpoint, credential);
var model = "gpt-realtime"; // Specify the model to use
// Start a new session
VoiceLiveSession session = await client.StartSessionAsync(model).ConfigureAwait(false);
// Configure session for voice conversation
VoiceLiveSessionOptions sessionOptions = new()
{
Model = model,
Instructions = "You are a helpful AI assistant. Respond naturally and conversationally.",
Voice = new AzureStandardVoice("en-US-AvaNeural"),
TurnDetection = new AzureSemanticVadTurnDetection()
{
Threshold = 0.5f,
PrefixPadding = TimeSpan.FromMilliseconds(300),
SilenceDuration = TimeSpan.FromMilliseconds(500)
},
InputAudioFormat = InputAudioFormat.Pcm16,
OutputAudioFormat = OutputAudioFormat.Pcm16
};
// Ensure modalities include audio
sessionOptions.Modalities.Clear();
sessionOptions.Modalities.Add(InteractionModality.Text);
sessionOptions.Modalities.Add(InteractionModality.Audio);
await session.ConfigureSessionAsync(sessionOptions).ConfigureAwait(false);
// Process events from the session
await foreach (SessionUpdate serverEvent in session.GetUpdatesAsync().ConfigureAwait(false))
{
if (serverEvent is SessionUpdateResponseAudioDelta audioDelta)
{
// Play audio response
byte[] audioData = audioDelta.Delta.ToArray();
// ... audio playback logic
}
else if (serverEvent is SessionUpdateResponseTextDelta textDelta)
{
// Display text response
Console.Write(textDelta.Delta);
}
}
VoiceLiveSessionOptions sessionOptions = new()
{
Model = model,
Instructions = "You are a customer service representative. Be helpful and professional.",
Voice = new AzureCustomVoice("your-custom-voice-name", "your-custom-voice-endpoint-id")
{
Temperature = 0.8f
},
TurnDetection = new AzureSemanticVadTurnDetection()
{
RemoveFillerWords = true
},
InputAudioFormat = InputAudioFormat.Pcm16,
OutputAudioFormat = OutputAudioFormat.Pcm16
};
// Ensure modalities include audio
sessionOptions.Modalities.Clear();
sessionOptions.Modalities.Add(InteractionModality.Text);
sessionOptions.Modalities.Add(InteractionModality.Audio);
await session.ConfigureSessionAsync(sessionOptions).ConfigureAwait(false);
// Define a function for the assistant to call
var getCurrentWeatherFunction = new VoiceLiveFunctionDefinition("get_current_weather")
{
Description = "Get the current weather for a given location",
Parameters = BinaryData.FromString("""
{
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state or country"
}
},
"required": ["location"]
}
""")
};
VoiceLiveSessionOptions sessionOptions = new()
{
Model = model,
Instructions = "You are a weather assistant. Use the get_current_weather function to help users with weather information.",
Voice = new AzureStandardVoice("en-US-AvaNeural"),
InputAudioFormat = InputAudioFormat.Pcm16,
OutputAudioFormat = OutputAudioFormat.Pcm16
};
// Add the function tool
sessionOptions.Tools.Add(getCurrentWeatherFunction);
// Ensure modalities include audio
sessionOptions.Modalities.Clear();
sessionOptions.Modalities.Add(InteractionModality.Text);
sessionOptions.Modalities.Add(InteractionModality.Audio);
await session.ConfigureSessionAsync(sessionOptions).ConfigureAwait(false);
// Process events from the session
await foreach (SessionUpdate serverEvent in session.GetUpdatesAsync().ConfigureAwait(false))
{
if (serverEvent is SessionUpdateResponseFunctionCallArgumentsDone functionCall)
{
if (functionCall.Name == "get_current_weather")
{
// Extract parameters from the function call
var parametersString = functionCall.Arguments;
var parameters = System.Text.Json.JsonSerializer.Deserialize<Dictionary<string, string>>(parametersString);
string location = parameters != null ? parameters["location"] : string.Empty;
// Call your external weather service here and get the result
string weatherInfo = $"The current weather in {location} is sunny with a temperature of 75�F.";
// Send the function response back to the session
await session.AddItemAsync(new FunctionCallOutputItem(functionCall.CallId, weatherInfo)).ConfigureAwait(false);
// Start the next response.
await session.StartResponseAsync().ConfigureAwait(false);
}
}
}
// Add a user message to the session
await session.AddItemAsync(new UserMessageItem("Hello, can you help me with my account?")).ConfigureAwait(false);
// Start the response from the assistant
await session.StartResponseAsync().ConfigureAwait(false);
Authentication Errors: If you receive authentication errors, verify that:
WebSocket Connection Issues: VoiceLive uses WebSocket connections. Ensure that:
*.cognitiveservices.azure.comAudio Processing Errors: For audio-related issues:
Enable logging to help diagnose issues:
using Azure.Core.Diagnostics;
// Enable logging for Azure SDK
using AzureEventSourceListener listener = AzureEventSourceListener.CreateConsoleLogger();
The VoiceLive service implements rate limiting based on:
Implement appropriate retry logic and connection management to handle throttling gracefully.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact with any additional questions or comments.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 net5.0 was computed. net5.0-windows net5.0-windows was computed. net6.0 net6.0 was computed. net6.0-android net6.0-android was computed. net6.0-ios net6.0-ios was computed. net6.0-maccatalyst net6.0-maccatalyst was computed. net6.0-macos net6.0-macos was computed. net6.0-tvos net6.0-tvos was computed. net6.0-windows net6.0-windows was computed. net7.0 net7.0 was computed. net7.0-android net7.0-android was computed. net7.0-ios net7.0-ios was computed. net7.0-maccatalyst net7.0-maccatalyst was computed. net7.0-macos net7.0-macos was computed. net7.0-tvos net7.0-tvos was computed. net7.0-windows net7.0-windows was computed. net8.0 net8.0 is compatible. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 netcoreapp2.0 was computed. netcoreapp2.1 netcoreapp2.1 was computed. netcoreapp2.2 netcoreapp2.2 was computed. netcoreapp3.0 netcoreapp3.0 was computed. netcoreapp3.1 netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 netstandard2.0 is compatible. netstandard2.1 netstandard2.1 was computed. |
| .NET Framework | net461 net461 was computed. net462 net462 was computed. net463 net463 was computed. net47 net47 was computed. net471 net471 was computed. net472 net472 was computed. net48 net48 was computed. net481 net481 was computed. |
| MonoAndroid | monoandroid monoandroid was computed. |
| MonoMac | monomac monomac was computed. |
| MonoTouch | monotouch monotouch was computed. |
| Tizen | tizen40 tizen40 was computed. tizen60 tizen60 was computed. |
| Xamarin.iOS | xamarinios xamarinios was computed. |
| Xamarin.Mac | xamarinmac xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos xamarinwatchos was computed. |
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.2.0-beta.1 | 283 | 6/9/2026 |
| 1.1.0 | 985 | 6/4/2026 |
| 1.1.0-beta.4 | 1,808 | 5/13/2026 |
| 1.1.0-beta.3 | 5,397 | 3/3/2026 |
| 1.1.0-beta.2 | 532 | 2/20/2026 |
| 1.1.0-beta.1 | 3,315 | 11/18/2025 |
| 1.0.0 | 27,822 | 10/2/2025 |
| 1.0.0-beta.4 | 266 | 9/30/2025 |
| 1.0.0-beta.3 | 233 | 9/27/2025 |
| 1.0.0-beta.2 | 508 | 9/22/2025 |
| 1.0.0-beta.1 | 731 | 9/17/2025 |