Azure OpenAI Realtime client_secrets returns 500 when input_audio_transcription is included (Sweden Central)

👁 Image

James Morgan 0 Reputation points

We are seeing a consistent server-side failure in Azure OpenAI Realtime when requesting client secrets with input_audio_transcription enabled.

Environment

Region: Sweden Central
Resource: LBBD-OpenAI-Sweden-Dev
Subscription: 52a9fd6d-324a-4cc7-861d-b17e4cf9c219
API path: /openai/v1/realtime/client_secrets
Auth: Managed Identity (DefaultAzureCredential)
Deployment tested: gpt-4o-mini-transcribe-sweden-dev-v2 (fresh deployment name)

Observed behavior

Request WITH input_audio_transcription in session payload -> HTTP 500
Same request WITHOUT input_audio_transcription -> HTTP 200

This is reproducible both directly against the endpoint and through our app route that mints realtime tokens.

What we already checked

payload structure
deployment recreation with new name
same auth and api-version across both requests
retries and fallback path

Question

Is there a known region-specific issue or feature-gating requirement for input_audio_transcription in Realtime session creation on Azure OpenAI? If not, what exact prerequisites are required for this field to work?

👁 Image
James Morgan 0 Reputation points
Adding concrete repro IDs from fresh test (UTC 2026-06-02 11:24):

Failing call WITH input_audio_transcription -> HTTP 500

apim-request-id / activity_id: c0d6e5c5-6e40-4039-a4e1-cd21f985cae1

Fallback call WITHOUT input_audio_transcription -> HTTP 200

apim-request-id / activity_id: bbae1531-28f3-4243-b5ee-c780a9681157

Both calls used same endpoint, auth path, and model family; only the transcription block differs.

1 answer

AI answer
There is no region-specific or feature-gating prerequisite for input_audio_transcription documented for Azure OpenAI Realtime beyond using the correct models and schema. A 500 response when adding input_audio_transcription indicates a service-side error rather than a validation or configuration issue on the client.

From the Realtime REST reference:

input_audio_transcription is an optional configuration object on the Realtime session. It “defaults to off and can be set to null to turn off once on.”

It is not tied to a special feature flag in the public documentation.

The supported transcription models for this property are:

whisper-1

gpt-4o-mini-transcribe

gpt-4o-mini-transcribe-2025-12-15

gpt-4o-transcribe

gpt-4o-transcribe-diarize

The only documented prerequisites for using input_audio_transcription are therefore:

Use the GA Realtime API endpoints and protocol

Path: POST {endpoint}/openai/v1/realtime/client_secrets

Optional api-version query: v1 or preview (defaults to v1 if omitted).

Provide a valid Realtime session payload

session.type must be set appropriately for a Realtime session.

If input_audio_transcription is included, its model value must be one of the supported transcription models listed above.

Optional fields like language and prompt follow the documented schema.

Use a supported Realtime model deployment

For WebRTC/Realtime, use one of the Realtime models listed for the region (for example gpt-4o-mini-realtime-preview, gpt-4o-realtime-preview, gpt-realtime, or gpt-realtime-mini as documented for East US 2 and Sweden Central).

Ensure the deployment name in the session configuration matches the actual deployment.

Given that:

The same request succeeds (200) when input_audio_transcription is omitted, and

The documentation does not list any additional region-specific requirements or feature flags for input_audio_transcription,

this pattern is consistent with a backend issue in the Sweden Central Realtime service rather than a missing prerequisite in the request. The next steps that align with the documentation are:

Verify that the input_audio_transcription.model is one of the supported transcription models listed above.

Confirm that the Realtime model deployment being used is supported in Sweden Central and that the deployment name is correct.

If the 500 persists with a valid payload and supported model, treat it as a service-side incident and raise it via Azure support, including:

Region (Sweden Central), resource name, and subscription ID.

The exact POST /openai/v1/realtime/client_secrets payload with and without input_audio_transcription.

Timestamps and apim-request-id values from the failing 500 responses.

There is no documented feature-gating or extra configuration required beyond the above; a persistent 500 in this scenario requires service investigation.

References:

Use the GPT Realtime API via WebRTC

Azure OpenAI realtime

Azure OpenAI in Azure AI Foundry Models REST API preview reference

Azure OpenAI in Microsoft Foundry Models v1 REST API reference
AI-generated content may be incorrect. Read our transparency notes for more information.

URL: https://learn.microsoft.com/en-us/answers/questions/5908916/azure-openai-realtime-client-secrets-returns-500-w

⇱ Azure OpenAI Realtime client_secrets returns 500 when input_audio_transcription is included (Sweden Central) - Microsoft Q&A

Azure OpenAI Realtime client_secrets returns 500 when input_audio_transcription is included (Sweden Central)

1 answer

Your answer