Azure OpenAI Realtime client_secrets returns 500 when input_audio_transcription is included (Sweden Central)
We are seeing a consistent server-side failure in Azure OpenAI Realtime when requesting client secrets with input_audio_transcription enabled.
Environment
- Region: Sweden Central
- Resource: LBBD-OpenAI-Sweden-Dev
- Subscription: 52a9fd6d-324a-4cc7-861d-b17e4cf9c219
- API path: /openai/v1/realtime/client_secrets
- Auth: Managed Identity (DefaultAzureCredential)
- Deployment tested: gpt-4o-mini-transcribe-sweden-dev-v2 (fresh deployment name)
Observed behavior
- Request WITH input_audio_transcription in session payload -> HTTP 500
- Same request WITHOUT input_audio_transcription -> HTTP 200
This is reproducible both directly against the endpoint and through our app route that mints realtime tokens.
What we already checked
- payload structure
- deployment recreation with new name
- same auth and api-version across both requests
- retries and fallback path
Question
Is there a known region-specific issue or feature-gating requirement for input_audio_transcription in Realtime session creation on Azure OpenAI? If not, what exact prerequisites are required for this field to work?
-
James Morgan 0 Reputation points
Adding concrete repro IDs from fresh test (UTC 2026-06-02 11:24):
- Failing call WITH input_audio_transcription -> HTTP 500
- apim-request-id / activity_id: c0d6e5c5-6e40-4039-a4e1-cd21f985cae1
-
- Fallback call WITHOUT input_audio_transcription -> HTTP 200
- apim-request-id / activity_id: bbae1531-28f3-4243-b5ee-c780a9681157
- Both calls used same endpoint, auth path, and model family; only the transcription block differs.
Sign in to comment
1 answer
-
AI answer
There is no region-specific or feature-gating prerequisite for
input_audio_transcriptiondocumented for Azure OpenAI Realtime beyond using the correct models and schema. A 500 response when addinginput_audio_transcriptionindicates a service-side error rather than a validation or configuration issue on the client.From the Realtime REST reference:
-
input_audio_transcriptionis an optional configuration object on the Realtime session. It “defaults to off and can be set tonullto turn off once on.” - It is not tied to a special feature flag in the public documentation.
- The supported transcription models for this property are:
-
whisper-1 -
gpt-4o-mini-transcribe -
gpt-4o-mini-transcribe-2025-12-15 -
gpt-4o-transcribe -
gpt-4o-transcribe-diarize
-
The only documented prerequisites for using
input_audio_transcriptionare therefore:- Use the GA Realtime API endpoints and protocol
- Path:
POST {endpoint}/openai/v1/realtime/client_secrets - Optional
api-versionquery:v1orpreview(defaults tov1if omitted).
- Path:
- Provide a valid Realtime session payload
-
session.typemust be set appropriately for a Realtime session. - If
input_audio_transcriptionis included, itsmodelvalue must be one of the supported transcription models listed above. - Optional fields like
languageandpromptfollow the documented schema.
-
- Use a supported Realtime model deployment
- For WebRTC/Realtime, use one of the Realtime models listed for the region (for example
gpt-4o-mini-realtime-preview,gpt-4o-realtime-preview,gpt-realtime, orgpt-realtime-minias documented for East US 2 and Sweden Central). - Ensure the deployment name in the session configuration matches the actual deployment.
- For WebRTC/Realtime, use one of the Realtime models listed for the region (for example
Given that:
- The same request succeeds (200) when
input_audio_transcriptionis omitted, and - The documentation does not list any additional region-specific requirements or feature flags for
input_audio_transcription,
this pattern is consistent with a backend issue in the Sweden Central Realtime service rather than a missing prerequisite in the request. The next steps that align with the documentation are:
- Verify that the
input_audio_transcription.modelis one of the supported transcription models listed above. - Confirm that the Realtime model deployment being used is supported in Sweden Central and that the deployment name is correct.
- If the 500 persists with a valid payload and supported model, treat it as a service-side incident and raise it via Azure support, including:
- Region (Sweden Central), resource name, and subscription ID.
- The exact
POST /openai/v1/realtime/client_secretspayload with and withoutinput_audio_transcription. - Timestamps and
apim-request-idvalues from the failing 500 responses.
There is no documented feature-gating or extra configuration required beyond the above; a persistent 500 in this scenario requires service investigation.
References:
AI-generated content may be incorrect. Read our transparency notes for more information. -
