VOOZH about

URL: https://apify.com/vivid_astronaut/speech-to-text

โ‡ฑ Speech to Text API - Audio Transcription with AI ยท Apify


Pricing

from $10.00 / 1,000 results

Go to Apify Store

Convert speech to text with high accuracy using Azure AI. Supports 100+ languages, speaker detection, and timestamps. Perfect for transcription, subtitles, and voice-to-text applications.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Fabio Suizu

Fabio Suizu

Maintained by Community

Actor stats

0

Bookmarked

24

Total users

0

Monthly active users

5 months ago

Last modified

Categories

Share

Speech to Text - Audio Transcription

Convert audio files to text using AI-powered speech recognition. Supports multiple languages and engines.

Features

  • Fast Processing: Lightning-fast speech to text - audio transcription powered by Azure
  • Reliable: 99.9% uptime with automatic failover
  • Scalable: Handle single requests or bulk operations
  • Secure: Enterprise-grade security with API key authentication
  • Well Documented: Comprehensive API documentation and examples

Use Cases

  • Content Generation: Automate content creation workflows
  • Data Analysis: Extract insights from unstructured data
  • Automation: Integrate AI capabilities into your apps

Input Parameters

ParameterTypeRequiredDescription
audioUrlstringNoURL of the audio file to transcribe
audioBase64stringNoBase64-encoded audio data (alternative to URL)
languagestringNoLanguage code (e.g., 'en', 'es', 'fr'). Leave empty for auto
includeSegmentsbooleanNoInclude time-stamped segments in the response
enginestringNoSpeech recognition engine to use
detectLanguageOnlybooleanNoOnly detect the language without full transcription

Output Format

{
"success":true,
"result":{ ... },
"timestamp":"2026-01-07T00:00:00Z"
}

Code Examples

JavaScript (Node.js)

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token:'YOUR_API_TOKEN'});
const input ={
"audioUrl":"https://example.com/audio.mp3",
"audioBase64":"example_audioBase64",
"language":"en",
"includeSegments":true,
"engine":"azure",
"detectLanguageOnly":false
};
const run =await client.actor("vivid_astronaut/speech-to-text").call(input);
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run_input ={
"audioUrl":"https://example.com/audio.mp3",
"audioBase64":"example_audioBase64",
"language":"en",
"includeSegments": true,
"engine":"azure",
"detectLanguageOnly": false
}
run = client.actor("vivid_astronaut/speech-to-text").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

cURL

curl-X POST "https://api.apify.com/v2/acts/vivid_astronaut~speech-to-text/runs?token=YOUR_API_TOKEN"\
-H"Content-Type: application/json"\
-d'{
"audioUrl": "https://example.com/audio.mp3",
"audioBase64": "example_audioBase64",
"language": "en",
"includeSegments": true,
"engine": "azure",
"detectLanguageOnly": false
}'

Pricing

Model: Pay per result Price: $0.020 per result

You only pay for successful results. Platform usage costs are included.

API Documentation

Full API documentation is available at:

Support

Version History

See ./CHANGELOG.md for version history.


Powered by Azure Cloud Infrastructure

You might also like

Google Free Text to Speech

jupri/google-speech

Use free Google Text to Speech to translate text into voice

Text To Speech

calm_necessity/text-to-speech

AI Text-to-Speech API that converts written text into high-quality natural voice audio. Supports multiple voices, languages, adjustable speed and pitch, ideal for audiobooks, podcasts, accessibility, automation, and voice-enabled applications.

๐Ÿ‘ User avatar

Taher Ali Badnawarwala

16

Text to Speech Generator

moving_beacon-owner1/my-actor-30

Convert text into natural-sounding speech in multiple languages with ease.

23

Text to speech generator

akash9078/advanced-text-to-speech

Professional-grade Text-to-Speech (TTS) actor powered by advanced AI models. Convert any text into natural, human-like speech with 50+ premium voices across 9 languages. Perfect for content creation, accessibility, voiceovers, audiobooks, podcasts, and multilingual applications.

๐Ÿ‘ User avatar

Akash Kumar Naik

21

Hugging Face Audio AI

alizarin_refrigerator-owner/hugging-face-audio-ai

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

Speech-to-Text Transcription

hgservices/speech-to-text

Transcribe audio and video from YouTube, TikTok, podcasts, X, and 1,000+ other sites or any direct media URL into accurate, speaker-labeled text. Uses World's best speech to text AI models with automatic language detection, multilingual support, and smart formatting.

78

5.0

Speech-to-Text Converter

moving_beacon-owner1/my-actor-72

Introducing the Speech-to-Text Converter โ€” Apify Actor! Transform your audio into text effortlessly with our powerful, serverless multi-engine transcription solution on Apify. Experience seamless and accurate transcription like never before!

18