VOOZH about

URL: https://apify.com/moving_beacon-owner1/my-actor-72

โ‡ฑ Speech-to-Text Converter ยท Apify


Pricing

from $100.00 / 1,000 results

Go to Apify Store

Speech-to-Text Converter

Introducing the Speech-to-Text Converter โ€” Apify Actor! Transform your audio into text effortlessly with our powerful, serverless multi-engine transcription solution on Apify. Experience seamless and accurate transcription like never before!

Pricing

from $100.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Jamshaid Arif

Jamshaid Arif

Maintained by Community

Actor stats

0

Bookmarked

18

Total users

8

Monthly active users

a month ago

Last modified

Share

๐ŸŽ™๏ธ Speech-to-Text Converter โ€” Apify Actor

Serverless multi-engine speech-to-text transcription on Apify.

๐Ÿ› ๏ธ Engines

EngineCostInternetBest For
Whisper LocalFreeNoLong files, best accuracy, 45+ languages
Google SpeechFreeYesQuick short clips, real-time results
Whisper APIPaidYesFast cloud processing, large files

๐Ÿ“ Supported Formats

Audio: WAV, MP3, FLAC, OGG, M4A, AAC, WMA, OPUS
Video: MP4, MKV, AVI, MOV, WEBM, FLV (audio auto-extracted)
Output: TXT, SRT subtitles, WebVTT, JSON

โšก Quick Start

Via Apify Console

  1. Select an Engine (Whisper Local recommended)
  2. Paste the Input File URL (direct download link)
  3. Choose Language and Output Format
  4. Click Start

Via API

curl-X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=<TOKEN>"\
-H"Content-Type: application/json"\
-d'{
"engine": "whisper_local",
"input_file_url": "https://example.com/audio.mp3",
"language": "en",
"whisper_model": "small",
"output_format": "srt"
}'

๐Ÿ“ฅ Input

ParameterTypeDefaultDescription
enginestringwhisper_localwhisper_local, google, or whisper_api
input_file_urlstringโ€”Direct URL to media file
languagestringenLanguage code or empty for auto-detect
whisper_modelstringsmalltiny, base, small, medium, large
openai_api_keystringโ€”OpenAI key (whisper_api only)
output_formatstringtxttxt, srt, vtt, json
max_file_size_mbint500Max file size limit
google_chunk_secondsint55Chunk size for Google engine

๐Ÿ“ค Output

Dataset contains metadata:

{
"status":"success",
"engine":"whisper_local (small)",
"language_detected":"en",
"input_file":"interview.mp3",
"output_url":"https://api.apify.com/.../interview_transcription.srt",
"duration_audio_sec":342.5,
"processing_time_sec":28.3,
"character_count":4521,
"segment_count":87,
"text_preview":"First 500 characters of the transcription...",
"message":"Transcribed with Whisper 'small' on CPU."
}

Key-Value Store contains:

  • interview_transcription.txt/srt/vtt/json โ€” the output file (downloadable via output_url)
  • FULL_RESULT โ€” complete JSON with full text + all segments

๐ŸŒ Languages (45+)

English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Turkish, Urdu, Dutch, Polish, Swedish, Danish, Finnish, Norwegian, Greek, Czech, Romanian, Hungarian, Thai, Vietnamese, Indonesian, Malay, Ukrainian, Bulgarian, Croatian, Slovak, Slovenian, Serbian, Hebrew, Bengali, Tamil, Telugu, Malayalam, Kannada, Marathi, Gujarati, Punjabi, Swahili, Afrikaans, Filipino.

You might also like

Speech to Text Converter (Transcript / Captcha)

saswave/speech-to-text-converter

Transform audio records to text. Get transcription from sales or customer success teams audio files. Get Captcha text from captcha audio challenge. Speech to text converter helps you analyse, build KPI with audio records and bypass captcha.

Google Free Text to Speech

jupri/google-speech

Use free Google Text to Speech to translate text into voice

Speech-to-Text Transcription

hgservices/speech-to-text

Transcribe audio and video from YouTube, TikTok, podcasts, X, and 1,000+ other sites or any direct media URL into accurate, speaker-labeled text. Uses World's best speech to text AI models with automatic language detection, multilingual support, and smart formatting.

82

5.0

Hugging Face Audio AI

alizarin_refrigerator-owner/hugging-face-audio-ai

Audio w/Hugging Face models speech recognition, text-to-speech & audio analysis Speech-to-Text: Transcribe audio Text-to-Speech: Generate natural speech Audio Classification: Classify sounds Voice Activity Detection: Detect speech Speaker Diarization: Identify speakers Music Generation: Create music

Text to Speech Generator

moving_beacon-owner1/my-actor-30

Convert text into natural-sounding speech in multiple languages with ease.

23

Text to speech generator

akash9078/advanced-text-to-speech

Professional-grade Text-to-Speech (TTS) actor powered by advanced AI models. Convert any text into natural, human-like speech with 50+ premium voices across 9 languages. Perfect for content creation, accessibility, voiceovers, audiobooks, podcasts, and multilingual applications.

๐Ÿ‘ User avatar

Akash Kumar Naik

21