VOOZH about

URL: https://apify.com/donjuan_mime/audio-video-to-text

⇱ Audio & Video to Text Β· Apify


Pricing

$5.00/month + usage

Go to Apify Store

Audio & Video to Text

Transcribes video and audio files into plain text and subtitle formats (TXT, SRT, VTT, TSV, JSON) using OpenAI's Whisper model. Supports preloaded tiny, base, and small models.

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

πŸ‘ Donjuan

Donjuan

Maintained by Community

Actor stats

5

Bookmarked

95

Total users

0

Monthly active users

10 months ago

Last modified

Share


🎬 Video and Audio to Text Transcription

🧠 Overview

This script is designed for the Apify platform and uses OpenAI Whisper to transcribe audio or video (e.g., from YouTube or MP4 files) into text and other formats (SRT, VTT, etc.).


πŸ“₯ Input

Parameters

  • model: (string) β€” Whisper model to use. Available options:
    • tiny βœ… (pre-installed)
    • base βœ… (pre-installed)
    • small βœ… (pre-installed)
    • medium (requires download)
    • large (requires download)
    • turbo (requires download)

βœ… Note: Models tiny, base, and small are already downloaded in the Docker image for faster and offline-ready processing.

  • source_url: (string) β€” Direct URL to the video/audio file (e.g., an MP4 file hosted online).
    ⚠️ YouTube links are not supported directly. You must download the video first.

Example Input

{
"model":"tiny",
"source_url":"https://raw.githubusercontent.com/donjuanMime/audio_to_text/main/video.mp4"
}

πŸ“€ Output

The output is a JSON array with one object, which includes multiple transcription formats:

  • json: Full Whisper output with segments, tokens, and metadata.
  • srt: SubRip subtitle format.
  • tsv: Tab-separated values (start, end, text).
  • txt: Plain text transcription.
  • vtt: WebVTT subtitle format.

Example Output (excerpt)

[
{
"json":"{ ... Whisper segment data ... }",
"srt":"1\n00:00:00,000 --> 00:00:01,120\nWhat's your favorite drink?\n...",
"tsv":"start\tend\ttext\n0\t1120\tWhat's your favorite drink?\n...",
"txt":"What's your favorite drink?\nMy favorite drink is apple juice...\n",
"vtt":"WEBVTT\n\n00:00.000 --> 00:01.120\nWhat's your favorite drink?\n..."
}
]

πŸ› οΈ How to Use

  1. Go to your Apify dashboard and create a new actor or task.
  2. Paste this script into the actor’s source.
  3. Provide the input in the required JSON format (see above).
  4. Run the actor. It will download the media file, process it using Whisper, and return transcription in multiple formats.

⚠️ Disclaimer

This script is provided "as is", without warranties of any kind. Use it at your own risk. Ensure compliance with:

  • YouTube’s Terms of Service (if downloading/transcribing from YouTube).
  • Local and international copyright laws.

Let me know if you’d like the actual Apify actor code or instructions on downloading YouTube videos as .mp4 files to use with this.

You might also like

Audio and Video Transcript (OpenAI Whisper)

vittuhy/audio-and-video-transcript

This Actor transcribes audio or video files from publicly accessible URLs using OpenAI's Whisper API. To use this Actor, you'll need to provide your own OpenAI API key. It supports multiple languages and highly customizable parameters, enabling precise control over the transcription process.

πŸ‘ User avatar

VΓ­t TuhΓ½

90

1.8

Video & Audio Transcriber β€” Word-Level + SRT/VTT

dami_studio/video-audio-transcriber

Transcribe any video or audio URL into accurate text with word-level and segment timestamps, plus ready-to-use SRT, VTT, and TXT files. Auto-detects language. For captions, subtitles, search & repurposing. Bring your own OpenAI API key.

3

5.0

$0.15/min REAL YouTube Transcriber & Subtitles (JSON/SRT/VTT)

practicaltools/apify-youtube-transcribe

Download and transcribe YouTube videos into text and subtitle files – quickly, locally, and without external APIs. This Apify actor Faster-Whisper to generate transcripts and captions. It saves results in TXT, JSON, SRT, and VTT formats, plus provides a summary in the Dataset.

πŸ‘ User avatar

Practical Tools

68

5.0

Audio And Video Transcriber (OpenAI GPT-4o-transcribe)

stanvanrooy6/audio-video-transcriber

Downloads videos from public URLs, extracts audio, and transcribes them using OpenAI

49

AI Audio to Text Transcriber

jungle_synthesizer/ai-audio-to-text-transcriber

Transcribe audio files to text using OpenAI Whisper. Accepts public audio URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC) and returns full transcripts with language, duration, and timed segments. BYO OpenAI key required.

πŸ‘ User avatar

BowTiedRaccoon

1

Transcribe Video to Text & Audio to Text β€” 99+ Languages

sian.agency/INCREDIBLY-FAST-audio-transcriber

Transcribe video to text and audio to text in bulk on Apify. 99+ languages, word-level timestamps, speaker diarization, SRT/VTT export. Try free.

95

5.0

YouTube Transcript Scraper – JSON, SRT, VTT, Plain Text

scraperhive/youtube-transcript-scraper

Extract YouTube video transcripts, subtitles, and captions in multiple formats with precise timestamps. Plain Text Β· JSON Β· SRT Β· WebVTT Β· 20+ Languages Β· Batch Processing Β· Auto + Manual Captions

71

5.0

Tiktok Video Transcirpt Using OpenAI Whisper API

linen_snack/tiktok-video-transcirpt-using-openai-whisper-api

This Apify actor uses the OpenAI Whisper API to either transcribe Tiktok video into its original language or translate it into English. It's built to be robust, automatically handling video-to-audio conversion and compression to stay within API limits.

Dailymotion Transcript Scraper β€” Subtitles to TXT, SRT, VTT

scrapersdelight/dailymotion-transcript-scraper

Extract any public Dailymotion video's subtitle transcript β€” no login, no ASR. By video URL/ID or a search query: full text, timestamped segments & SRT/VTT, plus title, owner and duration, from Dailymotion's own subtitle tracks. $2 per 1,000 videos.

πŸ‘ User avatar

Scrapers Delight

2