Pricing
Pay per usage
Go to Apify Store
YouTube Transcript Extractor
Extract transcripts from YouTube videos in bulk. Supports channels, playlists, multiple languages. AI/RAG optimized.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
108
Total users
8
Monthly active users
3 months ago
Last modified
Categories
Share
Extracts transcripts and captions from YouTube videos in bulk. Outputs clean, structured text optimized for AI/RAG pipelines, content analysis, translation, and research.
What it does
Takes YouTube video URLs (or channel/playlist URLs), extracts available transcripts and captions, and outputs structured text with timestamps. Supports multiple languages and automatic/manual caption types.
Key Features
- Bulk extraction - Process hundreds of videos in one run
- Multi-language support - Extract transcripts in any available language
- Channel/Playlist support - Automatically discover videos from channels and playlists
- AI-ready output - Clean text format optimized for RAG, embeddings, and LLMs
- Timestamp preservation - Keep or remove timestamps based on your needs
- Chunking options - Split transcripts into configurable chunks for embedding pipelines
Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | string[] | required | YouTube video, channel, or playlist URLs |
languages | string[] | ["en"] | Preferred transcript languages (ISO 639-1 codes) |
includeAutoGenerated | boolean | true | Include auto-generated captions |
includeTimestamps | boolean | true | Include start/duration timestamps |
chunkSize | integer | 0 | Split transcript into chunks of N characters (0 = no chunking) |
chunkOverlap | integer | 200 | Overlap between chunks in characters |
outputFormat | string | "structured" | Output format: "structured", "plain_text", "srt", "vtt" |
maxVideosPerChannel | integer | 50 | Max videos to process per channel/playlist |
proxyConfiguration | object | {} | Apify proxy settings |
Output
Each video produces a dataset item:
{"videoId":"dQw4w9WgXcQ","videoUrl":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","title":"Video Title","channelName":"Channel Name","language":"en","isAutoGenerated":false,"availableLanguages":["en","es","fr"],"transcript":[{"text":"Hello and welcome","start":0.0,"duration":2.5}],"fullText":"Hello and welcome to this video...","wordCount":1523,"duration":612.5,"chunks":[{"index":0,"text":"Hello and welcome to this video...","startTime":0.0,"endTime":45.2}],"extractedAt":"2026-02-23T07:00:00Z"}
Use Cases
- RAG Knowledge Bases - Build searchable knowledge bases from educational YouTube channels
- Content Research - Analyze competitor content, extract key topics
- Translation - Extract source text for translation workflows
- SEO Analysis - Analyze video content for keyword research
- Podcast Transcription - Many podcasts are uploaded to YouTube with captions
- Training Data - Collect text data for fine-tuning language models
- Accessibility - Generate text versions of video content
Pricing
Pay per result:
- $0.15 per 1,000 videos processed
- Free tier: 100 videos/month
Example Usage
Extract transcripts from specific videos
{"urls":["https://www.youtube.com/watch?v=VIDEO_ID_1","https://www.youtube.com/watch?v=VIDEO_ID_2"],"languages":["en"],"outputFormat":"structured"}
Extract from a channel with chunking for RAG
{"urls":["https://www.youtube.com/@ChannelName"],"languages":["en"],"chunkSize":1000,"chunkOverlap":200,"maxVideosPerChannel":100,"outputFormat":"structured"}
Get plain text transcripts in multiple languages
{"urls":["https://www.youtube.com/playlist?list=PLAYLIST_ID"],"languages":["en","es","fr"],"includeTimestamps":false,"outputFormat":"plain_text"}
