๐ YouTube Transcript Extractor โ AI-Ready Subtitles avatar
YouTube Transcript Extractor โ AI-Ready Subtitles
Under maintenancePricing
from $8.00 / 1,000 results
Go to Apify Store
YouTube Transcript Extractor โ AI-Ready Subtitles
Under maintenanceExtracts subtitles/transcripts from YouTube videos. Input a video URL or ID, get clean text output with metadata. Ideal for AI training data collection, content analysis, and LLM training pipelines.
Pricing
from $8.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 days ago
Last modified
Categories
Share
Extract clean subtitle/transcript text from any YouTube video with subtitles. Designed for AI training data pipelines, content analysis, and LLM training.
Features
- ๐ฏ Input a YouTube URL or bare video ID
- ๐ Supports manual and auto-generated captions
- ๐ Multi-language โ specify any ISO 639-1 language code (default:
en) - โฑ Optional
[MM:SS]timestamps in output - ๐งน Clean, join-transcript format
- ๐ Rich metadata: video_id, duration, word count, language
- ๐ก๏ธ Robust error handling with descriptive error messages
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
video_url | string | โ | โ | YouTube URL (any format) or bare video ID |
language | string | โ | en | ISO 639-1 language code |
include_timestamps | bool | โ | false | Add [MM:SS] before each subtitle line |
Output (one item per run)
| Field | Type | Description |
|---|---|---|
video_id | string | 11-char YouTube video ID |
title | string | Video title (if retrievable) |
duration | int | Approximate duration in seconds |
language | string | Language code of the transcript |
transcript_type | string | "manual" or "auto-generated" |
transcript | string | Full clean text of the subtitles |
word_count | int | Word count of the transcript |
url | string | Full YouTube URL |
Supported URL formats
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/embed/VIDEO_IDhttps://www.youtube.com/shorts/VIDEO_ID- Bare
VIDEO_ID(11 characters)
Use Cases
- AI/LLM Training Data โ collect natural language text from millions of YouTube videos
- Content Analysis โ analyze video content at scale for SEO, research, or moderation
- Accessibility โ extract captions for further processing or translation
- Dataset Building โ build large text corpora from video subtitles
Built with youtube_transcript_api โค๏ธ
