Pricing
from $0.08 / 1,000 transcript extracteds
YouTube Transcript Scraper
Extract clean transcript text, timestamps, captions, and public video metadata from YouTube URLs or video IDs for AI, SEO, and research workflows.
Pricing
from $0.08 / 1,000 transcript extracteds
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
a day ago
Last modified
Categories
Share
Extract transcripts and timestamped caption segments from public YouTube videos. Export text for summaries, RAG, content research, and AI-agent workflows.
Choose the right YouTube actor
| Need | Use |
|---|---|
| Viewer feedback and replies | YouTube Comments Scraper |
| Recent uploads from a channel | YouTube Channel Videos Scraper |
| Caption text for summaries, RAG, or research | YouTube Transcript Scraper |
What does YouTube Transcript Scraper do?
YouTube Transcript Scraper turns public YouTube captions into structured data.
It accepts YouTube watch URLs, Shorts URLs, youtu.be links, embed URLs, live URLs, or raw video IDs.
For every video, it returns a dataset row with the video ID, URL, title, channel name, caption language, transcript text, and optional timestamped segments.
Videos without public captions are handled gracefully with a clear error message instead of failing the whole run.
Who is it for?
SEO and content teams
Use transcripts to repurpose videos into briefs, blog drafts, quote libraries, keyword research, and content audits.
Researchers and analysts
Collect spoken content from public videos for media monitoring, qualitative research, public-interest analysis, or education datasets.
LLM and RAG builders
Create clean text chunks from public video captions for search, summarization, classification, embeddings, and retrieval workflows.
Sales and marketing teams
Extract talks, interviews, demos, webinars, and competitor videos into searchable text for faster review.
Journalists and fact checkers
Create searchable transcript records for public videos, speeches, interviews, and announcements.
Why use this actor?
- β Structured output with one row per video
- β Transcript text plus timestamped segments
- β Public caption language selection
- β Graceful handling for videos with no captions
- β Works with video URLs and video IDs
- β Low-cost runs for captioned public videos
- β Output ready for spreadsheets, APIs, and AI workflows
What data can I extract?
| Field | Description |
|---|---|
videoId | YouTube video ID |
videoUrl | Canonical watch URL |
title | Public video title when available |
channelName | Public channel name when available |
language | Caption language used |
isAutoGenerated | Whether the selected captions appear auto-generated |
transcriptText | Full transcript as one clean text string |
segments | Timestamped transcript segments |
duration | Video duration in seconds when available |
thumbnailUrl | Public thumbnail URL when available |
captionsAvailable | Whether public caption tracks were found |
error | Explanation for unavailable captions or failed videos |
How much does it cost to extract YouTube transcripts?
The actor uses pay-per-event pricing.
There is a small start event for each run and a per-transcript event for every successful transcript extracted.
A first test with one or two videos is inexpensive.
For high-volume work, run batches of known captioned public videos to keep cost predictable.
Final tiered pricing is set on Apify before publication and is visible on the actor page.
How to use YouTube Transcript Scraper
- Open the actor on Apify.
- Paste one or more public YouTube video URLs.
- Optionally set a preferred caption language such as
en,es, orde. - Choose whether to include timestamped segments.
- Click Start.
- Download the dataset as JSON, CSV, Excel, XML, or HTML.
Quick start example
{"videoUrls":[{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],"language":"en","includeTimestamps":true,"includeMetadata":true,"maxVideos":1}
Example output
{"videoId":"dQw4w9WgXcQ","videoUrl":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","title":"Example video title","channelName":"Example channel","language":"en","isAutoGenerated":false,"transcriptText":"We're no strangers to love...","segments":[{"start":0.0,"duration":2.1,"text":"We're no strangers to love"}],"duration":213,"thumbnailUrl":"https://i.ytimg.com/...","captionsAvailable":true}
Supported YouTube URL formats
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/shorts/VIDEO_IDhttps://www.youtube.com/embed/VIDEO_IDhttps://www.youtube.com/live/VIDEO_ID- Raw
VIDEO_IDvalues
Caption language selection
Set language to your preferred caption language code.
If that exact language is not available, the actor falls back to a related language variant or the first public caption track.
For example, en may match English captions when present.
The selected language is returned in the language field.
Timestamped transcript segments
Enable includeTimestamps to receive segment-level timing.
Each segment can include:
startβ segment start in secondsdurationβ segment length in secondstextβ spoken caption text
Disable timestamps when you only need the combined transcript text.
Metadata fields
Enable includeMetadata to include public video details when available.
Metadata can include title, channel name, duration, and thumbnail URL.
Some unavailable or restricted videos may return less metadata.
Handling videos with no captions
Not every public YouTube video has public captions.
When captions are unavailable, the actor still saves a row with:
captionsAvailable: false- the video ID and URL
- an
errormessage explaining what happened
This makes batch runs easier to audit because one bad video does not stop the rest of the run.
Tips for best results
- β Use public videos with captions enabled.
- β Start with a small batch to confirm your input format.
- β
Use
languagewhen you need a specific caption language. - β
Keep
maxVideoslow for testing and increase it after validation. - β
Check
captionsAvailablebefore using transcript text in automated workflows.
AI-agent recipe
Goal: turn public webinar or video captions into a summary and quote table.
Input:
{"videoUrls":[{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],"language":"en","includeTimestamps":true,"includeMetadata":true,"maxVideos":1}
Prompt to use with Claude / ChatGPT / MCP: "Extract transcripts from these public YouTube URLs. Create a table with video URL, key claims, notable quotes, and timestamps."
Follow-up automation:
- Schedule: Run after each new public webinar, podcast, or product video is published.
- Export: Docs, Sheets, a vector DB, or Slack summaries for content review.
- Guardrail: Use only public caption tracks returned by the actor; respect copyright and platform terms.
Integrations
You can connect the dataset to:
- Google Sheets for editorial review
- Zapier or Make for automations
- Vector databases for embeddings and retrieval
- BI tools for media analysis
- Internal dashboards for monitoring public video content
- LLM workflows for summarization, tagging, and question answering
API usage with Node.js
import{ ApifyClient }from'apify-client';const client =newApifyClient({token: process.env.APIFY_TOKEN});const run =await client.actor('fetch_cat/youtube-transcript-scraper').call({videoUrls:[{url:'https://www.youtube.com/watch?v=dQw4w9WgXcQ'}],language:'en',});const{ items }=await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].transcriptText);
API usage with Python
from apify_client import ApifyClientimport osclient = ApifyClient(os.environ['APIFY_TOKEN'])run = client.actor('fetch_cat/youtube-transcript-scraper').call(run_input={'videoUrls':[{'url':'https://www.youtube.com/watch?v=dQw4w9WgXcQ'}],'language':'en',})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items[0].get('transcriptText'))
API usage with cURL
curl-X POST "https://api.apify.com/v2/acts/fetch_cat~youtube-transcript-scraper/runs?token=$APIFY_TOKEN"\-H'Content-Type: application/json'\-d'{"videoUrls":[{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}],"language":"en"}'
MCP and AI agent usage
Use this actor through Apify MCP when you want an AI assistant to fetch public video transcripts.
MCP server URL pattern:
https://mcp.apify.com/?tools=fetch_cat/youtube-transcript-scraper
Claude Code setup:
$claude mcp add apify-youtube-transcripts --transport http --url"https://mcp.apify.com/?tools=fetch_cat/youtube-transcript-scraper"
Claude Desktop JSON config:
{"mcpServers":{"apify-youtube-transcripts":{"url":"https://mcp.apify.com/?tools=fetch_cat/youtube-transcript-scraper"}}}
Example prompts:
- "Extract the transcript from this public YouTube video and summarize the key claims."
- "Get transcripts for these five public webinar URLs and make a topic table."
- "Find quotes in this public interview transcript about pricing."
Common use cases
- Video-to-blog repurposing
- Public webinar transcript extraction
- Research corpus creation
- Podcast-style YouTube episode analysis
- Competitive content monitoring
- Training data preparation from public captions
- Subtitle QA and language availability checks
Limitations
This actor extracts public captions only.
It cannot access private videos, members-only videos, deleted videos, region-blocked content unavailable to the runner, or videos without public caption tracks.
Transcript quality depends on the caption track provided for the public video.
Auto-generated captions may contain recognition errors.
Legality and responsible use
Use this actor only for content you are allowed to access and process.
YouTube videos and captions may be protected by copyright or platform terms.
You are responsible for ensuring that your use case, storage, redistribution, and analysis comply with applicable laws, platform rules, and rights-holder requirements.
FAQ
Does it work without a YouTube account?
Yes, the actor is designed for public videos and public caption tracks.
Can it extract transcripts from private videos?
No. Private, members-only, deleted, or otherwise inaccessible videos are outside scope.
Why did a video return captionsAvailable=false?
The video may not have public captions, the video may be unavailable, or YouTube may not expose captions for that video.
Can I choose a language?
Yes. Use the language input with a language code such as en, es, fr, or de.
Are timestamps included?
Yes, when includeTimestamps is enabled.
Why are captions imperfect?
Some videos use auto-generated captions. These can include speech-recognition mistakes.
Troubleshooting
My run succeeded but transcript text is empty
Check the error and captionsAvailable fields. The video probably has no public caption track.
My preferred language was not returned
The requested language may not be available. The actor falls back to another public caption track when needed.
Related scrapers
- Reddit Scraper β add public discussion context around video topics.
- Apple App Store Reviews Scraper β compare video claims with public product feedback.
- Y Combinator Companies Scraper β research companies mentioned in public startup videos.
Changelog
0.1
Initial version with public YouTube transcript extraction, timestamped segments, language selection, metadata fields, and graceful no-caption handling.
Support
If a public captioned video fails unexpectedly, provide the video URL, input JSON, and run ID so the issue can be reproduced.
