Pricing
Pay per usage
YouTube Transcript Scraper-m2
DeprecatedExtracts full transcripts from YouTube videos using Crawlee. Provide a video URL or ID.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
2 months ago
Last modified
Categories
Share
YouTube Transcript Scraper
Extracts transcripts from YouTube videos using Crawlee + Playwright.
Works locally and deploys directly to Apify with zero changes.
How it works
Three-strategy waterfall β stops at the first success:
- ytInitialPlayerResponse β parses the JS blob YouTube embeds in every page; extracts the caption track URL and fetches it directly (fastest, no UI interaction needed)
- Network intercept β opens the transcript panel and captures the
/api/timedtextresponse as it fires - DOM scraping β reads the rendered transcript panel segments as a last resort
No third-party transcript libraries β no proxy issues, no IP blocks from PyPI-style packages.
Local usage
1. Install
npminstallnpx playwright install chromium # one-time browser download
2. Run
# Full URLYT_VIDEO_URL="https://www.youtube.com/watch?v=dQw4w9WgXcQ"npm start# Short URLYT_VIDEO_URL="https://youtu.be/dQw4w9WgXcQ"npm start# Bare video IDYT_VIDEO_ID="dQw4w9WgXcQ"npm start
Output is printed to the console and saved to ./storage/datasets/default/.
Apify deployment
Option A β Apify CLI
npminstall-g apify-cliapify loginapify push
Then run the Actor from the Apify Console with input:
{"videoUrl":"https://www.youtube.com/watch?v=dQw4w9WgXcQ"}
Option B β Apify Console UI
- Create a new Actor β choose "Empty project"
- Upload this folder (or connect your GitHub repo)
- Set the Dockerfile path to
Dockerfile - Build and run
Output schema
Each run saves one JSON object to the dataset:
{"videoId":"dQw4w9WgXcQ","url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","title":"Rick Astley - Never Gonna Give You Up","language":"en","source":"json3","segmentCount":142,"segments":[{"start":"0:00","startMs":0,"duration":"0:03","durationMs":3000,"text":"We're no strangers to love"},...],"fullText":"We're no strangers to love ...","fetchedAt":"2024-01-01T00:00:00.000Z"}
Supported input formats
| Format | Example |
|---|---|
| Full watch URL | https://www.youtube.com/watch?v=dQw4w9WgXcQ |
| Short URL | https://youtu.be/dQw4w9WgXcQ |
| Shorts URL | https://www.youtube.com/shorts/dQw4w9WgXcQ |
| Embed URL | https://www.youtube.com/embed/dQw4w9WgXcQ |
| Bare ID | dQw4w9WgXcQ |
| With timestamp | https://youtu.be/dQw4w9WgXcQ?t=42 |
Troubleshooting
No transcript found β The video may have captions disabled or only auto-generated captions in a non-English language. The scraper prefers English tracks but will fall back to the first available track.
Playwright browser not found β Run npx playwright install chromium.
Rate limiting β Add a delay between runs or use Apify's built-in proxy pool when deploying.
