Pricing
$3.00 / 1,000 results
๐ฅ YouTube Transcript Scraper
Extract YouTube transcript data โ name, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.
Pricing
$3.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
YouTube Transcript & Subtitle Scraper
๐ youtube-transcript-scraper
Scrape YouTube transcripts and subtitles for any public video. Give one or more video IDs or URLs and get every available caption track โ language, whether it is auto-generated (ASR) or human-authored, whether it is translatable, and the downloadable timedtext URL โ together with the video's title, channel, length and view count.
Unofficial. This Actor is not affiliated with, authorized, or endorsed by YouTube or Google LLC. It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with YouTube's Terms of Service and all applicable laws; you are responsible for how you use the retrieved data.
What it does
- Caption discovery โ for each video, lists all caption tracks YouTube exposes
(e.g. English, Spanish, auto-generated English), with the language code, a
human-readable name, the
kind(asr= auto-generated),isTranslatable, and thetranscriptUrl(a YouTubetimedtextURL). - Video metadata โ every item also carries the parent video's
videoTitle,channel,channelId,lengthSeconds,viewCountandshortDescription. - Filtering โ keep only certain languages, or only auto-generated tracks.
- Transcript text (best effort) โ optionally tries to download and flatten the caption file into plain text. See the note below.
Input
| Field | Type | Default | Description |
|---|---|---|---|
videoIds | string[] | ["dQw4w9WgXcQ"] | Video IDs or full watch / youtu.be / shorts URLs. |
languageCodes | string[] | [] | Keep only tracks whose language code matches (e.g. en, es). Empty = all. |
autoGeneratedOnly | boolean | false | Keep only ASR (auto-generated) tracks. |
fetchTranscriptText | boolean | false | Attempt to download the transcript text (best effort, see note). |
maxItems | integer | 50 | Max total caption tracks across all videos. |
Example input
{"videoIds":["dQw4w9WgXcQ","https://www.youtube.com/watch?v=jNQXAC9IVRw"],"languageCodes":["en"],"autoGeneratedOnly":false,"fetchTranscriptText":true,"maxItems":100}
Output
One dataset item per caption track:
{"videoId":"dQw4w9WgXcQ","url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","videoTitle":"Rick Astley - Never Gonna Give You Up (Official Video) (4K Remaster)","channel":"Rick Astley","channelId":"UCuAXFkgsw1L7xaCfnd5JJOw","lengthSeconds":213,"viewCount":1779355962,"languageCode":"en","language":"English","kind":"asr","isAutoGenerated":true,"isTranslatable":true,"vssId":".en","transcriptUrl":"https://www.youtube.com/api/timedtext?v=dQw4w9WgXcQ&...","source":"video:dQw4w9WgXcQ"}
Notes
- Transcript text is best-effort. YouTube signs each
timedtextURL against the IP that requested it, so a server-side download frequently returns an error. WhenfetchTranscriptTextis enabled the Actor still tries, buttranscriptTextmay come back empty. ThetranscriptUrlis always provided so you can fetch the caption file yourself (append&fmt=json3,&fmt=srv3, or&fmt=vtt) from the appropriate IP. - Data is sourced live; YouTube / the upstream edge occasionally rate-limits, so the Actor retries transient blocks with exponential backoff.
- Video IDs are de-duplicated within a run.
Quick start
- Open the Actor and press Run โ the default input works out of the box.
- Adjust the input fields below to your target (keywords, IDs, or URLs) and set
maxItemsto cap spend. - Grab results from the Dataset tab as JSON / CSV / Excel, or pull them via the Apify API and MCP from your own code.
No proxies to configure, no cookies to paste, no login โ the Actor handles everything server-side.
Why developers pick this transcript scraper
Transcript actors are the picks-and-shovels of the AI boom โ and most charge $10 per 1,000 videos or quietly fail on half their runs. This Actor fetches YouTube transcripts/captions via a direct HTTP API at $3 per 1,000 videos, returned as timestamped segments plus a ready-to-use plain-text field. It's built for piping into LLMs: no HTML to clean, no SRT parsing, no browser.
What people build with it
- RAG knowledge bases โ index transcripts of conference talks, tutorials and reviews so your assistant can cite video content like documents.
- Content repurposing โ turn long-form videos into newsletters, blog posts and social threads with one LLM step on top of the transcript.
- Competitor channel analysis โ what topics, hooks and phrases do the top channels in your niche actually use? Transcripts answer at scale.
- Compliance & moderation โ audit what's being said in sponsored or branded videos without watching hours of footage.
- Subtitle workflows โ timestamped segments drop straight into translation and dubbing pipelines.
- Research corpora โ build searchable text datasets from playlists or whole channels.
Tips for better results
- Works with standard video URLs, Shorts URLs, or bare video IDs.
- Combine with YouTube Search or YouTube Channel Videos to discover videos first, then transcript them in bulk โ a two-actor pipeline that turns any topic into a text corpus.
- Each segment carries
startandduration, so you can deep-link to the exact second a phrase is spoken (youtu.be/ID?t=123).
Why this Actor
- Direct API, no headless browser โ fast, stable runs with nothing to babysit.
- No login, no cookies โ we never touch your accounts, so there's no ban risk.
- Fresh, real-time data โ every run reads the source live, not a stale cache.
- Pay per result โ you're billed only for the rows actually delivered.
- Structured JSON โ export to CSV, Excel, or JSON, or pull straight from the API / MCP.
Use cases
- Build clean text corpora for LLM fine-tuning and RAG.
- Repurpose long video into blogs, summaries, and clips.
- Make video searchable and translatable at scale.
- Feed transcripts into topic modeling and keyword research.
FAQ
Do I need an account, cookies, or to log in anywhere? No. The Actor talks to a fast, direct HTTP API server-side โ you just provide inputs and run it.
How am I billed?
Pay-per-result: a fixed price per row returned, with no separate platform/compute charge. Caps like maxItems keep spend predictable.
Can I run it on a schedule or call it from my app? Yes โ use Apify Schedules, the REST API, the JavaScript / Python clients, or the MCP server. See the API tab.
Is this affiliated with YouTube? No. It's an independent tool that collects publicly available data. Use it in line with the platform's terms and applicable law.
More YouTube scrapers by us
- YouTube Search โ Keyword video search ยท stats ยท channels
- YouTube Channel Videos โ All videos for a channel ยท stats
- YouTube Channel Info โ Channel profile ยท subs ยท about
- YouTube Comments โ Video comments + replies
Browse the full fleet โ https://apify.com/ethereal_wool
