VOOZH about

URL: https://apify.com/parseforge/youtube-transcript-scraper

⇱ YouTube Transcript Scraper - Subtitles at Scale Β· Apify


Pricing

Pay per event

Go to Apify Store

Youtube Transcript Scraper

Pull transcripts from any YouTube video at scale! Extract full subtitles with timestamps in SRT and plain text, plus titles, channels, descriptions, view counts, upload dates, tags, and thumbnails. Perfect for content research, SEO, summarization, and video analytics. Start extracting today!

Pricing

Pay per event

Rating

0.0

(0)

Developer

πŸ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

9

Total users

1

Monthly active users

24 days ago

Last modified

Share

πŸ‘ ParseForge Banner

🎬 YouTube Transcript Scraper

πŸš€ Extract full transcripts from any YouTube video in seconds. Timestamped segments, SRT export, and 17 metadata fields (title, channel, views, likes, upload date, tags, thumbnail) per video. No API key, no registration, no YouTube Data API quota.

πŸ•’ Last updated: 2026-04-24 Β· πŸ“Š 17 fields per video Β· 🌐 100+ languages Β· ⚑ 30 videos in parallel Β· πŸ“œ SRT + plain text output

The YouTube Transcript Scraper turns any YouTube URL into a structured record with the full transcript, segment timestamps, and 17 metadata fields. It handles human-authored captions and auto-generated ones across 100+ languages. Each record ships with the plain-text transcript, an SRT file for subtitle overlay, and a segment array for timestamp-precise search.

Metadata covers title, channel ID, channel name, channel URL, description, duration, view count, like count, comment count, upload date, tags, categories, thumbnail URL, and the list of all available subtitle languages. Concurrent extraction keeps 30 videos processing in parallel, so a queue of 100 clips finishes in a couple of minutes. Residential proxy is required because YouTube has cracked down hard on datacenter IPs.

🎯 Target AudienceπŸ’‘ Primary Use Cases
AI app developers, researchers, content creators, language learners, accessibility engineers, journalistsRAG video indexing, LLM summarization, captions datasets, language learning, accessibility tools

πŸ“‹ What the YouTube Transcript Scraper does

Six transcript workflows in a single run:

  • πŸ“ Full transcript. Timestamped segments with start, duration, and text per line.
  • πŸ’¬ Plain text transcript. Flat string ready for LLM ingestion.
  • 🎞️ SRT export. Standards-compliant subtitle file for video apps.
  • 🌐 Language picker. Choose your preferred caption language with fallback to defaults.
  • 🎬 Video metadata. Title, channel info, views, likes, comments, upload date, tags, categories.
  • 🌍 Available languages. Full list of manual and auto-generated caption languages per video.

Each record also includes the thumbnail URL and an isAutoGenerated flag so you can filter out auto captions when you need human-quality transcripts.

πŸ’‘ Why it matters: video is the largest untapped dataset in the world. Transcripts make it searchable, summarizable, and indexable. DIY transcript fetchers break every time YouTube changes their API. This Actor uses yt-dlp under the hood, which is actively maintained.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough of transcript-powered video search.


βš™οΈ Input

InputTypeDefaultBehavior
startUrlsarray of URLsrequired if no videoIdsYouTube video URLs (youtube.com/watch?v=, youtu.be/, shorts).
videoIdsarray of stringsrequired if no startUrlsRaw YouTube video IDs (11 chars).
languagestring""Preferred ISO language code (en, es, fr).
includeAutoGeneratedbooleantrueFall back to auto-generated captions when no manual ones exist.
maxItemsinteger10Videos processed. Free plan caps at 10, paid plan at 1,000,000.
proxyConfigurationobjectRESIDENTIALResidential proxy required.

Example: transcribe a TED talk.

{
"startUrls":[
{"url":"https://www.youtube.com/watch?v=UyyjU8fzEYU"}
],
"language":"en",
"includeAutoGenerated":true,
"maxItems":1,
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":["RESIDENTIAL"]
}
}

Example: batch transcribe a playlist of videos.

{
"videoIds":[
"dQw4w9WgXcQ",
"kJQP7kiw5Fk",
"9bZkp7q19f0"
],
"language":"en",
"maxItems":100
}

⚠️ Good to Know: YouTube now blocks datacenter IPs for transcript fetching. Apify residential proxy is included on paid plans and is strongly recommended. Videos without captions return a record with an error field explaining "No subtitles available."


πŸ“Š Output

Each record contains 17 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
πŸ†” videoIdstring"UyyjU8fzEYU"
πŸ”— urlstring"https://www.youtube.com/watch?v=UyyjU8fzEYU"
🏷️ titlestring | null`"My stroke of insight
πŸ†” channelIdstring | null"UCAuUUnT6oDeKwE6v1NGQxug"
πŸ”— channelUrlstring | null"https://www.youtube.com/channel/..."
πŸ§‘ channelNamestring | null"TED"
πŸ“ descriptionstring | null"Neuroanatomist Jill Bolte Taylor..."
⏱️ durationSecondsnumber | null1141
πŸ‘οΈ viewCountnumber | null8688914
πŸ‘ likeCountnumber | null122000
πŸ’¬ commentCountnumber | null4800
πŸ“… uploadDatestring | null"2008-03-13"
🏷️ tagsstring[]["TED Talk", "brain", "science"]
πŸ—‚οΈ categoriesstring[]["Science & Technology"]
🌐 languagestring | null"en"
🌍 availableSubtitleLanguagesstring[]["en", "es", "fr"]
πŸ€– availableAutoCaptionLanguagesstring[]["en"]
πŸ€– isAutoGeneratedbooleanfalse
πŸ“œ transcriptarray[{"start": 12.3, "duration": 4.2, "text": "..."}]
πŸ’¬ transcriptPlainTextstring"I grew up to study the brain..."
🎞️ transcriptSrtstring"1\n00:00:12,300 --> ...\n..."
πŸ”’ wordCountnumber2703
πŸ–ΌοΈ thumbnailUrlstring | null"https://i.ytimg.com/vi/.../maxresdefault.jpg"
πŸ•’ scrapedAtISO 8601"2026-04-21T12:00:00.000Z"
❗ errorstring | null"No subtitles available" on failure

πŸ“¦ Sample records


✨ Why choose this Actor

Capability
πŸ“œFull transcript + SRT. Three output formats: segments, plain text, subtitle file.
🌐100+ languages. Manual captions and auto-generated captions supported.
πŸ“Š17 metadata fields. Title, channel, views, likes, comments, tags, upload date.
⚑Concurrent. 30 videos processing in parallel on a single run.
πŸ”Actively maintained. Uses yt-dlp under the hood, which tracks YouTube's changes.
🚫No YouTube Data API quota. Unlimited captions without Google Cloud project.
πŸ”ŒIntegrations. Drops into RAG pipelines, language-learning apps, and subtitle tools.

πŸ“Š Every transcript is a searchable index point. Indexing video at scale unlocks insights, summaries, and accessibility features that would be impossible to build manually.


πŸ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ YouTube Transcript Scraper (this Actor)$5 free credit, then pay-per-useAny public videoLive per runlanguage, auto/manual, list⚑ 2 min
YouTube Data APIFree (quota)Metadata onlyReal-timeStrict quota⏳ Variable
DIY yt-dlp scriptsFreeWhatever you codeYour scheduleWhatever you build🐒 Days
Paid transcription APIs$0.04+/minAny audioReal-timeCustom filters⏳ Hours

Pick this Actor when you want reliable YouTube transcripts without quota limits or custom infrastructure.


πŸš€ How to use

  1. πŸ“ Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the YouTube Transcript Scraper page on the Apify Store.
  3. 🎯 Add video URLs. Paste URLs or video IDs and pick a preferred language.
  4. πŸš€ Run it. Click Start and let the Actor transcribe.
  5. πŸ“₯ Download. Grab your dataset as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded transcripts: 3-5 minutes. No coding required.


πŸ’Ό Business use cases

🧠 AI & RAG

  • Index videos in a searchable knowledge base
  • Feed transcripts to GPT, Claude, or Gemini summaries
  • Build video-aware chatbots
  • Generate research datasets for LLMs

πŸŽ“ Education & Learning

  • Side-by-side bilingual transcripts
  • Study notes from lecture videos
  • Language-learning flashcards from music
  • Accessibility captions for students

πŸ“° Media & Journalism

  • Extract quotes from interview videos
  • Fact-check statements at scale
  • Build transcripts for podcast archives
  • Monitor public-figure statements

πŸ› οΈ Developer Tooling

  • SRT files for video players
  • Transcripts for video search engines
  • Subtitle generation for app content
  • Dataset assembly for speech models

πŸ”Œ Automating YouTube Transcript Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟒 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • πŸ“š See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Daily transcription of a channel's latest uploads keeps a RAG index current.

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

πŸŽ“ Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🀝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

πŸ§ͺ Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

πŸ€– Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:



❓ Frequently Asked Questions

🧩 How does it work?

The Actor wraps yt-dlp, which fetches metadata and subtitle files from YouTube. Transcripts are parsed into structured segments, then flattened into plain text and SRT formats. Each run processes up to 30 videos in parallel.

πŸ“ How accurate are the transcripts?

Human-authored captions are highly accurate. Auto-generated captions depend on the audio quality and language; English auto captions are typically 85-95% accurate.

🌐 Which languages are supported?

Every language for which YouTube publishes captions or auto-captions (100+ languages). Pass any ISO code to language or leave empty for the video's default.

πŸ” Why do I need residential proxy?

YouTube now challenges datacenter IPs with "Sign in to confirm you're not a bot" when fetching metadata or subtitles. Residential proxy is included on paid Apify plans and bypasses this cleanly.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to transcribe new uploads on any cron interval.

βš–οΈ Is it legal?

Transcript extraction from publicly available videos is generally fine for research, indexing, and AI use. Commercial redistribution of transcripts may require rights clearance from the video owner.

πŸ’Ό Can I use this commercially?

Yes for internal search, RAG, and summarization. Redistribution of full transcripts requires respecting copyright and YouTube's terms of service.

πŸ’³ Do I need a paid Apify plan to use this Actor?

The free plan covers testing (10 videos per run). A paid plan lifts the limit AND gives you residential proxy access, which is required for reliable YouTube transcript fetching.

πŸ” What happens if a run fails?

Apify retries transient errors. Per-video failures (no captions, geo-blocked, private) are logged in the error field. Partial datasets are preserved.

🎞️ Can I download the video file?

This Actor focuses on transcripts and metadata. For video files, use a dedicated YouTube Video Downloader actor.

πŸ“Ί Does it work on shorts, live streams, and age-restricted videos?

YouTube Shorts work. Live streams and age-restricted videos are not supported (age-restricted requires sign-in; live streams have no final transcript until the stream ends).

πŸ†˜ What if I need help?

Our team is available through the Apify platform and the Tally form below.


πŸ”Œ Integrate with any app

YouTube Transcript Scraper connects to any cloud service via Apify integrations:

  • Make - Auto-transcribe new uploads
  • Zapier - Push transcripts to Notion or Airtable
  • Slack - Share TL;DRs in team channels
  • Airbyte - Pipe transcripts into your warehouse
  • GitHub - Trigger runs from commits
  • Google Drive - Save transcripts to Docs or Sheets

You can also use webhooks to push transcripts into vector databases, RAG stacks, or subtitle tools.


πŸ”— Recommended Actors

πŸ’‘ Pro Tip: browse the complete ParseForge collection for more video and audio tools.


πŸ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with Google, YouTube, or Alphabet. It accesses only publicly available video metadata and caption tracks. Respect YouTube's terms of service and copyright when using transcripts commercially.

You might also like

Youtube Transcript Generator

quirky_neuron/youtube-transcript-generator

Instantly extract transcripts and subtitles from any YouTube video. Supports full URLs and Video IDs. Returns structured JSON data via a fast API integration. Perfect for AI analysis, content summarization, and SEO.

Youtube Transcript Scraper

scrapier/youtube-transcript-scraper

Extract full transcripts from YouTube videos with the YouTube Transcript Scraper. Get precise timestamps, speaker names, and text for any video. Perfect for content analysis, SEO, research, and summarization. Fast, accurate, and easy to integrate into your workflow.

YouTube Transcript Scraper

happy_b/youtube-transcript-scraper

Extract YouTube video transcripts with timestamps, word counts, and full video metadata.

Youtube Video Details Scraper

scrapevanta/youtube-video-details-scraper

YouTube Video Details Scraper extracts video titles, descriptions, tags, view counts, likes, comments, upload dates, channel information, and more from YouTube videos. Ideal for SEO analysis, competitor research, content tracking, market research, and data-driven decision-making.

YouTube Video Subtitles Scraper

scrapier/youtube-video-subtitles-scraper

Extract video subtitles effortlessly with the YouTube Video Subtitles Scraper. Retrieve full transcripts, timestamps, and multiple language options from any YouTube video. Ideal for content analysis, research, accessibility, and translation projects. Fast, accurate, and easy to use at scale.

Youtube Transcript Scraper

scrapapi/youtube-transcript-scraper

πŸŽ₯ YouTube Transcript Scraper (youtube-transcript-scraper) extracts clean video transcripts & captionsβ€”timestamps, languages, and more. ⚑ Bulk scrape playlists/channels, export JSON/CSV for SEO, research, summarization & AI. πŸ”Ž Perfect for repurposing and indexing.

YouTube Video transcript scraper

codenest/youtube-video-transcript-scraper

Easily extract precise YouTube video transcripts with millisecond timestamps, complete video metadata, and multiple output formats including structured JSON with timestamps and plain text arrays for professional content analysis. ❀️YouTube Video transcript scraper❀️.

Youtube Video Details Scraper

scrapio/youtube-video-details-scraper

Scrapes detailed information from any YouTube video, capturing titles, descriptions, tags, thumbnails, durations, views, likes, publish dates, channels, and metadata. Ideal for SEO analysis, competitor research, content insights, and large-scale video data extraction