👁 NPR Transcript Scraper — Fresh Air, Morning Edition & ATC avatar

NPR Transcript Scraper — Fresh Air, Morning Edition & ATC

Pricing

Pay per event

👁 NPR Transcript Scraper — Fresh Air, Morning Edition & ATC

NPR Transcript Scraper — Fresh Air, Morning Edition & ATC

Scrape full NPR transcripts — Fresh Air, Morning Edition, All Things Considered & Weekend Edition. Speaker-labeled paragraphs, full text, date, author & audio URL per story, plus a new-transcript monitor with alerts. No login or API key. $2 per 1,000 transcripts.

Pricing

Pay per event

Rating

0.0

(0)

Developer

👁 Scrapers Delight

Scrapers Delight

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

8 days ago

Last modified

📻 NPR Transcript Scraper — Fresh Air, Morning Edition & All Things Considered

Get the full, speaker-labeled transcript of NPR's broadcast stories — no login, no AI transcription. NPR publishes a complete transcript page for nearly every segment of Fresh Air, Morning Edition, All Things Considered and Weekend Edition, and this actor reads it: clean paragraphs, full text, speakers, date, author, section and the MP3 audio URL. Scrape one story, the latest broadcast days of any program, or page the archive back to 1991.

Because the transcript is already published, there's no speech-to-text compute — it's fast and cheap.

What does it do?

For each story (by program archive crawl or direct URL) it returns:

📝 Full transcript (plain text + paragraphs[]) — straight from npr.org/transcripts
🗣️ Speakers — the inline labels NPR prints (LEILA FADEL, HOST, JON HAMILTON, BYLINE, guests)
📅 Broadcast date, published date, author & section
🎧 Audio URL + duration (the segment MP3)
🚩 Honest flagging — stories whose transcript isn't published yet come back hasTranscript: false; nothing is synthesized

No ASR, no API key, no timestamps invented (NPR publishes none — output is paragraphs, never SRT).

What data does it extract?

For every story: storyId, title, storyUrl, transcriptUrl, program, episodeDate, publishedAt, author, section, speakers[], paragraphs[], paragraph_count, text, audioUrl, audioDuration, hasTranscript, is_new (monitor), scraped_at.

Who is it for?

✍️ Journalists, researchers & students quoting and searching radio coverage.
🤖 AI / RAG builders — dense, professionally produced news + interview transcripts, ideal retrieval/training data.
📰 Newsletter writers & media monitors tracking what NPR said about a topic, daily.
🎙️ Podcast/radio analysts studying program rundowns and guests.

How to use it (step by step)

Click Try for free.
Pick programs (fresh-air, morning-edition, all-things-considered, weekend-edition-saturday, weekend-edition-sunday) — or paste direct story/transcript URLs.
Set Max broadcast days and Max transcripts to size the run (each archive page = 5 broadcast days; archive reaches back to 1991).
Click Start, open the Dataset tab to view/export.
(Optional) enable monitorMode + a Schedule to get only NEW transcripts each run, with Slack/webhook/email alerts.

Quick start

{"programs":["fresh-air"],"maxEpisodes":5,"maxStories":10}

Direct story lookup

{"storyUrls":["https://www.npr.org/transcripts/nx-s1-5849937","963319470"]}

Input

Field	What it does
`programs`	NPR program slugs to crawl (`fresh-air`, `morning-edition`, …)
`storyUrls`	direct transcript/story URLs or bare story ids
`maxEpisodes`	recent broadcast days per program (0 = no day cap)
`maxStories`	hard cap on transcripts fetched per run (0 = unlimited)
`oldestDate`	optional `YYYY-MM-DD` floor for archive pagination
`includeMissingTranscripts`	also output stories with no transcript yet, flagged
`monitorMode`, `alertOnNewTranscript`	recurring new-transcript watcher + alerts
`webhookUrl`, `slackWebhookUrl`, `emailRecipients`	alert channels
`proxyConfiguration`, `requestConcurrency`	proxy + parallelism

Output example

{
"storyId":"nx-s1-5849937",
"title":"Socioeconomic factors are becoming 'biologically embedded' in children's brains",
"storyUrl":"https://www.npr.org/2026/06/11/nx-s1-5849937/child-brain-development-stress-sleep-neighborhood-economics",
"transcriptUrl":"https://www.npr.org/transcripts/nx-s1-5849937",
"program":"morning-edition",
"episodeDate":"2026-06-11",
"publishedAt":"2026-06-11",
"author":"Jon Hamilton",
"section":"Science",
"speakers":["LEILA FADEL","JON HAMILTON"],
"paragraphs":["LEILA FADEL, HOST:","New research suggests that the neighborhood a child lives in leaves a lasting imprint on their brain…","…"],
"paragraph_count":18,
"text":"LEILA FADEL, HOST:\n\nNew research suggests…",
"audioUrl":"https://ondemand.npr.org/anon.npr-mp3/npr/me/2026/06/…mp3",
"audioDuration":228,
"hasTranscript":true
}

Export to JSON, CSV, Excel, HTML, or RSS, or fetch via the Apify API.

Monitor mode — new-transcript alerts per program

Run on a Schedule (e.g. every 6 hours) with monitorMode: true: the actor remembers every transcript it has seen (in a named, persistent store) and outputs/alerts only the new ones. NPR posts a story's transcript a few hours after broadcast — unseen stories are simply picked up on a later run, never faked.

{"programs":["morning-edition","all-things-considered"],"maxEpisodes":3,"monitorMode":true,"slackWebhookUrl":"https://hooks.slack.com/…"}

How much does it cost?

Pay-per-event — and with no transcription compute, it's cheap:

Event	What it covers	Price
`lot-scraped`	each story returned	$0.004 / story
`lot-detail-enriched`	each transcript page fetched	$0.004 / story
`monitor-run-completed`	each scheduled watch run	$0.05 / run
`new-lot-detected`	each new transcript found	$0.02 / transcript
`alert-delivered`	each Slack/email/webhook push	$0.005 / alert

That's about $8 per 1,000 full transcripts.

Is it legal to scrape these transcripts?

This actor reads publicly published transcript pages on npr.org (NPR even runs a text-only site, text.npr.org). The content is NPR's (copyrighted). Scraping public pages is generally legal, but you are responsible for your use — review NPR's terms of use and permissions policy; don't republish transcripts you're not licensed to.

FAQ

Is there a Whisper/ASR step? No — NPR publishes the transcript; this actor reads it. Fast and cheap.

Which programs work? Any show with an npr.org program archive: fresh-air, morning-edition, all-things-considered, weekend-edition-saturday, weekend-edition-sunday are verified; any npr.org/programs/{slug} is accepted.

Do I get timestamps? No — NPR's published transcripts contain none, so the actor outputs paragraphs + full text (never fabricated SRT). You DO get the segment MP3 URL and its duration.

Do I get speaker labels? Yes — NPR prints inline labels (TERRY GROSS, HOST:); they're kept in the paragraphs and collected into speakers[].

A story came back hasTranscript: false — why? Same-day stories gain their transcript a few hours after broadcast, and a few segments (music interludes etc.) never get one. The actor flags these honestly instead of guessing. In monitor mode they're re-checked next run.

How far back can I go? The program archives paginate to 1991. Use maxEpisodes: 0 + oldestDate (and a generous maxStories) for deep backfills.

Both story-id formats? Yes — new nx-s1-… ids and legacy numeric ids (e.g. 963319470) both resolve.

How do I monitor several programs? List them all in programs — state is tracked per story id, so there's no cross-program double counting.

How do I export? JSON, CSV, Excel, HTML, or RSS from the Dataset tab, or via the Apify API.

Does it need a proxy or login? No login, no API key. Apify's datacenter proxy (default) is plenty — no anti-bot was observed.

Feedback

Want episode-rundown mode (every segment of a broadcast day), topic filtering, or another NPR show verified? Open an issue on the actor.

YouTube Transcript Pro - Office Edition

vigilant_arboretum/youtube-transcript-pro

Bulk YouTube transcript extractor with job tracking, timestamps, and stats.

👁 User avatar

Aman Bhawsar

Air Quality API

vivid_astronaut/air-quality

👁 User avatar

Fabio Suizu

👁 TikTok Transcript Scraper avatar

TikTok Transcript Scraper

crawlerbros/tiktok-transcript-scraper

Extract transcripts and subtitles from TikTok videos in all available languages. Returns timestamped segments plus full plain-text transcript per language.

👁 User avatar

Crawler Bros

132

👁 James Edition Real Estate Scraper avatar

James Edition Real Estate Scraper

parseforge/james-edition-real-estate-scraper

Scrape luxury real estate listings from James Edition. Extract property details including price, location, beds, baths, sqft, and property type.

👁 User avatar

ParseForge

5.0

YouTube Transcript Scraper

thescrappa/youtube-transcript-scraper

Extract YouTube transcript segments and full transcript text by video ID.

👁 User avatar

Scrappa

👁 Youtube Transcript Scraper avatar

Youtube Transcript Scraper

scraper-engine/youtube-transcript-scraper

YouTube Transcript Scraper extracts full transcripts from public YouTube videos with ease. Quickly retrieve spoken content for research, summarization, SEO, or accessibility—just enter a video URL and get clean, structured text. No login or API key required.

👁 User avatar

Scraper Engine

264

5.0

👁 South China Morning Post (scmp.com) News Scraper avatar

South China Morning Post (scmp.com) News Scraper

xtracto/scmp-scraper

Retrieves full South China Morning Post articles, including content protected by soft paywalls, for comprehensive regional coverage.

👁 User avatar

Farhan Febrian Nauval

👁 Youtube Video Transcript Scraper [ Subtitles ] avatar

Youtube Video Transcript Scraper [ Subtitles ]

alpha-scraper/youtube-video-transcript-scraper-subtitles

[ 🎥 Get any type of formats Transcript ] Extract full transcripts from public videos with ease ⚡ Quickly get spoken content for research, summaries & accessibility Just enter a video URL – no login or API key needed Fast, clean & structured text for pros ✨.

👁 User avatar

Alpha Scraper

5.0

YouTube Transcript Scraper-m2

seashell_knighthood/yt-transcript-scraper

Extracts full transcripts from YouTube videos using Crawlee. Provide a video URL or ID.

👁 User avatar

Mahir Sutar

👁 Archive.org Subtitle & Transcript Scraper — TXT, SRT & VTT avatar

Archive.org Subtitle & Transcript Scraper — TXT, SRT & VTT

scrapersdelight/archive-transcript-scraper

Download captions from any Archive.org film, TV, or audio item: clean transcript text, timestamped cues, normalized SRT & VTT, one row per language. Search 3M+ captioned items, monitor for new ones. No login or API key. $2 per 1,000 transcripts.

👁 User avatar

Scrapers Delight

URL: https://apify.com/scrapersdelight/npr-transcript-scraper