👁 CBS 60 Minutes Transcripts Scraper avatar

CBS 60 Minutes Transcripts Scraper

Pricing

Pay per event

CBS 60 Minutes Transcripts Scraper

Collects full interview transcripts from CBS 60 Minutes. Discovers pages via the CBS News article sitemap, extracts the Q&A body, correspondent name, broadcast date, speaker labels, and topic tags. Video-only segments without a published transcript are skipped.

Pricing

Pay per event

Rating

0.0

(0)

Developer

👁 BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

12 days ago

Last modified

What It Scrapes

Targets two URL patterns on cbsnews.com:

/news/<slug>-60-minutes-transcript/ — primary transcript pattern
/news/read-the-full-transcript-of-<slug>/ — extended interview variant

Discovery walks the CBS News monthly article sitemaps, filters by these patterns, and scrapes each matching page. Video-only stories (e.g. /news/<slug>-60-minutes/) are explicitly excluded.

Output Schema

Field	Type	Description
`story_slug`	string	URL slug of the transcript page
`story_title`	string	Article headline
`story_url`	string	Canonical CBS News URL
`aired_date`	string	Broadcast date (YYYY-MM-DD)
`published_date`	string	CBS News publish timestamp (ISO 8601)
`segment_type`	string	Inferred type: `interview`, `investigation`, or `profile`
`correspondent`	string	CBS News correspondent (e.g. Major Garrett, Lesley Stahl)
`subjects`	string	Interviewed subjects extracted from speaker labels (comma-separated)
`synopsis`	string	Article dek / meta description
`body_html`	string	Full transcript HTML preserving Q&A paragraph structure
`body_text`	string	Plain-text version of the transcript
`speakers`	string	All speaker labels found in the transcript (comma-separated)
`is_transcript`	boolean	Always `true` — non-transcripts are skipped
`has_video_only_variant`	boolean	True when a paired video-only story exists
`related_story_urls`	string	Related CBS News links on the page (comma-separated)
`topics`	string	CBS News topic tags (comma-separated)
`canonical_url`	string	Canonical URL from page head
`source`	string	Fixed: `cbsnews.com/60-minutes`
`scraped_at`	datetime	ISO 8601 scrape timestamp

Speaker labels follow two CBS conventions: Major Garrett: (Title Case) and MAJOR GARRETT: (ALL-CAPS, used in the extended-interview variant). Both formats are normalized and extracted.

Input Options

maxItems (integer, required) — Maximum number of transcript records to scrape. Set a higher value for bulk runs.

startDate (string, optional) — Limit sitemap discovery to a given month onwards (YYYY-MM format, e.g. "2024-01"). Defaults to all available months when omitted.

startUrls (array, optional) — One or more direct CBS News transcript URLs. When provided, sitemap discovery is skipped and only the supplied URLs are scraped. Useful for targeted re-runs of specific episodes.

Example: Specific episode

{
"maxItems":1,
"startUrls":[
{"url":"https://www.cbsnews.com/news/netanyahu-us-israel-iran-60-minutes-transcript/"}
]
}

Example: All 2025 transcripts

{
"maxItems":200,
"startDate":"2025-01"
}

Example: Full archive (all available transcripts)

{
"maxItems":1000
}

How It Works

Discovery uses the CBS News sitemap index at cbsnews.com/xml-sitemap/index.xml. Monthly article sitemaps (article-YYYY-MM.xml) are walked in order, newest first. Each sitemap lists 3,000+ news articles; only URLs matching the transcript patterns are fetched.

Metadata is parsed from JSON-LD NewsArticle blocks present on every CBS article page — giving reliable correspondent name, publish date, and keywords. The transcript body lives in <section class="content__body"> as a sequence of <p> tags. Speaker labels are extracted from paragraph-leading Name: patterns. Ad wrappers are stripped before body extraction.

CBS News is server-rendered (varnish edge cache) with no bot-protection observed. No proxy required, no headless browser required.

Coverage Notes

60 Minutes airs approximately 45 episodes per US broadcast season, with 3-4 segments per episode. Roughly 50-70% of segments receive a published transcript — the remainder are video-only. This scraper covers transcript-bearing segments only and makes that boundary explicit in every record (is_transcript: true, video-only pages are skipped). The active transcript archive covers approximately 5 years back, with sparser coverage for earlier seasons.

Pricing

Charged per transcript record scraped. Long-form interviews (5,000-30,000 words each) are priced at a modest premium reflecting their per-record research value versus wire-copy or short-form corpora.

👁 Dutch CBS Statistics Scraper avatar

Dutch CBS Statistics Scraper

parseforge/dutch-cbs-statistics-scraper

Export official Netherlands statistics from Statistics Netherlands (CBS) Open Data. Browse table catalog or pull full datasets by table ID. Filter rows with OData expressions. Population, economy, labor, housing, education data.

👁 User avatar

ParseForge

👁 CBS News Scraper avatar

CBS News Scraper

natasha.lekh/cbs-news-scraper

Scrape news data from cbsnews.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

👁 User avatar

Natasha Lekh

👁 CBS Local Scraper avatar

CBS Local Scraper

lukass/cbs-local-scraper

Scrape news data from cbslocal.com with this unofficial API. Extract articles, monitor their popularity and performance and automate the fight against fake news. Filter the results by authors, topics, categories, or publication dates. Preview or download the results in your preferred format.

👁 User avatar

Lukáš Širhal

👁 CNN Transcripts Scraper avatar

CNN Transcripts Scraper

jungle_synthesizer/cnn-transcripts-scraper

Scrape broadcast transcripts from transcripts.cnn.com. Extracts full segment text, speaker labels, show metadata, and airtime info for any CNN show and date range.

👁 User avatar

BowTiedRaccoon

👁 Youtube Transcripts Scraper avatar

Youtube Transcripts Scraper

apple_yang/youtube-transcripts-scraper

Extract spoken transcripts from Youtube video with Bilibili Transcripts Scraper. Just enter video URLs to get transcripts. Perfect for content analysis, AI pipelines, or trend research.

👁 User avatar

APISmith

5.0

👁 Stable Tiktok Transcripts Scraper avatar

Stable Tiktok Transcripts Scraper

apple_yang/tiktok-transcripts-scraper

Extract spoken transcripts from Tiktok video with Tiktok Transcripts Scraper. Just enter video URLs to get transcripts. Perfect for content analysis, AI pipelines, or trend research.

👁 User avatar

APISmith

205

5.0

YouTube SEO Rocket 2025 – 12 Titles + Description + 60 Tags

ytseo2025/youtube-seo-rocket-2025---12-titles-description-60-tags

Paste any YouTube URL → get 12 viral titles, full SEO description 60 tags 15 hashtags in <2 minutes. Gemini 2.5

👁 User avatar

Smart Agent

👁 Facebook Video Transcript Extractor avatar

Facebook Video Transcript Extractor

scrapio/facebook-video-transcript-extractor

Extracts full transcripts from Facebook videos, capturing spoken text, timestamps, speaker segments, and metadata. Ideal for research, content repurposing, SEO, accessibility, and automated analysis of large video libraries with accurate text output

👁 User avatar

Scrapio

👁 Facebook Video Transcript Extractor avatar

Facebook Video Transcript Extractor

linen_snack/facebook-video-transcript-extractor

Extract transcripts from Facebook video

👁 User avatar

ius iyb

261

👁 YouTube To Transcript avatar

YouTube To Transcript

hexa-api/youtube-to-transcript

Extract YouTube transcripts from public video URLs

👁 User avatar

Hexa API

5.0

URL: https://apify.com/jungle_synthesizer/cbs-60-minutes-transcripts-scraper