👁 Substack Scraper - posts, comments & authors avatar

Substack Scraper - posts, comments & authors

Pricing

from $4.00 / 1,000 posts

👁 Substack Scraper - posts, comments & authors

Substack Scraper - posts, comments & authors

Scrape Substack newsletters at scale: full post archives with article text, comments, author profiles, and publication stats like subscriber counts. Works with any Substack URL or custom domain. Fast API-based scraping with no browser, pay per result. Export to CSV, JSON, Excel, or API.

Pricing

from $4.00 / 1,000 posts

Rating

5.0

(2)

Developer

👁 Doggo

Doggo

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

7 days ago

Last modified

Substack Scraper

Scrape any Substack newsletter, post, author, or comment — fast, cheap, and at scale.

This Apify actor extracts structured data from Substack publications via their public JSON API. No browser, no JavaScript rendering, no login required. Built for newsletter research, content monitoring, author discovery, competitive intelligence, and LLM training datasets.

What you can scrape

Substack posts — title, subtitle, full HTML and plain-text body, word count, publish date, tags, cover image, paywall status, reactions, comment count, restacks
Substack publications — name, subdomain, custom domain, description, logo, category, language, subscriber count (when public), founding plan
Substack authors — profile, handle, bio, photo, the publications they write for, the publications they subscribe to
Substack comments — full nested comment threads, author handles, publish dates, reactions, reply depth

Works with any Substack URL: https://*.substack.com, custom domains (https://stratechery.com), individual post URLs, https://substack.com/@handle author profiles, and https://open.substack.com/pub/... share links.

Why use this Substack scraper

Pay only for data, not for browser time — no Playwright, no rendering overhead, no per-minute compute billing. You pay per result, and failed requests are never charged.
Full archives, not just the front page — paginates through the entire publication archive until the very first post.
Clean, typed output — one dataset with a type field (post / publication / author / comment) and per-type table views, so you can export straight to BI tools, CSV, JSON, Excel, or Google Sheets.
No duplicates, no surprises — every post is delivered exactly once, limits are enforced even across platform restarts, and proxy rotation is handled for you.

Common use cases

Newsletter research — download the full archive of a competitor's Substack for content analysis, topic clustering, or SEO research
Content monitoring — schedule a daily run with maxPostsPerPublication: 5 to capture new posts from a tracked list of newsletters and pipe to Slack or email
Author discovery and lead generation — crawl author profiles to map who writes for which publications, then export handles for outreach
LLM training data — bulk-extract long-form Substack content (with word counts and metadata) for fine-tuning datasets
Competitive intelligence — track subscriber counts, post frequency, paywall strategy, and engagement metrics (reactions, comments, restacks) across a competitor set
Academic and journalism research — gather statements, essays, and commentary from Substack writers with citable timestamps
Archiving and backup — export your own Substack publication before a migration

Input

Field	Type	Default	Description
`startUrls`	array of URLs	—	Substack publication, post, or author URLs. Leave empty only when using Discovery mode
`mode`	`posts` / `publication`	`posts`	What to pull for each publication URL
`maxPostsPerPublication`	integer	50	Cap per publication. `0` = entire archive. Lower = cheaper
`includeContent`	boolean	true	Fetch each post's full HTML body
`includeComments`	boolean	false	Fetch comments for each post (each comment is a separate result)
`onlyFreePosts`	boolean	false	Skip paid / subscriber-only posts in archives
`searchQuery`	string	—	Filter the publication archive by keyword
`discoveryMode`	`none` / `leaderboard` / `search`	`none`	Auto-discover many publications without providing URLs
`discoveryQuery`	string	—	Keyword for `search` discovery
`maxPublicationsToDiscover`	integer	25	Cap on discovered publications. Lower = cheaper
`maxConcurrency`	integer	5	Parallel requests

Discovery mode — scrape many publications without a list

If you don't have a list of specific newsletters, turn on Discovery mode and the actor will find publications for you:

Top publications (leaderboard) — seeds from 5 curated top Substacks and expands through each publication's recommendations until the limit is hit
Search (search) — same expansion, plus your discoveryQuery keyword filters every discovered publication's archive

Each discovered publication is then scraped using the same mode / maxPostsPerPublication settings as startUrls, so you can go from zero URLs to a full corpus in one run. Discovery is off by default — a discovery run scrapes many publications and produces a correspondingly large dataset.

{
"discoveryMode":"search",
"discoveryQuery":"AI",
"maxPublicationsToDiscover":50,
"mode":"posts",
"maxPostsPerPublication":20,
"includeContent":true
}

Example input

{
"startUrls":[
{"url":"https://www.thefitzwilliam.com"},
{"url":"https://noahpinion.substack.com"},
{"url":"https://substack.com/@mattyglesias"}
],
"mode":"posts",
"maxPostsPerPublication":100,
"includeContent":true,
"includeComments":false
}

Output

All records land in the run's dataset with a type discriminator (post, publication, author, comment). The Output tab offers per-type table views (Posts, Publications, Authors, Comments); for exports, filter on the type field to split record types into separate files.

Post record

{
"type":"post",
"id":123456,
"title":"Why newsletters won",
"slug":"why-newsletters-won",
"url":"https://example.substack.com/p/why-newsletters-won",
"publication":"example",
"publicationName":"The Example",
"publishedAt":"2026-02-01T14:00:00Z",
"audience":"everyone",
"isPaid":false,
"author":"Jane Author",
"authors":[{"id":99,"name":"Jane Author","handle":"janeauthor"}],
"bodyHtml":"<p>...</p>",
"bodyText":"...",
"wordcount":1842,
"reactionCount":213,
"commentCount":42,
"restacks":18,
"postTags":["media","business"]
}

Publication record

{
"type":"publication",
"id":42,
"name":"The Example",
"subdomain":"example",
"customDomain":null,
"url":"https://example.substack.com",
"description":"A newsletter about newsletters.",
"categoryName":"Business",
"totalSubscribers":48211,
"paidSubscribers":1203,
"createdAt":"2022-06-14T09:12:00Z"
}

Author record

{
"type":"author",
"id":99,
"name":"Jane Author",
"handle":"janeauthor",
"profileUrl":"https://substack.com/@janeauthor",
"bio":"Writing about media.",
"photoUrl":"https://.../photo.jpg",
"publications":[{"publicationName":"The Example","subdomain":"example","role":"admin"}],
"subscriptions":[{"publicationName":"Noahpinion","subdomain":"noahpinion"}]
}

Comment record

{
"type":"comment",
"id":55512,
"postId":123456,
"postSlug":"why-newsletters-won",
"postTitle":"Why newsletters won",
"publication":"example",
"parentId":null,
"depth":0,
"body":"Great piece.",
"authorName":"A Reader",
"authorHandle":"areader",
"publishedAt":"2026-02-01T16:30:00Z",
"reactionCount":4
}

How to scrape Substack (step-by-step)

Click "Try for free" at the top of this page — you'll be taken to the Apify console.
Paste your target URLs into the Start URLs field. Examples:
- A publication: https://stratechery.com or https://noahpinion.substack.com
- A single post: https://example.substack.com/p/some-post
- An author profile: https://substack.com/@handle
- A share link: https://open.substack.com/pub/astralcodexten/p/some-post
Set maxPostsPerPublication — start with 10 for a test, then bump it (or set 0 for the whole archive).
Click "Start". When the run completes, open the Output tab to browse results or hit Export for CSV / JSON / Excel.

FAQ

How am I charged? Per record in your results — each post, publication, author, and comment counts as one result. Failed or retried requests are never charged, and you'll never receive the same post twice. Control your bill with maxPostsPerPublication, includeComments, and maxPublicationsToDiscover; you can also set a maximum budget for any run in the Apify Console.

Does it scrape paywalled posts? Paid posts are listed with metadata and the free preview text; full paid bodies require a subscriber login, which this scraper does not use. Enable onlyFreePosts to skip them entirely.

How many comments will a post produce? Whatever the thread holds — popular posts can carry hundreds of comments, each delivered (and charged) as its own result. Leave includeComments off unless you need them.

Will it get blocked? No setup needed on your side — proxy rotation, retries, and rate-limit handling are built in.

Can I schedule it? Yes — use Apify Schedules for daily/weekly monitoring runs, and connect the dataset to Google Sheets, webhooks, or the API for delivery.

👁 Substack Scraper - Newsletters, Posts & Authors avatar

Substack Scraper - Newsletters, Posts & Authors

logiover/substack-newsletter-scraper

Substack API alternative: scrape newsletters, posts & authors without login. Export Substack data to CSV/JSON. No key, no proxy.

👁 User avatar

Logiover

👁 Substack Scraper avatar

Substack Scraper

automation-lab/substack-scraper

Scrape Substack newsletters — posts, comments, publication metadata. Full archive depth with no caps. Export to JSON, CSV, Excel, or connect via API.

👁 User avatar

Stas Persiianenko

193

👁 Substack Newsletter Scraper avatar

Substack Newsletter Scraper

dataharvest/substack-scraper

Scrape Substack newsletters, posts and comments.

👁 User avatar

Alex v

👁 Substack Scraper avatar

Substack Scraper

scraper_guru/substack-scraper

Extract complete data from Substack newsletters including posts, authors, engagement metrics, and article text. 13 fields per post. Fast and reliable.

👁 User avatar

LIAICHI MUSTAPHA

2.6

👁 Substack Scraper: Newsletter Posts, Archives & Subscribers avatar

Substack Scraper: Newsletter Posts, Archives & Subscribers

perconey/substack-scraper

Scrape any Substack publication: full post archive, single post detail with body, comment counts, reactions, paid/free audience, podcast metadata. No auth, no proxies, no cookies. Uses Substack official JSON API. Pay only per result.

👁 User avatar

Perconey

👁 Substack Publications Scraper 📚 avatar

Substack Publications Scraper 📚

easyapi/substack-publications-scraper

Scrape detailed publication information from Substack based on keywords. Get comprehensive data about newsletters, authors, subscriber counts, and publication metrics in structured JSON format.

👁 User avatar

EasyApi

1.8

👁 Substack Scraper - Download Newsletter Content Fast avatar

Substack Scraper - Download Newsletter Content Fast

stanvanrooy6/substack-scraper

Substack scraper for newsletters. Extract posts with titles, dates, authors, tags, and reactions.

👁 User avatar

Stan Van Rooy

👁 Substack Email Scraper avatar

Substack Email Scraper

scrapapi/substack-email-scraper

👁 User avatar

ScrapAPI

👁 Substack Profile Scraper avatar

Substack Profile Scraper

getdataforme/substack-profile-scraper

The Substack Profile Scraper efficiently extracts detailed data from Substack profiles and posts for analysis, research, and content aggregation....

👁 User avatar

GetDataForMe

👁 Substack Scraper — Posts, Authors & Newsletters avatar

Substack Scraper — Posts, Authors & Newsletters

cryptosignals/substack-scraper

Extract Substack newsletter content. Get post titles, authors, publish dates, paywall status, subscriber counts, and full article text. Ideal for newsletter research and content monitoring. PPE pricing — pay only for results.

👁 User avatar

Web Data Labs

URL: https://apify.com/doggo/substack-scraper-posts-comments-authors

⇱ Substack Scraper – Extract Posts, Authors & Newsletters · Apify

Substack Scraper - posts, comments & authors

Substack Scraper

What you can scrape

Why use this Substack scraper

Common use cases

Input

Discovery mode — scrape many publications without a list

Example input

Output

Post record

Publication record

Author record

Comment record

How to scrape Substack (step-by-step)

FAQ

You might also like

Substack Scraper - Newsletters, Posts & Authors

Substack Scraper

Substack Newsletter Scraper

Substack Scraper

Substack Scraper: Newsletter Posts, Archives & Subscribers

Substack Publications Scraper 📚

Substack Scraper - Download Newsletter Content Fast

Substack Email Scraper

Substack Profile Scraper

Substack Scraper — Posts, Authors & Newsletters