VOOZH about

URL: https://apify.com/cryptosignals/substack-scraper

โ‡ฑ Substack Newsletter Scraper โ€” Posts, Authors & Stats ยท Apify


๐Ÿ‘ Substack Scraper โ€” Posts, Authors & Newsletters avatar

Substack Scraper โ€” Posts, Authors & Newsletters

Pricing

$5.00 / 1,000 result scrapeds

Go to Apify Store

Substack Scraper โ€” Posts, Authors & Newsletters

Extract Substack newsletter content. Get post titles, authors, publish dates, paywall status, subscriber counts, and full article text. Ideal for newsletter research and content monitoring. PPE pricing โ€” pay only for results.

Pricing

$5.00 / 1,000 result scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ Web Data Labs

Web Data Labs

Maintained by Community

Actor stats

0

Bookmarked

27

Total users

7

Monthly active users

a month ago

Last modified

Share

Substack Scraper โ€” Posts, Comments & Publication Data

Extract structured data from any Substack newsletter at scale. Scrape posts with full article text, reader comments, and publication metadata โ€” no login required. Export to JSON, CSV, or Excel with a single click.

Why Use This Scraper?

Substack has grown into one of the most important platforms for independent journalism, thought leadership, and niche expertise. With over 35 million active subscriptions and 17,000+ paid writers, it's a goldmine for researchers, marketers, and analysts โ€” but Substack offers no bulk export or public API.

This actor solves that. It programmatically extracts posts, comments, and publication info from any Substack newsletter, giving you clean, structured data ready for analysis.

Key Features

  • Three scrape modes: Posts, comments, and publication info
  • Search across Substack: Find posts by keyword across the entire platform
  • Publication-specific scraping: Target one or more newsletters by subdomain
  • Full article text: Optionally include the complete body text of each post
  • Flexible sorting: Sort by newest or top-performing posts
  • Scale control: Scrape from 1 to 500 items per run
  • No authentication needed: Works without any Substack account
  • Multiple export formats: JSON, CSV, Excel, XML, HTML

Use Cases

1. Content Research & Competitive Analysis

Track what topics are trending across newsletters in your industry. Monitor competitors' publishing frequency, engagement, and content strategy.

2. Media Monitoring & PR Intelligence

Set up regular scrapes to track mentions of your brand, product, or industry across Substack newsletters. Stay ahead of narratives before they hit mainstream media.

3. Academic & Market Research

Collect large datasets of expert opinion pieces, industry analysis, and commentary for qualitative research. Study how narratives form and spread through independent media.

4. Newsletter Discovery & Curation

Search for newsletters covering specific topics, then scrape their publication info to evaluate subscriber counts, posting cadence, and content quality.

5. Sentiment & Trend Analysis

Extract posts about specific topics or companies, then run NLP or sentiment analysis on the text. Detect shifts in expert opinion over time.

6. Lead Generation for B2B

Find Substack authors writing about your domain and extract their publication details. These are high-value contacts who are actively engaged in your space.

7. Content Repurposing & Summarization

Pull posts from newsletters you subscribe to and feed them into LLMs for summarization, translation, or content repurposing workflows.

Input Parameters

ParameterTypeRequiredDefaultDescription
publicationsArray of stringsNoโ€”Substack subdomains to scrape (e.g., platformer for platformer.substack.com)
searchQueryStringNoโ€”Search keyword to find posts across all of Substack
scrapeTypeStringNopostsWhat to scrape: posts, comments, or info
maxItemsIntegerNo50Maximum items to return (1โ€“500)
sortByStringNonewSort order: new (newest first) or top (most popular)
includeBodyTextBooleanNofalseInclude the full body text of each post

Tip: Use publications to target specific newsletters, or searchQuery to search across the entire platform. You can combine both.

Sample Output

Posts Output

{
"title":"The AI Trust Crisis",
"subtitle":"Why users are losing faith in AI-generated content",
"slug":"the-ai-trust-crisis",
"publishedAt":"2026-03-01T10:30:00.000Z",
"canonicalUrl":"https://platformer.substack.com/p/the-ai-trust-crisis",
"author":"Casey Newton",
"publicationName":"Platformer",
"publicationSubdomain":"platformer",
"likes":847,
"comments":132,
"wordCount":2450,
"isPaywalled":false,
"previewText":"The past month has brought a reckoning for AI companies...",
"coverImage":"https://substackcdn.com/image/fetch/...",
"tags":["AI","trust","technology"]
}

Comments Output

{
"body":"This is exactly what I've been seeing in my industry...",
"author":"John Reader",
"date":"2026-03-01T14:22:00.000Z",
"likes":23,
"postTitle":"The AI Trust Crisis",
"publicationSubdomain":"platformer"
}

Publication Info Output

{
"name":"Platformer",
"subdomain":"platformer",
"description":"Tech and democracy coverage",
"authorName":"Casey Newton",
"heroImage":"https://substackcdn.com/image/fetch/...",
"logoUrl":"https://substackcdn.com/image/fetch/...",
"themeColor":"#FF6719",
"subscriberCount":250000,
"postCount":1200
}

Integration Examples

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run_input ={
"publications":["platformer","thebrowser"],
"scrapeType":"posts",
"maxItems":50,
"sortBy":"new",
"includeBodyText":True,
}
run = client.actor("cryptosignals/substack-scraper").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['title']} โ€” {item.get('likes',0)} likes")

Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token:'YOUR_API_TOKEN'});
const input ={
publications:["platformer","thebrowser"],
scrapeType:"posts",
maxItems:50,
sortBy:"new",
includeBodyText:true,
};
const run =await client.actor("cryptosignals/substack-scraper").call(input);
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item=>{
console.log(`${item.title} โ€” ${item.likes ||0} likes`);
});

Using the Apify API Directly

curl-X POST "https://api.apify.com/v2/acts/cryptosignals~substack-scraper/runs?token=YOUR_API_TOKEN"\
-H"Content-Type: application/json"\
-d'{
"publications": ["platformer"],
"scrapeType": "posts",
"maxItems": 20
}'

Pricing & Costs

This actor runs on the Apify platform using your account's compute units (CUs).

ScenarioEstimated Cost
50 posts from one publication~$0.01โ€“$0.02
200 posts from multiple publications~$0.05โ€“$0.10
500 posts with full body text~$0.10โ€“$0.25

Costs depend on the number of items, whether body text is included (larger payloads), and the Apify plan you're on. Free plan users get $5/month in platform credits โ€” enough for hundreds of scrapes.

Tips for Best Results

  1. Start small: Set maxItems to 5โ€“10 for your first run to verify the output format meets your needs.
  2. Use publication subdomains: For platformer.substack.com, enter just platformer in the publications list.
  3. Enable body text selectively: Full article text significantly increases output size. Only enable it when you need the content for analysis.
  4. Combine with Apify integrations: Send results directly to Google Sheets, Slack, Zapier, Make, or webhooks for automated workflows.
  5. Schedule regular runs: Set up recurring scrapes to build longitudinal datasets or monitor newsletters over time.

Frequently Asked Questions

Can I scrape paywalled/subscriber-only posts?

The scraper extracts publicly available data. For paywalled posts, you'll get the title, preview text, metadata, and publication info, but not the full subscriber-only content.

How do I find a publication's subdomain?

Look at the newsletter URL. For https://platformer.substack.com, the subdomain is platformer. For custom domains, check the Substack about page.

Can I scrape custom domain Substack newsletters?

Yes. Use the publication's original Substack subdomain (before they switched to a custom domain). You can usually find it referenced on their about page or through a web search.

How often is the data updated?

Every run fetches live data directly from Substack. You always get the latest posts, comments, and metrics.

Is there a rate limit?

The scraper handles rate limiting automatically with built-in delays and retries. You don't need to configure anything.

Can I search for posts about a specific topic?

Yes! Use the searchQuery parameter to search across all of Substack, or combine it with publications to search within specific newsletters.

What export formats are available?

Apify supports JSON, CSV, Excel (XLSX), XML, HTML, and RSS. You can download in any format from the dataset tab after a run completes.

How do I integrate this with my existing workflow?

Use Apify's built-in integrations (Zapier, Make, Google Sheets, webhooks) or call the API directly from any programming language. See the code examples above.

Can I run this on a schedule?

Yes. Apify supports cron-like scheduling. Set up daily, weekly, or custom schedules from the actor's Schedules tab. Each run stores results in a new dataset.

What happens if a publication doesn't exist?

The scraper will log a warning for invalid subdomains and continue processing the remaining publications. Your run won't fail because of one bad input.

You might also like

Substack Newsletter Scraper

digispruce/substack-scraper

Extract comprehensive Substack newsletter data including author profiles, subscriber counts, social media links, and contact information for B2B outreach and market research.

Substack Scraper

scraper_guru/substack-scraper

Extract complete data from Substack newsletters including posts, authors, engagement metrics, and article text. 13 fields per post. Fast and reliable.

๐Ÿ‘ User avatar

LIAICHI MUSTAPHA

43

2.6

Substack Leaderboard Scraper ๐Ÿ“Š

easyapi/substack-leaderboard-scraper

Scrape detailed publication data from Substack leaderboards. Get comprehensive insights about top newsletters including subscriber counts, pricing, author details, and more. Perfect for newsletter research and market analysis.

Substack Scraper

qpayre/substack-scraper

The Substack Author Scraper is a powerful Apify actor that makes it easy for content creators to scrape and retrieve all posts from their favorite Substack authors. With structured data presented in a user-friendly format, analyzing and processing valuable information has never been easier.

Substack Scraper | All-In-One

fatihtahta/substack-scraper

Get full articles, user profiles, and search results with All-in-One Substack Scraper. Extract rich data including titles, bios, subscriber counts, social links and engagement metrics. ideal for market research, creator discovery, trend tracking, and audience analysis.

136

Substack Posts Scraper ๐Ÿ“š

easyapi/substack-posts-scraper

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

YouTube Video Details Scraper

maged120/youtube-video-details

Extract full metadata from any YouTube video or Short โ€” title, views, likes, comments, subtitles, chapters, tags, and more. No YouTube API key needed.

Substack Scraper

automation-lab/substack-scraper

Scrape Substack newsletters โ€” posts, comments, publication metadata. Full archive depth with no caps. Export to JSON, CSV, Excel, or connect via API.

๐Ÿ‘ User avatar

Stas Persiianenko

189

Substack Notes Scraper ๐Ÿ”

easyapi/substack-notes-scraper

Extract notes and comments from Substack's search results with images, user info, and engagement metrics. Perfect for content analysis, user research, and tracking discussions around specific topics on Substack.

Substack Scraper - Download Newsletter Content Fast

stanvanrooy6/substack-scraper

Substack scraper for newsletters. Extract posts with titles, dates, authors, tags, and reactions.

31