Cheap Reddit Scraper - Posts, Comments, Scores & Subreddits
Pricing
Pay per usage
Cheap Reddit Scraper - Posts, Comments, Scores & Subreddits
Cheapest Reddit scraper: posts, comments, scores & subreddits. 100% free, no subscription, no API key. Works in Claude, ChatGPT & any MCP-compatible AI agent.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
3 days ago
Last modified
Categories
Share
Reddit Scraper β Pay Per Result, Deep History, Full Comment Trees
Scrape Reddit posts, comments, subreddits, search results, and user histories β without a monthly rental fee. This actor charges only for posts actually scraped, handles deep historical backfill through Reddit's full pagination API, and delivers nested comment trees that most alternatives silently truncate.
Keywords: reddit scraper, reddit api, scrape reddit posts, reddit comments scraper, subreddit scraper, reddit data, reddit post scraper
Why This Beats the Alternatives
The most-used Reddit scrapers on the Apify Store have a consistent set of problems:
| Problem with incumbents | This actor |
|---|---|
| Monthly rental ($45+/mo) regardless of usage | Pay only for what you scrape β $2.50 per 1,000 posts |
| Comment trees silently truncated or broken | Full recursive comment trees up to configurable depth |
| Hard ceiling of ~1,000 posts per run | Paginate as deep as Reddit allows via after cursor |
| Charged even when a run returns zero results | Zero charge on empty runs β actor fails with explanation |
| No way to resume or do historical backfill | Expose the after cursor so you can resume exactly where you left off |
| Opaque errors | Every error pushed as a typed item in the dataset; summary always present |
The two-year track record of 2.7-star reviews on competing actors comes down to these four issues. This actor was built specifically to fix all of them.
Features
- 4 scraping modes: subreddit listings, Reddit-wide search, user post history, and specific post URLs
- Deep historical backfill: paginate through Reddit's full history using the
aftercursor β not capped at 100 or 1,000 posts - Cursor resume: the summary item always contains
last_after_cursorso you can continue a run exactly where it left off by passing it as input - Full nested comment trees: recursive extraction up to configurable depth (default 3), with
morestubs preserved so you know where truncation occurred - Honest billing: PPE (pay-per-event) β charged exactly once per successfully scraped post, never on fetch errors or empty runs
- Zero charge on empty runs: if no posts are scraped the actor calls
Actor.fail()with a plain-English reason β your balance is untouched - Graceful error handling: HTTP 429 with exponential backoff (5s / 10s / 20s), 404 and 403 handled per-resource, never crashes the whole run on a single bad post
- Responsible rate limiting: 1-second delay between page requests, proper
User-Agentheader on every request per Reddit's API requirements
Output Schema
Each post is one dataset item:
{"id":"abc123","name":"t3_abc123","subreddit":"MachineLearning","subreddit_id":"t5_2fwo","title":"GPT-4 vs Claude 3 β which wins on reasoning?","author":"username123","score":1842,"upvote_ratio":0.97,"url":"https://i.imgur.com/example.jpg","permalink":"https://www.reddit.com/r/MachineLearning/comments/abc123/...","selftext":"I ran 200 reasoning tasks on both...","is_self":true,"domain":"self.MachineLearning","flair":"Discussion","num_comments":247,"created_utc":1717200000,"is_pinned":false,"is_locked":false,"awards_count":5,"scraped_at":"2026-06-05T10:00:00.000Z","comments":[{"id":"xyz789","author":"commenter_a","body":"Claude wins on multi-step reasoning, GPT-4 on coding.","score":312,"created_utc":1717201000,"is_submitter":false,"depth":0,"replies":[{"id":"pqr456","author":"commenter_b","body":"Agree, tested this last week.","score":88,"created_utc":1717201500,"is_submitter":false,"depth":1,"replies":[]}]}]}
comments is only present when includeComments=true. Deleted posts have author: "[deleted]" and selftext: "[removed]" β exactly as Reddit returns them.
At the end of every run, a summary item is pushed:
{"_type":"summary","posts_requested":200,"posts_scraped":197,"posts_failed":3,"subreddits_processed":2,"charged_for":197,"last_after_cursor":"t3_xyz999","scraped_at":"2026-06-05T10:04:22.000Z"}
Pass last_after_cursor back in as the after input field on your next run to continue from exactly that point β this is the deep historical backfill feature.
Pricing
$2.50 per 1,000 posts scraped.
- No monthly fee, no minimum, no rental
- Charged per-event (PPE) β only on successful scrapes
- Zero charge if the run returns no results
- A run scraping 250 posts costs $0.625
This is the standard Apify pay-per-result model. You pay for outcomes, not for time on the platform.
Use Cases
- Market research β scrape subreddit discussions to understand what customers say about your product or competitors, unsolicited and unfiltered
- Sentiment analysis β collect scored posts and comments to build training datasets or feed NLP pipelines
- AI training data β gather large volumes of human-written text across diverse topics for fine-tuning or RLHF
- Brand monitoring β track mentions of your brand or keywords across all of Reddit with scheduled runs
- Academic research β download historical threads with full comment trees for discourse analysis, misinformation studies, community dynamics
- Competitive intelligence β monitor competitor subreddits, feature request threads, and complaint patterns
Rate Limits & Responsible Use
Reddit's public JSON API does not require OAuth for read-only access but does require:
- A descriptive
User-Agentheader (included in every request) - No more than ~1 request per second (enforced by the 1s inter-page delay)
This actor targets Reddit's public JSON endpoints (*.reddit.com/*.json) rather than scraping rendered HTML. This is the same approach used by countless Reddit apps and is stable. If Reddit returns HTTP 429, the actor backs off automatically (5s β 10s β 20s) before retrying.
Do not use this actor to:
- Harvest personal data for advertising or resale
- Bypass Reddit's rate limits by running many parallel actors simultaneously
- Violate Reddit's User Agreement or Privacy Policy
Private subreddits, banned subreddits, and suspended users will return 403/404 β those runs still push error items explaining why, and you are not charged for them.
Input Reference
| Field | Type | Default | Description |
|---|---|---|---|
mode | enum | β | subreddit, search, user, or post |
subreddits | string[] | [] | Subreddit names (mode=subreddit) |
searchQuery | string | β | Search query (mode=search) |
username | string | β | Reddit username (mode=user) |
postUrls | string[] | [] | Full post URLs (mode=post) |
sortBy | enum | hot | hot, new, top, rising |
timeframe | enum | all | hour, day, week, month, year, all |
maxPosts | integer | 25 | Max posts to scrape (1β1000) |
includeComments | boolean | false | Fetch comment trees |
maxCommentsPerPost | integer | 100 | Max comments per post (1β500) |
maxDepth | integer | 3 | Max comment reply depth |
after | string | β | Pagination cursor for historical backfill resume |
skipPinnedPosts | boolean | false | Skip stickied/pinned posts |
