VOOZH about

URL: https://apify.com/alex_claw/reddit-scraper

โ‡ฑ Reddit Scraper ยท Apify


Pricing

Pay per usage

Go to Apify Store

Pricing

Pay per usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Alex Claw

Alex Claw

Maintained by Community

Actor stats

0

Bookmarked

30

Total users

6

Monthly active users

4 months ago

Last modified

Share

Scrape Reddit subreddits for posts, comments, scores, awards, and engagement metrics. No API key or Reddit account required.

Features

  • No API key needed -- uses Reddit's public JSON API
  • Multiple sort options -- hot, new, top (with time filters), rising
  • Post metadata -- title, author, score, upvote ratio, flair, awards, NSFW/spoiler flags
  • Comments -- optionally fetch comment trees with depth, scores, and reply counts
  • Pagination -- automatically pages through results up to your specified limit
  • Multi-subreddit -- scrape multiple subreddits in a single run
  • Rate-limit handling -- built-in delays and exponential backoff on 429 responses
  • Proxy support -- optional proxy configuration for large-scale scraping

Input

ParameterTypeDefaultDescription
subredditsarrayrequiredSubreddit names (e.g., ["python", "programming"]). No r/ prefix needed.
maxPostsPerSubredditinteger100Max posts to scrape per subreddit (1-5000)
sortstring"hot"Sort order: hot, new, top, rising
topTimeFilterstring"day"Time filter for top sort: hour, day, week, month, year, all
includeCommentsbooleanfalseFetch comments for each post (increases run time)
maxCommentsPerPostinteger50Max comments per post (1-500, only used when includeComments is true)
proxyConfigurationobjectnoneProxy settings for requests

Example Input

{
"subreddits":["python","programming","learnpython"],
"maxPostsPerSubreddit":50,
"sort":"top",
"topTimeFilter":"week",
"includeComments":true,
"maxCommentsPerPost":20
}

Output

Each post is saved as a dataset item:

{
"subreddit":"python",
"postId":"abc123",
"title":"What's the best Python web framework in 2026?",
"author":"pythonista42",
"score":1234,
"upvoteRatio":0.95,
"numComments":45,
"createdUtc":"2026-02-24T10:00:00+00:00",
"selfText":"I've been comparing Django, FastAPI, and...",
"url":"https://www.reddit.com/r/python/comments/abc123/...",
"permalink":"https://www.reddit.com/r/python/comments/abc123/...",
"isVideo":false,
"thumbnail":"self",
"flair":"Discussion",
"awards":3,
"postUrl":"https://www.reddit.com/r/python/comments/abc123/...",
"domain":"self.python",
"isNsfw":false,
"isSpoiler":false,
"isStickied":false,
"comments":[
{
"commentId":"xyz789",
"author":"webdev99",
"body":"FastAPI for APIs, Django for full-stack...",
"score":567,
"createdUtc":"2026-02-24T10:30:00+00:00",
"depth":0,
"repliesCount":12,
"isStickied":false,
"awards":1
}
]
}

When includeComments is false, the comments field is omitted.

How It Works

This actor uses Reddit's public JSON API, which is available by appending .json to any Reddit URL:

  • Subreddit listings: https://www.reddit.com/r/{subreddit}/{sort}.json
  • Post comments: https://www.reddit.com/r/{subreddit}/comments/{post_id}.json

No authentication is required. The actor uses a descriptive User-Agent header as recommended by Reddit's API guidelines.

Use Cases

  • Market research -- monitor discussions about your product, competitors, or industry
  • Content analysis -- find trending topics, popular content formats, engagement patterns
  • Sentiment analysis -- collect posts and comments for NLP/sentiment pipelines
  • Lead generation -- find users asking questions your product solves
  • Academic research -- collect public discourse data for analysis
  • SEO research -- discover what topics generate high engagement in your niche

Pricing

Pay per result: $2.00 per 1,000 posts scraped (comments included at no extra cost).

Important: Proxy Required

Reddit aggressively blocks datacenter IPs. Residential proxy is recommended for reliable scraping. Configure proxy in the input:

{
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":["RESIDENTIAL"]
}
}

Limitations

  • Proxy recommended โ€” Reddit blocks most datacenter IPs with 403 errors
  • Only works with public subreddits (private/quarantined subreddits are not accessible)
  • Reddit's pagination caps at approximately 1,000 posts per listing
  • Rate limiting: the actor respects Reddit's rate limits with built-in delays
  • Some posts/comments from deleted or suspended users may show [deleted]

You might also like