VOOZH about

URL: https://apify.com/scraply/reddit-posts-scraper

⇱ Reddit Posts Scraper · Apify


Pricing

$19.99/month + usage

Go to Apify Store

Reddit Posts Scraper

🔎 Reddit Posts Scraper pulls posts & comments from subreddits or users—titles, bodies, upvotes, score, flair, author, timestamps, links & media. 📊 Great for research, social listening, SEO, sentiment & trend analysis. ⚙️ Filters & keywords. 💾 Export CSV/JSON. 🚀

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

👁 Scraply

Scraply

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

2 months ago

Last modified

Share

Reddit Posts Scraper

Reddit Posts Scraper is a fast, reliable Reddit scraper that collects public posts (and optional comments) from subreddits, full Reddit URLs, or search keywords. It solves the pain of manual copying and API limits by returning clean, structured JSON ready for analysis. Built for marketers, developers, data analysts, and researchers, this subreddit scraper helps you scrape Reddit posts at scale for trend tracking, SEO research, social listening, and NLP pipelines. With parallel processing, smart retries, and proxy fallback, it enables high-volume Reddit thread scraping with production reliability. 🚀

What data / output can you get?

Below are the exact fields this Reddit web scraper pushes to the Apify dataset (one row per post):

Data typeDescriptionExample value
subredditCommunity name the post belongs to"technology"
titlePost title text"Open-source LLM hits new benchmark"
authorReddit username of the poster"u_datawizard"
scorePost score/upvotes842
num_commentsNumber of comments on the post126
created_utcUnix timestamp (UTC) when the post was created1703123456
permalinkFull permalink to the Reddit thread"https://www.reddit.com/r/technology/comments/abc123/example_post/"
bodySelftext/body content for text posts"Here’s a quick summary of the paper..."
thumbnail_urlThumbnail image URL (if any)"https://preview.redd.it/..."
image_urlMain media URL (if provided)"https://i.redd.it/xyz.png"
commentsArray of nested comments (author, body, score, created_utc, replies)[ { "author": "u_commenter", ... } ]
post_idUnique Reddit post ID"abc123"
successWhether this post was processed successfullytrue
error_messageError details if processing failednull

Notes:

  • Nested comments include replies with the same structure (author, body, score, created_utc, replies).
  • You can export results as JSON, CSV, or Excel from the Apify dataset UI or via the API.

Key features

  • ⚡ Bold-scale extraction Parallel comment fetching and batched processing to scrape subreddit posts efficiently across multiple sources.
  • 🧩 Flexible targeting Scrape Reddit posts by subreddit names, full Reddit URLs, or keywords in a single run—perfect for a Reddit thread scraper or Reddit post extractor.
  • 🔄 Sort & time filtering Choose hot, new, top, or rising with a time range for top/rising—ideal to scrape subreddit posts for trend snapshots.
  • 🛡️ Resilient proxy fallback Automatic fallback from no proxy → datacenter → residential if blocked, plus smart retries on 403/429/5xx and timeouts.
  • 💬 Optional comments Configure how many comments to fetch per post, or set zero to skip—streamlines both Reddit posts and Reddit comments scraper use cases.
  • 💾 Structured outputs Export-ready JSON for dashboards or to export Reddit posts to CSV; consistent fields for easy joins and analytics.
  • 🧑‍💻 Developer-friendly Built in Python (Python Reddit scraper) with Apify SDK—trigger via API and integrate with Make, Zapier, or n8n.
  • 🏗️ Production-ready reliability Request pacing, parallelism controls, and detailed logging ensure a stable Reddit scraping tool for large runs.

How to use Reddit Posts Scraper - step by step

  1. Create or log in to your Apify account.
  2. Open the Reddit Posts Scraper in Apify Console.
  3. Add your sources in “Reddit URLs / Subreddits / Keywords”:
    • Enter subreddit names (e.g., “news” or “r/technology”), full Reddit URLs, or search keywords (e.g., “artificial intelligence”). One per line.
  4. Set optional controls:
    • Sort order (hot, new, top, rising) and time filter (hour, day, week, month, year, all) for top/rising.
    • Limits for maximum posts and maximum comments per post.
    • Proxy configuration (recommended for larger volumes).
  5. Start the run and monitor logs as posts are collected and comments are processed in parallel.
  6. Download results:
    • Go to the Dataset in the Output tab to preview results and export to JSON, CSV, or Excel.
  7. Pro Tip: Trigger runs via the Apify API and pipe dataset URLs to your data stack or automation (n8n, Make, Zapier) for scheduled Reddit scraping without API credentials.

Use cases

Use caseDescription
Market & trend researchTrack trending topics by keyword or subreddit to quantify engagement and sentiment over time.
Content & SEO researchDiscover high-performing topics and questions to inform content calendars and SERP targeting.
Brand & competitor monitoringMonitor mentions across relevant communities and compare share of voice across subreddits.
NLP / ML datasetsCollect titles, bodies, and structured comment trees for training or evaluation datasets.
Academic & journalism researchCompile public quotes and discussions from Reddit threads for analysis and reporting.
Data pipelines & automationSchedule a Reddit scraping script via API, then export Reddit posts to CSV for ETL or BI dashboards.

Why choose Reddit Posts Scraper?

This Reddit scraping tool combines precision, automation, and reliability for large-scale, repeatable data collection.

  • ✅ Accurate, structured fields ready for analysis and modeling
  • 🌍 Keyword and subreddit targeting for broad or niche coverage
  • ⚙️ Scales from small tests to bulk runs with parallel processing
  • 🧑‍💻 API- and Python-friendly for developer workflows
  • 🛡️ Safer than brittle extensions—handles blocks with proxy fallback and retries
  • 💰 Cost-effective automation via Apify infrastructure and dataset exports
  • 🔌 Integrations-ready (n8n, Make, Zapier) for end-to-end pipelines

In short, a production-ready Reddit web scraper versus unstable alternatives—built for consistent data extraction at scale.

Is it legal / ethical to use Reddit Posts Scraper?

Yes—when done responsibly. This actor targets publicly available Reddit content and does not access private subreddits or authenticated data.

Guidelines for compliant use:

  • Scrape only public data and respect Reddit’s platform policies.
  • Do not misuse personal information found in public posts or comments.
  • Observe applicable data protection laws (e.g., GDPR, CCPA) in your jurisdiction.
  • Use proxy and rate controls to minimize load and reduce the likelihood of blocks.
  • Consult your legal team for edge cases or regulated workflows.

Input parameters & output format

Example input (JSON)

{
"startUrls":[
"https://www.reddit.com/r/news/",
"news",
"artificial intelligence"
],
"maxPosts":50,
"maxComments":100,
"sortOrder":"top",
"timeFilter":"week",
"proxyConfiguration":{"useApifyProxy":false}
}

Input fields

  • startUrls (array of strings, required)
    • Description: One per line — mix full Reddit URLs, subreddit names (e.g., news or r/news), or search keywords.
    • Default: None
  • maxPosts (integer)
    • Description: Max posts to scrape per subreddit or keyword (1–1000).
    • Default: 50
  • maxComments (integer)
    • Description: Max comments to fetch per post (0–1000). Set 0 to skip comments.
    • Default: 100
  • sortOrder (string; one of: hot, new, top, rising)
    • Description: How posts are ordered.
    • Default: top
  • timeFilter (string; one of: hour, day, week, month, year, all)
    • Description: Only applies when sortOrder is top or rising.
    • Default: week
  • proxyConfiguration (object)
    • Description: Choose proxies. If blocked, the actor falls back: no proxy → datacenter → residential.
    • Default: { "useApifyProxy": false }

Note:

  • The actor can also accept “startUrls” as a newline-separated string or as an array with URL objects via API; however, the Console form uses a string list.

Example output item (JSON)

{
"post_id":"abc123",
"title":"Example post title",
"author":"u_example",
"created_utc":1703123456,
"num_comments":42,
"score":156,
"permalink":"https://www.reddit.com/r/news/comments/abc123/example_post/",
"image_url":"https://i.redd.it/xyz.png",
"thumbnail_url":"https://preview.redd.it/xyz-thumb.jpg",
"body":"Post content...",
"comments":[
{
"author":"u_commenter1",
"body":"Top-level comment",
"score":23,
"created_utc":1703123499,
"replies":[
{
"author":"u_replier",
"body":"Nested reply",
"score":7,
"created_utc":1703123600,
"replies":[]
}
]
}
],
"subreddit":"news",
"success":true,
"error_message":null
}

Fields like image_url, thumbnail_url, and body may be empty when not present on the original post.

FAQ

Do I need a Reddit account or login to use this?

No. The actor collects publicly available Reddit data without requiring login or cookies. It fetches structured JSON from Reddit endpoints and processes it automatically.

Can it scrape comments as well as posts?

Yes. Set “Maximum comments per post” to a value greater than 0 to fetch comment threads; set it to 0 to skip comments for faster runs.

How many posts can I scrape per source?

You can set “Maximum posts per source” up to 1000. The total output depends on how many sources (subreddits, URLs, or keywords) you provide.

Does it work with proxies and handle blocks?

Yes. It automatically falls back through no proxy → datacenter → residential if Reddit blocks requests, and retries on 403/429, 5xx, timeouts, and connection/SSL issues.

Can I export results to CSV?

Yes. After the run, open the dataset in Apify and export to JSON, CSV, or Excel. You can also access results programmatically via the Apify API.

Is this a Python Reddit scraper I can integrate with my pipeline?

Yes. The actor is implemented in Python and is API-accessible, making it easy to integrate into ETL, analytics, or automation workflows (e.g., n8n, Make, Zapier).

Does it support sorting and time filters like “top this week”?

Yes. Choose hot, new, top, or rising. For top and rising, you can apply a time filter (hour, day, week, month, year, all).

What happens if some posts fail to process?

Each post reports a success flag and error_message if processing fails. The actor saves successful items as they’re scraped so you can still export partial results.

Final thoughts

Reddit Posts Scraper is built to scrape Reddit posts (and optional comments) from subreddits, URLs, or keywords with structured, export-ready output. With sort/time controls, scalable limits, and resilient proxy fallback, it’s ideal for marketers, researchers, analysts, and developers. Trigger it via API to power a Reddit API scraping workflow, connect to automation tools, or export Reddit posts to CSV for downstream analytics. Start extracting smarter Reddit insights—at scale and with confidence.

You might also like

Reddit Posts Scraper

scrapeengine/reddit-posts-scraper

🧰 Reddit Posts Scraper pulls structured Reddit data—titles, bodies, media, flair, author, subreddit, upvotes, comments, awards, dates & links—from subreddits, users, and searches. ⚙️ Exports JSON/CSV. 🚀 Ideal for market research, trend analysis, sentiment & content curation.

👁 User avatar

ScrapeEngine

3

Reddit Scraper

scraperx/reddit-scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments, authors, flair, upvotes & timestamps from subreddits and threads—fast, real-time & reliable. 📊 Perfect for social listening, market research, trend analysis & sentiment. ⚡ Clean JSON/CSV output. 🚀 API-ready.

Reddit Posts Scraper

scrapemesh/reddit-posts-scraper

🧰 Reddit Posts Scraper extracts Reddit post data by subreddit, keyword, or URL—titles, authors, flairs, scores, upvotes, comments, timestamps, links & media. 📊 Export CSV/JSON. 🔎 Perfect for trend tracking, sentiment analysis, content research & social listening. 🚀

2

Reddit Scraper — Posts, Search & Subreddits

automly/reddit-scraper

Scrape Reddit posts from subreddits or keyword search. Structured JSON with scores, comments, author data, and flair.

Reddit Scraper

scrapium/reddit-scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments & metadata from subreddits, users and threads — keywords, timestamps, scores & links. 📤 Export JSON/CSV. 🚀 Ideal for market research, social listening, academic studies & content discovery.

Reddit Posts Scraper

scrapium/reddit-posts-scraper

Scrape Reddit posts with ease 🧵👽 Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast 🚀

Reddit Scraper - Posts, Comments, Subreddits & Users

makework36/reddit-scraper

Fast, reliable Reddit scraper. Extract posts, comments, subreddits & users from any subreddit without Reddit API keys or login. AI-ready JSON for LLM training, sentiment analysis, lead generation. Export JSON/CSV/Excel.

👁 User avatar

deusex machine

113