VOOZH about

URL: https://apify.com/alwaysprimedev/reddit-scraper

โ‡ฑ Reddit Scraper ยท Apify


Pricing

from $2.50 / 1,000 posts

Go to Apify Store

Scrape Reddit posts, threads, and comments from any subreddit, search, or user โ€” clean structured JSON, fast.

Pricing

from $2.50 / 1,000 posts

Rating

0.0

(0)

Developer

๐Ÿ‘ Always Prime

Always Prime

Maintained by Community

Actor stats

1

Bookmarked

20

Total users

8

Monthly active users

a month ago

Last modified

Share

๐Ÿš€ Reddit Scraper โ€” every post, comment & thread, as clean JSON

๐Ÿ‘ Apify
๐Ÿ‘ Python
๐Ÿ‘ Output: JSON ยท CSV ยท Excel

Pull structured Reddit data at speed โ€” posts, comments, scores, flairs, awards, media, timestamps. No login. No code. No babysitting.

๐Ÿ  Subreddits ยท ๐Ÿ” Keyword search ยท ๐Ÿ‘ค User submissions/comments ยท ๐Ÿ”— Custom URLs โ€” all four sources, one input form.


โšก๏ธ Why this scraper

  • ๐ŸŽฏ 50+ fields per post โ€” full title and body, score breakdown, upvote ratio, flair, awards, removal status, media URLs, edit timestamps. Nothing dropped on the floor.
  • ๐Ÿ’ฌ Comment threads on demand โ€” flip one switch and get the full comment tree per post, threaded via parent_id and depth.
  • ๐Ÿš„ Fast โ€” ~3 posts/second steady-state on default settings; ~250ms median per detail fetch.
  • ๐Ÿง  Smart pagination โ€” stops the moment your Max items budget is reached. Never over-fetches, never wastes Apify Compute Units.
  • ๐Ÿ” Incremental mode โ€” pass a since timestamp and only get posts newer than your last run. Perfect for daily monitoring jobs.
  • ๐Ÿ›ก๏ธ Built-in failure budget โ€” if Reddit starts pushing back (challenges, hard 4xx), the actor aborts cleanly instead of burning through your CU on a broken extractor.
  • ๐Ÿ“Š Three export formats out of the box โ€” JSON, CSV, Excel. Direct download links from the run page.

๐Ÿš€ Quick start

  1. Click Try for free (top-right). No code, no API key.
  2. Pick a search type โ€” Subreddit, Search, User, or paste your own URLs.
  3. Hit Start and let it run.
  4. Download as JSON / CSV / Excel from the run page.

๐Ÿ“ฅ Input

FieldTypeDescription
What to scrape (searchType)enumsubreddit ยท search ยท user ยท urls
Subreddits (subreddits)string liste.g. python, programming (no r/ prefix)
Search query (query)stringKeywords. Reddit operators work: author:, subreddit:, self:yes, flair:.
Users (users)string listUsernames to scrape (no u/ prefix)
User content type (userContent)enumsubmitted (posts) or comments
Sort by (sortBy)enumhot ยท new ยท top ยท rising ยท controversial ยท relevance ยท comments
Time window (time)enumhour ยท day ยท week ยท month ยท year ยท all (only matters for top/controversial)
Max items (maxItems)intStop after N posts. 0 = unlimited. Default 50.
Scrape comments (scrapeComments)boolFetch the comment tree for every post. Default off (cheaper for indexing).
Max comments per post (commentDepth)intCap on comments per post (BFS). Default 200.
Only posts newer than (since)datetimeISO 8601 cutoff for incremental runs.
Concurrency (concurrency)intParallel fetches. Default 5, max 25.
Start URLs (startUrls)string listAdvanced override โ€” paste any reddit URLs and ignore the search-type builder.

๐Ÿ“ฆ Sample output

One record per post โ€” flat, JSON-friendly, ready to load into BigQuery / Postgres / pandas.

{
"id":"1t3x7ba",
"fullname":"t3_1t3x7ba",
"url":"https://www.reddit.com/r/Python/comments/1t3x7ba/whos_going_to_pycon_us_next_week/",
"subreddit":"Python",
"subreddit_prefixed":"r/Python",
"subreddit_id":"t5_2qh0y",
"title":"Who's going to PyCon US next week?",
"selftext":"Me โœ‹ I hope to see a good number of you all in Long Beach, too! ...",
"is_self":true,
"domain":"self.Python",
"post_hint":"self",
"link_url":null,
"author":"Loren-PSF",
"author_fullname":"t2_so0s40st",
"author_flair_text":":pythonLogo: Python Software Foundation Staff",
"distinguished":null,
"score":46,
"ups":46,
"upvote_ratio":0.91,
"num_comments":35,
"num_crossposts":0,
"total_awards_received":0,
"gilded":0,
"over_18":false,
"spoiler":false,
"locked":false,
"stickied":true,
"archived":false,
"is_video":false,
"is_original_content":false,
"link_flair_text":"Discussion",
"link_flair_css_class":"discussion",
"link_flair_background_color":"#f50057",
"thumbnail":null,
"preview_image_url":"https://external-preview.redd.it/FBtD3iI-OdRHdmfJbVushiwzLeMcmgTx-Ff3FnwUUg0.jpeg",
"video_url":null,
"removed_by_category":null,
"removal_reason":null,
"created_at":"2026-05-04T22:40:29+00:00",
"edited_at":null,
"scraped_at":"2026-05-09T13:43:47+00:00",
"comments":[
{
"id":"myz2pn1",
"parent_id":"t3_1t3x7ba",
"depth":0,
"author":"vintagegeek",
"body":"I'll be there with bells on. Looking forward to meeting people!",
"score":19,
"is_submitter":false,
"stickied":false,
"permalink":"https://www.reddit.com/r/Python/comments/1t3x7ba/.../myz2pn1/",
"created_at":"2026-05-04T23:01:14+00:00",
"edited_at":null
}
],
"comments_count_scraped":35
}

๐Ÿ’ก Use cases

WhoWhat for
๐Ÿ“ˆ Market researchersTrack sentiment, competitor mentions and product feedback across niche subreddits.
๐Ÿค– AI / ML teamsBuild training corpora from focused subreddits โ€” clean text, threading preserved.
๐Ÿ“ฐ Journalists & analystsMonitor breaking-story subreddits and surface trending discussions for coverage.
๐Ÿ’ผ Brand / community managersFind unanswered support questions about your product across Reddit, on a daily cron.
๐Ÿท๏ธ Recruiters & talent intelPull discussions in tech-job subreddits to track skill demand and salary chatter.
๐Ÿง‘โ€๐Ÿ”ฌ Academic researchersPublic-discourse datasets for sociolinguistics, network analysis, opinion mining.

๐Ÿงฐ Tips & tricks

  • ๐Ÿชถ Index-first, hydrate later. Run with scrapeComments: false and maxItems: 0 to cheaply enumerate everything. Then a second run with startUrls and scrapeComments: true only on the posts you care about.
  • โฑ๏ธ Daily diffs. Save the timestamp of your last successful run, then pass it as since next time. The actor short-circuits old posts before fetching them.
  • ๐ŸŽ›๏ธ Subreddit-scoped search. Set searchType: search, fill query, and add subreddits to subreddits โ€” the actor automatically scopes search to those subreddits.
  • ๐Ÿ”— Mix custom URLs. Drop any reddit.com/... URL into startUrls (a thread, a multireddit, a sort variant) โ€” the actor strips/appends .json itself.

โ“ FAQ

Does it need a Reddit account? No.

What about the new Reddit API limits? This actor doesn't use Reddit's Data API, so the post-2023 commercial pricing tiers don't apply.

Can I scrape NSFW subreddits? Yes. NSFW posts are returned with over_18: true so you can filter downstream.

Will it get all comments on a huge thread? Up to your commentDepth cap (default 200, max 5000), breadth-first across the tree. For Reddit's truly massive megathreads (>10K comments), Reddit itself paginates and not every comment is reachable in one fetch โ€” that's a Reddit limitation, not the scraper's.

What if a post is deleted while scraping? Deleted posts come through with author: "[deleted]", selftext: "[deleted]", and removed_by_category: "deleted". They're not skipped โ€” you get the metadata Reddit still surfaces.

How fresh is the data? Real-time. Each record carries a scraped_at UTC timestamp.


๐Ÿ“… Changelog

0.1 (initial release)

  • Subreddit, search, user, and start-URL modes
  • Configurable comment-tree scraping with depth cap
  • Incremental since filter, maxItems cap, dedup, failure budget
  • JSON / CSV / Excel exports

โš–๏ธ Legal

This scraper accesses Reddit through public, non-authenticated requests. Reddit's robots.txt disallows automated crawling, and Reddit's User Agreement and Public Content Policy restrict automated/commercial use of Reddit content. By using this scraper you take on responsibility for the legality of your specific use case in your jurisdiction (including GDPR / CCPA where applicable). The scraper does not bypass authentication, paywalls, or technical access controls. Use it for research, journalism, internal analytics, ML/AI training datasets, or other lawful purposes โ€” and confirm that those purposes are compatible with Reddit's policies and any applicable law before running large-scale jobs. Personal data scraped from Reddit (usernames, comment bodies, flair) may constitute PII under GDPR even though usernames are pseudonymous; treat the output dataset accordingly.

You might also like

Reddit Scraper

gio21/reddit-scraper

Scrape Reddit posts and comments from any subreddit. Extract titles, scores, authors, comments, and more using Reddit's public JSON API.

Reddit Scraper - Posts, Comments & Subreddits

viralanalyzer/reddit-scraper

Extract Reddit posts, comments, subreddit data, and user profiles.

27

5.0

Reddit Scraper

automation-lab/reddit-scraper

Working Reddit scraper for public Reddit search, subreddit listings, posts, comments, and user profiles. No Reddit account or API key required.

๐Ÿ‘ User avatar

Stas Persiianenko

1.6K

4.6

Fast Reddit Scraper

timgreen/fast-reddit-scraper

Extract Reddit posts and comments from any subreddit or search query. Fast, reliable Reddit scraping with detailed metadata including upvotes, timestamps, and nested comment threads.

225

1.0

Reddit Search Scraper โ€” Posts, Comments & Users

logiover/reddit-search-scraper

Scrape Reddit subreddit search with no API key or login. Export posts and comments to CSV/JSON โ€” a Reddit API alternative for keyword monitoring.

Reddit User Posts Scraper

scrapearchitect/reddit-user-posts-scraper

Reddit User Posts Scraper

๐Ÿ‘ User avatar

Scrape Architect

2

๐Ÿ”ฅ๐Ÿ”ฅ Reddit Scraper | URL or Search | Post, users, subreddit

braveleads/reddit-scraper

Pull ๐Ÿ”ฅ posts, comments, communities, and user profiles from any public Reddit URL or search

12

5.0