Reddit Search Scraper โ Posts, Comments & Users
Pricing
from $2.00 / 1,000 results
Reddit Search Scraper โ Posts, Comments & Users
Scrape Reddit subreddit search with no API key or login. Export posts and comments to CSV/JSON โ a Reddit API alternative for keyword monitoring.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
26
Total users
20
Monthly active users
4 days ago
Last modified
Categories
Share
Reddit Search Scraper
Search within any subreddit by keyword + sort + time window, with no login or API key. Returns each matching post (or comment) with title, author, subreddit, body text, permalink and timestamps.
How it works (important)
As of mid-2026 Reddit hard-blocks the legacy search.json API (both www and old.reddit return 403, even over residential proxies with a browser fingerprint). The only logged-out search endpoint still served is the subreddit-scoped Atom feed (/r/{sub}/search.rss).
This actor uses that feed. Consequences you should know before buying:
- A
subredditis required on every search. All-of-Reddit (global) search has no working logged-out endpoint anymore and is skipped with a warning. - ~25 results per search, no pagination. The feed returns at most ~25 of the most relevant items per query and exposes no cursor. To get more coverage, run more searches (different keywords / subreddits / sort+time combos).
- No numeric signals. The RSS feed does not carry
score,upvoteRatio,numComments, orawardsโ those lived only on the now-dead.jsonAPI. They are not returned. If you need upvotes/comment counts, this is not the right tool.
Features
- Bulk multi-search: pass many
{query, subreddit, sort, time, type}objects, each runs independently - All sort modes the feed honors:
relevance,hot,top,new,comments - All time windows:
hour,day,week,month,year,all type:link(posts, default) orcomment- Residential-proxy session rotation + UA rotation + 429/403 backoff for reliable feed fetches
- Clean, deduplicated rows with decoded text (handles Reddit's double-encoded entities)
Input
{"searches":[{"query":"ai agent","subreddit":"MachineLearning","sort":"new","time":"month","type":"link"},{"query":"side hustle","subreddit":"Entrepreneur","sort":"top","time":"all","type":"link"}],"maxResultsPerSearch":25}
| Field | Type | Default | Notes |
|---|---|---|---|
searches | array | (required) | Each {query, subreddit (required), sort?, time?, type?} |
maxResultsPerSearch | int | 25 | Cap per search. Feed returns ~25 max, so higher values have no effect. |
proxyConfig | object | residential | Reddit blocks datacenter IPs; residential proxy is used by default. |
Sort + time + type
| Field | Options |
|---|---|
sort | relevance, hot, top, new, comments |
time | hour, day, week, month, year, all |
type | link (posts, default), comment |
Output (one row per result)
{"resultType":"link","id":"1twtdob","fullname":"t3_1twtdob","subreddit":"MachineLearning","author":"Intellerce","title":"We built a source-available LLM reliability library","text":"TL;DR: Reliability techniques that boost an LLM's correctness...","url":"https://www.reddit.com/r/MachineLearning/comments/1twtdob/...","permalink":"https://www.reddit.com/r/MachineLearning/comments/1twtdob/...","createdAt":"2026-06-04T16:51:29+00:00","editedAt":"2026-06-04T16:51:29+00:00","searchQuery":{"query":"ai agent","subreddit":"MachineLearning","sort":"new","time":"month","type":"link"},"scrapedAt":"2026-06-06T17:59:00.000Z"}
The output object also contains
score,upvoteRatio,numComments,awards,flair,thumbnail,isNsfwand similar fields for schema stability, but they are alwaysnullin RSS mode (the feed does not carry them). Do not rely on them.
Use cases
- Brand / keyword monitoring inside specific communities (run on a schedule)
- Competitor & topic intel in niche subreddits
- Trend research / content discovery โ pull
sort=top, time=weekper subreddit - Sentiment & NLP pipelines โ bulk-ingest post/comment text
Notes
- Residential proxy recommended/used: Reddit blocks Apify datacenter IPs. The actor defaults to the residential pool and rotates sessions on 403/429.
- Want more than ~25 per topic? Add more
searchesentries (vary keyword, sort, and time window) โ that is the only way to widen coverage given the feed's cap. - Need scores/comment counts or full comment threads? Reddit no longer exposes these to logged-out clients; use a dedicated OAuth-based Reddit tool instead.
FAQ
Is this a Reddit API alternative for searching subreddits?
Yes. Reddit's logged-out search.json API is hard-blocked as of mid-2026, so this acts as a no-API-key way to search any subreddit by keyword. It returns posts and comments from the subreddit-scoped feed, with a ~25-result cap per search.
How do I export Reddit posts and comments to CSV or JSON?
Run a search and Apify stores the matching rows in a dataset you can download as CSV or JSON (or pull via API). Each row carries title, author, subreddit, text, permalink and timestamps โ ready for a spreadsheet or NLP pipeline.
Can I scrape Reddit without an API or login?
Yes โ no login, OAuth, or developer app is required. The actor reads Reddit's public subreddit search feed over a residential proxy, so it works without a Reddit account or API credentials.
๐ Changelog
2026-06-15
- Reliability pass: re-verified end-to-end on live data with real-world inputs. Routine maintenance build.
2026-06-07
- ๐ Docs: added coverage for using the actor as a Reddit API alternative, exporting Reddit posts/comments to CSV/JSON, and scraping Reddit without an API key or login.
2026-06-06
- ๐ Docs & schema accuracy pass: README now reflects the RSS-only reality (subreddit required, ~25/search cap, no score/comments). Removed always-null
score/numCommentscolumns from the dataset table; added the populatedtextcolumn.
2026-06-05
- ๐ก๏ธ Reliability fix: results no longer dropped by strict output validation โ runs complete cleanly.
2026-06-04
- Verified live & refreshed build โ reliability/maintenance pass.
