VOOZH about

URL: https://apify.com/devcake/reddit-search-scraper

โ‡ฑ Reddit Search Scraper 2026 ยท Apify


Pricing

from $1.00 / 1,000 results

Go to Apify Store

Reddit Search Scraper 2026

Find viral content ideas, and research niche communities. Extract Reddit posts, comments, and discussion data from search queries.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ devcake

devcake

Maintained by Community

Actor stats

1

Bookmarked

3

Total users

3

Monthly active users

8 days ago

Last modified

Share

Reddit Search Scraper

Scrapes Reddit search results, post details, and comments with one fast hybrid implementation.

Current Approach

  • Playwright opens Reddit first and handles the JavaScript challenge.
  • Search results are parsed from the warmed browser page.
  • httpcloak reuses the warmed browser cookies for fast HTML post-detail and comment requests.
  • Browser detail fallback is used only when HTTP detail/comment HTML is blocked or empty.
  • A single browser/context is reused across all queries in one run.
  • HTTP detail/comment fetches run with bounded concurrency via detail_workers.

The old .json endpoint implementation was removed because Reddit now blocks those paths in local residential-proxy tests.

Input

{
"queries":["python","openai"],
"max_posts_per_query":10,
"include_comments":true,
"max_comments_per_post":50,
"detail_workers":4,
"sort":"relevance",
"time":"all"
}
FieldDescription
queriesSearch terms. A single browser session is reused across the batch.
max_posts_per_queryNumber of posts to collect per query.
include_commentsInclude parsed comment bodies for each post.
max_comments_per_postMaximum comments to save per post.
detail_workersConcurrent HTTP detail/comment workers before browser fallback.
sortReddit search sort: relevance, hot, new, top, or comments.
timeReddit time filter: all, hour, day, week, month, or year.

Output

Each dataset item is a post row with metadata and optional comments:

{
"dataType":"post",
"id":"abc123",
"subreddit":"python",
"author":"example_user",
"title":"Example Reddit post",
"selftext":"Post body",
"url":"https://www.reddit.com/r/python/comments/abc123/example/",
"permalink":"https://www.reddit.com/r/python/comments/abc123/example/",
"score":42,
"num_comments":18,
"created_iso":"2026-06-23T12:00:00+00:00",
"search_query":"python",
"search_rank":1,
"source":{
"search":"browser_dom",
"detail":"http_html",
"comments":"http_html"
},
"comments":[
{
"dataType":"comment",
"id":"t1_example",
"author":"commenter",
"body":"Comment body",
"score":5,
"created_iso":"2026-06-23T12:30:00+00:00",
"depth":0,
"permalink":"https://www.reddit.com/r/python/comments/abc123/comment/example/"
}
]
}

Run stats are written to the default key-value store as RUN_STATS.

Local Smoke Test

Set REDDIT_PROXY_URL to a residential proxy URL, then run:

python3 reddit_fast_hybrid.py -q python,openai \
-n2-p1-c --max-comments 20 --detail-workers 4\
-o /tmp/reddit_fast_hybrid.jsonl

Actor entrypoint local test:

$APIFY_INPUT_FILE=INPUT.json python3 main.py

You might also like

Reddit Posts Search Scraper

scrapapi/reddit-posts-search-scraper