VOOZH about

URL: https://apify.com/qaseemiqbal/reddit-intelligence-scraper

⇱ Reddit Scraper for Posts, Comments & Users Β· Apify


πŸ‘ Reddit Intelligence Scraper avatar

Reddit Intelligence Scraper

Under maintenance

Pricing

from $2.00 / 1,000 results

Go to Apify Store

Reddit Intelligence Scraper

Under maintenance

Collect public Reddit posts, comments, communities, and user profile data from searches, subreddit pages, Reddit URLs, and usernames. Export clean datasets for monitoring, research, and AI workflows.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

πŸ‘ Muhammad Qaseem Iqbal

Muhammad Qaseem Iqbal

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

18 hours ago

Last modified

Categories

Share

πŸš€ Reddit Intelligence Scraper

Collect public Reddit posts, comments, communities, and user profile data from searches, subreddit pages, Reddit URLs, and usernames. πŸ”Ž Use it to monitor conversations, research customer opinions, follow trends, and export clean Reddit data into spreadsheets, dashboards, databases, AI workflows, or automation tools. πŸ“Š

This Actor is designed to be practical for both non-technical users and data teams. βœ… You can start with a keyword or Reddit URL, choose how many results you want, and download the results from the Apify dataset when the run finishes. πŸ“₯

🧠 What does this Actor do?

Reddit Intelligence Scraper turns public Reddit pages into structured data. 🧾 Instead of manually copying posts and comments from Reddit, you can run the Actor and get organized records with useful details such as:

  • πŸ“ post title, body, author, subreddit, score, comment count, and URL
  • πŸ’¬ comment text, author, parent post, score, depth, and timestamp
  • 🏘️ subreddit/community name, description, subscriber count, and metadata
  • πŸ‘€ public user profile information, including karma and profile URL
  • 🏷️ optional sentiment labels, content categories, engagement metrics, media links, and raw payloads

No Reddit API key, OAuth setup, or Reddit login is required for supported public pages. πŸ”“

🎯 Common use cases

  • πŸ“£ Track brand, product, or competitor mentions on Reddit
  • πŸ“… Monitor subreddit discussions on a schedule
  • πŸ’‘ Find customer pain points, feature requests, complaints, and praise
  • 🧡 Collect comments from a specific Reddit thread
  • πŸ”¬ Research topics, communities, trends, and market language
  • πŸ€– Build datasets for AI search, RAG, clustering, dashboards, or reports
  • πŸ“€ Export Reddit data to CSV, Excel, Google Sheets, Make, Zapier, n8n, webhooks, or your own API workflow

πŸ“¦ What Reddit data can it collect?

Data typeWhat you can collect
πŸ“ PostsSearch results, subreddit listings, direct post URLs, user submitted posts, r/all, and r/popular
πŸ’¬ CommentsComment search results and comment threads under posts when comment collection is enabled
🏘️ CommunitiesSubreddit metadata and community search results
πŸ‘€ UsersPublic Reddit user profile records and optional user activity inputs

The Actor works with several input styles, so you can start broad with keywords or stay precise with direct Reddit URLs. 🧭

⚑ How to scrape Reddit on Apify

  1. πŸ–₯️ Open the Actor in Apify Console.
  2. βž• Add at least one source:
    • πŸ”Ž keywords in Search terms
    • πŸ”— Reddit links in Direct Reddit URLs
    • 🏘️ subreddit names or URLs in Full subreddit scrape inputs
    • πŸ‘€ Reddit usernames or profile URLs in User profile inputs
  3. 🎚️ Set a result limit, such as maxItems.
  4. βš™οΈ Choose whether to include comments, media links, sentiment, or other optional data.
  5. ▢️ Click Start.
  6. πŸ“₯ Download the results from the Dataset tab as JSON, CSV, Excel, XML, or RSS.

For a quick test, use a small limit such as maxItems: 10. πŸ§ͺ For scheduled monitoring, keep the limit modest and run the Actor repeatedly. πŸ“…

πŸŽ›οΈ Input options

You only need one valid source to start. βœ… The most important fields are below.

FieldPlain-English meaningTypical use
πŸ”Ž searchTermsKeywords or phrases to search across RedditBrand monitoring, topic research, competitor tracking
πŸ”— startUrlsDirect Reddit URLsScrape a specific post, subreddit, user page, or Reddit search URL
🏘️ subredditUrlsSubreddit names or URLsCollect posts from communities such as r/startups
πŸ‘€ userUrlsReddit usernames or profile URLsCollect public user profile information
🎚️ maxItemsMaximum total records to saveKeep tests and production runs under control
πŸ’¬ crawlCommentsPerPostAlso collect comments under each collected postThread research, sentiment, FAQ mining
🧡 maxCommentsPerPostComment limit for each postPrevent very large threads from growing too much
🧭 sort and timeReddit search ranking and time windowNewest posts, top posts this week, most commented posts, etc.
πŸ“ withinCommunitySearch only inside one subredditSearch for a topic within a specific community
πŸ–ΌοΈ includeMediaLinksSave image, video, gallery, and outbound link detailsMedia analysis or content discovery
😊 sentimentAnalysisAdd simple sentiment labels to posts and commentsPositive, negative, neutral, mixed, or uncertain
🏷️ contentAnalysisAdd topic/category labels to post recordsRouting, grouping, research, and AI workflows
πŸ›‘οΈ proxyConfigurationOptional Apify Proxy settingsUse Residential proxy when Reddit blocks cloud traffic

Advanced settings are available for date filters, comment depth, strict keyword matching, output style, raw data storage, and run reports. 🧰

πŸ§ͺ Example inputs

πŸ”Ž 1. Quick keyword search

Use this when you want a small sample of recent posts for a topic. ⚑

{
"searchTerms":["AI video generator"],
"sort":"new",
"time":"week",
"maxItems":25,
"maxPostsPerSearch":25
}

πŸ“£ 2. Brand and competitor monitoring

Use this to track mentions and include comments found through Reddit comment search. πŸ“‘

{
"searchTerms":["Acme AI","Acme pricing","Acme alternative"],
"searchPosts":true,
"searchComments":true,
"sort":"new",
"time":"week",
"maxItems":150,
"maxPostsPerSearch":50,
"maxCommentsCount":50,
"sentimentAnalysis":true
}

🏘️ 3. Scrape a subreddit

Use this to collect posts from one or more communities. 🧭

{
"subredditUrls":["r/startups"],
"subredditSort":"new",
"subredditTime":"month",
"maxItems":100,
"maxPostsPerSubreddit":100
}

🧡 4. Collect a full post thread

Use this when you already know the Reddit post URL and want the discussion under it. πŸ’¬

{
"startUrls":[
{
"url":"https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/"
}
],
"crawlCommentsPerPost":true,
"maxCommentsPerPost":500,
"commentDepthLimit":0
}

πŸ’Έ 5. Low-cost test run

Use this before a larger run to confirm your input works. βœ…

{
"searchTerms":["customer support software"],
"maxItems":10,
"maxPostsPerSearch":10,
"crawlCommentsPerPost":false,
"includeMediaLinks":false,
"saveRawData":false,
"writeHtmlReport":false
}

πŸ“€ Output

Results are saved to the default Apify dataset. πŸ“Š Each dataset item is one record.

Possible record types:

  • πŸ“ post
  • πŸ’¬ comment
  • 🏘️ community
  • πŸ‘€ user

Every record includes basic tracking fields such as: 🧾

FieldMeaning
🧩 kindType of record: post, comment, community, or user
πŸ†” idReddit item ID
πŸ”— urlMain Reddit URL for the item
βœ… canonicalUrlNormalized Reddit URL where available
⏱️ scrapedAtWhen the Actor collected the record
πŸ“ sourceWhich input produced the record
πŸ” sourcesOther inputs that found the same record, when duplicates are merged

πŸ“ Example post output

{
"kind":"post",
"id":"1hvoazn",
"url":"https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"title":"My best cheesecake so far",
"author":"example_user",
"subreddit":"Baking",
"createdAt":"2025-01-07T10:09:56.000Z",
"score":3489,
"numComments":43,
"mediaType":"gallery",
"hasMedia":true,
"sentimentLabel":"positive",
"contentCategoryLabel":"Food & Drink"
}

The exact fields depend on the record type and the options you enable. βš™οΈ

πŸ“‹ Run summary

At the end of a run, the Actor writes RUN-SUMMARY.json to the key-value store. 🧾 This file is useful when you want a quick overview without opening the full dataset.

The summary includes:

  • πŸ”’ total records saved
  • πŸ“¦ records by type
  • πŸ”Ž query and subreddit breakdowns
  • ⏭️ skipped items and why they were skipped
  • πŸ“ˆ request statistics
  • ⚠️ warnings and errors
  • πŸ†” IDs of the output dataset and key-value store

If you enable writeHtmlReport, the Actor can also create a simple HTML report called RUN-MAP.html. πŸ—ΊοΈ

πŸ’Έ Cost and performance tips

This Actor is configured to keep costs low by default. βœ…

  • πŸ›‘οΈ Residential proxy is enabled by default because Reddit currently blocks direct Apify cloud traffic.
  • 🏠 For the cheapest successful tests, keep runs small and use direct Reddit URLs first.
  • 🎚️ Result limits are conservative by default.
  • πŸ” Request retries are disabled by default to avoid paying for repeated failed requests.
  • πŸ“ Raw data, media details, awards, and HTML reports are off by default.
  • πŸ’¬ Comments are only collected when you enable comment collection.

To keep runs cheap:

  • πŸ§ͺ start with maxItems between 10 and 100
  • πŸ’¬ keep crawlCommentsPerPost off unless you need thread-level discussion
  • πŸ“¦ keep saveRawData off unless you are debugging
  • πŸ—ΊοΈ keep writeHtmlReport off unless you need a visual report
  • πŸ”­ avoid maximizeCoverage unless recall matters more than speed and cost
  • πŸ›‘οΈ disable proxy only if direct access works for your run environment

πŸ’³ Store pricing

This Actor is designed for simple pay-per-result pricing on Apify Store. 🧾

Recommended paid events:

EventWhat it means
πŸš€ apify-actor-startA very small startup event charged automatically by Apify
πŸ“¦ apify-default-dataset-itemOne saved dataset record, such as a post, comment, community, or user

This keeps pricing easy to predict: the more records you save, the more you pay. Apify shows the run cost before and during execution, and you can control spend by setting maxItems, comment limits, and other result caps. 🎚️

πŸ“… Scheduling and integrations

You can schedule this Actor in Apify Console to monitor Reddit regularly. ⏰ For example:

  • ⚑ every hour for fast-moving brand monitoring
  • πŸ“† once per day for subreddit tracking
  • πŸ“Š once per week for market research exports

After each run, you can send the dataset to:

  • πŸ“— Google Sheets
  • 🧩 Make
  • ⚑ Zapier
  • πŸ”„ n8n
  • πŸͺ webhooks
  • ☁️ cloud storage
  • πŸ—„οΈ databases and warehouses
  • πŸ”Œ custom applications through the Apify API

⚠️ Important notes and limitations

Reddit controls how much public data is available through its pages and listings. πŸ“Œ This affects all Reddit scrapers, not only this Actor.

  • πŸ”’ Some private, restricted, quarantined, deleted, removed, or login-gated content cannot be collected.
  • πŸͺŸ Reddit search and subreddit listings may expose only a limited window of results.
  • πŸ•°οΈ Very old posts may require narrower keywords, different sort options, or direct URLs.
  • 🚧 Reddit may rate limit or block traffic from cloud networks or proxies.
  • ❌ If every Reddit request is blocked, the Actor fails the run instead of silently returning an empty successful dataset.
  • βš™οΈ This version is HTTP-first and does not use a browser fallback.

If a run is blocked by Reddit, try a smaller run first, reduce concurrency and request rate, try a direct post URL, use different inputs, or run again later. πŸ§ͺ Residential proxy settings are often the most reliable cloud option for Reddit, but they can increase cost and are not guaranteed to bypass every Reddit-side block. πŸ›‘οΈ

❓ FAQ

βš–οΈ Is Reddit scraping legal?

Scraping public Reddit data can be allowed in many cases, but you are responsible for how you collect, store, and use the data. πŸ›‘οΈ Always follow Reddit's terms, applicable laws, privacy rules, and the rules of any downstream platform where you use the data.

πŸ”‘ Do I need a Reddit account or API key?

No. βœ… This Actor is built for supported public Reddit pages and does not require a Reddit login or Reddit API key.

πŸ’¬ Can it scrape comments?

Yes. βœ… Enable crawlCommentsPerPost to collect comments under posts. You can control the amount with maxCommentsPerPost and commentDepthLimit.

πŸ”— Can I scrape a specific Reddit post?

Yes. βœ… Add the post URL to startUrls. If you also want the comments, enable crawlCommentsPerPost.

🏘️ Can I scrape a whole subreddit?

Yes. βœ… Add a subreddit name such as r/startups or a full subreddit URL to subredditUrls. You can choose sorting options such as new, hot, top, rising, or most commented.

πŸ“‰ Why did I get fewer results than expected?

Common reasons include Reddit result limits, strict filters, date filters, duplicate removal, deleted or unavailable items, or Reddit blocking the request. πŸ” Check RUN-SUMMARY.json for warnings, errors, and skip counts.

πŸͺŸ Why can't I always get more than about 1,000 posts from a subreddit or search?

Reddit lists are not unlimited. πŸ“Œ Search pages and subreddit feeds often stop after a practical result window. To find more unique posts, try narrower keywords, different time windows, different sort options, or direct Reddit URLs.

πŸ›‘οΈ Do I need proxies?

On Apify cloud, usually yes. πŸ›‘οΈ Reddit is currently blocking direct cloud requests in our tests, while the RESIDENTIAL proxy group succeeded. Residential proxy traffic can increase cost, so keep test runs small and lower maxItems while testing.

πŸ“€ Can I export the results?

Yes. βœ… Apify datasets can be exported as JSON, CSV, Excel, XML, RSS, or accessed through the Apify API.

πŸ€– Can I use the data with AI tools?

Yes. βœ… The output is structured JSON, which makes it suitable for AI search, summarization, clustering, dashboards, and RAG workflows. Make sure your use of the data follows applicable privacy and platform rules.

πŸ›‘οΈ Responsible use

Use this Actor only for public Reddit data that you are allowed to collect and process. βœ… Do not use it to collect private, login-gated, sensitive, or harmful personal data. πŸ”’ Avoid publishing datasets in a way that exposes individuals unfairly or outside the purpose for which the data was collected.

🧰 Support

If something does not work as expected, include:

  • πŸ†” the Apify run ID
  • πŸ“₯ your input JSON
  • πŸ“‹ the RUN-SUMMARY.json file
  • πŸ“ a short description of what you expected and what happened

This makes it much easier to diagnose blocked requests, empty datasets, input mistakes, and result-limit questions. πŸ”

You might also like

Reddit Api Scraper

scraper-engine/reddit-api-scraper

Extract posts, comments, subreddit data, and user insights from Reddit using the Reddit API Scraper. Collect titles, scores, authors, timestamps, and full discussions. Ideal for market research, sentiment analysis, trend monitoring, and building datasets from Reddit communities.

πŸ‘ User avatar

Scraper Engine

2

Reddit Scraper

automation-lab/reddit-scraper

Working Reddit scraper for public Reddit search, subreddit listings, posts, comments, and user profiles. No Reddit account or API key required.

πŸ‘ User avatar

Stas Persiianenko

1.6K

4.6

Reddit Scraper

alwaysprimedev/reddit-scraper

Scrape Reddit posts, threads, and comments from any subreddit, search, or user β€” clean structured JSON, fast.

18

Reddit Api Scraper

scrapio/reddit-api-scraper

Extract structured Reddit data with the Reddit API Scraper. Collect posts, comments, usernames, upvotes, subreddit names, and timestamps directly through the Reddit API. Ideal for market research, sentiment analysis, and community monitoring.