👁 Reddit Scraper — Detect pain points, leads, emerging trends avatar

Reddit Scraper — Detect pain points, leads, emerging trends

Pricing

$29.00/month + usage

👁 Reddit Scraper — Detect pain points, leads, emerging trends

Reddit Scraper — Detect pain points, leads, emerging trends

Scrape Reddit posts, comments, communities, and user profiles via URLs or keyword searches. Supports proxy rotation, flexible filters, custom field names, and automatic retries. Ideal for monitoring discussions, trend analysis, research, and large-scale data collection.

Pricing

$29.00/month + usage

Rating

5.0

(1)

Developer

👁 scraping automation

scraping automation

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

5 months ago

Last modified

Reddit Scraper

Reddit Scraper is a comprehensive Apify Actor that collects posts, comments, communities, user profiles, and leaderboards from Reddit. Each entity saved in the dataset uses differentiated output fields (entityType, headline, mediaBundle, communityTag, subscriberTotal, karmaPost, etc.) to facilitate integration and avoid conflicts with other data sources.

Main Features

🎯 Complete Coverage

Posts: Title, content, media, votes, comments, complete metadata
Comments: Complete tree structure, votes, depth, replies
Communities/Subreddits: Metadata, members, descriptions, icons
User Profiles: Karma, post/comment history, metadata
Leaderboards: Rankings of popular subreddits by category

🔧 Flexibility and Control

URL-based Scraping: Supports all Reddit formats (posts, users, communities, leaderboards, searches, multireddits)
Keyword-based Scraping: Automatic search with configurable scope (Posts or Communities & users)
Advanced Sorting: 5 options (relevance, hot, top, new, comments)
Temporal Filters: By hour, day, week, month, year
Granular Limits: 6 independent cap types (total, post, comment, community, profile, leaderboard)
Score Filters: Automatically exclude posts/comments with low scores
NSFW Filters: Option to exclude NSFW content
Absolute Date Filters: Filter by precise date range (dateFrom, dateTo)
Multireddits: Support for URLs combining multiple subreddits (e.g., /r/pics+funny)
Automatic Pagination: Collect more than 100 items by automatically paginating
Deduplication: Avoid duplicates within the same run

🛡️ Robustness and Reliability

Automatic Retry: Intelligent handling of 403/429 errors with proxy rotation
Flexible Proxy Configuration: Apify Proxy (residential/datacenter) or custom proxies
Automatic Fallback: Default URL if no input is provided (ideal for automated tests)
Debug Mode: Detailed logging for quick diagnostics
Configurable Concurrency: Adjust the number of parallel requests
Performance Metrics: Detailed statistics at end of run (items/sec, duration, applied filters, duplicates)

🎨 Customization

Light Extract Mode: Extract only essential fields (permalink, headline, textBody) - perfect for AI processing and minimal data needs
Differentiated Output Fields: Unique names to facilitate integration
Extend Result Function: Custom enrichment of each item
Output Format: JSON, CSV, XML, HTML, Excel via Apify interface

Main Input Parameters

Field	Type	Default	Description
`startLinks`	array	`[]`	Reddit URLs to crawl (posts, communities, users, leaderboards, searches). If empty and no `searchQueries`, automatically uses `/r/popular/` as fallback.
`searchQueries`	array<string>	`[]`	Keywords to run a Reddit search.
`searchScope`	enum	`posts`	`posts` or `communities` to target the search tab.
`sortOrder`	enum	`relevance`	`relevance`, `hot`, `top`, `new`, `comments` (5 available options).
`timeWindow`	enum	`all`	`all`, `hour`, `day`, `week`, `month`, `year` (for posts).
`totalItemCap`	integer	`100`	Global limit of items in the dataset.
`postCap`	integer	`50`	Maximum posts per subreddit/feed/user.
`commentCap`	integer	`25`	Maximum comments per post.
`communityCap`	integer	`25`	Maximum communities from leaderboards/searches.
`profileCap`	integer	`25`	Maximum user profiles from searches.
`leaderboardCap`	integer	`25`	Number of entries from `/subreddits/leaderboard`.
`scrollWaitSeconds`	integer	`30`	Wait delay between retries on 403/429 errors.
`maxConcurrency`	integer	`10`	Maximum number of parallel HTTP requests.
`useApifyProxy`	boolean	`true`	Enable Apify Proxy (recommended to avoid 403 errors).
`proxyConfiguration`	object	`{}`	Detailed proxy configuration (Apify or custom).
`extendResultFunction`	string	-	JavaScript function to enrich each item.
`debugLog`	boolean	`false`	Enable detailed logging for diagnostics.
`minScore`	integer	`null`	Minimum score to filter posts and comments. Items with lower scores will be excluded.
`includeNSFW`	boolean	`true`	When `false`, excludes NSFW posts and communities from results.
`logMetrics`	boolean	`true`	Displays performance statistics at end of run (items/sec, duration, errors, filters).
`enablePagination`	boolean	`false`	Enables automatic pagination to collect more than 100 items per listing.
`dateFrom`	string	`null`	Start date to filter items (ISO 8601 format, e.g., `2024-01-01T00:00:00Z`).
`dateTo`	string	`null`	End date to filter items (ISO 8601 format, e.g., `2024-12-31T23:59:59Z`).
`enableDeduplication`	boolean	`false`	Enables deduplication to avoid duplicates (by `entityId`) within the same run.
`lightExtract`	boolean	`false`	When `true`, only extracts minimal fields: `permalink`, `headline`, and `textBody`. Applies to posts and comments only. Perfect for AI processing.

Input Example

{
"startLinks":[
{"url":"https://www.reddit.com/r/worldnews/"},
{"url":"https://www.reddit.com/r/learnprogramming/comments/lp1hi4/is_webscraping_a_good_skill_to_learn_as_a_beginner/"},
{"url":"https://www.reddit.com/subreddits/leaderboard/"},
{"url":"https://www.reddit.com/r/pics+funny/"}
],
"searchQueries":["parrots"],
"searchScope":"communities",
"sortOrder":"new",
"timeWindow":"all",
"totalItemCap":20,
"postCap":10,
"commentCap":5,
"communityCap":15,
"leaderboardCap":25,
"maxConcurrency":10,
"scrollWaitSeconds":30,
"useApifyProxy":true,
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":["RESIDENTIAL"]
},
"debugLog":false,
"minScore":10,
"includeNSFW":false,
"logMetrics":true,
"enablePagination":true,
"dateFrom":"2024-01-01T00:00:00Z",
"dateTo":"2024-12-31T23:59:59Z",
"enableDeduplication":true,
"lightExtract":false
}

Use Cases

AI Processing: Use lightExtract: true to get clean, minimal data perfect for AI models, sentiment analysis, and NLP tasks
Brand Monitoring: Track discussions about your product or service
Trend Research: Identify popular topics by community
Sentiment Analysis: Collect comments for NLP analysis
Community Discovery: Explore leaderboards by category
Competitive Intelligence: Monitor competitor mentions
Academic Research: Collect data for social studies
Content Curation: Find relevant content by keywords

Example Output

Full Extract (default)

{
"entityType":"post",
"entityId":"t3_144w7sn",
"redditId":"144w7sn",
"permalink":"https://www.reddit.com/r/HonkaiStarRail/comments/144w7sn/my_luckiest_10x_pull_yet/",
"headline":"My Luckiest 10x Pull Yet",
"textBody":"URL: https://i.redd.it/yod3okjkgx4b1.jpg",
"mediaBundle":{
"primaryUrl":"https://i.redd.it/yod3okjkgx4b1.jpg",
"thumbnailUrl":"https://b.thumbs.redditmedia.com/lm9KxS4laQWgx4uOoioM3N7-tBK3GLPrxb9da2hGtjs.jpg",
"isVideo":false
},
"authorHandle":"YourKingLives",
"communityTag":"r/HonkaiStarRail",
"voteScore":1,
"commentTotal":0,
"createdAt":"2023-06-09T05:23:15.000Z",
"collectedAt":"2025-11-20T10:00:00.000Z"
}

Light Extract (`lightExtract: true`)

Perfect for AI processing, data analysis, or when you only need essential content:

{
"permalink":"https://www.reddit.com/r/HonkaiStarRail/comments/144w7sn/my_luckiest_10x_pull_yet/",
"headline":"My Luckiest 10x Pull Yet",
"textBody":"URL: https://i.redd.it/yod3okjkgx4b1.jpg"
}

Note: Light extract mode only applies to posts and comments. Communities and profiles are not included when lightExtract: true.

Quick Start

Open the actor in the Apify console
Configure input parameters (or use default values)
Click Start and wait for the run to complete
Download results from the Dataset tab (JSON, CSV, XML, HTML, Excel)

Note: If you don't provide startLinks or searchQueries, the actor automatically uses /r/popular/ as a starting point, ensuring a valid run even for automated tests.

Key Advantages

Differentiated Output Fields

Data is structured with unique field names (entityType, headline, mediaBundle, communityTag, subscriberTotal, karmaPost, etc.) to facilitate integration and avoid conflicts with other data sources.

Automatic Robustness

Automatic retry with proxy rotation on 403/429 errors
Intelligent rate limit handling
Automatic fallback if no input is provided
Debug mode for quick diagnostics

Advanced Configuration

Granular control with 6 independent cap types
Adjustable concurrency and delays
Complete support for Reddit leaderboards
5 sorting options (including "comments")
Automatic pagination to collect large volumes
Absolute date filters for precise historical analysis
Automatic deduplication to avoid duplicates

Technical Notes

extendResultFunction receives { data, page }; page is null because we use Reddit's JSON API.
extendResultFunction is not supported in light extract mode (data structure is intentionally minimal).
Always respect Reddit's usage rules and avoid unreasonable volumes.
Using Apify Proxy (residential recommended) is strongly advised to avoid 403 blocks.
When lightExtract: true, only posts and comments are extracted with minimal fields (permalink, headline, textBody). Communities and profiles are skipped.

Legal Disclaimer

Important: This Actor scrapes publicly available data from Reddit. By using this Actor, you acknowledge and agree to the following:

Reddit Terms of Service: You are responsible for complying with Reddit's Terms of Service and User Agreement. Reddit's ToS can be found at https://www.reddit.com/help/useragreement.
Rate Limiting: This Actor includes automatic retry logic and proxy rotation to handle rate limits. However, you must use reasonable request rates and avoid excessive scraping that could impact Reddit's servers.
Data Usage: The scraped data is for your personal or business use only. You must respect copyright, privacy rights, and any applicable data protection laws (such as GDPR, CCPA) when using the collected data.
No Warranty: This Actor is provided "as is" without any warranties. The developers are not responsible for any consequences arising from the use of this Actor, including but not limited to account bans, legal issues, or data inaccuracies.
User Responsibility: You are solely responsible for ensuring that your use of this Actor complies with all applicable laws and regulations in your jurisdiction. This includes respecting intellectual property rights, privacy laws, and terms of service of third-party platforms.
Prohibited Uses: Do not use this Actor to:
- Scrape private or restricted content
- Violate Reddit's API usage policies
- Collect personal information without consent
- Engage in any illegal activities

Recommendation: For production use, consider using Reddit's official API when possible, as it provides a more reliable and compliant way to access Reddit data.

👁 Reddit Scraper avatar

Reddit Scraper

automation-lab/reddit-scraper

Working Reddit scraper for public Reddit search, subreddit listings, posts, comments, and user profiles. No Reddit account or API key required.

👁 User avatar

Stas Persiianenko

1.6K

4.6

👁 🔥🔥Reddit Scraper ✅ 2$/1k for Post | Comments | Communities avatar

🔥🔥Reddit Scraper ✅ 2$/1k for Post | Comments | Communities

boneswill/reddit-scraper-2-1k-for-post-comments-communities

Reddit Scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats ⚡

👁 User avatar

succexx_DEV

237

4.9

👁 Reddit Archive Scraper avatar

Reddit Archive Scraper

benthepythondev/reddit-archive-scraper

Reddit Archive Scraper to extract years of historical Reddit posts and comments from the PullPush archive. Reddit's API caps subreddits at ~1000 posts; this Actor pulls months or years from many subreddits by date range and keyword. For historical backfill, research and AI datasets.

👁 User avatar

ben

👁 Universal Speech to Text Transcriber avatar

Universal Speech to Text Transcriber

tictechid/vanzi-universal-transcriber

Transcribe audio from videos stored on Google Drive, Dropbox, GitHub raw, OneDrive, Box, iCloud, AWS S3, GCS, Azure Blob, and Backblaze B2. Convert share links to direct downloads for fast, accurate transcripts with timestamps and easy API integration.

👁 User avatar

TicTech

147

5.0

👁 Video To Text avatar

Video To Text

truefetch/video-to-text

Transcribe videos from 1,000+ platforms to text — auto language detection, timestamps, subtitle file download, and translation to 100+ languages. No file uploads. $0.30 per video.

👁 User avatar

TrueFetch

246

4.9

👁 Best Youtube Transcripts Scraper avatar

Best Youtube Transcripts Scraper

scrape-creators/best-youtube-transcripts-scraper

Extract transcripts from YouTube videos. Simply enter video URLs. Get full text, timestamps, and metadata where available. Perfect for research, SEO, and content analysis.

👁 User avatar

Scrape Creators

1.7K

4.1

👁 Reddit Scraper - Posts, Comments & Users avatar

Reddit Scraper - Posts, Comments & Users

betterdevsscrape/reddit-scraper

Extract posts, comments, communities & user profiles from any subreddit at scale. Fetches all comments including hidden/collapsed ones. Breaks Reddit's 1000-post limit with date windowing. No login needed, no browser. $0.003 per result. Supports search, sorting, NSFW filtering & date filtering.

👁 User avatar

Better Devs Scrape

962

👁 Reddit Scraper Pro avatar

Reddit Scraper Pro

harshmaur/reddit-scraper-pro

Reddit Scraper Pro is a powerful, unlimited scraping for $20/mo for extracting data from Reddit. Scrape posts, users, comments, and communities with advanced search capabilities. Perfect for brand monitoring, trend tracking, and competitor research. Supports make, n8n integrations

👁 User avatar

Harsh Maur

2.5K

4.7

👁 YouTube Transcript & Subtitles Scraper API avatar

YouTube Transcript & Subtitles Scraper API

johnvc/YoutubeTranscripts

Scrape YouTube transcripts, subtitles, and captions in bulk, the cheapest pay-per-video YouTube transcript API on Apify. Callable from any MCP client (Claude, Cursor, ChatGPT). Supports YouTube videos, Shorts, and every URL format.

👁 User avatar

John

132

5.0

👁 Reddit Posts Scraper avatar

Reddit Posts Scraper

vulnv/reddit-posts-scraper

Unlimited Reddit web scraper to crawl posts, comments and subreddits without login.

👁 User avatar

VulnV

387

5.0

URL: https://apify.com/runtime/reddit-scraper