Reddit Comment Scraper

Pricing

from $5.00 / 1,000 results

Try for free

Go to Apify Store

👁 Reddit Comment Scraper

Reddit Comment Scraper

Try for free

Scrape Reddit Comments from a post on Reddit. Provides comment text, the parent of the thread, score and timestamps.

Pricing

from $5.00 / 1,000 results

Rating

5.0

(3)

Developer

👁 Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

Bookmarked

566

Total users

114

Monthly active users

8 days ago

Last modified

Features

💬 Scrape comments from multiple Reddit posts
📊 Extract comprehensive comment data (text, author, score, timestamps, etc.)
🔄 Automatically expand collapsed threads and "load more" sections
🌳 Capture nested comment structure with depth levels
📦 No authentication required for public posts
💾 Data saved in structured JSON format
🌐 Browser automation bypasses API restrictions

Input Parameters

The actor accepts the following input parameters:

Parameter	Type	Required	Default	Description
`postUrls`	array	Yes	-	List of Reddit post URLs to scrape comments from
`maxComments`	integer	No	`100`	Maximum number of comments to scrape from each post (1-10000)
`expandThreads`	boolean	No	`true`	Automatically expand collapsed threads and "load more" sections

Example Input

{
"postUrls":[
"https://www.reddit.com/r/programming/comments/1abc123/interesting_discussion/",
"https://old.reddit.com/r/python/comments/1def456/another_post/"
],
"maxComments":200,
"expandThreads":true
}

Output Fields

The actor extracts the following data for each comment:

Comment Information

comment_id - Unique comment ID (e.g., "abc123xyz")
comment_name - Full comment name in Reddit format (e.g., "t1_abc123xyz")
author - Username of the comment author (or "[deleted]")
text - Full comment text/content

Engagement Metrics

score - Comment score/karma (upvotes minus downvotes)
awards_count - Number of awards/gildings the comment received

Metadata

depth - Nesting level/depth in the comment thread (0 = top-level)
parent_comment_id - ID of the parent comment (null for top-level comments)
is_op - Boolean indicating if the author is the Original Poster
is_edited - Boolean indicating if the comment was edited
is_stickied - Boolean indicating if the comment is stickied/pinned

Timestamps

created_utc - Unix timestamp when the comment was created
created_at - ISO 8601 formatted datetime (e.g., "2025-10-14T12:30:45")

Example Output

{
"comment_id":"abc123xyz",
"comment_name":"t1_abc123xyz",
"author":"example_user",
"text":"This is a great discussion! I totally agree with your points about...",
"score":42,
"awards_count":2,
"permalink":"https://old.reddit.com/r/programming/comments/1abc123/_/abc123xyz/",
"post_url":"https://old.reddit.com/r/programming/comments/1abc123/interesting_discussion/",
"depth":0,
"parent_comment_id":null,
"is_op":false,
"is_edited":true,
"is_stickied":false,
"created_utc":1728912645,
"created_at":"2025-10-14T12:30:45"
}

Usage

Local Development

Install dependencies:

pip install-r requirements.txt
playwright install chromium

Set up input in storage/key_value_stores/default/INPUT.json:

{
"postUrls":["https://www.reddit.com/r/programming/comments/1example/"],
"maxComments":100,
"expandThreads":true
}

Run the actor:
```
$python -m src
```
Check results in storage/datasets/default/

On Apify Platform

Push to Apify:
- Login to Apify CLI: apify login
- Initialize: apify init (if not already done)
- Push to Apify: apify push
Or manually upload:
- Create a new actor on Apify platform
- Upload all files including Dockerfile, requirements.txt, and .actor/ directory
Configure and run:
- Set input parameters in the Apify console
- Paste Reddit post URLs
- Click "Start" to run the actor
- Download results from the dataset tab

Technical Details

Browser Automation

Uses Playwright with Chromium browser
Scrapes old.reddit.com for better compatibility and simpler HTML structure
Implements anti-detection measures:
- Custom User-Agent headers
- Disabled automation flags
- Browser fingerprint masking

Features

Automatic thread expansion: Clicks "load more" and "continue this thread" buttons
Smart extraction: Handles nested comments and preserves thread structure
Depth tracking: Captures comment nesting levels
Parent-child relationships: Links comments to their parents
Error handling: Gracefully handles deleted comments and missing data

Comment Expansion

The scraper automatically:

Clicks "load more comments" buttons (up to 10 per attempt)
Clicks "continue this thread" links (up to 5 per attempt)
Makes up to 3 expansion attempts to maximize comment coverage
Waits for new comments to load after each expansion

Performance

Headless browser mode for efficiency
Optimized page load strategy (domcontentloaded)
Configurable wait times and timeouts
Parallel processing of multiple posts (sequential with delays)

Limitations

Only works with public Reddit posts
Cannot scrape private or restricted posts
Browser automation is slower than direct API calls but more reliable
Hidden scores show as 0 (when "[score hidden]" is displayed)
Maximum 10,000 comments per post (configurable)

Dependencies

apify>=2.1.0 - Apify SDK for Python
playwright~=1.40.0 - Browser automation framework
beautifulsoup4~=4.12.0 - HTML parsing library

Troubleshooting

Timeout Issues

If you encounter timeout errors:

Check if the post URL is valid and accessible
Increase timeout values in the code if needed
Verify the post has comments

Missing Comments

If some comments are missing:

Enable expandThreads to load collapsed comments
Increase maxComments limit
Some comments may be deleted or removed by moderators

"[deleted]" Authors

Comments from deleted accounts show "[deleted]" as author
This is normal Reddit behavior
The comment text may still be available or show as "[removed]"

Use Cases

Sentiment Analysis: Analyze community opinions on topics
Market Research: Gather user feedback and discussions
Content Moderation: Monitor discussions for moderation
Academic Research: Study online community interactions
Data Analysis: Build datasets for machine learning

License

This actor is provided as-is for scraping public Reddit data in accordance with Reddit's terms of service.

Notes

This scraper uses browser automation to access Reddit's public web interface
Always respect Reddit's robots.txt and terms of service
Use responsibly and avoid overwhelming Reddit's servers
Consider implementing additional rate limiting for large-scale scraping
The actor works best with the Apify platform's infrastructure
Posts with thousands of comments may take longer to scrape

👁 Reddit Comments Search Scraper avatar

Reddit Comments Search Scraper

easyapi/reddit-comments-search-scraper

Search and extract Reddit comments with advanced filtering options. Get detailed metadata including comment content, author info, post context, and engagement metrics. Perfect for sentiment analysis, trend research, and social media monitoring.

👁 User avatar

EasyApi

277

5.0

👁 Instagram Post Scraper avatar

Instagram Post Scraper

scrapers-hub/instagram-post-scraper

Instagram post scraper to extract posts, captions, likes, comments, and metadata from Instagram 📸💬 Perfect for content research, engagement analysis, and social media insights. Fast and scalable.

👁 User avatar

Scrapers Hub

5.0

👁 Reddit User Profile Posts & Comments Scraper avatar

Reddit User Profile Posts & Comments Scraper

louisdeconinck/reddit-user-profile-posts-scraper

Unlock Reddit's potential with our advanced scraper! Effortlessly gather comprehensive user data from public profiles. Perfect for researchers, marketers, and analysts. Enjoy high-speed performance, structured JSON output, and zero setup. Start scraping today with Apify's reliable infrastructure!

👁 User avatar

Louis Deconinck

287

5.0

👁 Reddit Post Comments Scraper | Bulk Thread & Reply Export avatar

Reddit Post Comments Scraper | Bulk Thread & Reply Export

clearpath/reddit-post-comments-bulk-scraper

Scrape Reddit posts with full comment trees. 6 sort orders, Q&A filtering, and deep sub-thread expansion. Bulk URLs, CSV upload, any format.

👁 User avatar

ClearPath

181

👁 Reddit User Profile Info Scraper avatar

Reddit User Profile Info Scraper

louisdeconinck/reddit-user-info-scraper

Unlock Reddit's full potential with our premium scraper! Instantly access complete user data, from profile stats to engagement metrics. Enjoy lightning-fast performance, built-in error handling, and analysis-ready JSON. Perfect for marketers, researchers, and data scientists. Try it free today!

👁 User avatar

Louis Deconinck

129

1.1

👁 🔥Reddit Scraper - Posts, Comments & Subreddit Data Extractor avatar

🔥Reddit Scraper - Posts, Comments & Subreddit Data Extractor

nourishing_courier/reddit-scraper-pro

Scrape Reddit posts, comments, and subreddit data. Extract upvotes, authors, timestamps, and nested replies. No API keys or login needed. Export to JSON, CSV, Excel. Pay per result - no monthly fees.

👁 User avatar

Ani Björkström

186

5.0

👁 Reddit Posts, Comments & Subreddit Analytics Scraper avatar

Reddit Posts, Comments & Subreddit Analytics Scraper

khadinakbar/reddit-posts-comments-scraper

Scrape Reddit posts, comments & subreddit analytics via JSON API. No browser, no login, no API key. Structured JSON for AI, research & monitoring.

👁 User avatar

Khadin Akbar

361

👁 Reddit Scraper Pro avatar

Reddit Scraper Pro

harshmaur/reddit-scraper-pro

Reddit Scraper Pro is a powerful, unlimited scraping for $20/mo for extracting data from Reddit. Scrape posts, users, comments, and communities with advanced search capabilities. Perfect for brand monitoring, trend tracking, and competitor research. Supports make, n8n integrations

👁 User avatar

Harsh Maur

2.5K

4.7

👁 Reddit Scraper avatar

Reddit Scraper

trudax/reddit-scraper

Unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

👁 User avatar

Trudax

14K

2.5

👁 Reddit Comment Scraper avatar