VOOZH about

URL: https://apify.com/clearpath/reddit-to-llm-api

⇱ Reddit Scraper - Markdown for AI & n8n Β· Apify


πŸ‘ Reddit Scraper - Markdown for AI & n8n avatar

Reddit Scraper - Markdown for AI & n8n

Pricing

Pay per event

Go to Apify Store

Reddit Scraper - Markdown for AI & n8n

Extract Reddit posts and comments as LLM-ready Markdown. No API key needed. Direct n8n/Make integrationβ€”connect output to AI nodes instantly. 20x faster than browser scrapers. Perfect for lead gen, product validation, and market research workflows.

Pricing

Pay per event

Rating

0.0

(0)

Developer

πŸ‘ ClearPath

ClearPath

Maintained by Community

Actor stats

1

Bookmarked

23

Total users

2

Monthly active users

2 days ago

Last modified

Share

πŸ”₯ Reddit Scraper for LLM & RAG | Posts, Comments & Images (2026)

The fastest, most cost-effective way to extract Reddit data for AI workflows. No browser overhead, no rate limit headachesβ€”just clean JSON + LLM-ready Markdown output that plugs directly into n8n, Make, or any automation platform.

  • ⚑ Blazing fast - Pure HTTP requests, no browser simulation
  • πŸ’° Incredibly cheap - $0.30 for 100 posts with comments (flat rate, any comment count)
  • πŸ€– LLM-optimized - Markdown output ready for GPT, Claude, Gemini
  • πŸ”„ n8n native - Designed for workflow automation

πŸ‘ Demo


Copy to your AI assistant

Copy this block into ChatGPT, Claude, Cursor, or any LLM to start using this actor.

clearpath/reddit-to-llm-api on Apify. Call:ApifyClient("TOKEN").actor("clearpath/reddit-to-llm-api").call(run_input={...}), then client.dataset(run["defaultDatasetId"]).list_items().items for results. Key inputs:mode(string:"subreddit"|"search"|"post"),subreddits(array),searchKeywords(array). Full actor spec(input schema with all params/enums/defaults, output dataset fields,README):GEThttps://api.apify.com/v2/acts/clearpath~reddit-to-llm-api/builds/default(Bearer TOKEN) β†’ inputSchema, actorDefinition.storages.dataset, readme. Pricing: pay-per-event, $0.001/post, $0.003/post with comments, $0.0005/image. Get token: https://console.apify.com/account/integrations

Why This Actor?

Most Reddit scrapers give you raw JSON that needs heavy transformation before LLMs can use it. This Actor outputs pre-formatted Markdown alongside structured JSONβ€”feed it directly to an AI node without writing a single line of code.

Perfect for n8n Workflows

Lead Generation from Pain Points

reddit-to-llm(search:"looking for","need help with")
β†’ LLM qualify leads
β†’ CRM/Email sequence

Find people actively seeking solutions you provide.

Product Validation Pipeline

Webhook (new idea)
β†’ reddit-to-llm (search related subreddits)
β†’ LLM analyze: demand signals, objections, existing solutions
β†’ Structured report

Before building, validate if people actually want it. The markdown format lets LLMs deeply analyze threaded discussions.


⚑ Key Features

Lightning Fast Extraction

  • No browser overhead - Direct data extraction, not Puppeteer/Playwright
  • 20 concurrent requests - Process multiple posts simultaneously
  • Automatic deduplication - No duplicate posts across modes

🎯 Three Collection Modes

  • Search - Global or subreddit-restricted keyword search
  • Subreddit Feeds - Hot, new, top, rising posts
  • Direct URLs - Scrape specific posts by URL

πŸ€– LLM-Ready Output

  • Markdown field - Formatted for direct AI consumption
  • Flat comments with depth - Easy to process, depth signals conviction
  • OP markers - Know when the author replies

πŸ“Έ Optional Image Extraction

  • Preview images, galleries, direct links (i.redd.it, imgur)
  • Stored to Apify Key-Value Store with public URLs
  • Ready for vision models (GPT-4V, Claude)

πŸ’° Pricing (Pay Per Event)

Transparent, predictable pricing. Only pay for what you extract.

EventPrice
Post scraped (without comments)$0.001
Post scraped (with comments)$0.003
Image scraped$0.0005

Flat rate per post - Whether a post has 1 comment or 500 comments, the price is the same ($0.003).

Cost Examples

ScenarioPostsComments?ImagesTotal Cost
Posts only100No0$0.10
Posts + comments100Yes (any count)0$0.30
Deep dive500Yes (any count)0$1.50
With images100Yes500$0.55

Cost optimization tips:

  • Set includeComments: false if you only need post titles/content (3x cheaper)
  • Comment count doesn't affect price - get as many as you need!
  • Filter by subreddit to avoid irrelevant posts

Input Configuration

Search Mode

ParameterTypeDefaultDescription
searchKeywordsstring[][]Keywords to search (joined with spaces)
searchInSubredditsstring[][]Limit search to specific subreddits
searchSortenumrelevancerelevance, new, top, comments
searchLimitinteger25Max posts (1-1000)

Subreddit Feed Mode

ParameterTypeDefaultDescription
subredditsstring[]["indiehackers"]Subreddits to scrape
subredditSortenumhothot, new, top, rising
subredditTimeFilterenum-For top: hour, day, week, month, year, all
subredditLimitinteger25Max posts per subreddit (1-1000)

Direct URLs Mode

ParameterTypeDefaultDescription
postUrlsstring[][]Reddit post URLs or redd.it short links

Output Settings

ParameterTypeDefaultDescription
includeCommentsbooleantrueFetch comments for each post
commentsLimitinteger100Max comments per post (0 = all, max 1000)
scrapeImagesbooleanfalseExtract and store images
proxyConfigurationobjectResidentialApify Proxy settings

Output Schema

Each dataset item is one Reddit post and carries every field Reddit exposes for that post (media, image previews, galleries, awards, full flair, crosspost data, content and moderation flags) plus a full subreddit profile, the comment thread, and the LLM-ready Markdown. The example below is trimmed for readability; large nested objects (preview, media, all_awardings) are shown abbreviated.

{
"id":"1prkwnx",
"title":"Product Developer (15y SaaS/Apps) seeking Marketing/Sales co-builder",
"author":"ManuelWenner",
"author_fullname":"t2_8xk2p",
"created_utc":"2025-12-20T18:18:21+00:00",
"created_utc_epoch":1766254701,
"permalink":"/r/indiehackers/comments/1prkwnx/product_developer_15y_saasapps_seeking/",
"url":"https://www.reddit.com/r/indiehackers/comments/1prkwnx/...",
"domain":"self.indiehackers",
"selftext":"Hey folks,\n\nI've been building digital products for ~15 years...",
"score":2,
"ups":2,
"upvote_ratio":0.75,
"num_comments":8,
"num_crossposts":0,
"view_count":null,
"subreddit":"indiehackers",
"subreddit_id":"t5_3i9si",
"subreddit_subscribers":140617,
"is_self":true,
"is_video":false,
"over_18":false,
"spoiler":false,
"stickied":false,
"locked":false,
"archived":false,
"edited":false,
"gilded":0,
"total_awards_received":0,
"all_awardings":[],
"distinguished":null,
"post_hint":null,
"thumbnail":"self",
"link_flair_text":"General Question",
"link_flair_background_color":"#0079d3",
"is_nsfw":false,
"is_spoiler":false,
"media":null,
"preview":{"images":[{"source":{"url":"https://preview.redd.it/...","width":1200,"height":630},"resolutions":["..."]}]},
"subreddit_details":{
"id":"3i9si",
"fullname":"t5_3i9si",
"name":"indiehackers",
"display_name_prefixed":"r/indiehackers",
"title":"Independent developers building their own way",
"description":"IndieHackers is a subreddit focused on people who bootstrap their way to success.",
"sidebar_description":"# Welcome\n\nFull community rules and resources...",
"subscribers":140617,
"active_users":null,
"created_utc":"2016-09-26T12:05:56+00:00",
"over_18":false,
"subreddit_type":"public",
"lang":"en",
"icon_img":"https://b.thumbs.redditmedia.com/....png",
"community_icon":"https://styles.redditmedia.com/....png",
"banner_background_image":"https://styles.redditmedia.com/....jpg",
"primary_color":"#0079d3",
"key_color":"#222222",
"submission_type":"any",
"allow_images":true,
"allow_videos":true,
"allow_galleries":true,
"spoilers_enabled":true,
"wiki_enabled":true,
"url":"https://www.reddit.com/r/indiehackers/"
},
"comments":[
{
"id":"nv30qrn",
"body":"How do people find Matchplan?",
"author":"scarfwizard",
"author_id":"t2_5h2k9",
"author_flair":"Bootstrapper",
"author_type":"USER",
"score":1,
"depth":0,
"child_count":1,
"created_utc":"2025-12-20T20:02:18+00:00",
"edited_at":null,
"parent_id":"t3_1prkwnx",
"permalink":"/r/indiehackers/comments/1prkwnx/.../nv30qrn/",
"is_submitter":false,
"distinguished":null,
"is_stickied":false,
"language_code":"en"
},
{
"id":"nv33n4g",
"body":"Currently I'm in such an early stage...",
"author":"ManuelWenner",
"score":1,
"depth":1,
"parent_id":"t1_nv30qrn",
"is_submitter":true
}
],
"images":[],
"markdown":"# Product Developer seeking Marketing/Sales co-builder\n\n**2 upvotes** | 8 comments | u/ManuelWenner | 2025-12-20\n\nHey folks...\n\n---\n\n## Comments\n\n**[1] u/scarfwizard** How do people find Matchplan?\n> **[1] u/ManuelWenner (OP)** Currently I'm in such an early stage...\n"
}

Output Fields

Post data: every field the post exposes, including identity (id, title, author, author_fullname, permalink, url, domain), content (selftext, is_self, is_video, media, preview, gallery data, thumbnail, post_hint), engagement (score, ups, upvote_ratio, num_comments, num_crossposts, view_count, gilded, total_awards_received, all_awardings), state (edited, stickied, pinned, locked, archived, distinguished, over_18, spoiler, removed_by_category), flair (link_flair_text + color/css/type), and subreddit context (subreddit, subreddit_id, subreddit_subscribers). Timestamps come as both ISO (created_utc) and epoch (created_utc_epoch).

subreddit_details: full subreddit profile β€” descriptions, subscriber count, icons and banners, theme colors, content policy (submission_type, allow_images/allow_videos/allow_galleries/allow_polls, spoilers_enabled, wiki_enabled), and timestamps.

Comments: flat list with text, author (author, author_id, author_flair, author_type), score, depth (0 = top-level), child_count, timestamps (created_utc, edited_at), parent_id, permalink, is_submitter (marks OP replies), plus distinguished, is_stickied/is_locked/is_removed, removed_by_category, and language_code.

Markdown: Pre-formatted for LLM consumption with nested blockquotes for replies.


Example Inputs

Search Mode (Global)

{
"searchKeywords":["indiehacker","pain points"],
"searchSort":"relevance",
"searchLimit":50,
"includeComments":true,
"commentsLimit":100
}

Subreddit Feed

{
"subreddits":["indiehackers","SaaS","startups"],
"subredditSort":"top",
"subredditTimeFilter":"week",
"subredditLimit":100
}

Direct URLs

{
"postUrls":[
"https://www.reddit.com/r/indiehackers/comments/abc123/...",
"https://redd.it/xyz789"
],
"includeComments":true,
"commentsLimit":500
}

Combined (All Modes)

{
"searchKeywords":["product feedback"],
"subreddits":["SaaS"],
"postUrls":["https://redd.it/abc123"],
"includeComments":true,
"commentsLimit":100
}

Use Cases

For Product Teams

  • Voice of Customer - Extract feature requests and complaints from product subreddits
  • Competitor Intelligence - Monitor what users say about alternatives
  • Product Validation - Search for demand signals before building

For Marketers

  • Content Research - Find top-performing topics in your niche
  • Lead Generation - Identify users seeking solutions you provide
  • Brand Monitoring - Track mentions and sentiment

For Researchers

  • Qualitative Analysis - Reddit comments as interview transcripts
  • Trend Detection - Early signals from rising posts
  • Sentiment Analysis - Community reactions with depth context

Limitations

  • Pagination cap: Max 1,000 posts per mode, 1,000 comments per post
  • Comment structure: Flat list with depth field (not nested JSON tree)
  • Images: Common sources supported (preview, i.redd.it, imgur, galleries). Videos not downloaded.
  • Rate limits: Handled automatically with exponential backoff

FAQ

Q: Do I need a Reddit account or API key? A: No. The Actor extracts publicly available data without authentication.

Q: How fast is it? A: Very fast. No browser overhead means 100 posts with comments typically complete in under 60 seconds.

Q: Why is the markdown field useful? A: LLMs process it directly without transformation. Perfect for n8n/Make workflows where you connect Actor output β†’ AI node.

Q: Can I scrape private subreddits? A: No. Only public subreddits and posts are accessible.

Q: What if a post is deleted? A: The Actor returns null and continues with other posts.

Q: How do I reduce costs? A: Set includeComments: false if you only need posts (3x cheaper: $0.001 vs $0.003). Comment count doesn't affect price, so no need to limit commentsLimit for cost reasonsβ€”use it only to reduce processing time.


More Clearpath scrapers for Reddit

πŸ” Search & discovery

πŸ’¬ Threads & comments

πŸ‘€ Users

πŸ€– AI & LLM tools

Support

  • πŸ“§ Email: max@mapa.slmail.me
  • πŸ› Bugs: Use the Issues tab
  • πŸ’‘ Feature requests: Email or Issues

Legal

This Actor extracts publicly available Reddit data. Users are responsible for compliance with Reddit's Terms of Service and applicable data protection regulations (GDPR, CCPA).


πŸš€ Start Extracting Reddit Data Now

Turn Reddit discussions into AI-ready insights in minutes, not hours.

You might also like

n8n-mcp

nourishing_courier/web-data-for-ai

n8n-mcp

πŸ‘ User avatar

Ani BjΓΆrkstrΓΆm

4

n8n Documentation MCP Server

agentify/n8n-mcp-server

n8n MCP Server provides AI assistants with structured access to n8n node documentation, properties, and validation tools for building and verifying workflows efficiently.

n8n Workflow Automation Templates Scraper

scraped/n8n-workflow-automation-templates-scraper

A tool that automatically scrapes and collects n8n workflow automation templates from the n8n for easy access and use.

n8n Community Creators Scraper

lexis-solutions/n8n-community-creators-pr-1275

Scrape top n8n community creators by engagement: capture profiles, reputation metrics, locations, bios, plus linked n8n creator pages with workflow counts and verification for expert discovery and lead generation.

πŸ‘ User avatar

Lexis Solutions

2

Trustpilot Reviews

datasaurus/trustpilot-reviews

Scrape reviews from Trustpilot. Works with all of the website's filters and sorting options. Fast and efficient. n8n-nodes-trustpilot-reviews. n8n node: n8n-nodes-trustpilot-reviews

12

5.0

n8n-apify-bridge

jungle_thunder/n8n-apify-bridge

Turn your n8n workflows into data powerhouses. This bridge gives n8n users instant access to 2000 battle-tested Apify tools - web scrapers, AI agents, lead generators, price monitors, and more. No coding required.

πŸ‘ User avatar

Ani BjΓΆrkstrΓΆm

8

Reddit Answers API "Ask Reddit" - AI Insights for n8n Pipelines

clearpath/reddit-answers-api

Extract AI-powered answers in 6 languages from Reddit discussions at scale. Structured JSON + markdown for n8n, Make, and LLM pipelines. Includes full post/comment context, quotes with citations, and subreddit metadata. 6 languages supported. No login required. Pay per successful answer.

n8n & Make Workflow Documentation Generator

spqr79/automation-documentation-generator

Generate professional Markdown, DOCX & PDF documentation from n8n workflows and Make blueprints. Powered by Claude Sonnet AI. Supports DE, EN, FR, ES.

n8n Workflow Template Scraper

muhammetakkurtt/n8n-scraper

Automate n8n.io workflow template collection with this Apify actor. Scrape by category (AI, Marketing, DevOps), sort (relevancy, popularity), & get detailed structured data. Fetch importable JSONs for direct n8n use. Ideal for developers, automation experts & businesses.

πŸ‘ User avatar

Muhammet Akkurt

347

5.0

Related articles

How to publish your Apify Actor as an n8n node
Read more