Pricing
Pay per usage
Lemmy Scraper
Scrape posts, comments, communities and search results from any Lemmy instance via the official API. Clean structured data (JSON/CSV), no login required.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape posts, comments, communities, and search results from any Lemmy instance โ fast, structured, and auth-free. Built on the official public Lemmy API v3 (ActivityPub / fediverse), so it works on lemmy.world or any other decentralized server.
What it does
Lemmy Scraper turns the public Lemmy REST API into clean, flat dataset rows you can export to JSON, CSV, or Excel. Point it at any instance, pick a mode, and collect data from the fediverse without logging in.
Four modes:
- Community posts โ every post in a community, sorted by Hot / Active / New / Top.
- Search โ keyword search across the instance.
- Community info โ metadata for a single community (subscribers, post count, description).
- Post comments โ all comments under a specific post.
Features
- Works on any Lemmy instance โ just change the
instanceinput. - No authentication, no cookies, no tokens โ pure public API reads.
- Flattened output โ clean columns, not raw nested JSON blobs.
- Automatic pagination with a
maxItemscap. - Polite request pacing plus automatic retries on transient HTTP errors.
- Pay-per-result pricing friendly (PPE
item-scrapedevents).
Input
| Field | Type | Description |
|---|---|---|
mode | enum | community_posts, search, community_info, or post_comments. |
instance | string | Lemmy host, e.g. lemmy.world (default). |
communityName | string | Community to scrape โ name or name@instance.tld. Required for community modes. |
query | string | Search keywords. Required for search mode. |
postId | integer | Post ID. Required for post_comments mode. |
sort | enum | Hot, Active, New, TopDay, TopWeek (community posts). |
maxItems | integer | Max rows to collect (default 1000). |
Example input
{"mode":"community_posts","instance":"lemmy.world","communityName":"technology","sort":"Hot","maxItems":500}
Output example
Each post becomes one flat row:
{"id":12345678,"title":"Open-source project hits 1.0","body":"Release notes inside...","url":"https://example.com/release","creatorName":"dev_user","creatorActorId":"https://lemmy.world/u/dev_user","communityName":"technology","score":842,"upvotes":870,"downvotes":28,"commentsCount":134,"published":"2026-06-20T14:03:11.000Z","postUrl":"https://lemmy.world/post/12345678"}
Comments and community-info modes produce their own flat schemas (content, creator, score, subscribers, etc.).
Use cases
- OSINT & research โ monitor communities and discussions across the fediverse.
- Journalism โ track emerging stories and public sentiment on decentralized platforms.
- Brand monitoring โ find mentions of your product or company via search mode.
- AI / ML training data โ collect open social text and threaded discussions at scale.
Why this actor
Lemmy exposes a stable, official public REST API backed by ActivityPub. This scraper talks to that API directly instead of fragile HTML parsing, so it keeps working through UI changes and runs against any Lemmy server in the fediverse. No login, no rate-limit gymnastics, no brittle selectors โ just structured decentralized social data.
Keywords: Lemmy scraper, Lemmy API, scrape Lemmy, ActivityPub, fediverse data, decentralized social, Lemmy posts, Lemmy comments, federated Reddit alternative.
