VOOZH about

URL: https://apify.com/verifiable_clamp/apify-bluesky-scraper

โ‡ฑ Bluesky Posts & Profiles Scraper with Claude Enrichment ยท Apify


๐Ÿ‘ Bluesky Posts & Profiles Scraper avatar

Bluesky Posts & Profiles Scraper

Under maintenance

Pricing

from $0.00005 / actor start

Go to Apify Store

Bluesky Posts & Profiles Scraper

Under maintenance

Scrape Bluesky posts via the AT Protocol public API. Search by query or fetch posts from a list of user handles. Optional Claude-powered sentiment/topic/entity enrichment.

Pricing

from $0.00005 / actor start

Rating

0.0

(0)

Developer

๐Ÿ‘ Rara21

Rara21

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

21 days ago

Last modified

Categories

Share

Apify Actor that scrapes Bluesky via the public AT Protocol API. Search posts by query, fetch posts from specific authors, optionally enrich each post with Claude-powered sentiment / topic / entity / summary fields.

No Bluesky account required. The AT Protocol exposes public read endpoints at https://public.api.bsky.app โ€” this Actor uses only those, so there's no auth setup beyond Apify itself.

What you get per scraped post

Every output item is a flat object with these fields (see src/types.ts for the full Zod schema):

{
"uri":"at://did:plc:abc.../app.bsky.feed.post/3kxyz",
"cid":"bafyrei...",
"url":"https://bsky.app/profile/alice.bsky.social/post/3kxyz",
"text":"Hello Bluesky! โ€ฆ",
"language":["en"],
"author_did":"did:plc:abc...",
"author_handle":"alice.bsky.social",
"author_display_name":"Alice",
"like_count":42,
"repost_count":7,
"reply_count":3,
"quote_count":1,
"created_at":"2026-05-10T12:00:00.000Z",
"indexed_at":"2026-05-10T12:00:01.000Z",
"is_reply":false,
"reply_root_uri":null,
"reply_parent_uri":null,
"has_media":true,
"has_external_link":false,
"has_video":false,
"embed_images":[{"url":"https://...","alt":"An orange sky"}],
"embed_external_url":null,
"embed_external_title":null,
"mentions":["did:plc:..."],
"links":["https://..."],
"hashtags":["atproto"],
"labels":[],
"semantic":{
"sentiment":"positive",
"topics":["climate","policy"],
"entities":[{"name":"COP30","kind":"event"}],
"summary":"Short auto-generated summary."
},
"source_mode":"search",
"source_query":"climate change",
"scraped_at":"2026-05-11T05:14:00.000Z"
}

semantic only appears when enrich_with_claude is on.

Modes

Mode search โ€” by query

{
"mode":"search",
"search_query":"climate change OR climatechange",
"sort":"latest",
"language":"en",
"max_items":500
}

Uses app.bsky.feed.searchPosts under the hood. Supports OR, quoted phrases, and hashtag queries.

Mode author_feed โ€” by user

{
"mode":"author_feed",
"actors":["bsky.app","atproto.com","alice.bsky.social"],
"author_filter":"posts_no_replies",
"max_items_per_actor":200,
"max_items":1000
}

Calls app.bsky.feed.getAuthorFeed once per actor in the list, with cursor-based pagination.

Optional: Claude enrichment

Toggle enrich_with_claude: true and provide an Anthropic API key. Each post then gets a semantic field added before being pushed to the dataset.

You choose which fields to compute (cheaper subsets cost less):

{
"enrich_with_claude":true,
"claude_api_key":"sk-ant-โ€ฆ",
"claude_model":"claude-haiku-4-5",
"enrichment_fields":{
"sentiment":true,
"topics":true,
"entities":false,
"summary":false
}
}

Posts are batched (10 per call) so you pay roughly $0.002 per 10 posts at Haiku 4.5 rates with sentiment + topics on.

If enrichment fails for any reason (rate limit, malformed model response, network), the batch falls through unchanged โ€” the run never fails because of optional enrichment.

Local development

git clone https://github.com/<your-username>/apify-bluesky-scraper
cd apify-bluesky-scraper
npminstall
npm run build
npmtest# 27 unit tests, ~2s

Pushing to Apify Store

npminstall-g apify-cli
apify login # browser auth
apify push # uploads source + builds the Actor on Apify Cloud

After the build succeeds, open the Actor in Apify Console:

  1. Fill in seoTitle and seoDescription (this is the main discoverability lever โ€” see Apify Store guidance)
  2. Set pricing model: PAY_PER_EVENT recommended at $0.003/post (matches the leading competitor's tier)
  3. Publish under the Publication tab

Cost model (per Apify run)

VolumeBluesky API callsApify computeClaude calls (optional)Total Apify cost
100 posts~1-2256 MB ยท ~10s0-10~$0.001
1,000 posts~10256 MB ยท ~60s0-100~$0.005
10,000 posts~100512 MB ยท ~10min0-1,000~$0.05

The Bluesky public API has no documented hard rate limit but is empirically rate-friendly at ~100 requests/min from a single IP. The Actor's built-in retry+backoff handles 429s automatically.

Why this Actor

Bluesky has 30M+ users, the AT Protocol is open, but tooling lags โ€” the leading scraper on Apify Store has fewer than 500 installs. This one is:

  • Fully open โ€” MIT licensed, every transform in src/transform.ts is auditable
  • Test-covered โ€” 27 unit tests with mocked Bluesky responses, no flaky integration suite
  • LLM-ready โ€” optional Claude enrichment makes posts useful for brand monitoring, sentiment dashboards, and RAG ingestion without an additional pipeline
  • Cheap by default โ€” pay-per-event pricing means small runs cost cents, not dollars

Project structure

.actor/
โ”œโ”€โ”€ actor.json # Apify Actor metadata(categories, dataset views, memory limits)
โ”œโ”€โ”€ input_schema.json # Console UI input form definition
โ””โ”€โ”€ Dockerfile # Apify Cloud build
src/
โ”œโ”€โ”€ main.ts # Actor entry โ€” orchestrates search/feed โ†’ transform โ†’ push
โ”œโ”€โ”€ input.ts # Zod-validated Input schema mirroring input_schema.json
โ”œโ”€โ”€ types.ts # ScrapedPost output schema
โ”œโ”€โ”€ transform.ts # BskyPostView โ†’ ScrapedPost mapper(handles embeds, facets, reposts)
โ”œโ”€โ”€ bluesky/
โ”‚ โ”œโ”€โ”€ client.ts # XRPC fetch client with retry+backoff and paginated iterators
โ”‚ โ””โ”€โ”€ types.ts # Bluesky response shapes
โ””โ”€โ”€ enrichment/
โ””โ”€โ”€ claude.ts # Optional batched Claude enrichment
test/
โ”œโ”€โ”€ fixtures.ts # Sample Bluesky responses(plain post, reply, image, link, mention, repost)
โ”œโ”€โ”€ transform.test.ts # 12 tests
โ”œโ”€โ”€ client.test.ts # 9 tests
โ””โ”€โ”€ input.test.ts # 6 tests

License

MIT โ€” see LICENSE.

You might also like

Bluesky Profile & Posts Scraper

deepthoughts/bluesky-profile-scraper

Extract Bluesky profiles and recent posts with engagement metrics via the public AT Protocol API. No login required.

๐Ÿ‘ User avatar

Deep Thoughts Inc

2

Bluesky Scraper

kelvinosse/bluesky-scraper

Scrape profiles, posts, followers, and threads from Bluesky AT Protocol API.

Bluesky Scraper - Profiles, Posts and Feeds via AT Protocol

gio21/bluesky-scraper

Scrape Bluesky posts, profiles and feeds via the public AT Protocol. No login or API key.

Bluesky Scraper โ€” Posts, Profiles & Search

aurumworks/bluesky-scraper

Scrape Bluesky social network. Search posts by keyword, get user profiles, fetch user feeds, and extract post threads with replies. Uses Bluesky's official public API. No login or API key needed.

Bluesky Posts Search Scraper

codingfrontend/bluesky-posts-search-scraper

Search and scrape posts from Bluesky social network using the AT Protocol public API. Collect post text, author info, engagement metrics, and more.

๐Ÿ‘ User avatar

codingfrontend

1

Bluesky Scraper Pro โ€“ Posts, Profiles, Analytics & AI Insights

predictable_embargo/bluesky-scraper-pro

High-performance Bluesky scraper powered by AT Protocol. Collect posts, profiles, followers, search results, and full threads with optional AI sentiment, topic extraction, summary, and viral potential scoring.

๐Ÿ‘ User avatar

sovereigngroupus-dev

2

5.0

Bluesky Social Search โ€” Posts, Profiles & Feeds via AT Protocol

ryanclinton/bluesky-social-search

Search and extract posts, profiles, and author feeds from the Bluesky decentralized social network using the public AT Protocol API.

14

Bluesky Scraper

theguide/bluesky-scraper

Scrape recent posts and user details from the Bluesky social network based on handles and/or search keywords..

Bluesky Scraper

george.the.developer/bluesky-scraper

Scrape Bluesky (bsky.app) posts, profiles, and search results using the public AT Protocol API. No authentication required.