VOOZH about

URL: https://apify.com/junipr/bluesky-scraper

โ‡ฑ Bluesky Scraper - Posts, Profiles & Search ยท Apify


๐Ÿ‘ Bluesky Scraper โ€” Posts, Profiles & Search via AT Protocol avatar

Bluesky Scraper โ€” Posts, Profiles & Search via AT Protocol

Pricing

from $1.60 / 1,000 item scrapeds

Go to Apify Store

Bluesky Scraper โ€” Posts, Profiles & Search via AT Protocol

Scrape Bluesky posts, profiles, threads, and search results via AT Protocol. Extract text, engagement metrics, media, and author data. 4 modes: feed, profile, search, thread. Export JSON/CSV. No auth required.

Pricing

from $1.60 / 1,000 item scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ junipr

junipr

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

a month ago

Last modified

Share

Bluesky Scraper

What does Bluesky Scraper do?

Bluesky Scraper extracts posts, profiles, search results, and full threads from the Bluesky social network using the public AT Protocol API. It collects structured data including post text, engagement metrics (likes, reposts, replies, quotes), embedded media, author information, and direct web URLs. No login or authentication is required since the actor uses Bluesky's public API endpoints.

Whether you need to monitor brand mentions, analyze trending topics, research competitor activity, or build a dataset of public Bluesky content, this actor handles pagination automatically and delivers clean, structured JSON output ready for analysis or integration.

Features

  • Four scrape modes โ€” Fetch user posts, profiles, search results, or entire post threads
  • Batch handle support โ€” Scrape multiple Bluesky accounts in a single run
  • Full engagement metrics โ€” Likes, reposts, replies, and quote counts for every post
  • Embedded content detection โ€” Identifies images, links, videos, and quoted records with URLs
  • Profile data โ€” Follower/following counts, post counts, bios, avatar and banner URLs
  • Content labels โ€” Captures moderation labels applied to posts
  • Language detection โ€” Returns the primary language tag for each post
  • Direct web URLs โ€” Every post includes a clickable bsky.app link
  • Configurable pagination โ€” Set max results from 1 to 10,000 per handle or query
  • Rate limit protection โ€” Adjustable request delay to avoid API throttling
  • Pay-per-event pricing โ€” Only pay for the items you actually scrape

Input Configuration

{
"scrapeType":"posts",
"handles":["bsky.app","jay.bsky.team"],
"searchQuery":"",
"threadUri":"",
"maxResults":100,
"includeReplies":true,
"includeReposts":true,
"maxConcurrency":5,
"requestDelay":500
}
ParameterTypeDefaultDescription
scrapeTypestring"posts"What to scrape: posts, profile, search, or thread
handlesstring[]["bsky.app"]Bluesky handles to scrape (for posts and profile modes)
searchQuerystring""Search query (for search mode only)
threadUristring""AT Protocol URI of the thread (for thread mode only)
maxResultsinteger100Maximum items to scrape per handle/query (1-10,000)
includeRepliesbooleantrueInclude replies in the user feed (posts mode)
includeRepostsbooleantrueInclude reposts in the user feed (posts mode)
maxConcurrencyinteger5Maximum concurrent API requests (1-10)
requestDelayinteger500Delay between requests in milliseconds (minimum 100)

Output Format

Each post item in the dataset looks like this:

{
"type":"post",
"uri":"at://did:plc:abc123/app.bsky.feed.post/xyz789",
"cid":"bafyreiabc123",
"author":{
"handle":"bsky.app",
"displayName":"Bluesky",
"avatar":"https://cdn.bsky.app/img/avatar/plain/..."
},
"text":"Hello, Bluesky!",
"createdAt":"2025-01-15T12:00:00.000Z",
"likeCount":42,
"repostCount":10,
"replyCount":5,
"quoteCount":3,
"embedType":"image",
"embedUrl":"https://cdn.bsky.app/img/feed_fullsize/...",
"labels":[],
"language":"en",
"url":"https://bsky.app/profile/bsky.app/post/xyz789"
}

Profile items return:

{
"type":"profile",
"handle":"bsky.app",
"did":"did:plc:abc123",
"displayName":"Bluesky",
"description":"The official Bluesky account.",
"avatar":"https://cdn.bsky.app/img/avatar/plain/...",
"banner":"https://cdn.bsky.app/img/banner/plain/...",
"followersCount":100000,
"followsCount":500,
"postsCount":1200,
"createdAt":"2023-04-01T00:00:00.000Z"
}

Usage Examples / Use Cases

  • Brand monitoring โ€” Track mentions of your brand or product by searching for keywords across all public Bluesky posts
  • Influencer research โ€” Pull profile data and post history for multiple accounts to compare follower counts, engagement rates, and posting frequency
  • Content analysis โ€” Collect posts about a trending topic to analyze sentiment, language distribution, and media usage patterns
  • Thread archiving โ€” Save complete threads with all replies for research, compliance, or record-keeping
  • Competitive intelligence โ€” Monitor competitor accounts to track their messaging, engagement, and posting cadence
  • Academic research โ€” Build structured datasets of public social media content for linguistic, sociological, or network analysis

FAQ

Does this actor require a Bluesky account or login?

No. Bluesky Scraper uses the public AT Protocol API (public.api.bsky.app), which does not require authentication. All data accessed is publicly available content.

What is the thread URI format?

Thread URIs follow the AT Protocol format: at://did:plc:XXXXXXX/app.bsky.feed.post/XXXXXXX. You can find this by looking at a post's URL on bsky.app and converting it, or by using the posts scrape mode to collect URIs first.

How many posts can I scrape?

You can scrape up to 10,000 items per handle or search query in a single run. For larger datasets, run the actor multiple times with different handles or queries. The actor handles pagination automatically.

Will I get rate limited?

The actor includes a configurable requestDelay (default 500ms) to stay within Bluesky's API rate limits. If you experience rate limiting, increase the delay to 1000ms or higher. The public API is generous but does enforce limits on rapid requests.

Can I filter out replies and reposts?

Yes. Set includeReplies to false to exclude replies and includeReposts to false to exclude reposts. This only applies to posts mode when scraping a user's feed.

Related Actors

You might also like

Scrape Bluesky Posts: Search, Author Feed and Threads

danielainsworth/bluesky-posts

Extract posts from Bluesky by keyword search, author feed, or full thread. Engagement metrics, images, embeds. Uses official AT Protocol API.

๐Ÿ‘ User avatar

Daniel Ainsworth

2

Bluesky Scraper - Profiles, Posts and Feeds via AT Protocol

gio21/bluesky-scraper

Scrape Bluesky posts, profiles and feeds via the public AT Protocol. No login or API key.

Bluesky Scraper

kelvinosse/bluesky-scraper

Scrape profiles, posts, followers, and threads from Bluesky AT Protocol API.

Bluesky Posts Search Scraper

codingfrontend/bluesky-posts-search-scraper

Search and scrape posts from Bluesky social network using the AT Protocol public API. Collect post text, author info, engagement metrics, and more.

๐Ÿ‘ User avatar

codingfrontend

1

Bluesky Social Search โ€” Posts, Profiles & Feeds via AT Protocol

ryanclinton/bluesky-social-search

Search and extract posts, profiles, and author feeds from the Bluesky decentralized social network using the public AT Protocol API.

13

Bluesky Scraper - Posts, Profiles, Followers & Search

hata1234/bluesky-scraper

All-in-one Bluesky data extraction via AT Protocol. Scrape posts, profiles, followers, following lists, threads, and search results. No proxy or authentication needed. 7 modes in one actor: profile lookup, user posts, post search, user search, follower/following lists, and full thread extraction.

Bluesky Omni Scraper

actorpilot/bluesky-scraper

Extract posts, profiles, threads and followers from Bluesky via the official AT Protocol API. Search by keyword or hashtag, scrape author feeds, full threads and follower lists. No browser, no login. Export to JSON, CSV or Excel.