Pricing
Pay per usage
Farcaster Hub Scraper
Protocol-native Farcaster data ingestion for research, analytics, and social graph analysis. Collect casts, reactions, follows, user profiles, and real-time events directly from Farcaster Hubs via HTTP API.
Pricing
Pay per usage
Rating
5.0
(1)
Developer
Actor stats
1
Bookmarked
11
Total users
1
Monthly active users
23 days ago
Last modified
Categories
Share
Protocol-native Farcaster data ingestion for research, analytics, and social graph analysis. Collect casts, reactions, follows, user profiles, and real-time events directly from Farcaster Hubs via HTTP API.
Features
โ Protocol-First Design - Direct Hub HTTP API integration (no third-party dependencies) โ Three Ingestion Modes - Deterministic backfill by FIDs, time-bounded studies, or incremental event tailing โ Comprehensive Data - Casts, reactions (likes/recasts), follows, user profiles, and events โ Optional Enrichment - Parse Frames/Mini-Apps metadata from embedded URLs โ State Checkpointing - Migration-safe, resumable runs with automatic state persistence โ Rate Limiting & Retries - Production-grade reliability with exponential backoff โ Neynar v2 Support - Optional integration with Neynar hosted hubs โ Multiple Views - Pre-configured dataset views for easy data exploration
Who Uses This Actor?
๐ฏ Target Users
๐ Web3 Data Analysts & Researchers (Dune, Flipside)
- Export Farcaster data to SQL databases for analytics dashboards
- Track protocol growth, user engagement trends, and network effects
- Cross-reference social data with onchain transactions
๐ ๏ธ Farcaster Frame/Mini-App Developers
- Monitor Frame engagement and interaction patterns
- Track which users interact with your Mini-Apps
- Analyze viral content and user acquisition funnels
๐ข Web3 Marketing Agencies & Brands
- Track influencer campaigns and brand mentions
- Measure content reach and engagement rates
- Identify key opinion leaders in the Farcaster ecosystem
๐ Academic Researchers
- Study decentralized social network dynamics
- Analyze information diffusion and community formation
- Research Web3 social graph topology
Use Cases by Persona
๐ For Data Analysts
Influencer Ranking Dashboard
{"mode":"byFids","fids":[2,3,6833,5650,7890],"include":{"casts":true,"reactions":true,"userData":true},"maxRecords":50000}
โ Export to Dune to calculate engagement rates, follower growth, content velocity
Protocol Growth Metrics
{"mode":"tailEvents","maxRecords":100000}
โ Stream all events to track daily active users, network growth, retention
๐ ๏ธ For Frame Developers
Frame Interaction Analysis
{"mode":"byFids","fids":[list of users who interacted],"include":{"casts":true,"reactions":true},"fetchEmbeds":true}
โ Identify which casts contain your Frame, track engagement patterns
Real-Time Frame Monitoring
{"mode":"tailEvents","tail":{"fromEventId":"latest"},"maxRecords":10000}
โ Get notified when users interact with your Frames in real-time
๐ข For Marketing Agencies
Campaign Performance Tracking
{"mode":"byFids","fids":[brand_account, influencer1, influencer2],"startTimestamp":130000000,"stopTimestamp":130100000,"include":{"casts":true,"reactions":true}}
โ Measure campaign reach during specific time window
Influencer Discovery
{"mode":"byFids","fids":[competitor_followers],"include":{"links":true,"userData":true,"reactions":true}}
โ Find high-engagement users in target communities
๐ For Researchers
Social Network Topology Study
{"mode":"byFids","discoverFids":true,"shardIds":[0,1,2],"include":{"links":true,"userData":true},"maxRecords":500000}
โ Build complete follow graph for network analysis
Information Diffusion Analysis
{"mode":"byTime","fids":[seed_users],"startTimestamp":100000000,"stopTimestamp":100500000,"include":{"casts":true,"reactions":true}}
โ Track how content spreads through the network over time
Quick Start
Basic Example: Backfill by FIDs
{"hubBaseUrl":"https://hub.pinata.cloud","mode":"byFids","fids":[2,3,6833],"include":{"casts":true,"reactions":true,"links":true,"userData":true},"pageSize":1000,"maxRecords":10000}
Time-Bounded Study
{"hubBaseUrl":"https://hub.pinata.cloud","mode":"byTime","fids":[2,3],"startTimestamp":100000000,"stopTimestamp":100050000,"include":{"casts":true,"reactions":true}}
Real-Time Event Tail
{"hubBaseUrl":"https://hub.pinata.cloud","mode":"tailEvents","tail":{"fromEventId":"0","shardIndex":0},"maxRecords":1000}
Auto-Discover FIDs via Shard Scan
{"hubBaseUrl":"https://hub.pinata.cloud","mode":"byFids","discoverFids":true,"shardIds":[0,1],"include":{"casts":true,"userData":true},"maxRecords":5000}
With Frame/Mini-App Metadata Parsing
{"hubBaseUrl":"https://hub.pinata.cloud","mode":"byFids","fids":[2],"fetchEmbeds":true,"maxEmbedsPerRun":100,"proxy":"RESIDENTIAL","include":{"casts":true}}
Input Configuration
Required Fields
| Field | Type | Description | Default |
|---|---|---|---|
hubBaseUrl | string | HTTP endpoint of Farcaster Hub | https://hub.pinata.cloud |
mode | enum | Ingestion mode: byFids, byTime, tailEvents | byFids |
Mode-Specific Fields
By FIDs Mode
| Field | Type | Description | Default |
|---|---|---|---|
fids | array<integer> | List of Farcaster IDs to scrape | [] |
discoverFids | boolean | Auto-discover FIDs via shard scan | false |
shardIds | array<integer> | Shard IDs to scan when discovering | [] |
By Time Mode
| Field | Type | Description | Default |
|---|---|---|---|
fids | array<integer> | FIDs to scrape (required) | [] |
startTimestamp | integer | Start time (Farcaster epoch seconds) | - |
stopTimestamp | integer | Stop time (Farcaster epoch seconds) | - |
Tail Events Mode
| Field | Type | Description | Default |
|---|---|---|---|
tail.fromEventId | string | Start from event ID (empty = start from 0) | "0" |
tail.shardIndex | integer | Shard index to tail (optional) | - |
Entity Filters
| Field | Type | Description | Default |
|---|---|---|---|
include.casts | boolean | Include cast messages | true |
include.reactions | boolean | Include reactions (likes/recasts) | true |
include.links | boolean | Include follows | true |
include.userData | boolean | Include user profiles | true |
Optional Features
| Field | Type | Description | Default |
|---|---|---|---|
fetchEmbeds | boolean | Parse embedded URLs for Frames/Mini-Apps | false |
maxEmbedsPerRun | integer | Max embeds to fetch per run | 500 |
neynarApiKey | string | Neynar v2 API key (optional) | - |
clientApi | boolean | Enable Farcaster Client API (experimental) | false |
proxy | string | Apify Proxy groups or custom URL | - |
Performance & Limits
| Field | Type | Description | Default |
|---|---|---|---|
pageSize | integer | Records per page (max 1000) | 1000 |
maxRecords | integer | Stop after N records (safety limit) | - |
requestPerMinute | integer | Rate limit for Hub API calls | 600 |
Output Schema
The actor produces normalized entities with the following types:
Cast Entity
{"entity_type":"cast","fid":2,"hash":"0x1234567890abcdef","ts":123456789,"ts_iso":"2025-01-15T10:30:00.000Z","text":"Hello Farcaster!","mentions":[3,6833],"parent":{"castId":{"fid":2,"hash":"0xabc..."}},"embeds":{"urls":["https://example.com"],"castIds":[]},"derived":{"urls":["https://example.com"],"frame_meta":{"name":"My App","url":"https://app.example.com"}},"ingest_source":"hub_http","ingest_ts":"2025-01-15T10:31:00.000Z","raw":{/* original Hub message */}}
Reaction Entity
{"entity_type":"reaction","fid":3,"type":"like","target":{"castId":{"fid":2,"hash":"0x1234..."}},"ts":123456790,"ts_iso":"2025-01-15T10:31:00.000Z","hash":"0xabcd...","ingest_source":"hub_http","ingest_ts":"2025-01-15T10:32:00.000Z","raw":{/* original Hub message */}}
Link Entity (Follow)
{"entity_type":"link","fid":3,"targetFid":2,"type":"follow","ts":123456791,"ts_iso":"2025-01-15T10:32:00.000Z","hash":"0xdef...","ingest_source":"hub_http","ingest_ts":"2025-01-15T10:33:00.000Z","raw":{/* original Hub message */}}
User Data Entity
{"entity_type":"user_data","fid":2,"username":"vitalik.eth","display":"Vitalik","pfp":"https://example.com/pfp.png","bio":"Ethereum co-founder","url":"https://vitalik.ca","location":"Singapore","github":"vbuterin","twitter":"VitalikButerin","ts":123456792,"ts_iso":"2025-01-15T10:33:00.000Z","ingest_source":"hub_http","ingest_ts":"2025-01-15T10:34:00.000Z","raw":[/* original Hub messages */]}
Event Entity (Tail Mode)
{"entity_type":"event","event_id":"12345","event_type":"MERGE_MESSAGE","ts":123456793,"ts_iso":"2025-01-15T10:34:00.000Z","shard_index":0,"message":{/* hydrated message if MERGE_MESSAGE */},"ingest_source":"hub_http","ingest_ts":"2025-01-15T10:35:00.000Z","raw":{/* original Hub event */}}
Farcaster Timestamps
Important: Farcaster uses a custom epoch starting at 2021-01-01T00:00:00.000Z.
- All entities include both
ts(Farcaster epoch seconds) andts_iso(ISO 8601) fields - Use
ts_isofor human-readable timestamps and data analysis - Use
tsfor filtering Hub API requests
Example conversion:
- Farcaster epoch
100000000=2024-03-03T01:46:40.000Z - Current time:
isoToFarcasterEpoch(new Date().toISOString())
Ingestion Modes Explained
Mode 1: By FIDs (Deterministic Backfill)
Use Case: Research specific users, backfill known accounts
How it works:
- For each FID in the input list (or discovered via shard scan):
- Fetch all casts with pagination
- Fetch all reactions (likes/recasts)
- Fetch all follows
- Fetch user profile data
- Maintains checkpoint per FID (
lastTs,lastPageToken) for resumable runs - Optionally discover FIDs by scanning specified shards
Best for: User-centric analysis, follower studies, content backfills
Mode 2: By Time Window (Targeted Study)
Use Case: Time-bounded analysis (e.g., "all activity during an event")
How it works:
- For each FID, fetch only messages within
startTimestamptostopTimestamp - Applies time filters to casts (Hub native support)
- Filters reactions and links manually (Hub doesn't support time filters)
- Faster than full backfill when studying specific time periods
Best for: Event analysis, temporal studies, A/B testing
Mode 3: Tail Events (Near-Real-Time)
Use Case: Live monitoring, incremental ingestion
How it works:
- Poll
/v1/eventsstarting fromfromEventId(or last checkpoint) - For
MERGE_MESSAGEevents, hydrate and push the message entity - Update
lastEventIdcheckpoint per shard - Sleeps 5s between polls (configurable)
Important: Hubs prune events older than ~3 days. Run frequently (every 1-2 days) to avoid data loss.
Best for: Real-time dashboards, notifications, streaming pipelines
Optional Features
Frame/Mini-App Metadata Parsing
When fetchEmbeds: true, the actor will:
- Extract all unique URLs from cast embeds
- Fetch each URL (up to
maxEmbedsPerRunlimit) - Parse
fc:miniapp:*andfc:frame:*meta tags - Enrich cast entities with
derived.frame_metaobject
Use Proxy: Set proxy field to avoid rate limits (e.g., "RESIDENTIAL" for Apify Proxy)
Performance: Adds ~2-5s per URL. Use maxEmbedsPerRun to cap crawling time.
Neynar v2 Integration
Provide neynarApiKey to use Neynar's hosted Hub endpoints instead of direct Hub HTTP.
Benefits:
- Faster, managed infrastructure
- No self-hosted Hub required
- Additional features (v2 only; v1 EOL March 31, 2025)
Records flagged: All entities get ingest_source: "neynar_v2"
Client API (Experimental)
Set clientApi: true to enable Warpcast-specific endpoints (e.g., trending, channels).
Warning: Non-protocol data. Records flagged as ingest_source: "client_api" to avoid confusion.
State Checkpointing & Resumability
The actor automatically persists state every 30 seconds and on Apify migration events:
- Per-FID checkpoints:
{ lastTs, lastPageToken }for resuming mid-pagination - Per-Shard checkpoints:
{ lastEventId }for event tail mode - Migration-safe: Survives container restarts and platform migrations
To resume a run:
- Start the actor with same input
- State is automatically restored
- Scraping continues from last checkpoint
Performance Tips
- Use time filters: Narrow
startTimestamp/stopTimestampfor faster runs - Batch FIDs: Process related users together to share dedup cache
- Tune
pageSize: Larger pages (1000) = fewer requests, but slower per-request - Set
maxRecords: Safety limit prevents runaway costs - Monitor rate limits: Default 600 req/min is conservative; increase if Hub allows
- Schedule tail runs: Run every 1-2 days to avoid event pruning
Limitations & Best Practices
Hub Event Pruning
- Limitation: Hubs prune events older than ~3 days
- Best Practice: Schedule tail runs every 1-2 days for continuous ingestion
Reaction/Link Time Filters
- Limitation: Hub API doesn't support time filters for reactions/links
- Workaround: Actor fetches all and filters manually in
byTimemode (slower)
Embed Fetching
- Limitation: Some URLs may be slow, dead, or behind auth
- Best Practice: Use
maxEmbedsPerRuncap and Apify Proxy to avoid timeouts
Rate Limiting
- Default: 600 req/min (conservative)
- Tuning: Increase
requestPerMinuteif your Hub supports higher rates - Public Hubs: May have stricter limits; monitor 429 responses
Pricing & Compute
Approximate compute units (based on default settings):
| Run Type | Records | Compute Units | Notes |
|---|---|---|---|
| Small backfill | <10k | ~0.01 | 2-3 FIDs, no embeds |
| Medium backfill | 100k | ~0.5 | 10-20 FIDs, all entities |
| Large backfill | 1M | ~5 | 100+ FIDs or full shard scan |
| Tail (1 hour) | 1k events | ~0.005 | Near-real-time streaming |
| With embeds | +100 URLs | +0.02 per 100 | Crawlee overhead |
Formula: ~0.5 CU per 100k records (without embeds)
Example Use Cases
Social Graph Analysis
{"mode":"byFids","fids":[2,3,6833,5650],"include":{"links":true,"userData":true}}
Output: Follow relationships + user profiles for network analysis
Content Research
{"mode":"byTime","fids":[2],"startTimestamp":100000000,"stopTimestamp":100050000,"include":{"casts":true,"reactions":true}}
Output: All casts + reactions during a specific event
Real-Time Dashboard
{"mode":"tailEvents","tail":{"fromEventId":"0"},"maxRecords":10000}
Output: Live stream of all protocol events (schedule every hour)
Frame/Mini-App Catalog
{"mode":"byFids","fids":[2,3],"fetchEmbeds":true,"maxEmbedsPerRun":200,"include":{"casts":true}}
Output: Casts with Frame/Mini-App metadata extracted
Troubleshooting
"Failed to connect to Hub"
- Verify
hubBaseUrlis correct and accessible - Check Hub is running and serving HTTP API on port 3381
- Try public Hub:
https://hub.pinata.cloud
"No data returned"
- Verify FIDs exist and have activity
- Check time window isn't too narrow (
byTimemode) - Ensure
include.*filters aren't excluding all data
"Max records limit reached"
- Increase
maxRecordsor remove limit for full backfill - Use checkpointing to resume in multiple runs
"Rate limit errors (429)"
- Decrease
requestPerMinute - Add delays between runs
- Use Neynar hosted Hub (better rate limits)
"Event tail missing data"
- Events pruned >3 days ago
- Schedule runs more frequently (every 1-2 days)
- Use
byFidsmode for historical backfill
Data Views
The actor provides pre-configured dataset views:
- Overview: All entities with key identifiers
- Casts: Cast content, timestamps, and URLs
- Reactions: Likes and recasts by FID
- Follows: Follow relationships (social graph edges)
- Users: User profiles and metadata
Access views in Apify Console โ Dataset โ Views tab
Support
- Email: via Apify
- Documentation: Farcaster Hub API Docs
- Issues: Report bugs or request features via email
Version History
- 1.0.0 (2025-01) - Initial release
- Three ingestion modes (byFids, byTime, tailEvents)
- Hub HTTP API integration
- State checkpointing
- Optional Frame/Mini-App parsing
- Neynar v2 support
๐ Explore More of Our Actors
๐ฐ Content & Publishing
| Actor | Description |
|---|---|
| Notion Marketplace Scraper | Scrape Notion templates and marketplace listings |
| Ghost Newsletter Scraper | Extract Ghost newsletter content and subscriber data |
| Google Play Reviews Scraper | Extract app reviews from Google Play Store |
๐ฌ Social Media & Community
| Actor | Description |
|---|---|
| Reddit Scraper Pro | Monitor subreddits and track keywords with sentiment analysis |
| Discord Scraper Pro | Extract Discord messages and chat history for community insights |
| YouTube Comments Harvester | Comprehensive YouTube comments scraper with channel-wide enumeration |
| YouTube Contact Scraper | Extract YouTube channel contact information for outreach |
| YouTube Shorts Scraper | Scrape YouTube Shorts for viral content research |
License
MIT License - Free for commercial and non-commercial use
๐ฌ Custom Solutions & Enterprise
Need a custom data feed, modified output format, or enterprise integration?
Contact: Furkanc58@gmail.com
I offer:
- Daily/weekly data feeds (Snowflake, S3, BigQuery, Google Sheets)
- Custom scrapers for platforms not yet covered
- White-label solutions for agencies
- Priority support and SLAs
Response within 24-48 hours.
Legal Disclaimer
This actor is a general-purpose tool for analyzing publicly accessible web data. The user bears sole responsibility for ensuring their specific use complies with:
- Applicable laws (GDPR/DSGVO, copyright law)
- The target website's Terms of Service
- Apify's Terms of Service
The provider (webdatalabs) expressly disclaims liability for any unauthorized or unlawful use. By using this actor, the user agrees to indemnify the provider against any third-party claims arising from their use of the data.
This tool is not affiliated with Farcaster. All trademarks belong to their respective owners.
