VOOZH about

URL: https://apify.com/dubz/x-bulk-vision

⇱ X (Twitter) Bulk Scraper/Monitor/Alerts + Vision [DEPRECATED] Β· Apify


πŸ‘ X (Twitter) Bulk Scraper/Monitor/Alerts + Vision avatar

X (Twitter) Bulk Scraper/Monitor/Alerts + Vision

Deprecated

Pricing

Pay per usage

Go to Apify Store

X (Twitter) Bulk Scraper/Monitor/Alerts + Vision

Deprecated

Monitor X (formerly Twitter) for specific content. Extract data, monitor, and optionally run image-based alerts using cloud vision APIs. Perfect for brand reputation management, tracking tweets, hashtags, specific images, and user activity.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

πŸ‘ β €Advanced Automation

β €Advanced Automation

Maintained by Community

Actor stats

0

Bookmarked

9

Total users

2

Monthly active users

5 months ago

Last modified

Share

X (Twitter) Bulk Scrape/Monitor + Vision AI

Monitor X/Twitter accounts, extract tweets, filter by keywords/hashtags, and run AI vision analysis on images using 6 different AI providers.

πŸ‘ Apify Actor
πŸ‘ Node.js
LICENSE

🎯 Overview

This Apify Actor scrapes X (formerly Twitter) posts from multiple accounts, filters by keywords/hashtags, and optionally runs AI vision analysis on images to detect objects, brands, or custom content patterns. Perfect for social media monitoring, brand tracking, and competitive intelligence.

✨ Key Features

Core Scraping

  • πŸš€ Hyperdrive Mode: Lightning-fast RSS-based scraping with automatic fallback to web scraping
  • πŸ‘₯ Bulk Processing: Monitor up to 100 Twitter accounts simultaneously
  • πŸ” Smart Filtering: Filter by keywords, hashtags, or require images
  • πŸ“Š Dual Datasets: Separate outputs for tweets and vision alerts
  • πŸ”„ Automatic Retry: Robust error handling with multiple Nitter instance fallbacks

AI Vision Analysis (Optional)

Analyze tweet images using 6 industry-leading AI providers:

  • πŸ€– Google Gemini 2.0 Flash - Latest multimodal AI with base64 encoding
  • 🎨 OpenAI GPT-4o Vision - Advanced image understanding and analysis
  • πŸ‘οΈ Google Cloud Vision - Label detection, OCR, safe search, object localization
  • ☁️ Azure Computer Vision - Tags, objects, brands, faces, adult content detection
  • πŸ“Έ AWS Rekognition - Label detection and content moderation
  • πŸ”— Custom Webhooks - Integrate your own vision API

Alert System

  • πŸ”” Webhook Notifications: Get instant alerts when vision pipelines trigger
  • 🎯 Flexible Configuration: Per-pipeline or global webhook URLs
  • πŸ“ˆ Confidence Scoring: Filter alerts by AI confidence thresholds
  • 🏷️ Label Matching: Trigger on specific detected objects or keywords

πŸ“₯ Input Configuration

Basic Example

{
"usernames":["apify","openai"],
"maxItems":100,
"preferRss":true
}

Complete Example with Vision Analysis

{
"usernames":["apify","elonmusk","openai"],
"searchTerms":["AI","automation","web scraping"],
"hashtags":["webscraping","machinelearning"],
"maxItems":500,
"preferRss":true,
"requireImages":false,
"rssTimeoutSecs":10,
"visionPipelines":[
{
"name":"Product Launch Detector",
"provider":"gemini_vision",
"enabled":true,
"configJson":"{\"prompt\":\"Is this a product launch or announcement?\",\"triggerKeywords\":[\"launch\",\"new\",\"announcement\"],\"model\":\"gemini-2.0-flash-exp\"}",
"alertWebhookUrl":"https://your-webhook.com/product-alerts"
},
{
"name":"Brand Monitor",
"provider":"openai_vision",
"enabled":true,
"configJson":"{\"prompt\":\"Identify brands and logos\",\"triggerKeywords\":[\"Tesla\",\"Apple\",\"Nike\"],\"model\":\"gpt-4o\"}",
"alertWebhookUrl":""
},
{
"name":"Object Detector",
"provider":"google_vision",
"enabled":true,
"configJson":"{\"threshold\":0.8,\"triggerLabels\":[\"car\",\"vehicle\"],\"maxLabels\":10}"
}
]
}

Input Fields

FieldTypeRequiredDescription
usernamesarrayβœ… YesX/Twitter usernames to monitor (without @ symbol)
searchTermsarrayNoFilter tweets containing these keywords
hashtagsarrayNoFilter tweets containing these hashtags
maxItemsintegerNoMaximum tweets to collect (default: 100)
preferRssbooleanNoUse RSS scraping first (default: true)
requireImagesbooleanNoOnly collect tweets with images (default: false)
rssTimeoutSecsintegerNoRSS fetch timeout in seconds (default: 10)
visionPipelinesarrayNoAI vision analysis configuration

Vision Pipeline Configuration

Each pipeline in visionPipelines array:

FieldTypeRequiredDescription
namestringβœ… YesDescriptive name for the pipeline
providerstringβœ… YesAI provider: gemini_vision, openai_vision, google_vision, azure_cv, aws_rekognition, custom_webhook
enabledbooleanNoEnable/disable this pipeline (default: true)
configJsonstringNoProvider-specific configuration as JSON string
alertWebhookUrlstringNoWebhook URL for alerts (overrides env var)

Provider-Specific Configuration

Gemini Vision

{
"prompt":"Describe what you see in detail",
"triggerKeywords":["product","launch"],
"model":"gemini-2.0-flash-exp"
}

OpenAI Vision

{
"prompt":"Identify brands and logos",
"triggerKeywords":["Nike","Apple"],
"model":"gpt-4o",
"maxTokens":500
}

Google Cloud Vision

{
"threshold":0.8,
"triggerLabels":["car","vehicle"],
"maxLabels":10
}

Azure Computer Vision

{
"minConfidence":0.7,
"targetTags":["car","person"],
"blockAdult":false
}

AWS Rekognition

{
"minConfidence":0.7,
"targetLabels":["Car","Person"],
"blockUnsafe":true
}

Custom Webhook

{
"webhookUrl":"https://your-api.com/analyze",
"timeout":20000,
"headers":{
"Authorization":"Bearer YOUR_TOKEN"
}
}

πŸ“€ Output

Main Dataset (Tweets)

Each scraped tweet contains:

{
"title":"Check out our new Actor for web scraping!",
"link":"https://x.com/apify/status/1234567890",
"author":"apify",
"published":"2026-01-15T10:30:00Z",
"description":"Check out our new Actor...",
"tags":["#webscraping","#automation"],
"imageUrl":"https://pbs.twimg.com/media/abc123.jpg",
"visionAlertsCount":2,
"scrapedUsername":"apify",
"collectedAt":"2026-01-15T10:35:00Z",
"sourceType":"rss",
"instance":"nitter.net"
}

Alerts Dataset (Vision Triggers)

Each triggered alert contains:

{
"pipelineName":"Product Launch Detector",
"provider":"gemini_vision",
"itemLink":"https://x.com/apify/status/1234567890",
"imageUrl":"https://pbs.twimg.com/media/abc123.jpg",
"labels":[
{"name":"product","score":0.95},
{"name":"announcement","score":0.88}
],
"score":0.95,
"analysis":"This image shows a new product launch announcement...",
"triggeredAt":"2026-01-15T10:35:00Z"
}

Output Views

The Actor provides multiple pre-configured output views:

  • tweets - Full dataset JSON
  • tweetsTable - Simplified table view
  • tweetsCSV - CSV export
  • tweetsWithImages - Images only
  • visionAlerts - All vision alerts
  • visionAlertsTable - Simplified alerts view
  • visionAlertsCSV - Alerts CSV export
  • highConfidenceAlerts - 90%+ confidence only
  • runStats - Actor run statistics

πŸ” Environment Variables

Configure AI providers via environment variables in the Actor settings:

Required (if using vision analysis)

VariableDescriptionExample
OPENAI_API_KEYOpenAI API key for GPT-4o Visionsk-...
GEMINI_API_KEYGoogle Gemini API keyAIza...
GOOGLE_APPLICATION_CREDENTIALSGoogle Cloud credentials JSON{"type":"service_account",...}
AZURE_CV_ENDPOINTAzure Computer Vision endpointhttps://your-resource.cognitiveservices.azure.com/
AZURE_CV_KEYAzure Computer Vision API keyabc123...
AWS_ACCESS_KEY_IDAWS access key for RekognitionAKIA...
AWS_SECRET_ACCESS_KEYAWS secret keyabc123...
AWS_REGIONAWS region (optional)us-east-1 (default)

Optional

VariableDescription
ALERT_WEBHOOK_URLGlobal webhook URL for all alerts
WEBHOOK_<PIPELINE_NAME>Pipeline-specific webhook (e.g., WEBHOOK_PRODUCT_DETECTOR)

Setting Environment Variables

Via Apify Console:

  1. Go to your Actor β†’ Settings β†’ Environment variables
  2. Click "Add variable"
  3. Enter name and value
  4. Check "Secret" for sensitive data

Via .actor/actor.json:

{
"environmentVariables":{
"OPENAI_API_KEY":"@openai-key",
"GEMINI_API_KEY":"@gemini-key"
}
}

Note: Use @secret-name syntax to reference Apify secrets.

🎯 Use Cases

1. Brand Monitoring

Monitor brand mentions and visual content across competitor accounts:

  • Track logo appearances in images
  • Detect product placements
  • Monitor sentiment around brand discussions

2. Product Launch Detection

Get instant alerts when competitors announce new products:

  • Analyze images for product unveils
  • Detect "new" or "launching" keywords
  • Track announcement patterns

3. Content Moderation

Filter and flag inappropriate content:

  • Adult content detection (Azure/AWS)
  • Unsafe content filtering
  • Brand safety monitoring

4. Competitor Analysis

Track competitor social media activity:

  • Monitor posting frequency
  • Analyze content themes
  • Track image-based campaigns

5. Social Media Intelligence

Aggregate insights from multiple accounts:

  • Trending topics detection
  • Hashtag performance tracking
  • Engagement pattern analysis

6. Market Research

Gather visual data for market analysis:

  • Product feature comparisons
  • Packaging design trends
  • Campaign creative analysis

πŸš€ Quick Start

1. Basic Tweet Scraping (No Vision)

{
"usernames":["apify"],
"maxItems":50
}

2. Keyword Filtering

{
"usernames":["techcrunch","theverge"],
"searchTerms":["AI","ChatGPT"],
"maxItems":100
}

3. Image-Only Collection

{
"usernames":["nasa","spacex"],
"requireImages":true,
"maxItems":50
}

4. With Gemini Vision

{
"usernames":["producthunt"],
"requireImages":true,
"visionPipelines":[{
"name":"Product Detector",
"provider":"gemini_vision",
"enabled":true,
"configJson":"{\"prompt\":\"Describe this product\",\"triggerKeywords\":[\"app\",\"software\"]}"
}]
}

πŸ“Š Performance & Limits

  • Speed: 50-100 tweets per minute (RSS mode)
  • Concurrent Accounts: Up to 100 usernames
  • Vision Processing: ~2-5 seconds per image per provider
  • Memory: 512MB recommended (1GB for heavy vision usage)
  • Timeout: 300 seconds default (adjust in Actor settings)

πŸ”§ Troubleshooting

No Items Collected

Possible causes:

  • Bot protection blocking Nitter instances
  • Invalid usernames
  • User accounts have no recent posts
  • Filters are too restrictive

Solutions:

  • Verify usernames are correct (without @ symbol)
  • Try different time of day
  • Reduce filter restrictions
  • Check Actor logs for specific errors

Vision Analysis Not Working

Possible causes:

  • Missing API credentials in environment variables
  • Invalid API keys
  • API rate limits exceeded
  • Image URLs inaccessible

Solutions:

  • Verify all required environment variables are set
  • Check API key validity in provider dashboard
  • Review Actor logs for specific API errors
  • Ensure images are publicly accessible

Webhook Alerts Not Received

Possible causes:

  • Invalid webhook URL
  • Webhook endpoint timeout
  • Firewall blocking Apify IPs

Solutions:

  • Test webhook URL with curl/Postman
  • Increase webhook timeout in config
  • Verify webhook endpoint accepts POST requests
  • Check webhook logs for incoming requests

πŸ—οΈ Architecture

Data Flow

  1. Input Validation - Verify usernames and configuration
  2. Instance Discovery - Fetch working Nitter instances from status page
  3. RSS Scraping - Try RSS feeds from multiple instances
  4. Web Scraping Fallback - Parse HTML if RSS fails
  5. Content Filtering - Apply keyword/hashtag filters
  6. Vision Processing - Run enabled AI pipelines on images
  7. Alert Triggering - Send webhooks for matched patterns
  8. Data Storage - Save to Apify datasets

Technical Stack

  • Runtime: Node.js 18 (Apify SDK 3.x)
  • HTTP Client: Axios
  • HTML Parsing: Cheerio
  • RSS Parsing: rss-parser
  • AI Providers: Native REST APIs
  • Image Processing: Base64 encoding for Gemini/OpenAI

πŸ“ Changelog

Version 1.0.0 (2026-02-01)

  • ✨ Initial release
  • πŸš€ RSS-first scraping with web fallback
  • πŸ€– 6 AI vision providers
  • πŸ”” Webhook alert system
  • πŸ“Š Dual dataset output

πŸ“„ License

Apache-2.0

πŸ†˜ Support & Resources

πŸ™ Credits

Built with ❀️ using:


Made by [dubz]

You might also like

Google Images Scraper

automation-lab/google-images-scraper

Scrape Google Images results. Extract full-resolution image URLs, thumbnails, dimensions, titles, and source pages. Filter by size, color, type, and usage rights. No API key needed.

πŸ‘ User avatar

Stas Persiianenko

45

Chatbot Arena Scraper

automation-lab/chatbot-arena-scraper

Scrapes the Chatbot Arena (arena.ai) leaderboard to extract LLM model rankings, Elo scores, confidence intervals, vote counts, and category-specific ratings from human preference battles.

πŸ‘ User avatar

Stas Persiianenko

6

Tiktok Video Scraper

scrapers-hub/tiktok-video-scraper

πŸŽ₯ TikTok Video Scraper pulls rich video data β€” captions, hashtags, sounds, URLs, thumbnails, and engagement (views, likes, shares) plus creator profiles & comments. πŸ” Ideal for trend tracking, competitor analysis, and influencer discovery. πŸ“Š CSV/JSON output for analytics. πŸš€

Skool Profile Scraper β€” Bio, Socials, Communities & Activity

scrapersdelight/skool-profile-scraper

$0.0025/profile. Scrape Skool members β€” name, bio, location, avatar, socials, member-since, public communities, per-community level & points, and the activity heatmap. Harvest a whole community or the discovery directory, plus a new-profile monitor. Bring your own Skool login for member-only data.

πŸ‘ User avatar

Scrapers Delight

2

Twitter Trends Scraper

scraper-engine/twitter-trends-scraper

Twitter Trends Scraper extracts trending topics from Twitter worldwide or by region. Get trend names, tweet volumes, and links. Ideal for social listening, content strategy, and research. Export data in JSON, CSV, or Excel.

πŸ‘ User avatar

Scraper Engine

31

5.0

(4)

Skool Members & Email Scraper β€” Community Leads

scrapersdelight/skool-members-scraper

From $0.0025/member: scrape any Skool community's member list β€” name, handle, bio, location, socials, joined date, level & points β€” plus optional email enrichment ($0.002/member). Public communities work no-login; bring your own Skool login for member-only data.

πŸ‘ User avatar

Scrapers Delight

3

Twitter X Posts Scraper

scrapelabsapi/Twitter-X-Posts-Scraper

🐦 Twitter X Posts Scraper collects posts from X (Twitter) with text, timestamps, likes, reposts, replies, hashtags & media links. ⚑ Fast, reliable, and scalable for social listening, competitor tracking, sentiment, and marketing research. πŸ”Ž Perfect for analysts & growth teams; CSV/JSON export.

GTM Trigger Feed β€” Funding, Licenses, Layoffs & Openings

scrapersdelight/gtm-trigger-feed

One buying-intent feed from public records: SEC Form D funding & exec changes, WARN layoffs, new business & liquor licenses, ATS job postings. No login or API key. Monitor mode alerts only NEW signals via Slack, email or webhook. $4 per 1,000 signals β€” a PredictLeads / Mantiks alternative.

πŸ‘ User avatar

Scrapers Delight

3

X Profile & Post Monitor - Twitter Account Changes

groupoject/x-profile-post-monitor

Monitor public X/Twitter profiles for new posts and account changes. Track bios, names, avatars, websites, followers, posts, engagement, and media without login or an API key.