Pricing
from $28.00 / 1,000 lead enricheds
Website Lead Intelligence
Crawl any website and turn it into a sales-ready lead profileβno APIs needed. AI identifies industries, detects 50+ technologies, extracts emails and phones, estimates company size, and scores leads 0β100 against your ICP. Built for B2B sales teams and marketers qualifying leads at scale.
Pricing
from $28.00 / 1,000 lead enricheds
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
6
Total users
2
Monthly active users
3 months ago
Last modified
Categories
Share
Turn any company website into actionable lead intelligence β with AI-powered industry classification, technology detection, contact extraction, company size estimation, and ICP-based lead scoring. No external API keys required.
What It Does
Give this Actor a list of company domains or URLs. It crawls each website (homepage + key subpages), then returns a rich, structured dataset for every lead:
| Feature | Description |
|---|---|
| Industry Classification | Auto-classifies into 20+ industries using AI (zero-shot) or keyword analysis |
| Technology Detection | Identifies 50+ technologies across 8 categories (frameworks, CMS, analytics, marketing, etc.) |
| Contact Extraction | Finds emails, phone numbers, and social media profiles (LinkedIn, Twitter, Facebook, etc.) |
| Company Size Estimation | Estimates employee count from text patterns, careers pages, team pages, and tech sophistication |
| Lead Scoring | Scores each lead 0-100 against your Ideal Customer Profile (ICP) |
How It Works
Input:["stripe.com","hubspot.com","shopify.com"]|+-----------+-----------+|||Crawl Crawl Crawl(homepage(homepage(homepage+/about +/about +/about+/contact +/contact +/contact+/careers)+/careers)+/careers)|||+-----+-----+-----+----+||Extract Contacts Detect Tech Stack||Classify Industry Estimate Size||Score vs ICP|Structured Output(sorted by score)
- Crawl β Fetches the homepage plus up to 10 subpages (/about, /contact, /careers, /pricing, /team, etc.)
- Extract β Pulls emails (from text + mailto: links), phones (from tel: links), and social profiles from anchor tags
- Detect β Matches 50+ technology signatures in HTML and HTTP headers (React, WordPress, HubSpot, Stripe, AWS, etc.)
- Classify β Determines the company's industry using a hybrid approach:
- AI mode (default): Runs a zero-shot classification model (distilbert-base-uncased-mnli via transformers.js) locally β no API keys needed
- Keyword mode: Fast regex-based matching against 20+ industry keyword dictionaries
- Results are cross-validated between both methods for maximum accuracy
- Estimate β Determines company size from employee count patterns, job listings, team profiles, and tech stack sophistication
- Score β Calculates a 0-100 lead score based on how well the company matches your ICP criteria
Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | string[] | (required) | Website URLs or domains to analyze (e.g., stripe.com, https://hubspot.com) |
maxPagesPerDomain | integer | 5 | Max subpages to crawl per domain (1-10) |
concurrency | integer | 3 | Number of domains to process in parallel (1-10) |
enableAiClassification | boolean | true | Use AI model for industry classification. Disable for faster keyword-only mode |
targetIndustries | string[] | [] | ICP: Target industries for lead scoring (e.g., ["SaaS", "E-commerce"]) |
targetSizeMin | integer | 10 | ICP: Minimum preferred employee count |
targetSizeMax | integer | 500 | ICP: Maximum preferred employee count |
targetTechnologies | string[] | [] | ICP: Target technologies (e.g., ["React", "HubSpot"]) |
requiredContactTypes | string[] | ["email"] | ICP: Required contact types β email, phone, linkedin, twitter |
Example Input
{"urls":["stripe.com","hubspot.com","shopify.com","coursera.org"],"enableAiClassification":true,"maxPagesPerDomain":5,"concurrency":3,"targetIndustries":["SaaS and Software"],"targetSizeMin":50,"targetSizeMax":1000,"targetTechnologies":["React","Node.js"],"requiredContactTypes":["email"]}
Output
Each lead is returned as a structured JSON object with the following fields:
{"domain":"stripe.com","url":"https://stripe.com","title":"Stripe | Financial Infrastructure for the Internet","description":"Stripe powers online and in-person payment processing...","emails":["sales@stripe.com"],"phones":["+1-888-926-2289"],"socialLinks":{"linkedin":"https://www.linkedin.com/company/stripe","twitter":"https://twitter.com/stripe","facebook":"https://www.facebook.com/StripeHQ","instagram":null,"github":"https://github.com/stripe","youtube":null},"industry":{"primary":"SaaS and Software","secondary":"E-commerce and Retail","confidence":0.485,"allScores":[{"label":"SaaS and Software","score":0.485},{"label":"E-commerce and Retail","score":0.389}],"method":"keyword"},"companySize":{"estimate":"enterprise","employeeRange":"1000+","confidence":0.25,"signals":["Enterprise-level language detected"]},"technologies":["React","Next.js","AWS","Vercel","Fastly","Bootstrap","Stripe"],"techCategories":{"frameworks":["React","Next.js"],"cms":[],"analytics":[],"marketing":[],"infrastructure":["AWS","Vercel","Fastly"],"libraries":["Bootstrap"],"ecommerce":["Stripe"],"payments":[]},"leadScore":34,"scoreBreakdown":{"industryMatch":15,"sizeMatch":0,"techMatch":10,"contactQuality":0,"webPresence":9},"pagesAnalyzed":3,"analyzedAt":"2026-03-08T08:03:53.138Z"}
Output Field Reference
| Field | Type | Description |
|---|---|---|
domain | string | Normalized domain name |
url | string | Full URL of the homepage |
title | string | Page title from <title> tag |
description | string | Meta description |
emails | string[] | Extracted email addresses (deduplicated, validated) |
phones | string[] | Phone numbers from tel: links (max 10, deduplicated) |
socialLinks | object | Social media profile URLs (LinkedIn, Twitter/X, Facebook, Instagram, GitHub, YouTube) |
industry.primary | string | Top industry classification |
industry.secondary | string | null | Second-best industry (if confidence > threshold) |
industry.confidence | number | Classification confidence (0-1) |
industry.method | string | "ai" or "keyword" β which method produced the final result |
companySize.estimate | string | Size category: micro, small, medium, large, enterprise, or unknown |
companySize.employeeRange | string | Human-readable range (e.g., "51-200") |
companySize.signals | string[] | Evidence used for estimation |
technologies | string[] | All detected technologies |
techCategories | object | Technologies grouped by category |
leadScore | integer | ICP match score (0-100), higher = better fit |
scoreBreakdown | object | Score components: industryMatch, sizeMatch, techMatch, contactQuality, webPresence |
pagesAnalyzed | integer | Number of pages successfully crawled |
analyzedAt | string | ISO 8601 timestamp |
Supported Industries
The classifier recognizes 20+ industries:
| Industry | Example Companies |
|---|---|
| SaaS and Software | Stripe, HubSpot, GitHub, Salesforce |
| E-commerce and Retail | Shopify, Amazon, Etsy |
| Marketing and Advertising | Creative agencies, ad networks |
| Finance and Payments | Banks, investment platforms, fintech |
| Healthcare and Biotech | Hospitals, pharma, telehealth |
| Education and Training | Coursera, universities, edtech |
| Manufacturing and Industrial | Factories, industrial equipment |
| Consulting and Professional Services | McKinsey, Deloitte |
| Media and Entertainment | Streaming, publishing, podcasts |
| Travel & Hospitality | Airbnb, hotels, tourism |
| Real Estate | Property management, mortgage |
| Food & Beverage | Restaurants, food delivery |
| Telecommunications | Telecom, ISPs |
| Energy | Solar, oil & gas, utilities |
| Non-Profit | Charities, foundations, NGOs |
| Legal | Law firms, legal services |
| Construction | Contractors, architecture firms |
| Transportation & Logistics | Shipping, freight, fleet management |
| Cybersecurity | InfoSec, threat detection |
| AI & Machine Learning | AI research, ML platforms |
Detected Technologies (50+)
| Category | Technologies |
|---|---|
| Frameworks | React, Next.js, Vue.js, Nuxt.js, Angular, Svelte, Gatsby, Remix |
| CMS | WordPress, Shopify, Squarespace, Wix, Webflow, Drupal, Ghost, Contentful |
| Analytics | Google Analytics, Google Tag Manager, Facebook Pixel, Hotjar, Mixpanel, Segment, Amplitude, Plausible |
| Marketing | HubSpot, Mailchimp, Intercom, Drift, Zendesk, Crisp, Salesforce, Marketo, ActiveCampaign |
| Infrastructure | Cloudflare, AWS, Vercel, Netlify, Heroku, Google Cloud, Azure, Fastly |
| Libraries | jQuery, Bootstrap, Tailwind CSS, Material UI, Font Awesome, GSAP, Three.js, D3.js |
| E-commerce | Stripe, PayPal, WooCommerce, BigCommerce, Magento |
| Payments | Stripe Payments, PayPal Checkout, Square, Braintree |
Lead Score Breakdown
The lead score (0-100) is composed of five weighted dimensions:
| Dimension | Max Points | What It Measures |
|---|---|---|
| Industry Match | 30 | How well the company's industry matches your target industries |
| Size Match | 20 | How close the company size is to your target range |
| Tech Match | 20 | How many of your target technologies the company uses |
| Contact Quality | 15 | Availability of emails, phones, and social profiles |
| Web Presence | 15 | Tech sophistication, analytics usage, marketing tools, active website |
Company Size Categories
| Category | Employee Range | Midpoint |
|---|---|---|
| Micro | 1-10 | 5 |
| Small | 11-50 | 30 |
| Medium | 51-200 | 125 |
| Large | 201-1000 | 600 |
| Enterprise | 1000+ | 2000 |
Use Cases
- Sales Prospecting β Score and prioritize leads from cold outreach lists
- Market Research β Map competitors' tech stacks and company sizes at scale
- Lead Qualification β Auto-filter leads that match your ICP before CRM import
- Data Enrichment β Append industry, size, and tech data to existing lead lists
- Competitive Intelligence β Monitor which technologies competitors are adopting
Integration Tips
Chain with Other Scrapers
Use this Actor as a post-processing step after collecting domains from:
- Google Maps Email Extractor
- LinkedIn Company Scraper
- Any domain list scraper
Export to CRM
Results can be exported as JSON, CSV, or Excel directly from the Apify dataset. Key fields map naturally to CRM fields:
emailsβ Contact Emailphonesβ Contact Phoneindustry.primaryβ Company IndustrycompanySize.employeeRangeβ Company SizeleadScoreβ Lead Score / Priority
Webhook Integration
Set up an Apify webhook to automatically send enriched leads to your CRM, Slack, or any HTTP endpoint when a run completes.
Performance
| Metric | Value |
|---|---|
| Speed | ~10-15 seconds per domain (with AI), ~5-8 seconds (keyword-only) |
| Memory | Recommended 4096 MB (for AI model). Minimum 1024 MB (keyword-only) |
| Concurrency | Up to 10 domains in parallel |
| AI Model | distilbert-base-uncased-mnli (~100MB, loaded once, runs locally) |
Limitations
- Contact extraction relies on publicly visible information (emails in HTML, tel: links, social media links)
- Company size estimation is approximate β based on text patterns, careers pages, and tech sophistication
- Some websites may block automated crawlers, resulting in fewer pages analyzed
- AI classification requires 4096 MB memory; use keyword-only mode for lower memory usage
Cost
This Actor runs on the Apify platform. Costs depend on compute usage:
- Memory: 4096 MB recommended (AI mode) or 1024 MB (keyword mode)
- Compute units: ~0.01-0.02 CU per domain analyzed
- No external API keys or additional costs required
