Company Website Enricher โ B2B Lead Intelligence
Pricing
from $3.00 / 1,000 lead enricheds
Company Website Enricher โ B2B Lead Intelligence
Extract company info, emails, phone numbers, social media profiles, and technology stack from any website. Pure HTTP scraping, no browser needed. Perfect for B2B lead enrichment, competitive intelligence, and sales prospecting.
Pricing
from $3.00 / 1,000 lead enricheds
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Company Website Enricher
Extract structured company intelligence from any website using plain HTTP requests. Give it a list of domains and get back emails, phone numbers, social media profiles, technology stack, and company metadata โ ready for CRM import, lead scoring, or competitive analysis.
What data do you get?
For each domain, the actor crawls the homepage and key subpages (/about, /contact, /team) and returns:
| Field | Description | Example |
|---|---|---|
| companyName | Company name from structured data, Open Graph, or title tag | Apify |
| description | Company description from meta tags | Thousands of tools to automate your business... |
| emails | Email addresses found on the website | ["hello@apify.com"] |
| phoneNumbers | Phone numbers from tel: links and visible text | ["+1 (555) 123-4567"] |
| socialProfiles | LinkedIn, X/Twitter, Facebook, Instagram, YouTube, GitHub | {"linkedin": "https://linkedin.com/company/apify", ...} |
| techStack | Technologies detected from script sources, meta tags, and HTTP headers | ["Next.js", "HubSpot", "Google Tag Manager"] |
| logoUrl | Company logo or Open Graph image URL | https://apify.com/img/og/landing.png |
| language | Page language from HTML lang attribute | en |
| pagesCrawled | Number of pages analyzed for this domain | 3 |
Use cases
- B2B Sales Prospecting โ Enrich your lead lists with emails, phone numbers, and social profiles before outreach
- Competitive Intelligence โ Discover what technologies your competitors use (CMS, analytics, marketing tools)
- Market Research โ Profile hundreds of companies in a target market to identify technology trends
- CRM Enrichment โ Bulk-enrich your CRM contacts with missing company data
- Lead Scoring โ Use tech stack and social presence as signals for lead qualification
- Agency Pitching โ Identify prospects using outdated technology or missing key tools
How it works
- You provide a list of company domains (e.g.,
apify.com,stripe.com) - The actor fetches each website's homepage via HTTP (no browser โ fast and cheap)
- It discovers and crawls relevant subpages (/about, /contact, /team, etc.)
- Data is extracted from HTML structure, meta tags, HTTP headers, and visible text
- Results are deduplicated, merged across pages, and pushed to the dataset
No browser rendering, no JavaScript execution โ pure HTTP requests with Cheerio parsing. This makes it fast, lightweight, and cost-effective.
Technologies detected
The actor identifies 40+ technologies across these categories:
| Category | Examples |
|---|---|
| CMS | WordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Joomla, Ghost, Magento |
| Frameworks | Next.js, Nuxt.js, Angular, Vue.js, Gatsby, Docusaurus |
| Analytics | Google Analytics, Google Tag Manager, Segment, Mixpanel, Amplitude, Hotjar |
| Marketing | HubSpot, Mailchimp, Optimizely, LaunchDarkly |
| Support | Intercom, Zendesk, Drift, Crisp, LiveChat |
| Payments | Stripe, PayPal |
| Infrastructure | Cloudflare, Cloudinary, Imgix, Algolia, Sentry, Recaptcha |
| JS Libraries | jQuery, Bootstrap, Tailwind CSS, Font Awesome |
| Servers | Nginx, Apache, Microsoft IIS (from HTTP headers) |
Detection uses structural indicators (script/link URLs, specific DOM markers) rather than keyword matching, which eliminates false positives.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
| domains | string[] | required | List of company domains or URLs to enrich. Examples: "apify.com", "https://stripe.com" |
| maxPagesPerDomain | integer | 5 | Maximum pages to crawl per domain (homepage + subpages). More pages = more data but slower |
| extractEmails | boolean | true | Extract email addresses |
| extractPhones | boolean | true | Extract phone numbers |
| extractSocialLinks | boolean | true | Extract social media profile links |
| detectTechStack | boolean | true | Detect website technologies |
| proxyConfiguration | object | โ | Proxy settings for the crawler |
Example input
{"domains":["hubspot.com","zendesk.com"],"maxPagesPerDomain":5,"extractEmails":true,"extractSocialLinks":true,"detectTechStack":true}
Output
Each domain produces one result object in the dataset. Here are real results from running the actor:
Example 1: HubSpot โ social profiles + global phone numbers
{"domain":"hubspot.com","url":"https://hubspot.com","companyName":"HubSpot","description":"HubSpot's AI-powered customer platform provides the tools your business needs to grow better.","logoUrl":"https://www.hubspot.com/hubfs/HubSpot_Logos/HubSpot-Inversed-Favicon.png","emails":[],"phoneNumbers":["18884827768","+35315187500","+6569556000","+61291648000","+813-4520-9500","+4930208486000","+442073243700"],"socialProfiles":{"linkedin":"https://www.linkedin.com/company/hubspot","twitter":"https://x.com/HubSpot","facebook":"https://www.facebook.com/hubspot","instagram":"https://www.instagram.com/hubspot","youtube":"https://youtube.com/user/HubSpot","github":null},"techStack":["Cloudflare","HubSpot","jQuery"],"language":"en","pagesCrawled":5,"enrichedAt":"2026-06-26T08:54:23.158Z"}
The actor discovered HubSpot's global contact page and extracted phone numbers for offices in the US, Ireland, Singapore, Australia, Japan, Germany, and the UK โ all from a single domain input.
Example 2: Zendesk โ emails + tech stack
{"domain":"zendesk.com","url":"https://zendesk.com","companyName":"Zendesk","description":"Move beyond deflection with AI agents that resolve issues end-to-end.","logoUrl":"https://d1eipm3vz40hy0.cloudfront.net/images/logos/favicons/zendesk-image.png","emails":["ask.philippines@zendesk.com","ask.thailand@zendesk.com","ask.indonesia@zendesk.com","ask.malaysia@zendesk.com","ask.gcr@zendesk.com"],"phoneNumbers":["18888519456"],"socialProfiles":{"linkedin":"https://www.linkedin.com/company/zendesk","twitter":"https://www.x.com/zendesk","facebook":"https://www.facebook.com/zendesk","instagram":"https://www.instagram.com/zendesk","youtube":null,"github":null},"techStack":["Cloudflare","Next.js","Optimizely","Zendesk"],"language":"en-US","pagesCrawled":5,"enrichedAt":"2026-06-26T08:54:44.387Z"}
The actor probed Zendesk's contact pages and found regional sales emails, detected they use Optimizely for A/B testing, and identified they run on their own Zendesk platform.
Results can be exported as JSON, CSV, Excel, XML, or accessed via the Apify API.
How emails are extracted
- Scans
mailto:links (most reliable source) - Pattern-matches email addresses in visible page text (script/style/SVG content is stripped first)
- Filters out noise: noreply addresses, system domains (sentry.io, schema.org, etc.), and image file extensions
- Deduplicates across all crawled pages
How social profiles are detected
- Scans all
<a href>links for LinkedIn, X/Twitter, Facebook, Instagram, YouTube, and GitHub URLs - Excludes share/intent/login links (e.g.,
facebook.com/shareris ignored) - Normalizes Twitter/X URLs to
x.com - Returns the company's actual profile, not generic platform links
Performance
- Speed: ~200-400ms per page (HTTP only, no browser overhead)
- Memory: 256 MB minimum, works well at default settings
- Throughput: Processes 10 domains concurrently by default
- Cost: Lightweight โ uses minimal compute and no browser instances
Integrations
This actor works with the full Apify ecosystem:
- API โ Call via REST API or Apify client libraries (JavaScript, Python)
- Scheduling โ Run on a schedule to keep your company data fresh
- Webhooks โ Get notified when a run finishes
- Zapier / Make / n8n โ Connect to your automation workflows
- Google Sheets โ Export results directly to a spreadsheet
Limitations
- JavaScript-rendered content: Since this actor uses HTTP requests (no browser), it cannot extract data from websites that require JavaScript to render their content. Most company websites serve key content in the initial HTML.
- Phone numbers: Uses a conservative regex to avoid false positives. Some phone numbers in unusual formats may be missed.
- Paywalled content: Cannot access content behind login walls.
- Anti-bot protection: Some websites with aggressive bot protection (Cloudflare challenges, CAPTCHAs) may block requests. Use proxy configuration to improve success rates.
