Pricing
from $10.00 / 1,000 results
Website Intelligence Extractor
A powerful Apify actor that crawls websites to extract key intelligence, including emails, phone numbers, social media profiles, technology stack, SEO metadata, and structured data (JSON-LD). Ideal for lead generation, competitive analysis, marketing research, and SEO audits.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
4
Total users
0
Monthly active users
2 months ago
Last modified
Categories
Share
π Website Intelligence Extractor
A powerful Apify actor that crawls any website and extracts actionable intelligence β emails, phone numbers, social media profiles, technology stack, SEO metadata, and structured data (JSON-LD).
Perfect for lead generation, competitive analysis, marketing research, and SEO auditing.
β¨ What It Extracts
| Category | Details |
|---|---|
| π§ Emails | All email addresses found on pages + mailto: links, with junk filtering |
| π Phones | Phone numbers from text + tel: links, international format support |
| π Social Media | Facebook, Twitter/X, LinkedIn, Instagram, YouTube, GitHub, TikTok, Reddit, Threads, Bluesky, and 15+ more platforms |
| βοΈ Tech Stack | CMS (WordPress, Shopify, Webflowβ¦), Frameworks (React, Next.js, Vueβ¦), Analytics (GA, Mixpanel, PostHogβ¦), Marketing tools (HubSpot, Intercomβ¦), CDN, Hosting, Payments β 60+ technologies |
| π SEO Data | Title, meta description, canonical URL, OG tags, Twitter cards, heading hierarchy, word count, image alt audit, internal/external links, and a computed SEO Score (0-100) |
| π Structured Data | JSON-LD schemas (Organization, Product, Article, FAQ, etc.) |
π Quick Start
Input Example
{"startUrls":[{"url":"https://example.com"}],"maxPages":30,"maxDepth":3,"extractEmails":true,"extractPhones":true,"extractSocials":true,"detectTechStack":true,"extractSEO":true,"extractStructuredData":true,"proxyConfiguration":{"useApifyProxy":true}}
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
startUrls | array | required | URLs to crawl |
maxPages | integer | 20 | Max pages per run (1β500) |
maxDepth | integer | 3 | Link depth to follow (0β10) |
extractEmails | boolean | true | Find email addresses |
extractPhones | boolean | true | Find phone numbers |
extractSocials | boolean | true | Find social media links |
detectTechStack | boolean | true | Identify technologies |
extractSEO | boolean | true | Collect SEO metadata |
extractStructuredData | boolean | true | Parse JSON-LD |
proxyConfiguration | object | Apify proxy | Proxy settings |
π¦ Output Format
Per-Page Dataset Record
{"url":"https://example.com/about","statusCode":200,"crawledAt":"2025-01-15T10:30:00.000Z","title":"About Us β Example Corp","metaDescription":"Learn about Example Corp...","seoScore":82,"wordCount":1450,"emails":["hello@example.com","careers@example.com"],"phones":["+1 (555) 123-4567"],"socialLinks":{"twitter":["https://twitter.com/examplecorp"],"linkedin":["https://linkedin.com/company/example"],"github":["https://github.com/example"]},"techStack":[{"name":"Next.js","category":"Framework"},{"name":"Vercel","category":"Hosting"},{"name":"Google Analytics","category":"Analytics"},{"name":"Stripe","category":"Payments"}],"seo":{"title":"About Us β Example Corp","titleLength":25,"metaDescription":"Learn about Example Corp...","metaDescriptionLength":145,"canonicalUrl":"https://example.com/about","language":"en","openGraph":{"title":"...","image":"..."},"headings":{"h1":["About Example Corp"],"h2":["Our Mission","Our Team","Contact"]},"totalImages":12,"imagesWithoutAlt":2,"internalLinks":34,"externalLinks":8,"seoScore":82},"structuredData":[{"@type":"Organization","name":"Example Corp","url":"https://example.com"}]}
Domain Summary (Key-Value Store β DOMAIN_SUMMARY)
After crawling completes, a rolled-up summary is saved:
{"totalPagesCrawled":25,"totalUniqueEmails":["hello@example.com","sales@example.com"],"totalUniquePhones":["+1 (555) 123-4567"],"socialProfiles":{"twitter":["https://twitter.com/examplecorp"],"linkedin":["https://linkedin.com/company/example"]},"technologiesDetected":[{"name":"Next.js","category":"Framework"},{"name":"Stripe","category":"Payments"}]}
π― Use Cases
- Lead Generation β Crawl prospect websites to harvest contact emails and phone numbers
- Competitive Analysis β Discover what tech stack competitors use
- SEO Auditing β Bulk-audit SEO health across hundreds of pages
- Market Research β Map social media presence across an industry
- Sales Intelligence β Enrich CRM records with fresh website data
- Content Analysis β Extract structured data and content metrics
π License
MIT β see LICENSE for details.
