Pricing
from $2.00 / 1,000 results
Website Email Scraper - All Contacts
Extract emails from websites. This Apify actor crawls pages to discover media links with configurable depth, proxy support & domain filtering. Boost content research & lead gen.
Pricing
from $2.00 / 1,000 results
Rating
4.0
(3)
Developer
Actor stats
14
Bookmarked
1.1K
Total users
64
Monthly active users
1.6 hours
Issues response
2 days ago
Last modified
Categories
Share
Website Email & Contact Extractor v2.1
π Overview
Website Email & Contact Extractor is an Apify actor that crawls websites and extracts contact information in a clean, consistent format. It finds emails, phone numbers, social media profiles, and physical addresses β perfect for lead generation, sales outreach, local SEO, and market research.
β¨ Key Features
- Contact-First Output: Emails, phones, social profiles, and addresses in one consistent schema
- Social Platform Detection: Automatically identifies LinkedIn, Instagram, Twitter/X, TikTok, YouTube, Facebook, GitHub, Telegram, WhatsApp, Pinterest, Snapchat, and Reddit
- Cloudflare Protection Bypass: Decodes Cloudflare-obfuscated email addresses
- Text + DOM Extraction: Finds contacts in visible text,
mailto:/tel:links, structured markup, and social links - Adaptive Stealth Browser: Auto-escalates to a headless browser when pages block normal requests
- Domain Filtering: Stay on the same domain or crawl freely
- Consistent Schema: Every result has the same 9 fields, with
nullfor absent values
π― Use Cases
- Lead Generation: Build lists of sales prospects from company websites
- Sales Outreach: Extract decision-maker emails and LinkedIn profiles
- Local SEO: Collect NAP (Name, Address, Phone) data
- Market Research: Map social presence across competitor sites
- Recruiting: Find contact details and social profiles of teams
π οΈ Input Parameters
{"startUrls":[{"url":"https://example.com"}],"mediaType":"all","maxCrawlDepth":2,"maxConcurrency":10,"maxRequestRetries":3,"maxUrlsToCrawl":100,"useProxy":{"useApifyProxy":false,"apifyProxyGroups":[],"apifyProxyCountry":""}}
Parameter Details
| Parameter | Type | Description |
|---|---|---|
startUrls | Array | List of URLs where the crawler will begin |
mediaType | String | Contact type: all, contact, email, phone, social, or address |
maxCrawlDepth | Number | How many links deep the crawler will go |
maxConcurrency | Number | Maximum parallel requests |
maxRequestRetries | Number | Number of retry attempts for failed requests |
maxUrlsToCrawl | Number | Maximum number of pages to process |
useProxy | Object | Configuration for Apify proxy usage |
useStealth | Boolean | Auto-escalate to stealth browser when blocked; auto-enables proxy if none set |
solveCloudflare | Boolean | Automatically solve Cloudflare challenges |
includeContactText | Boolean | Scan visible page text for contacts not wrapped in links |
groupByPage | Boolean | Combine all contacts from one page into a single dataset item (default: true) |
π Output Format
By default (groupByPage: true) the actor outputs one item per crawled page, combining all contacts found on that page. Set groupByPage: false to emit one flat item per contact instead.
Grouped output (default)
{"sourceUrl":"https://apify.com/","pageTitle":"Apify: Full-stack web scraping and data extraction platform","emails":["hello@apify.com"],"phones":["+1-234-567-8900"],"socials":{"github":[{"url":"https://github.com/apify","handle":"apify"}],"twitter":[{"url":"https://twitter.com/apify","handle":"apify"}]},"addresses":["123 Main St, Los Angeles, CA"],"foundAt":"2026-06-15T15:40:51.184Z"}
Flat output (groupByPage: false)
Every item uses the same 9-field schema:
{"type":"contact","contactType":"email","value":"info@example.com","url":null,"socialPlatform":null,"socialHandle":null,"sourceUrl":"https://example.com/contact","pageTitle":"Contact Us","foundBy":"mailto","foundAt":"2026-06-15T04:01:58.105Z"}
Output fields (flat mode)
| Field | Description |
|---|---|
type | Always "contact" |
contactType | email, phone, social, or address |
value | The extracted contact value |
url | Web URL for social profiles; null for other types |
socialPlatform | Platform name for social items; null otherwise |
socialHandle | Username/handle for social items; null otherwise |
sourceUrl | Page where the contact was found |
pageTitle | Title of the source page |
foundBy | Detection method: dom, mailto, tel, text-scan, cfemail |
foundAt | ISO-8601 timestamp |
Examples
{"type":"contact","contactType":"email","value":"info@eversquaremedical.ca","url":null,"socialPlatform":null,"socialHandle":null,"sourceUrl":"https://www.eversquaremedical.ca/","pageTitle":"Ever Square Medical","foundBy":"mailto","foundAt":"2026-06-15T04:01:58.105Z"}
Phone from a tel: link
{"type":"contact","contactType":"phone","value":"310-929-6336","url":null,"socialPlatform":null,"socialHandle":null,"sourceUrl":"https://www.conciergehealthcarepartnersinc.com/","pageTitle":"Concierge Healthcare Partners","foundBy":"tel","foundAt":"2026-06-15T04:01:58.084Z"}
Social profile
{"type":"contact","contactType":"social","value":"https://www.instagram.com/example","url":"https://www.instagram.com/example","socialPlatform":"instagram","socialHandle":"example","sourceUrl":"https://example.com/about","pageTitle":"About Us","foundBy":"dom","foundAt":"2026-06-15T04:01:58.200Z"}
π‘ Best Practices
- Start Small: Begin with a low
maxUrlsToCrawlvalue to test results - Use Stealth for Protected Sites: Enable
useStealthandsolveCloudflarefor Cloudflare-protected sites. Stealth auto-enables an Apify datacenter proxy if you do not provide one. The actor rotates datacenter IPs on blocks and only escalates to expensive residential proxies after repeated consecutive blocks on the same domain. - Optimize Depth: Most contact info is found within 1β2 levels of crawl depth
- Target Specific Contact Types: Use
mediaTypeto focus on emails, phones, or socials - Respect Websites: Use reasonable
maxConcurrencyvalues to avoid overloading sites
π Examples
Extract emails only
{"startUrls":[{"url":"https://company.com"}],"mediaType":"email","maxCrawlDepth":2,"maxUrlsToCrawl":50}
Extract all contact types
{"startUrls":[{"url":"https://company.com"}],"mediaType":"all","maxCrawlDepth":2,"maxUrlsToCrawl":100,"includeContactText":true}
Collect social media profiles
{"startUrls":[{"url":"https://company.com"}],"mediaType":"social","maxCrawlDepth":1,"maxUrlsToCrawl":50}
βοΈ Technical Implementation
The actor uses multiple extraction strategies:
- DOM Selectors:
mailto:,tel:, social links, and structured markup - Text Scanning: Regex over visible page text
- Cloudflare Decode: Reverses
data-cfemailobfuscation - Adaptive Escalation: Rotates datacenter IPs on blocks; only falls back to residential stealth when a domain repeatedly fails with datacenter proxies
π Performance Considerations
- Processing speed depends on website complexity and response times
- Typical extraction rates: 5β10 pages per second without proxy, 2β5 pages per second with proxy
- Memory usage scales with concurrency and page complexity
- The actor uses datacenter proxies by default and escalates to residential proxies only when necessary, keeping costs low for most contact-extraction tasks
π Integration Ideas
- Connect with Apify Storage for permanent dataset archiving
- Combine with Google Sheets integration for easy team collaboration
- Use with Zapier or Make to automate outreach workflows
