VOOZH about

URL: https://apify.com/dominic-quaiser/email-scraper

โ‡ฑ Email Scraper - Email Extractor for Lead Generation ยท Apify


Pricing

from $3.50 / 1,000 emails

Go to Apify Store

A lightweight Apify Actor that crawls websites and extracts email addresses using HTTP requests. It bypasses Cloudflare protection, RTL obfuscation, and text patterns to deliver structured data fast. Features include configurable crawl depth, proxy support, and anti-detection measures.

Pricing

from $3.50 / 1,000 emails

Rating

0.0

(0)

Developer

๐Ÿ‘ Dominic M. Quaiser

Dominic M. Quaiser

Maintained by Community

Actor stats

2

Bookmarked

104

Total users

18

Monthly active users

25 days ago

Last modified

Share

A powerful Apify Actor designed to crawl websites and extract email addresses using advanced detection methods. Simply provide a list of starting URLs, configure crawling depth and behavior, and the actor will automatically discover and extract email addresses from across the websiteโ€”even those hidden behind obfuscation or CloudFlare protection.

โš ๏ธ Pre-Release Version: This is a release candidate. Features are complete but may contain bugs. Feedback and issue reports are welcome!

๐Ÿš€ Features

  • Intelligent Email Discovery: Finds email addresses using multiple sophisticated detection methods, including:
    • Standard text pattern matching
    • Mailto links extraction
    • CloudFlare-protected emails
    • RTL (Right-to-Left) Unicode obfuscation
    • Common text obfuscation patterns
  • Configurable Crawl Depth: Control how deep the crawler follows links from your starting URLs (0-10 levels).
  • Domain-Focused or Broad Crawling: Choose to stay on the same domain or explore external links.
  • Lightweight HTTP Crawling: Fast, efficient method using HTTP requests without the overhead of a browser.
  • Anti-Detection Features: Built-in measures to avoid blocking, including user agent rotation, request delays, and robots.txt compliance.
  • Proxy Support: Integrates seamlessly with Apify's proxy service for IP rotation and avoiding rate limits.
  • Structured JSON Output: Delivers clean, well-structured data with full context about where and how each email was discovered.

๐Ÿ“ฅ Input Parameters

Configure the actor's behavior using these fields in the Apify Console Input tab or via API:

FieldTypeDescriptionDefaultRequired
start_urlsArrayThe URLs to start crawling from. The scraper will extract emails from these pages and follow links up to the specified depth.[{ "url": "https://www.katjes.de/" }]Yes
max_depthIntegerMaximum depth of links to follow from start URLs. 0 = only start URLs, 1 = start URLs + one level of links, etc. Range: 0-10.2No
stay_on_domainBooleanOnly follow links that stay on the same domain as each start URL. When enabled, the crawler won't visit external sites.trueNo
max_concurrent_pagesIntegerMaximum number of pages to process simultaneously. Leave empty for auto-tuning (recommended: 50). Range: 1-100.AutoNo
max_pages_per_domainIntegerMaximum number of pages to crawl from each individual domain. Leave empty for unlimited. This limit applies separately to each domain.200No
max_requests_per_runIntegerMaximum number of pages to crawl globally across all domains. Leave empty for unlimited.UnlimitedNo
request_delay_minNumberMinimum delay in seconds between requests to avoid detection. Recommended: 1-2 seconds. Range: 0-60.1No
request_delay_maxNumberMaximum delay in seconds between requests. A random delay between min and max will be used. Range: 0-60.3No
respect_robots_txtBooleanHonor robots.txt directives including crawl delays and disallowed paths.falseNo
rotate_user_agentsBooleanUse a pool of realistic user agents to appear as different users.trueNo
proxy_configurationObjectProxy settings to avoid being blocked. Apify Proxy is recommended for large crawls.{}No

๐Ÿ“ค Output Data Structure

The actor outputs one record per unique email address found during the crawl.

Example Output

[
{
"email":"info@example-company.com",
"found_on_url":"https://www.example-company.com/contact",
"start_url":"https://www.example-company.com",
"extraction_method":"mailto_link",
"depth":1
},
{
"email":"support@example-company.com",
"found_on_url":"https://www.example-company.com/about",
"start_url":"https://www.example-company.com",
"extraction_method":"text_standard",
"depth":1
},
{
"email":"sales@example-company.com",
"found_on_url":"https://www.example-company.com/impressum",
"start_url":"https://www.example-company.com",
"extraction_method":"cloudflare_protected",
"depth":2
}
]

๐Ÿ“ง Extraction Methods Explained

The actor uses multiple sophisticated techniques to find email addresses, even when websites try to hide them from bots:

MethodDescription
mailto_linkEmail addresses found in standard mailto: links in the HTML.
text_standardEmail addresses found in plain text using standard pattern matching.
text_obfuscatedEmail addresses that use common text obfuscation like "info [at] example [dot] com".
cloudflare_protectedEmail addresses protected by CloudFlare's email obfuscation that are decoded from the page.
rtl_obfuscatedEmail addresses hidden using Right-to-Left (RTL) Unicode characters to confuse simple scrapers.

๐Ÿ’ก Performance Tips

  • For small sites: Keep the default settings for optimal speed.
  • For large crawls: Use proxy rotation to avoid blocking and rate limits.
  • Memory constraints: Set max_concurrent_pages to a lower value (2-5) if running on limited memory.
  • Faster crawling: Increase max_concurrent_pages if you have sufficient resources.

๐ŸŽฏ Use Cases

  • Lead Generation: Build targeted contact lists for sales and marketing outreach.
  • Competitive Research: Discover contact information for companies in your industry.
  • Data Enrichment: Enhance existing company databases with email addresses.
  • Market Analysis: Gather communication channels for businesses in specific sectors or regions.
  • Recruitment: Find contact emails for potential candidates or hiring managers.
  • Partnership Development: Identify contact points for potential business partnerships.

๐Ÿ› ๏ธ Maintainer


๐Ÿ”ง Troubleshooting

No Emails Found

  • Check if the website contains any publicly visible emails
  • Try increasing max_depth to crawl more pages
  • Verify that stay_on_domain isn't preventing you from reaching contact pages on subdomains
  • Check if the website might be blocking the scraper (try enabling proxies)

Actor Running Out of Memory

  • Decrease max_concurrent_pages to process fewer pages simultaneously
  • Use max_requests_per_run to limit the total crawl size
  • Upgrade to a larger memory tier on Apify

Getting Blocked by Websites

  • Enable proxy rotation via proxy_configuration
  • Increase request_delay_min and request_delay_max
  • Enable rotate_user_agents and use_stealth_mode
  • Consider enabling respect_robots_txt to honor crawl delays

You might also like

Interior Designers Email Scraper

contacts-api/interior-designers-email-scraper

Interior designers email scraper to extract verified designer emails from design firms, portfolios, and business directories ๐Ÿ“ง๐Ÿ  Perfect for B2B outreach, partnerships, and interior design lead generation.

Website Email Scraper - All Contacts

thenetaji/website-email-scraper

Extract emails from websites. This Apify actor crawls pages to discover media links with configurable depth, proxy support & domain filtering. Boost content research & lead gen.

1.1K

4.0

AI Contact Details Scraper

scraping_samurai/ai-contact-details-scraper

AI-driven multi-pass crawler enriches each domain by auto-discovering /contact and /about pages, extracting, validating, and de-duplicating emails, phones, and social links via an LLM parsing layer. Outputs a single structured, machine-clean contact entity per business for downstream enrichment.

๐Ÿ‘ User avatar

Scraping Samurai

125

Gumroad Scraper

muhammetakkurtt/gumroad-scraper

Powerful Gumroad product scraper that extracts digital products, courses, and assets with advanced filtering options. Search by category, price range, rating, and file type. Supports unlimited product collection with structured data output including seller details, ratings, and pricing information.

๐Ÿ‘ User avatar

Muhammet Akkurt

457

5.0

Best Etsy Email Scraper

scraper-mind/best-etsy-email-scraper

[๐—•๐Ÿฎ๐—• ๐—˜๐— ๐—”๐—œ๐—Ÿ ๐—”๐—ฉ๐—”๐—œ๐—Ÿ๐—”๐—•๐—Ÿ๐—˜] Unlock targeted Etsy leads with fast, accurate Etsy email extraction. Our Etsy Email Scraper finds B2C/B2B contacts, shop details, and niche leads in seconds; perfect for outreach, marketing, and growth. Try this powerful Etsy contact scraper today!

42

Mass Etsy Email Scraper

scraper-mind/etsy-email-scraper

[๐—–๐—ต๐—ฒ๐—ฎ๐—ฝ๐—ฒ๐˜€๐˜ ๐—ฃ๐—ฟ๐—ถ๐—ฐ๐—ฒ] Boost your outreach with our Bulk Etsy Email Scraper! This powerful tool extracts emails from Etsy shops, helping you connect with potential customers or partners effortlessly. Save time, grow your business, and enhance marketing campaigns with accurate Etsy email data.

375

5.0

Gumroad Scraper

louisdeconinck/gumroad-scraper

Extract valuable product data from Gumroad with ease. This scraper collects detailed information about products, pricing, sellers, and ratings. Perfect for market research, competitor analysis, and building product feeds. Supports multiple search parameters and automated pagination.

๐Ÿ‘ User avatar

Louis Deconinck

78

5.0

Website Contacts Crawler

quaking_pail/contact-crawler

Scrap website searching for contact details, emails and phone numbers