VOOZH about

URL: https://apify.com/odaudlegur/hitta-se-lead-scraper

โ‡ฑ Hitta.se Lead Scraper ยท Apify


Pricing

from $1.00 / 1,000 results

Go to Apify Store

Hitta.se Lead Scraper

Retrieve leads on hitta.se, the easy way. This actor will retrieve the business' name, address, email addresses, phone numbers and social links.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ SLASH

SLASH

Maintained by Community

Actor stats

3

Bookmarked

23

Total users

1

Monthly active users

22 days ago

Last modified

Categories

Share

Hitta.se Scraper

Updates in this version

  • If there is no website or it is banned/unavailable, keep all valid detail-page emails (not only generic).
  • Email slug fallback now unescapes HTML and matches both:
    • <slug>-button-email-<email>
    • button-email-<email> (broader pattern)
  • Unescape attribute values before extracting emails to handle entity-encoded addresses.
  • Pagination errors fixed

Kept (still working as intended)

  • max_results with โ€œNรคstaโ€ pagination (25 per page).
  • Website picked from canonical/og:url/JSON-LD first, then scored anchors; bans junk.
  • website_details: one of ok | 404 | unavailable | banned | n/a.
  • Banned: hitta.dixa.help, dixa.help, biluppgifter.se, specific DNB URL.
  • Detail-page email extraction from Hitta UI + mailto + strict regex + slug fallback.
  • Same-domain mini-crawl for emails + social links when website is OK.
  • Swedish address heuristics; phone extraction; categories.

Features

  • Crawls listing pages on Hitta.se, discovers company detail pages, and pushes normalized contact data.
  • Extracts from detail pages:
    • name
    • categories (best-effort)
    • phone
    • address (Swedish heuristics and JSON-LD)
    • website (structured first, then scored anchors with bans)
    • email1..N (strict parsing, de-duplicated)
    • website_details (status as above)
    • Socials if available: social_facebook, social_instagram, social_linkedin, social_x, social_youtube, social_tiktok, social_pinterest
  • If website is OK, performs a tiny same-domain crawl (configurable) to discover more emails and socials.

Input Configuration

Example input JSON:

{
"start_urls":[
{"url":"https://www.hitta.se/nacka/fรถretag/2"}
],
"max_depth":3,
"headers":{
"User-Agent":"Mozilla/5.0 ...",
"Accept-Language":"sv-SE,sv;q=0.9,en-US;q=0.8,en;q=0.7"
},
"timeout_seconds":30,
"site_email_max_pages":3,
"max_results":0
}
FieldTypeDefaultDescription
start_urlsarray[{ "url": "https://www.hitta.se/nacka/fรถretag/2"}]One or more Hitta listing URLs.
max_depthinteger3Crawl depth from the start URLs. Listing depth is reused for pagination.
headersobjectSee code defaultsHTTP headers for requests.
timeout_secondsinteger30Read timeout for HTTP requests.
site_email_max_pagesinteger3Max pages to crawl on the same domain as the extracted website for extra contacts.
max_resultsinteger0 (no cap)Limit how many detail results to push. Pagination respects 25/page with โ€œNรคsta.โ€

How It Works

  1. Listing pages: extracts detail links using /verksamhet/ anchors. Follows โ€œNรคstaโ€ to paginate.

  2. Detail pages: extracts:

    • Website from canonical, og:url, and JSON-LD; otherwise scores external anchors and bans junk domains.

    • Address from JSON-LD, microdata, and Swedish heuristics (street tokens + postcode check).

    • Phone from tel: links or strict regex for Swedish formats.

    • Emails from:

      • Hitta UI attributes (unescaped)
      • mailto: links
      • Strict regex on visible text
      • Slug-pattern fallback: <slug>-button-email-<email> and button-email-<email>
  3. Website status: HEAD/GET to classify website_details as ok, 404, unavailable, banned, or n/a.

  4. Same-domain mini-crawl (if website OK): fetch up to site_email_max_pages pages for more emails and socials.

  5. Email policy:

    • If no website or website is banned/unavailable, keep all valid detail-page emails.
    • If website is OK, keep emails that match the website base-domain or are generic providers (Gmail, Outlook, etc.).

Example Output

{
"source_url":"https://www.hitta.se/foeretag/exempel-ab/123456",
"name":"Exempel AB",
"categories":"Bygg, Renovering",
"phone":"08 123 45 67",
"address":"Exempelgatan 10, 123 45 Stockholm",
"website":"https://www.exempel.se",
"email1":"info@exempel.se",
"email2":"support@exempel.se",
"website_details":"ok",
"social_facebook":"https://www.facebook.com/exempel-ab",
"social_instagram":"https://www.linkedin.com/exempel-ab",
"social_linkedin":"https://www.linkedin.com/company/exempel-ab",
"social_x":"https://www.x.com/exempel-ab",
"social_youtube":"https://www.youtube.com/exempel-ab",
"social_tiktok":"https://www.tiktok.com/exempel-ab",
"social_pinterest":"https://www.pinterest.com/exempel-ab"
}

If no valid emails are found, the actor emits "email1": "n/a".


Notes

  • Pagination: respects Hittaโ€™s โ€œNรคstaโ€ flow, ~25 results per page.
  • Bans: hitta.dixa.help, dixa.help, biluppgifter.se, and a specific DNB marketing URL are excluded.
  • Email hygiene: strict regex, HTML-unescape, de-duplication, and tracking-pattern filtering.
  • Address quality: prefers JSON-LD PostalAddress, then microdata, then heuristics requiring Swedish postcode + street token.

Disclaimer & License

This Apify Actor is provided "as is", without warranty of any kind โ€” express or implied โ€” including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. Use it, modify it, break it, or improve it โ€” but you do so at your own risk.

ยฉ 2025 SLSH. All rights reserved. Copying or modifying the source code is prohibited.

You might also like

Hitta.se Business Directory Scraper

automation-lab/hitta-se-business-directory-scraper

Scrape Swedish business listings, contacts, addresses, ratings, and coordinates from Hitta.se search results.

๐Ÿ‘ User avatar

Stas Persiianenko

2

Hitta.se Business Search Scraper

powerai/hitta-search-scraper

Scrape business listings from Hitta.se (Swedish directory) with automatic pagination and comprehensive company data extraction.

Proff.no Lead Scraper (Beta)

odaudlegur/proff-no-lead-scraper-beta

Retrieve leads on proff.no, the easy way. This actor will retrieve the business' name, address, email addresses, phone numbers and social links.

Eniro.se Scraper

rainminer/eniro-se-scraper

Scrape Eniro.se business listings, contact details, addresses, ratings, opening hours, websites, and company profile data from Swedish local search pages.

lead scraper (email)

coder_luffy/lead-scraper-email

Lead Scraper (Email) generates targeted business leads from any keyword. Get emails, phone numbers, websites, addresses, and moreโ€”perfect for B2B marketing, outreach, and lead generation. Fast, accurate, and easy to use.

Google Maps Scraper โ€” Business Leads & Email Extraction

lanky_quantifier/google-maps-scraper

Extract business names, addresses, phone numbers, reviews, emails, and social links from Google Maps. Scrape by keyword search or direct URLs. Automatically visits business websites to extract email addresses. Perfect for local lead generation.

27

Jobbland.se Scraper

lexis-solutions/jobbland-se-scraper

Scrape Jobbland.se with this Jobbland.se scraper: collect Swedish job listings, salaries, locations, categories, company profiles, finances, and application links with flexible URL input, pagination, maxItems limits, and proxy support.

๐Ÿ‘ User avatar

Lexis Solutions

6

Google Maps Places Scraper

scrapeai/google-maps-places-scraper

Retrieve verified business data directly from the Google Maps API. Search by keyword and location to collect structured details such as business name, phone number, address, website, ratings, reviews, and moreโ€”ideal for B2B lead generation, market research, and business intelligence

Svenskfast.se Scraper

lexis-solutions/svenskfast-se-scraper

Svenskfast.se scraper for Swedish real estate: extract property listings, prices, addresses, images, broker and financial details at scale for market research, investment analysis, and housing data dashboards.

๐Ÿ‘ User avatar

Lexis Solutions

2