VOOZH about

URL: https://apify.com/dominic-quaiser/impressum-standby-scraper

โ‡ฑ Impressum Standby Scraper (Playwright Version) ยท Apify


๐Ÿ‘ Impressum Standby Scraper (Playwright Version) avatar

Impressum Standby Scraper (Playwright Version)

Pricing

from $2.52 / 1,000 results

Go to Apify Store

Impressum Standby Scraper (Playwright Version)

Scrape German imprint pages instantly. Using a headless-browser for dynamic modern sites. This Apify Actor finds and extracts structured contact & legal data from any German website โ€” company name, address, phone, fax, email, VAT ID, register number, social media & decision makers.

Pricing

from $2.52 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Dominic M. Quaiser

Dominic M. Quaiser

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

German Imprint Scraper (Standby API)

Find and extract structured contact and legal information from German imprint pages ("Impressum") โ€” in real time, one URL per request. Send a homepage URL to the actor's HTTP endpoint and it automatically discovers the site's imprint page and returns clean, structured data: company name, address, phone/fax, email, commercial register number, VAT ID, social media links, and decision-makers.

This actor runs in Apify Standby mode as a long-lived HTTP server. That makes it ideal for on-demand enrichment: low per-request latency, no run start-up overhead per URL, and a simple GET/POST API you can call directly from your application, a workflow tool, or another actor.

โ„น๏ธ Which version is this?

This scraper is published in two variants, optimised for different kinds of websites:

๐ŸŽญ Playwright version (this actor)

A headless-browser scraper that renders pages with a real Chromium engine. Use it for modern, JavaScript-heavy websites whose imprint links or content only appear after the page renders (e.g. Next.js / React apps). It is more robust but slower, and adds a small headless-browser charge per processed URL.

๐Ÿ‘‰ Most imprint pages are plain server-rendered HTML and don't need a browser. For those, the HTTP version is faster and cheaper.

๐Ÿ’ก Features

  • Automatic imprint-page discovery: point the actor at a homepage; it finds the correct "Impressum" page for you.
  • Selective data extraction: request only the fields you need, from basic contact info to ML-extracted decision-makers.
  • Real-time Standby API: GET or POST a single URL and get structured JSON back immediately. One request is processed at a time per container.
  • Proxy support: integrates with Apify Proxy for IP rotation and to reduce blocking.
  • Structured JSON output: clean, predictable records ready for your CRM, database, or downstream pipeline.

๐Ÿ”Œ Standby API

In Standby mode the actor exposes an HTTP server. Apify gives every Standby actor a base URL; append the query parameters below and authenticate with your Apify API token (e.g. as a ?token= query parameter or Authorization: Bearer <token> header).

GET / โ€” scrape one URL (query string)

ParameterRequiredDescription
startUrlYesHomepage URL to scrape. The actor discovers the imprint page automatically. https:// is prepended if the scheme is missing.
fieldsToExtractNoComma-separated list of fields to extract. Defaults to all fields.
metaDataNotrue/false โ€” include extra technical details in the response. Default false.
$curl'https://dominic-quaiser--impressum-standby-scraper.apify.actor/?startUrl=https://www.renault.de/&fieldsToExtract=company_name,emails,phone_number&token=<APIFY_TOKEN>'

POST / โ€” scrape one URL (JSON body)

curl-X POST 'https://dominic-quaiser--impressum-standby-scraper.apify.actor/?token=<APIFY_TOKEN>'\
-H'Content-Type: application/json'\
-d'{
"startUrl": "https://www.renault.de/",
"fieldsToExtract": ["company_name", "emails", "phone_number"],
"metaData": false
}'

GET /health โ€” health check & stats

Returns 200 with a snapshot of running counters (total requests, successful scrapes, errors, etc.). Useful for uptime checks.

$curl'https://dominic-quaiser--impressum-standby-scraper.apify.actor/health'

Responses

StatusMeaning
200Scrape completed. Body is { "url": ..., "result": { ... } }, or { "url": ..., "result": null, "message": "No data extracted" } when nothing could be extracted.
400Missing or invalid startUrl, or an invalid JSON body.
500Unhandled scraper error.
504Processing timed out.

Each successful result is also pushed to the actor's default dataset, so you can browse or export your scrape history from the Apify Console even when calling the API directly.

๐Ÿ“Š Extractable data

Select any combination of the following fields via fieldsToExtract:

FieldDescriptionType
company_nameThe official company name, with a confidence score for the match.Object
business_addressFull address parsed into full_address, street, house_number, postal_code, city.Object
phone_numberOne or more phone numbers, keyed phone_1, phone_2, โ€ฆObject
fax_numberOne or more fax numbers, keyed fax_1, fax_2, โ€ฆObject
emailsOne or more email addresses; emails matching the site's domain are prioritised.Object
register_numberCommercial register number ("Handelsregisternummer") and the registration court ("Registergericht").Object
vat_idGerman VAT ID ("Umsatzsteuer-ID") with checksum validation, e.g. DE123456788.Object
social_mediaLinks to platforms like LinkedIn, Xing, Facebook, Instagram, etc.Object
decision_makers(Premium) Names of key decision-makers ("Entscheidungstrรคger") extracted via an external NER (Named Entity Recognition) model.Array

Numbered outputs (emails, phone numbers, โ€ฆ) are ordered by how likely each value is the company's main contact.

๐Ÿ“ค Output structure

The exact fields depend on your fieldsToExtract selection.

{
"start_url":"https://muster-firma.de/",
"imprint_url":"https://muster-firma.de/impressum",
"company_name":{
"name":"Muster GmbH",
"confidence":1
},
"business_address":{
"full_address":"MusterstraรŸe 123, 12345 Berlin",
"street":"MusterstraรŸe",
"house_number":"123",
"postal_code":"12345",
"city":"Berlin"
},
"phone_number":{"phone_1":"+493012345678"},
"fax_number":{"fax_1":"+493012345679"},
"emails":{"email_1":"kontakt@muster-firma.de"},
"register_number":{
"number":"HRB 12345 B",
"court":"Amtsgericht Charlottenburg"
},
"vat_id":{"vat_id":"DE123456788"},
"social_media":{
"linkedin":"https://www.linkedin.com/company/muster-firma"
},
"decision_makers":["Max Mustermann"],
"metadata":{
"domain":"muster-firma.de",
"fetch_method":"http",
"fallback_attempted":false,
"scraped_at":"2026-06-22T12:04:48.003780"
}
}

The metadata block is only included when metaData is enabled.

โš–๏ธ Legal disclaimer

You are solely responsible for determining the legality of your use of this actor and the data it generates. Scraping and handling data โ€” particularly personal information โ€” is subject to legal frameworks such as the GDPR (DSGVO), copyright law, and the terms of service of the sites you scrape. Ensure your use case is compliant with all applicable laws. This text is not legal advice.

GDPR notice: "Decision Makers" feature

The decision_makers feature uses an external API hosted on a private server in Europe (Germany) to process data.

  • What is processed: the text of the imprint page is sent to the API to identify personal names.
  • Why: the NER model needs the page text to accurately extract decision-makers.
  • Data controller: you, the user, are the data controller; the actor's developer acts as data processor for this task.
  • Location & compliance: all processing occurs within the EU and is subject to the GDPR (DSGVO).
  • Data storage: the text is processed in-memory and is not stored or logged on the external server.
  • Important: this processing is external to the Apify platform and not covered by Apify's DPA. By using this feature you acknowledge this separate processing activity.

๐Ÿค– Other actors

๐ŸŽฏ Use cases

  • Lead generation โ€” build targeted contact lists for sales and marketing.
  • Real-time enrichment โ€” call the Standby API to enrich a record the moment a lead enters your CRM.
  • Compliance & verification โ€” check for legally compliant imprint information.
  • Market research โ€” aggregate company data for a specific industry or region.

๐Ÿ› ๏ธ Maintainer

You might also like

German Imprint Scraper

dominic-quaiser/imprint-contact-scraper

An Actor that automatically locates and scrapes key contact details from German website imprint pages (Impressum). It extracts information such as company name, address, phone numbers, emails, and decision-makers (Entscheider, Entscheidungstrรคger)

๐Ÿ‘ User avatar

Dominic M. Quaiser

511

3.9

(3)

Lead Scraper & Email Finder - Decision Makers

painless_tweet/leadslogix-pipeline

Upload a company list, get verified decision maker emails, phones, LinkedIn, and social profiles. 12-stage pipeline: website discovery, contact extraction, email finder, verification, social enrichment, lead scoring, and Excel export. For email marketing, cold outreach, and B2B prospecting.

๐Ÿ‘ User avatar

Leadslogix LLC

18

5.0

(3)

Website Scraper

dz_omar/ai-lead-extractor

Extract information from websites using intelligent AI ๐Ÿค–from contact details to custom data fields, summaries, and creative content ๐ŸŒ. Automatically crawl contact, about, and team pages to gather emails, phone numbers, job titles, and social links. Batch process hundreds of URLs efficiently.

๐Ÿ‘ User avatar

FlowExtract API

149

3.9

(14)

Website Contact Scraper Pro - Extract from JavaScript Sites

ryanclinton/website-contact-scraper-pro

Website Contact Scraper Pro. Available on the Apify Store with pay-per-event pricing.

25

Website Contact Extractor (Browser)

santamaria-automations/website-contact-extractor-browser

Extract team contacts from JavaScript-rendered company websites (React, Vue, Angular) using AI + Playwright. Companion to the HTTP-only Website Contact Extractor. Handles the ~28% of sites that need a real browser. Same output format, same quality, same LLM fallback chain.

Website Contact Data Scraper from Bing and Google

tuguidragos/website-contact-data-scraper-from-bing-and-google

Scrape verified business contact details from Google and Bing search engine results pages (SERP). Extract emails, phone numbers, websites, and addresses from official company pages. No coding required. Perfect for sales prospecting, market research, and B2B outreach. Export to CSV, JSON via API

๐Ÿ‘ User avatar

ศšugui Dragoศ™

43

5.0

(4)

Google Maps Reviews Scraper

mth-software/google-maps-scraper

Extract unlimited Google Maps reviews with direct links, images, ratings & contact info. Fast, auto-scaling. Perfect for market research, lead generation, competitive analysis & review monitoring. Get phone, email, popular times & more.

20

Website Screenshot & PDF API โ€” Fast Captures

george.the.developer/screenshot-pdf-api

Capture pixel-perfect screenshots and PDFs from any URL in under 3 seconds. PNG/JPEG/WebP formats, custom viewports, full-page capture, batch up to 20 URLs.

18

Reddit Scraper for Leads & Pain Points

runtime/reddit-scraper

Scrape Reddit posts, comments, communities, and profiles for lead research, pain point discovery, social listening, and AI-ready market intelligence workflows.

๐Ÿ‘ User avatar

scraping automation

28

5.0

(1)