Pricing
from $0.001 / actor start
Company Website Research
Extracting comprehensive data from the corporate website
Pricing
from $0.001 / actor start
Rating
4.2
(2)
Developer
Actor stats
2
Bookmarked
25
Total users
9
Monthly active users
52 days
Issues response
3 months ago
Last modified
Categories
Share
Company Research Actor
Apify Actor for researching a public company website and returning structured website evidence in one JSON result.
This Actor is built for company research, lead enrichment, and downstream automation. It can start from a direct website, a bare domain, or only a company name.
What This Actor Does
- accepts
website_url,domain, orcompany_name - discovers an official website when only the company name is provided
- prefers Apify's
Google Search Results Scraperfor company-name discovery and uses the first valid Google organic website result directly - falls back to the internal heuristic search flow only when the nested Google search actor is unavailable or returns no usable website result
- if discovery still stays ambiguous after fallback, returns
candidate_websitesinstead of guessing - crawls a small set of high-value pages such as
homepage,about,products/services, andcontact - uses a hybrid crawl strategy:
http-firstwhen HTML is enoughbrowser-fallbackwhen the site is JS-heavy or the HTTP probe is not enough
- fails fast on heavy block signals such as CAPTCHA, WAF, or explicit access denial instead of spending time on low-value salvage attempts
- when running on Apify, prepares a standby Apify Proxy profile and can auto-escalate to proxy for suspicious blocked hosts even if
use_proxyis left off - extracts:
- company name
- resolved website and domain
- LinkedIn company URL when found
- cleaned text from kept pages
- public emails, phones, and an address candidate
- rule-based summary, products, and market signals
- returns crawl metadata including
strategy,mode,confidence,failure_reason, timing breakdown, browser engine, and salvage usage
Best Fit
Works best for:
- company websites
- manufacturer and industrial sites
- B2B corporate sites
- one-page company sites
- public product/catalog websites with clear navigation
Less reliable for:
- login-only sites
- CAPTCHA or anti-bot protected sites
- sites with very heavy client-side rendering
- sites where key information is hidden behind forms, PDFs, or gated downloads
Input
Resolution order:
website_urldomain- discovery from
company_name
Main input fields:
company_name: company name for website discovery or as a hint for extractionwebsite_url: full website URL, highest priority inputdomain: bare domain, normalized tohttps://<domain>/social_link: known company social URL, usually LinkedIncountry: optional discovery hintcountry: optional discovery hint, available as a dropdown in the Apify input UImode:fastordeepanti_block_mode: browser hardening level,off,basic, oraggressiveuse_proxy: force Apify Proxy from the start for HTTP and browser crawlingproxy_groups: optional Apify Proxy groups such asRESIDENTIALsalvage_if_blocked: try likely subpages if the homepage is blocked or unavailable, except for clearly heavy-blocked sites that are failed fastmax_pages: max number of kept pages in outputmax_text_chars: max total extracted text characters across kept pagesdiscover_if_missing: whether to discover a website when only the company name is givenextract_contacts: whether to extract emails, phones, and addressfollow_subpages: whether to crawl internal pages beyond the first pageinclude_path_hints: preferred path fragments used to prioritize internal links
Mode
fast
- lower latency
- stops earlier once enough useful content is found
- good for lead enrichment and bulk runs
deep
- broader page coverage
- better for contacts, products, and company profile quality
- slower than
fast
Anti-Block Mode
off
- no browser hardening beyond the default crawler setup
basic
- adds browser environment hardening and lightweight blocker dismissal
- recommended default for most runs
aggressive
- adds stronger popup/overlay removal and lightweight resource blocking
- useful for difficult websites, but slightly riskier on fragile sites
Example Inputs
Direct website:
{"website_url":"https://vnsteel.vn/","mode":"fast","max_pages":3,"max_text_chars":7000,"extract_contacts":true,"follow_subpages":true}
Bare domain:
{"domain":"pny.com","mode":"deep","max_pages":3,"max_text_chars":8000,"extract_contacts":true,"follow_subpages":true}
Company name only:
{"company_name":"VNSTEEL","country":"Vietnam","mode":"deep","max_pages":3,"max_text_chars":7000,"discover_if_missing":true,"extract_contacts":true,"follow_subpages":true}
Company name discovery notes:
- when only
company_nameis provided, this Actor first tries to callapify/google-search-scraper - if Google returns a usable organic website result, the Actor uses that website directly for crawling
- the nested search run is executed under the current runner account, so the runner pays for that search usage
- if the nested search run is unavailable or returns no usable website result, the Actor falls back to its internal discovery heuristic
- if discovery is ambiguous, the Actor returns
candidate_websitesand stops instead of crawling the wrong website
Custom path hints:
{"website_url":"https://eup.vn/","mode":"deep","max_pages":4,"max_text_chars":10000,"extract_contacts":true,"follow_subpages":true,"include_path_hints":["about","products","services","contact","gioi-thieu","linh-vuc","lien-he"]}
Output
The Actor writes one result object to:
- the default dataset
- the
OUTPUTrecord in the default key-value store
Output Shape
{"company_name":"PNY Technologies Inc.","resolved_website_url":"https://www.pny.com/","resolved_domain":"pny.com","resolved_social_link":"https://www.linkedin.com/company/pny-technologies/","candidate_websites":[],"sources":["https://www.pny.com/","https://www.pny.com/professional/support/contact-us"],"pages":[{"url":"https://www.pny.com/","title":"PNY | NVIDIA Graphics, Storage, Networking & Memory Solutions","page_type":"homepage","text":"PNY delivers solutions in over 50 countries...","text_chars":3200}],"contacts":{"emails":["gopny@pny.com","tsupport@pny.com"],"phones":["19735159700"],"address":"100 Jefferson Road, Parsippany, New Jersey 07054 US"},"signals":{"about_summary":"PNY delivers solutions in over 50 countries...","products":["GeForce graphics cards","Solid state drives","PC memory"],"markets":["Global"]},"metadata":{"discovery_used":false,"strategy":"http-first","mode":"deep","anti_block_mode":"basic","browser_used":false,"browser_engine":null,"salvage_used":false,"pages_crawled":3,"failure_reason":null,"confidence":{"website":0.99,"contacts":0.99,"summary":0.85,"products":0.63,"overall":0.92},"timings":{"total_ms":5472,"discovery_ms":0,"crawl_ms":5472,"http_probe_ms":5472,"browser_crawl_ms":0},"duration_ms":5472}}
