VOOZH about

URL: https://apify.com/datavault/product-finder-plus-crawler-extractor

⇱ Product Finder Plus: Crawler & Extractor Β· Apify


πŸ‘ Product Finder Plus: Crawler & Extractor avatar

Product Finder Plus: Crawler & Extractor

Pricing

from $1.00 / 1,000 product details

Go to Apify Store

Product Finder Plus: Crawler & Extractor

Product Finder Plus is a high-end e-commerce crawler built for websites where standard scraping tools fall short. It is designed to extract structured product data from complex, dynamic e-commerce stores and platforms.

Pricing

from $1.00 / 1,000 product details

Rating

0.0

(0)

Developer

πŸ‘ Datavault

Datavault

Maintained by Community

Actor stats

0

Bookmarked

11

Total users

1

Monthly active users

5 months ago

Last modified

Share

Product Finder Plus - Crawler & Extractor

Recommendation: For simpler sites, we highly recommend trying the Product Finder Crawler & Extractor as a first step. It is generally faster and more cost-effective. This "Plus" version is designed for sites that require more complex solutions, specifically those with dynamic content or advanced anti-bot protections.

The Product Finder Crawler & Extractor Plus is an enhanced, high-performance implementation of our versatile e-commerce scraper. It is designed to extract product information from virtually any website, including modern Single Page Applications (SPAs) and PWA-based stores. It leverages multi-threaded concurrency and sophisticated parsing strategies (JSON-LD, Microdata, and JS-Global objects) to ensure maximum data yield with minimal overhead.

Features

  • High-Performance Concurrency: Uses a worker pool to crawl multiple pages in parallel, significantly reducing total execution time.
  • State Persistence & Resume: Automatically saves crawl progress (visited URLs and queue) to the Apify Key-Value Store. If the run is interrupted, it resumes exactly where it left off.
  • Comprehensive Product Discovery: Automatically identifies and extracts products using Schema.org (JSON-LD, Microdata), Meta Tags, and Next.js __NEXT_DATA__.
  • Dynamic JS-Object Extraction: Specifically tuned for ScandiPWA and React stores by extracting data directly from window.actionName and other global JavaScript objects.
  • Multi-Country Proxy Support: Fully integrated with Apify Proxy to bypass geo-blocks and analyze price differences across regions.
  • Pay-per-event (PPE) Integration: Fully compatible with Apify's PPE model, charging only for successful page loads and products found.
  • Configurable Limits: Control maxPagesPerCrawl, maxConcurrency, and maxRetries to manage depth and operational costs.

Input Parameters

  • startUrls: An array of URLs to start the crawl.
  • crawlSubpages: If checked (default: true), the crawler will follow links found on the pages.
  • maxPagesPerCrawl: The maximum number of pages to visit in a single run.
  • maxConcurrency: How many pages to process in parallel (Default: 5).
  • maxRetries: Number of times to retry a failed page fetch (Default: 3).
  • minRequestDelay: Minimum time in milliseconds to wait between requests.
  • proxyConfiguration: Apify Proxy configuration. Recommended for residential proxies on protected sites.

Output

The scraper outputs a dataset where each item represents a found product. Fields include:

  • url: The product page URL.
  • name: Product name.
  • description: Product description.
  • sku: Stock Keeping Unit.
  • brand: Brand name.
  • price: Product price.
  • currency: Currency code (e.g., USD, NOK).
  • image: URL of the product image.
  • availability: Availability status (e.g., InStock).
  • gtin: Global Trade Item Number (GTIN) such as EAN, UPC, ISBN.
  • rawSchema: The full extracted object for debugging or extra fields.

Sample Input

{
"startUrls":[
{"url":"https://www.example-store.com"}
],
"crawlSubpages":true,
"maxPagesPerCrawl":200,
"maxConcurrency":5,
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":["RESIDENTIAL"]
}
}

How it works

  1. Initialization: The crawler loads any existing state and charges the apify-actor-start event.
  2. Concurrent Fetching: Workers pick URLs from the queue and fetch them using a persistent HTTP client.
  3. Advanced Parsing: It parses the page content using various strategies:
    • Schema.org (JSON-LD, Microdata)
    • Next.js and ScandiPWA data structures
    • Global JavaScript objects and Meta Tags
  4. Resilient Storage: Products are pushed to the Apify Dataset, and the crawl state is periodically saved to the Key-Value Store.
  5. Smart Discovery: New links are identified from both HTML anchors and dynamic JavaScript content to ensure deep coverage.

Common issue when there is no result

  • Blocking: Some sites might require Residential Proxies or specific User-Agent headers.
  • Non-Standard Structures: If a site doesn't use standard markup or common HTML patterns, generic extraction might fail.

Tip

Try setting just one URL of your site in the list of startUrls and set crawlSubpages to false. See if you get any result before scaling up the crawl.


Feedback & Improvements If the results don't align with your goals, please reach out and leave us a message. We use your feedback to continuously update and refine our extraction engine, helping us make the Product Finder better for everyone.

You might also like

Product Finder: Crawler & Extractor

datavault/product-finder-crawler-extractor

The Product Finder Crawler & Extractor is a versatile e-commerce scraper designed to extract product information from virtually any website but with a focus on e-commerce. Comprehensive Product Discovery, Up-to-Date Pricing, Multi-Country Price Comparison

E-commerce Scraping Tool

apify/e-commerce-scraping-tool

Scrape data from e-commerce websites with E-commerce Scraping Tool. Scrape almost any retail site in minutes, extract e-commerce data and use it to monitor price details over time or compare different e-commerce sites’ offerings.

Ecommerce-Product-Scraper

digicovai/ecommerce-product-scraper

Scrape data from e-commerce websites with E-commerce Scraping Tool. Scrape almost any retail site in minutes, extract e-commerce data and use it to monitor price details over time or compare different e-commerce sites’ offerings.

E-commerce Product Matching Tool

tri_angle/e-commerce-product-matching-tool

Match products across e-commerce datasets with E-Commerce Product Matching Tool. Use it with E-commerce Scraping Tool datasets to automatically find identical and similar products and power price monitoring or catalog comparison.

πŸ‘ User avatar

Tri⟁angle

3

E-commerce Email Scraper πŸ”πŸ›’πŸ“§ - Cheap & Advanced

scrapestorm/e-commerce-email-scraper---cheap-advanced

πŸ” Scrape E-commerce Emails Easily Enter your search parameters (e.g product keywords, email domains & platform) to collect verified seller or store contacts along with product title, store description & more πŸ“Š Perfect for e-commerce lead generation, B2B outreach, product research & market analysis

119

5.0

Advanced Ebay Scraper – Extract Product Data, Prices & Reviews

sovanza.inc/advanced-ebay-scraper-extract-product-data-prices-reviews

The eBay Product Scraper is a powerful Apify actor designed to extract detailed product data from eBay listings, including price, images, seller information, product variants, and reviews. It is ideal for e-commerce research, competitor analysis, and price monitoring.

E-commerce Email Scraper - Low-costπŸ’²πŸ”₯πŸ”πŸ›’

delectable_incubator/e-commerce-email-scraper-low-cost

Scrape e-commerce contacts and store data πŸ”πŸ›’ with a powerful email scraper. Extract verified seller emails, contacts, product titles, store descriptions, and source links using keywords, domains, or platforms. Ideal for B2B lead generation, outreach campaigns and e-commerce market intelligence πŸ“Š

Noon Product Info Scraper

getdataforme/noon-productInfo-scraper

Project Cheerio Crawler Typescript is a web scraping tool that extracts detailed product data from e-commerce sites using the Cheerio library....

Ecommerce Price Scraper

fipper_ai/Ecommerce-Price-Scraper

Scrape product prices, ratings, and details from e-commerce websites in real-time. Ideal for price tracking, competitor analysis, dropshipping research, and market insights. Fast, reliable, and easy to use for automation and data collectionWeb scraping E-commerce Developer tools

Related articles

Best e-commerce scrapers for enterprise
Read more