E-Commerce Product Scraper β Any Store, Any Country
Pricing
from $1.50 / 1,000 results
E-Commerce Product Scraper β Any Store, Any Country
Scrape product data from any online store: price, title, stock, images, brand, SKU, specs. Works on Amazon, Rozetka, Walmart, eBay, AliExpress and 50+ more. 4-layer extraction: JSON-LD, Open Graph, Microdata, CSS. HTTP-first with Playwright fallback. No API key needed. Universal and reliable.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
1
Bookmarked
31
Total users
6
Monthly active users
2 months ago
Last modified
Categories
Share
E-Commerce Product Scraper
Extract structured product data from any e-commerce website β title, price, original price, currency, availability, images, specs, reviews, and more.
Works with 100+ online stores worldwide. Uses a 5-layer extraction engine (JSON-LD with ProductGroup variant resolution, Open Graph, Microdata, expanded CSS heuristics with 30+ selectors, and smart old-price detection). HTTP-first fetching with automatic Playwright fallback for JavaScript-heavy sites. TLD-based currency inference for 30+ countries.
Features
- Universal extraction β works with any e-commerce site, not just specific stores
- 4-layer parsing β JSON-LD β Open Graph β Microdata β CSS heuristics for maximum coverage
- Smart rendering β tries fast HTTP first; switches to headless browser only when needed
- Structured output β clean JSON with title, price, currency, stock status, images, brand, SKU, specs
- Multi-currency β auto-detects UAH, USD, EUR, GBP, PLN, CZK, RON
- Breadcrumbs β extracts product category path when available
- Proxy support β works with Apify proxy for anti-bot bypass
Supported Stores (tested)
| Region | Stores |
|---|---|
| πΊπ¦ Ukraine | Rozetka, Foxtrot, Epicentr, Comfy, Allo, Citrus, Moyo, Prom.ua |
| πͺπΊ Europe | Amazon.de, MediaMarkt, Notino, Zara, H&M, IKEA |
| π Global | Amazon.com, eBay, AliExpress*, Best Buy, Walmart |
*AliExpress requires Playwright mode (set forcePlaywright: true)
The scraper also works with any other e-commerce site that uses standard product markup (JSON-LD, Open Graph, or Microdata) β which is the vast majority of online stores.
Input
{"urls":["https://rozetka.com.ua/ua/some-product/p123456/","https://www.amazon.com/dp/B0EXAMPLE/"],"forcePlaywright":false,"maxConcurrency":5}
| Field | Type | Description |
|---|---|---|
urls | string[] | Required. Product page URLs to scrape |
forcePlaywright | boolean | Force headless browser for all URLs (default: false) |
maxConcurrency | integer | Max parallel pages (default: 5, max: 20) |
proxyConfiguration | object | Proxy settings (Apify proxy recommended for protected sites) |
Output
Each product is saved to the dataset as a JSON object:
{"url":"https://rozetka.com.ua/ua/samsung-galaxy-s24/p395058825/","store":"rozetka.com.ua","title":"Samsung Galaxy S24 Ultra 12/256GB Titanium Black","price":51999.0,"currency":"UAH","in_stock":true,"image":"https://content.rozetka.com.ua/...","brand":"Samsung","sku":"SM-S928BZKDSEK","description":"Π‘ΠΌΠ°ΡΡΡΠΎΠ½ Samsung Galaxy S24 Ultra...","rating":4.8,"review_count":342,"breadcrumbs":["Π‘ΠΌΠ°ΡΡΡΠΎΠ½ΠΈ","Samsung","Galaxy S24"],"extraction_method":"json-ld"}
Output fields
| Field | Type | Description |
|---|---|---|
url | string | Original URL |
store | string | Store domain |
title | string | Product name |
price | float | Price as a number |
currency | string | ISO currency code (UAH, USD, EUR, etc.) |
in_stock | boolean | Availability status |
image | string | Main product image URL |
brand | string | Brand name |
sku | string | Product SKU or MPN |
description | string | Short description (max 500 chars) |
rating | float | Average rating (if available) |
review_count | integer | Number of reviews (if available) |
breadcrumbs | string[] | Category path |
specs | object | Technical specifications (if available) |
extraction_method | string | Which extraction layer succeeded |
How it works
The scraper uses a 4-layer extraction strategy, running each layer in order and filling in missing data:
- JSON-LD (highest confidence) β parses
<script type="application/ld+json">with@type: Product - Open Graph β reads
<meta property="og:*">and<meta property="product:*">tags - Microdata β finds
itemscope itemtype="schema.org/Product"attributes - CSS Heuristics β falls back to common CSS selector patterns for price, title, etc.
If HTTP fetch returns weak data (no title or no price), the scraper automatically retries with a headless Chromium browser (Playwright) to handle JavaScript-rendered pages.
Use Cases
- Price monitoring β track competitor prices across multiple stores
- Market research β collect pricing data for analysis
- Product catalog β build product databases from multiple sources
- Dropshipping β check prices and availability across suppliers
- Price comparison β aggregate offers for the same product
Tips
- For best results with protected sites (Cloudflare, AWS WAF), enable Apify Proxy
- Set
forcePlaywright: truefor sites known to require JavaScript (AliExpress, some fashion stores) - Keep
maxConcurrencyat 3-5 for sites with aggressive rate limiting - The scraper respects
robots.txtβ use responsibly
Cost estimate
| Mode | Compute units per URL | Cost* |
|---|---|---|
| HTTP only | ~0.005 | ~$0.0005 |
| Playwright | ~0.05-0.1 | ~$0.005-0.01 |
| Mixed (auto) | ~0.01-0.03 avg | ~$0.001-0.003 |
*Based on Apify platform pricing. Actual costs depend on page complexity and proxy usage.
