Pricing
from $1.00 / 1,000 results
E-commerce Website Scraping Tool
Scrape public e-commerce pages for product names, prices, stock status, reviews, sellers, and price changes. Use URLs or keywords, export clean data, and keep runs cheap with low-cost defaults.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
19 hours ago
Last modified
Categories
Share
Extract product, price, review, seller, and price-change data from public e-commerce websites.
E-commerce Website Scraping Tool helps you turn online store pages into clean, downloadable data. Add product URLs, category or listing pages, review pages, seller pages, or search keywords, then run the actor to collect structured results you can open in Apify, download to a spreadsheet, send to an API, or connect to your own tools.
The actor is designed for public web pages and everyday e-commerce research tasks such as competitor price checks, product catalog tracking, review analysis, seller monitoring, and market discovery.
What can this actor do?
- Collect product names, prices, currencies, descriptions, stock status, categories, images, ratings, and identifiers.
- Extract reviews when they are visible on the page.
- Capture seller and offer information where the website shows it.
- Discover product pages from search keywords and optional domain filters.
- Monitor prices and stock over time by comparing a new run with a previous dataset.
- Export results from Apify in formats such as JSON, CSV, Excel, XML, or through the Apify API.
- Keep costs low by using lightweight HTTP scraping by default, with browser mode available only when needed.
Common use cases
| Use case | How this actor helps |
|---|---|
| Price monitoring | Track product prices and stock status on competitor websites. |
| Market research | Build a list of products, brands, categories, and prices in a market. |
| Product comparison | Collect product details from several stores and compare them in one dataset. |
| Review analysis | Gather customer reviews for sentiment analysis, product feedback, or quality research. |
| Seller research | Extract visible seller names, seller pages, offers, shipping details, and prices. |
| Catalog checks | Monitor whether products are in stock, discounted, renamed, or removed. |
| AI and analytics workflows | Send structured e-commerce data into spreadsheets, dashboards, databases, or AI tools. |
What data can you extract?
The exact fields depend on what the website shows publicly, but the actor can collect the following types of information.
| Product data | Price and availability | Reviews and sellers |
|---|---|---|
| Product name | Current price | Review title |
| Product URL | Currency | Review text |
| Description | List price, when available | Review rating |
| Brand | Discount text, when available | Review date |
| Category and breadcrumbs | Stock status | Verified purchase flag, when visible |
| Product identifiers such as SKU, GTIN, EAN, UPC, ISBN, ASIN, or MPN | Shipping text, when visible | Seller name |
| Images | Product variants | Seller URL |
| Rating and review count | Offers from sellers, when visible | Seller or offer price |
How it works
- You provide product URLs, listing URLs, review URLs, seller URLs, or keywords.
- The actor opens those public pages and reads the visible product information.
- It organizes the extracted data into clean rows.
- You view the results in Apify or export them to your preferred format.
- For price monitoring, you can schedule repeat runs and compare the latest data with an earlier dataset.
For most simple product pages, the actor uses fast HTTP scraping. For pages that load important data with JavaScript, you can enable browser scraping or automatic browser fallback.
How to use E-commerce Website Scraping Tool
- Open the actor in Apify Console.
- Choose what you want to extract in the Operation field.
- Add one or more product, listing, review, seller, or storefront URLs in Start URLs.
- Optionally add Keywords and Search domains to discover product pages from search results.
- Keep the default Lowest cost settings for a cheap first run.
- Click Start.
- Open the dataset when the run finishes and download the data or connect it to another tool.
Operations
| Operation | Best for | What it returns |
|---|---|---|
| Products | Product pages and product listings | Product rows with price, stock, category, rating, images, variants, and identifiers. |
| Reviews | Review pages or product pages with visible reviews | One row per review. |
| Sellers/offers | Seller pages, offer blocks, or product pages with visible sellers | Seller and offer rows. |
| Discovery | Finding product pages from keywords, search pages, or listing pages | Discovered product URLs and basic product details. |
| Food delivery | Public menu or delivery pages | Menu/product rows, using your delivery location when prices depend on location. |
| Influencer storefront | Public storefront or promoted-product pages | Storefront posts, product links, images, and creator names where visible. |
| Price monitor | Scheduled tracking of known product URLs | Current product rows plus price or stock change records when compared with a previous dataset. |
Input options
| Input | Required? | Plain-English explanation |
|---|---|---|
| Operation | Yes | Choose whether you want products, reviews, sellers, discovery, food delivery, influencer storefronts, or price monitoring. |
| Start URLs | Usually yes | Product pages, category pages, review pages, seller pages, menu pages, or storefront pages to scrape. |
| Keywords | Optional | Search terms used to find product pages. Use this when you do not already have exact URLs. |
| Search domains | Optional | Limit keyword search to specific websites, such as example.com. |
| Country code and locale | Optional | Helps with localized search and output preferences. |
| Preferred currency | Optional | Used when a page shows a price but does not clearly show the currency. |
| Delivery location | Required for food delivery | Address, ZIP code, city, or region used for location-sensitive menu pages. |
| Maximum records | Optional | Stops the run after a set number of output rows. |
| Maximum requests per run | Optional | Adds a hard safety limit so a run does not crawl too many pages. |
| Scrape mode | Optional | Choose fast HTTP scraping, browser scraping, or automatic mode. |
| Proxy configuration | Optional | Leave off for lower cost. Turn on only when a site blocks normal requests or requires a specific location. |
Example input
This is a simple low-cost product scraping run.
{"operation":"PRODUCTS","costPreset":"LOWEST_COST","scrapeMode":"HTTP","startUrls":[{"url":"https://example.com/products/demo-headphones"}],"maxItems":25,"maxConcurrency":3,"maxRequestsPerRun":100,"allowBrowserFallback":false,"proxyConfiguration":{"useApifyProxy":false}}
Example price monitoring input
Use this when you already have product URLs and want to compare the latest scrape with a previous Apify dataset.
{"operation":"PRICE_MONITOR","scrapeMode":"HTTP","startUrls":[{"url":"https://example.com/products/demo-headphones"}],"previousDatasetId":"YOUR_PREVIOUS_DATASET_ID","maxItems":100}
Example output
Results are saved to the actor dataset. A product row can look like this:
{"recordType":"product","url":"https://example.com/products/demo-headphones","name":"Demo Wireless Headphones","brand":{"name":"Demo Brand"},"price":{"amount":129.99,"currency":"USD","text":"$129.99"},"availability":"IN_STOCK","category":{"breadcrumbs":["Electronics","Headphones"],"path":"Electronics > Headphones"},"rating":{"ratingValue":4.6,"bestRating":5,"reviewCount":248},"source":{"domain":"example.com"},"extraction":{"confidence":0.86,"mode":"HTTP"}}
Dataset views
The actor includes ready-made dataset views so you can quickly inspect the data in Apify.
| View | What it shows |
|---|---|
| Products | Product names, brands, prices, availability, ratings, URLs, and confidence score. |
| Reviews | Review text, rating, date, author alias, helpful votes, and product link. |
| Sellers and offers | Seller names, seller URLs, offer prices, shipping text, availability, and product link. |
| Price changes | Previous price, current price, price delta, stock change, alert severity, and detection time. |
| All fields | The full raw output for advanced analysis. |
The actor also writes a run summary to the key-value store under OUTPUT. Price monitor runs with detected changes also write an ALERTS record.
Keeping costs low
This actor is configured to start cheaply by default.
- Lowest cost is the default cost preset.
- HTTP mode is the default scrape mode because it is faster and cheaper than launching a browser.
- Browser fallback is off by default.
- Apify Proxy is off by default.
- Discovery enrichment is off by default, so keyword discovery does not automatically open every discovered product page.
- Review media is off by default to reduce output size.
- Default limits are modest: 25 records, 100 requests per run, 3 concurrent requests, 1 retry, and a 30-second request timeout.
For the cheapest test run, use one product URL, keep HTTP mode, leave proxy disabled, and set a small maximum record limit.
For harder websites, use the settings below only when needed:
{"costPreset":"BALANCED","scrapeMode":"AUTO","allowBrowserFallback":true,"maxBrowserFallbackPages":5,"proxyConfiguration":{"useApifyProxy":true}}
Tips for better results
- Start with direct product URLs when accuracy matters most.
- Use listing or category pages when you want to collect many products from the same store.
- Use keywords plus search domains when you want to discover product pages from selected websites.
- Use browser mode for websites that show empty pages or load prices only after JavaScript runs.
- Use proxy only when direct requests are blocked or location matters.
- Keep
maxItemsandmaxRequestsPerRunlow while testing a new website. - Schedule recurring price monitor runs to build your own price history over time.
Frequently asked questions
Can I scrape Amazon, Walmart, eBay, Shopify stores, or other e-commerce sites?
You can use the actor with many public e-commerce websites, including marketplaces, regional retailers, and independent stores. Results depend on how each website displays its data and whether the page is publicly accessible. Very protected or JavaScript-heavy websites may require browser mode, proxy settings, or a more specialized actor.
Can it scrape pages behind a login?
No. This actor is intended for publicly available pages. It does not log in to accounts, bypass paywalls, or access private customer information.
Can it collect historical prices?
It captures the current data visible at the time of each run. To build history, schedule regular runs and save each dataset. Price monitor mode can compare the current run with a previous dataset and create price or stock change records.
Why are some fields empty?
Some websites do not show every field, or they load certain fields later with JavaScript. Empty fields usually mean the information was not visible in the page version the actor could access. Try browser mode, automatic fallback, or proxy settings if the page looks incomplete.
How often should I run it for price monitoring?
Daily runs are enough for many products. Use hourly runs for fast-changing categories, flash sales, or highly competitive products. Use weekly runs for slower catalogs or long-term market tracking.
Can I export the data?
Yes. Apify datasets can be downloaded in common formats such as JSON, CSV, Excel, XML, and HTML. You can also use the Apify API, integrations, webhooks, or scheduling to send results into other systems.
Is e-commerce scraping legal?
This actor is designed for publicly available information. Web scraping rules vary by website and country, so you should review the target website's terms and follow applicable laws and policies.
Local development
Most users do not need this section. It is only for developers who want to run or modify the actor locally.
npminstallnpm run buildnpmtestnpm run dev
Support
If a page does not work as expected, include the target URL, your input settings, and the run ID when reporting the issue. This makes it much easier to reproduce the problem and improve extraction for that website.
