Product Data Extractor (price, stock, rating)
Pricing
Pay per usage
Product Data Extractor (price, stock, rating)
Extract clean, normalized product data โ name, price, currency, availability, brand, rating, SKU/GTIN, image โ from public product pages via JSON-LD, microdata, and OpenGraph. HTML-only, fast, structured output.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
13 days ago
Last modified
Categories
Share
Product Data Extractor (Apify Actor)
Give it public product page URLs, get back clean, normalized product data โ name, price, currency, availability, in-stock, brand, rating, SKU/GTIN/MPN, image โ pulled from JSON-LD, microdata, and OpenGraph. HTML-only (no headless browser) so it's fast and cheap. Ideal for price monitoring, competitor tracking, catalog enrichment, and feed building.
Why it's useful (and money-first)
Price/stock monitoring is one of the most-demanded scraping jobs. This actor turns messy
product markup (which comes in dozens of shapes โ Offer vs AggregateOffer, price as string vs
number, 1.299,00 vs $1,299.00, availability URLs vs text) into one stable, tidy record.
Input
{"startUrls":[{"url":"https://scrapeme.live/shop/Bulbasaur/"}],"maxConcurrency":5,"maxPages":100}
maxPages capped at 200, maxConcurrency at 20 (cost guard).
Output โ one STABLE record per URL (ok and error rows share the shape)
{"status":"ok","requested_url":"https://shop.example.com/widget","final_url":"https://shop.example.com/widget","http_status":200,"found":true,"source":"json-ld","name":"Acme Widget","brand":"Acme","price":19.99,"currency":"USD","availability":"InStock","in_stock":true,"rating_value":4.5,"rating_count":231,"sku":"AW-1","gtin":"0123456789012","mpn":null,"image":"https://cdn.example.com/w.jpg","description":"...","offers_count":1,"extracted_at":"2026-05-29T..."}
source is json-ld | microdata | opengraph | none. found:false means no product data
was present in the page markup (e.g. a blog or a JS-rendered shop). Failed fetches return the
same keys with status:"error" + error.
Run locally / test
npminstallnpmtest# unit tests on the pure extractor (node:test)
Publish to Apify (account-holder's step)
npminstall-g apify-cliapify login # free Apify accountapify push # from this directory
Keep it free initially; enable pricing later via the adult account-holder once it shows repeat organic usage and clears a margin gate.
Notes / safety
- SSRF-guarded (scheme + private/metadata IP block + redirect re-check), robots-respecting,
rate-limited, cost-capped โ all via the shared
src/lib/actor_runner.js. - Stores only derived product fields โ no raw page bodies / PII.
- HTML-only: client-rendered shops that inject product JSON via JS will return
found:false(no server-side markup to read). Core logic insrc/extract.js(pure, unit-tested).
