VOOZH about

URL: https://apify.com/datavault/schemaorg

โ‡ฑ Structured Data Scraper (Schema.org) ยท Apify


๐Ÿ‘ Structured Data Scraper (Schema.org) avatar

Structured Data Scraper (Schema.org)

Pricing

Pay per event

Go to Apify Store

Structured Data Scraper (Schema.org)

Fast, lightweight scraper that extracts structured data (JSON-LD & microdata) from HTML pages. Ideal for e-commerce and sites that embed schema.org markup without heavy client-side rendering.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Datavault

Datavault

Maintained by Community

Actor stats

1

Bookmarked

56

Total users

1

Monthly active users

4 months ago

Last modified

Share

Fast scraper optimized for sites that follows schema.org structured data without heavy client-side rendering. It is great for e-commerce sites.

Speed first. Lightweight because it parses static HTML instead of launching a browser. Pages that require client-side rendering may need a headless browser (for example Playwright or Puppeteer).

What you get

  • Schema.org payloads collected from JSON-LD <script> tags and microdata attributes.
  • Final URL, status code, and page title for quick validation.
  • Dataset output suitable for feeding into validation tools or downstream pipelines.

Input

Provide at least one URL via url (string, array, or Apify request object) or urls (array). Optional settings:

  • maxRequestsPerCrawl โ€“ stop the crawl after N requests (defaults to the number of provided URLs).
  • proxyConfiguration โ€“ standard Apify proxy configuration block.

Output

Each dataset item contains:

  • inputUrl, loadedUrl, statusCode, title, retrievedAt
  • schema.jsonLd โ€“ parsed JSON-LD blocks
  • schema.microdata โ€“ microdata trees normalised into nested objects

Sample INPUT.json

{
"url":[
{
"url":"https://schema.dev/blog/schema-markup-builder-video-walkthroughs/"
},
{
"url":"https://schema.dev/blog/schema-seo-boost-your-websites-visibility-with-structured-data/"
},
{
"url":"https://schema.dev/blog/schema-tests-unleashing-the-full-potential-of-your-seo-strategy/"
},
{
"url":"https://schema.dev/blog/understanding-product-schema-a-key-to-better-product-visibility-online/"
},
{
"url":"https://schema.dev/blog/5-types-of-schema-markup-every-legal-service-should-use-for-seo/"
}
]
}

You might also like

Structured Data Extractor

automation-lab/structured-data-extractor

This actor extracts structured data markup from web pages. It parses all three major formats: JSON-LD (`<script type="application/ld+json">`), Microdata (`itemscope`/`itemprop`), and RDFa (`typeof`/`property`). For each page, it returns the full structured data objects, detected Schema.org...

๐Ÿ‘ User avatar

Stas Persiianenko

16

Structured Data Scraper & Validator

taroyamada/structured-data-validator

Crawl websites to extract JSON-LD and Microdata, validate schema markup syntax, and flag missing fields across massive URL lists.

JSON-LD Schema & Meta Tag Extractor

logiover/json-ld-schema-meta-tag-extractor

Bulk JSON-LD structured data scraper and meta tag extractor for any URL. Export Schema.org, OpenGraph and Twitter Cards to CSV/JSON. No API.