VOOZH about

URL: https://apify.com/scrappy_garden/structured-data-extractor

โ‡ฑ Structured Data Extractor - JSON-LD, Microdata & RDFa ยท Apify


๐Ÿ‘ Structured Data Extractor - JSON-LD, Microdata & RDFa avatar

Structured Data Extractor - JSON-LD, Microdata & RDFa

Pricing

$4.99/month + usage

Go to Apify Store

Structured Data Extractor - JSON-LD, Microdata & RDFa

Extract and validate structured data from any web page for SEO. Parses JSON-LD, detects Microdata and RDFa, highlights schema.org types, and reports common markup issues.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Bikram Adhikari

Bikram Adhikari

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

0

Monthly active users

5 months ago

Last modified

Categories

Share

Structured Data Extractor (JSON-LD, Microdata & RDFa)

Extract structured data from any web page for SEO audits.

This Actor:

  • Extracts JSON-LD blocks (<script type="application/ld+json">)
  • Detects & extracts Microdata (itemscope, itemtype, itemprop)
  • Detects & extracts basic RDFa (property, typeof, about, resource, vocab)
  • Highlights detected schema.org types and reports common issues (missing @type, non-schema.org @context, parse errors)

Input

  • Start URLs: pages to analyze.
  • Follow internal links (optional): crawl additional pages for site-wide audits.
  • Extraction toggles for JSON-LD / Microdata / RDFa.

Output

  • Dataset: one item per analyzed page (counts, detected types, warnings/errors).
  • Key-Value Store:
    • SUMMARY: run summary + top schema.org types
    • REPORT: compact per-page report

Example API call

{
"startUrls":[{"url":"https://example.com"},{"url":"https://json-ld.org/"}],
"maxPages":10,
"followLinks":false,
"validateSchemaOrg":true
}

Quick start

Store page: https://apify.com/scrappy_garden/structured-data-extractor

Paste this into Input and click Run:

{
"startUrls":[
{
"url":"https://example.com/"
}
],
"proxyConfiguration":{
"useApifyProxy":false
}
}

Outputs (what you get)

  • Dataset: Dataset items typically include fields like: url, statusCode, title, jsonLdCount, microdataItemCount, rdfaStatementCount, schemaTypes, warnings, errors, extractedAt.
  • Key-value store: REPORT, SUMMARY

Tips (trust + predictable results)

  • Start with 1โ€“3 URLs to validate behavior, then scale up.
  • If a target blocks requests, enable Proxy and/or slow down concurrency in Input.
  • Use the SUMMARY / REPORT keys (when present) for automation pipelines and monitoring.

Related actors

Search keywords

structured data extractor, structured data extractor - json-ld, microdata & rdfa, website audit, seo

You might also like

Structured Data Extractor

automation-lab/structured-data-extractor

This actor extracts structured data markup from web pages. It parses all three major formats: JSON-LD (`<script type="application/ld+json">`), Microdata (`itemscope`/`itemprop`), and RDFa (`typeof`/`property`). For each page, it returns the full structured data objects, detected Schema.org...

๐Ÿ‘ User avatar

Stas Persiianenko

15

Structured Data Validator (JSON-LD / OG)

jungle_synthesizer/structured-data-validator-pro

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.

๐Ÿ‘ User avatar

BowTiedRaccoon

3

Schema Markup Scraper & SEO Auditor

autofacts/metadata-scraper

Extract JSON-LD, Microdata, RDFa, Open Graph & Twitter Cards. Runs a 0-100 SEO audit โ€” checks canonical, hreflang, headings, image alt, EEAT author signals. Detects 80+ schema.org types including LocalBusiness with NAP, geo coordinates, and Google Place IDs.

142

5.0

Structured Data Scraper & Validator

taroyamada/structured-data-validator

Crawl websites to extract JSON-LD and Microdata, validate schema markup syntax, and flag missing fields across massive URL lists.

Structured Data Scraper (Schema.org)

datavault/schemaorg

Fast, lightweight scraper that extracts structured data (JSON-LD & microdata) from HTML pages. Ideal for e-commerce and sites that embed schema.org markup without heavy client-side rendering.