Pricing
Pay per event
Structured Data Validator (JSON-LD / OG)
Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
11 days ago
Last modified
Categories
Share
Structured Data Validator Pro โ JSON-LD, Open Graph & Schema Markup Testing Tool
A structured data testing tool and schema markup validator that extracts and validates structured data from any URL โ JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags โ in one pass. Run a rich results check and schema.org validation across a whole sitemap, not one URL at a time. Local schema.org validation, Google rich-result eligibility check, and an AI-discovery readiness score. Pure HTTP, no browser.
Structured Data Validator & Schema Markup Checker Features
- Extracts six structured-data formats per URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags.
- Validates JSON-LD blocks against a bundled schema.org rule set with required-field gates per type (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject).
- Flags Google rich-result eligibility โ true when any block satisfies the relevant rich-result requirement set.
- Scores AI-discovery readiness on a 0-100 scale, weighted toward the signals LLM crawlers actually use.
- Detects and lists every schema.org
@typefound across all formats. - Optional raw-HTML dump to KVS for offline debugging.
- Pure HTTP fetch via CheerioCrawler โ no browser, no proxy by default. The cheap default.
Who Uses Structured Data Audits?
- SEO teams โ audit rich-result eligibility across a sitemap before chasing rank changes that turn out to be markup bugs.
- Content engineering โ verify JSON-LD blocks ship with every article, product, or recipe page.
- AI / LLM-discovery auditors โ score how well a site speaks to AI crawlers, since LLMs lean heavily on structured data.
- Migration QA โ diff structured-data coverage before and after a CMS swap or template refactor.
- Competitive research โ see exactly which schema.org types competitors mark up, and which ones they miss.
How Structured Data Validator Works
- Pass in a list of URLs. The actor caps at 15 per run by default to stay inside the Apify tester's 5-minute timeout.
- CoreCrawler fetches the static HTML over plain HTTP. The handler runs all six extractors in parallel.
- JSON-LD blocks are validated against the bundled schema.org rule set. Each issue is recorded with severity, path, type, and message.
- The actor flags Google rich-result eligibility and computes the AI-discovery readiness score, then emits one row per URL.
Input
{"urls":["https://schema.org/Article","https://www.apify.com"],"maxItems":5,"extractWhich":["json-ld","open-graph","twitter-cards","microdata","rdfa","meta-tags"],"validateAgainst":"schema.org","includeRawHtml":false}
| Field | Type | Default | Description |
|---|---|---|---|
urls | array | required | URLs to extract and validate structured data from. |
maxItems | integer | 5 | Hard cap on URLs per run. Range 1-15. |
extractWhich | array | all six | Formats to extract: json-ld, open-graph, twitter-cards, microdata, rdfa, meta-tags. |
validateAgainst | enum | schema.org | Validation rule set. schema.org runs the bundled gates; none skips validation. |
includeRawHtml | boolean | false | Save the fetched HTML to KVS and link via rawHtmlKvsKey on each row. |
proxyConfiguration | object | none | Optional. Default is no proxy. |
Structured Data Validator Output Fields
{"url":"https://www.apify.com","finalUrl":"https://www.apify.com/","jsonLd":["{\"@context\":\"https://schema.org\",\"@type\":\"Organization\",\"name\":\"Apify\"}"],"openGraph":{"og:title":"Apify - The Web Scraping Platform","og:type":"website","og:url":"https://apify.com/","og:image":"https://apify.com/img/social.png"},"twitterCard":{"twitter:card":"summary_large_image"},"microdata":[],"rdfa":[],"metaTags":{"viewport":"width=device-width, initial-scale=1","robots":"index, follow"},"validationErrors":[],"schemaTypes":["Organization"],"googleRichResultEligible":false,"aiDiscoveryReadiness":{"hasJsonLd":true,"hasArticleSchema":false,"hasFAQ":false,"hasHowTo":false,"hasOpenGraph":true,"score":60},"rawHtmlKvsKey":"","status":"success","errorMsg":"","extractedAt":"2026-04-30T12:00:00Z"}
| Field | Type | Description |
|---|---|---|
url | string | Audited URL. |
finalUrl | string | URL after redirects. |
jsonLd | array | Parsed JSON-LD blocks as JSON-stringified objects (CSV/Excel safe). |
openGraph | object | All og:* meta tags flattened into a single object. |
twitterCard | object | All twitter:* meta tags flattened into a single object. |
microdata | array | itemscope/itemtype blocks as JSON-stringified objects. |
rdfa | array | property/typeof/resource blocks as JSON-stringified objects. |
metaTags | object | All <meta name> and <meta http-equiv> tags as a flat object. |
validationErrors | array | Issues formatted as <severity> [<path>] (<type>) <message>. |
schemaTypes | array | Detected schema.org types (e.g. Article, Recipe, Product). |
googleRichResultEligible | boolean | True when any block satisfies a Google rich-result requirement set. |
aiDiscoveryReadiness | object | {hasJsonLd, hasArticleSchema, hasFAQ, hasHowTo, hasOpenGraph, score 0-100}. |
rawHtmlKvsKey | string | KVS key for raw HTML when includeRawHtml=true (else empty). |
status | string | success, not_found, or error. |
errorMsg | string | Error message on failure (empty on success). |
extractedAt | string | ISO timestamp. |
Pricing
Token charge โ functionally free. Apify rejects truly $0 PPE events, so the per-record price is the smallest practical floor.
| Event | Price |
|---|---|
| Actor start | $0.10 |
| Per record | $0.0001 |
| Volume | Cost |
|---|---|
| 100 records | $0.11 |
| 1,000 records | $0.20 |
| 10,000 records | $1.10 |
This actor is the cheap discovery primitive that pairs with paid downstream actors. Audit liberally.
Limits
maxItemscaps at 15 per run by default โ sized for the Apify tester's 5-minute timeout.- The schema.org validator covers the common Google-rich-result types (Article, Recipe, Product, Event, FAQPage, HowTo, VideoObject). Other types parse but skip required-field validation.
- The actor uses HTTP fetch only. Sites that require JS rendering for structured data won't surface anything โ pair with a render crawler upstream.
includeRawHtml=truewrites one KVS entry per URL. KVS quotas apply.- Validation severity is internal โ
validationErrorsstrings start witherror,warn, orinfofor downstream filtering.
FAQ
How do I test JSON-LD and schema markup across a whole site instead of one URL? Feed the actor a URL list (or pipe in a sitemap from Sitemap Walker Pro). It runs the same rich results check and schema.org validation on every URL in one run, so you audit the whole site in a single pass rather than pasting URLs into a single-page validator one at a time.
Related Actors
- Sitemap Walker Pro โ feed discovered URLs straight into this validator for site-wide structured-data audits.
- SSL & Security Headers Checker โ pair for full SEO + security audits per URL.
- Angular SSR State Extractor โ for sites where the structured data lives inside Angular's TransferState payload.
Need More Features?
Need additional schema.org types, custom validation rules, or a render-crawler variant? File an issue or get in touch.
Why Use Structured Data Validator Pro?
- Functionally free โ $0.0001 per record. Audit your whole sitemap and barely move the needle.
- Six formats, one pass โ JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, and meta tags in a single dataset row. Most tools cover one, maybe two.
- AI-discovery score baked in โ rich-result eligibility plus an LLM-readiness score, so you know how the site reads to both Google and Claude.
Built by OrbTop.
