VOOZH about

URL: https://apify.com/automation-lab/schema-markup-validator

โ‡ฑ Schema Markup Validator for JSON-LD and SEO Audits ยท Apify


Pricing

Pay per event

Go to Apify Store

Schema Markup Validator

Validate JSON-LD, Microdata, RDFa, Open Graph, and Twitter Cards across public pages and sitemaps for bulk structured-data SEO QA.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Categories

Share

Bulk validate structured data, schema.org markup, JSON-LD, Microdata, RDFa, Open Graph, and Twitter Cards from public web pages.

Use this actor when you need repeatable SEO QA at scale: crawl a list of URLs, expand an XML sitemap, detect schema types, parse markup, and export page-level warnings before a release, migration, or content audit.

What does Schema Markup Validator do?

Schema Markup Validator fetches public HTML pages and inspects the markup that search engines and social platforms use.

It extracts and validates:

  • โœ… JSON-LD blocks from application/ld+json scripts
  • โœ… Microdata entities from itemscope and itemprop
  • โœ… RDFa entities from typeof and property
  • โœ… schema.org type names across all detected formats
  • โœ… Open Graph meta tags such as og:title and og:description
  • โœ… Twitter Card tags such as twitter:card
  • โœ… JSON parse errors and missing context/type warnings
  • โœ… local rich-result readiness hints for common schema types

The output is one dataset row per URL, which makes it easy to export to CSV, JSON, Google Sheets, BI tools, or automated QA pipelines.

Who is it for?

This actor is built for teams that manage SEO-critical websites and need consistent structured-data checks.

  • ๐Ÿ” Technical SEO agencies auditing client templates
  • ๐Ÿ“ฐ Publishers validating Article and NewsArticle pages
  • ๐Ÿ›’ Ecommerce teams checking Product schema before launches
  • ๐Ÿข Local SEO teams checking LocalBusiness and Organization markup
  • ๐Ÿง‘โ€๐Ÿ’ป Developers adding schema.org markup to templates
  • ๐Ÿ“ˆ Growth teams monitoring regression after CMS changes
  • ๐Ÿงช QA teams adding SEO checks to release workflows

Why use it?

Manual validators are useful for one page, but they are slow for dozens or thousands of pages.

Schema Markup Validator is designed for repeatable bulk audits:

  • Run the same validation before every release
  • Compare schema coverage across page templates
  • Spot invalid JSON-LD in large URL lists
  • Export issues to spreadsheets or ticket systems
  • Monitor important pages after CMS or theme changes
  • Validate social-card metadata alongside schema markup

Data you can extract

FieldDescription
urlInput page URL
finalUrlFinal URL after redirects
statusCodeHTTP response status
pageTitleHTML title text
canonicalUrlCanonical link URL when present
schemaTypesDetected schema.org types
jsonLdCountNumber of JSON-LD blocks
microdataCountNumber of Microdata entities
rdfaCountNumber of RDFa entities
openGraphCountNumber of Open Graph tags
twitterCardCountNumber of Twitter Card tags
errorsBlocking validation errors
warningsNon-blocking quality warnings
richResultHintsLocal required/recommended field hints
jsonLdParsed JSON-LD blocks
microdataExtracted Microdata entities
rdfaExtracted RDFa entities
openGraphOpen Graph metadata
twitterCardTwitter Card metadata
rawMarkupOptional raw snippets for debugging
fetchedAtValidation timestamp

How much does it cost to validate schema markup?

This actor uses pay-per-event pricing.

You pay a small run-start fee and then a per-page validation fee for each dataset row produced.

The exact live prices are shown on the Apify Store pricing tab. The actor is designed as an HTTP-first tool, so it avoids browser automation by default and keeps validation runs inexpensive.

Cost-control tips:

  • Start with 10-25 representative URLs
  • Use maxPages when testing sitemaps
  • Disable raw markup output for smaller exports
  • Crawl links only when you need discovery
  • Use sitemaps for controlled bulk validation

How to use Schema Markup Validator

  1. Add page URLs to startUrls.
  2. Optionally add XML sitemap URLs to sitemapUrls.
  3. Set maxPages to the number of pages you want to validate.
  4. Keep crawlLinks disabled unless you want link discovery.
  5. Run the actor.
  6. Open the dataset table.
  7. Filter rows with errors or warnings.
  8. Export the dataset to CSV, JSON, XLSX, or your integration target.

Input example

{
"startUrls":[
{"url":"https://schema.org/Article"},
{"url":"https://schema.org/Product"}
],
"maxPages":25,
"includeRawMarkup":false,
"validateRichResultHints":true
}

Sitemap input example

{
"startUrls":[
{"url":"https://example.com/"}
],
"sitemapUrls":[
{"url":"https://example.com/sitemap.xml"}
],
"maxPages":100,
"crawlLinks":false
}

Output example

{
"url":"https://schema.org/Article",
"finalUrl":"https://schema.org/Article",
"statusCode":200,
"pageTitle":"Article - Schema.org Type",
"canonicalUrl":"https://schema.org/Article",
"schemaTypes":["Article"],
"jsonLdCount":1,
"microdataCount":0,
"rdfaCount":0,
"openGraphCount":3,
"twitterCardCount":2,
"errors":[],
"warnings":[],
"richResultHints":[
{
"type":"Article",
"eligible":false,
"missingRequired":["headline","image"],
"missingRecommended":["publisher"]
}
]
}

JSON-LD validation

The actor parses every application/ld+json script block independently.

It reports:

  • invalid JSON syntax
  • missing @context
  • missing @type or @graph
  • detected schema.org types
  • optional raw block text

This helps teams find broken template snippets without waiting for a search-engine recrawl.

Microdata validation

The actor extracts Microdata from elements with itemscope, itemtype, itemid, and itemprop.

Each entity includes the item type, optional ID, and detected properties.

This is useful for older templates, ecommerce themes, and CMS plugins that still generate Microdata instead of JSON-LD.

RDFa validation

The actor extracts RDFa-like entities from elements with typeof, property, resource, and about attributes.

RDFa is less common than JSON-LD, but many older sites and semantic templates still use it.

Open Graph checks

Open Graph tags control link previews on platforms such as Facebook, LinkedIn, Slack, and many messaging apps.

The actor extracts all og:* properties and warns when common core fields are missing.

Common tags include:

  • og:title
  • og:description
  • og:image
  • og:url
  • og:type

Twitter Card checks

Twitter Card metadata controls previews on X/Twitter and other tools that read twitter:* tags.

The actor extracts all Twitter Card tags and warns when twitter:card is missing.

Rich-result hints

The actor includes deterministic local hints for common schema types.

Supported hint families include:

  • Article
  • NewsArticle
  • BlogPosting
  • Product
  • LocalBusiness
  • Organization
  • FAQPage
  • HowTo
  • Recipe
  • Event
  • JobPosting
  • BreadcrumbList

These hints are not a replacement for Google's official tools. They are fast local checks for common required and recommended fields.

Integrations

You can connect Schema Markup Validator to many workflows:

  • Send dataset rows to Google Sheets for SEO review
  • Trigger Slack alerts when errors appear
  • Store historical validation exports in S3
  • Compare staging and production templates
  • Add structured-data checks to release QA
  • Feed warnings into Jira, Linear, or GitHub issues
  • Monitor high-value product or article pages weekly

API usage with Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token: process.env.APIFY_TOKEN});
const run =await client.actor('automation-lab/schema-markup-validator').call({
startUrls:[{url:'https://schema.org/Article'}],
maxPages:10,
});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/schema-markup-validator').call(run_input={
'startUrls':[{'url':'https://schema.org/Article'}],
'maxPages':10,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl-X POST "https://api.apify.com/v2/acts/automation-lab~schema-markup-validator/runs?token=$APIFY_TOKEN"\
-H'Content-Type: application/json'\
-d'{"startUrls":[{"url":"https://schema.org/Article"}],"maxPages":10}'

MCP usage

Use the actor from Claude Desktop, Claude Code, or other MCP-compatible clients through Apify MCP Server.

MCP endpoint:

https://mcp.apify.com/?tools=automation-lab/schema-markup-validator

Claude Code setup:

$claude mcp add apify-schema-validator --transport http https://mcp.apify.com/?tools=automation-lab/schema-markup-validator

Claude Desktop JSON config:

{
"mcpServers":{
"apify-schema-validator":{
"url":"https://mcp.apify.com/?tools=automation-lab/schema-markup-validator"
}
}
}

Example prompts:

  • "Validate schema markup for these 20 product URLs and summarize missing fields."
  • "Check whether our article pages have valid JSON-LD and Open Graph tags."
  • "Audit this sitemap and give me a CSV of pages missing Twitter Cards."

Tips for best results

  • Validate representative template URLs first.
  • Use sitemaps for controlled bulk audits.
  • Keep includeRawMarkup off unless you need debugging snippets.
  • Use crawlLinks only for small site discovery runs.
  • Treat rich-result hints as local guidance, not official Google eligibility.
  • Export results and track error counts over time.

Troubleshooting

Why do I see no structured data?

The page may not include schema markup in server-rendered HTML, or it may generate markup only in the browser after JavaScript runs. This actor is HTTP-first for cost and reliability.

Why does a page show rich-result warnings even with schema present?

The actor checks common required and recommended fields for popular schema types. A warning means the detected entity may be missing fields commonly expected for that rich-result family.

Why did a URL return status code 0?

Status code 0 means the request failed before a normal HTTP response was available. Check whether the site blocks automated requests, redirects unusually, or requires login.

Legality and ethical use

This actor validates public page markup supplied by the user. Use it only on websites you are allowed to audit and follow the target site's terms, robots policies, and applicable laws.

Do not use it to overload websites. Keep maxPages reasonable and run recurring audits at responsible intervals.

Related scrapers and SEO tools

Explore other automation-lab actors on Apify:

Changelog

  • Initial version: HTTP-first schema.org, JSON-LD, Microdata, RDFa, Open Graph, and Twitter Card validation.

Support

If you need a validation field that is not included yet, open an issue on the Apify actor page with an example URL and the expected output.

You might also like

Schema Markup Validator

maximedupre/schema-markup-validator

Validate schema markup on public pages. Extract JSON-LD, Microdata, RDFa, Open Graph, Twitter Cards, meta tags, schema.org types, issue counts, and rich-result readiness signals.

๐Ÿ‘ User avatar

Maxime Duprรฉ

2

Structured Data Validator (JSON-LD / OG)

jungle_synthesizer/structured-data-validator-pro

Extract and validate structured data from any URL: JSON-LD, Open Graph, Twitter Cards, microdata, RDFa, meta tags. Local schema.org validation. Flags Google rich-result eligibility and AI-discovery readiness. Pure HTTP. Built for SEO audits and structured-data debugging at scale.

๐Ÿ‘ User avatar

BowTiedRaccoon

5

Structured Data Scraper & Validator

taroyamada/structured-data-validator

Crawl websites to extract JSON-LD and Microdata, validate schema markup syntax, and flag missing fields across massive URL lists.

Schema Markup Validator

glowing_glove/schema-markup-validator

Extract and validate JSON-LD/schema.org markup from public webpages for SEO, ecommerce, SaaS, and publisher audits.

Schema Markup Scraper & SEO Auditor

autofacts/metadata-scraper

Extract JSON-LD, Microdata, RDFa, Open Graph & Twitter Cards. Runs a 0-100 SEO audit โ€” checks canonical, hreflang, headings, image alt, EEAT author signals. Detects 80+ schema.org types including LocalBusiness with NAP, geo coordinates, and Google Place IDs.

146

5.0

Seo Schema Validator

naive_zing/seo-schema-validator

Bulk validate schema markup and JSON-LD across your entire website by crawling sitemaps. Generate agency-ready SEO health reports with per-page health scores for improved rich results and technical SEO audits.

Structured Data Extractor

automation-lab/structured-data-extractor

This actor extracts structured data markup from web pages. It parses all three major formats: JSON-LD (`<script type="application/ld+json">`), Microdata (`itemscope`/`itemprop`), and RDFa (`typeof`/`property`). For each page, it returns the full structured data objects, detected Schema.org...

๐Ÿ‘ User avatar

Stas Persiianenko

17