Schema Markup & JSON-LD Scraper - Structured Data API
Pricing
from $2.00 / 1,000 urls
Schema Markup & JSON-LD Scraper - Structured Data API
Extract schema markup, JSON-LD, Open Graph, Twitter Cards, and meta tags from any URL. Structured data scraper/API for SEO audits, rich result checks, schema validation, and competitor research.
Pricing
from $2.00 / 1,000 urls
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
10
Total users
4
Monthly active users
12 days ago
Last modified
Categories
Share
Schema Markup & SEO Data Extractor
Extract JSON-LD structured data, Open Graph tags, Twitter Card metadata, and meta tags from any URL. Built for SEO auditors, developers, and data engineers who need structured page metadata at scale.
Pricing: $0.002 per URL (~$2 per 1,000 URLs)
What It Extracts
| Data Type | Examples |
|---|---|
| JSON-LD | Product, Article, BreadcrumbList, FAQPage, LocalBusiness, WebSite, Person, Organization |
| Open Graph | og:title, og:description, og:image, og:url, og:type, og:site_name |
| Twitter Card | twitter:card, twitter:title, twitter:description, twitter:image, twitter:site |
| Meta Tags | description, keywords, author, robots, viewport, canonical |
| Schema Types | Deduplicated list of all @type values found on the page |
Input
| Field | Type | Default | Description |
|---|---|---|---|
urls | array | required | URLs to extract from |
includeJsonLd | boolean | true | Parse JSON-LD script blocks |
includeOpenGraph | boolean | true | Parse og: meta properties |
includeTwitterCard | boolean | true | Parse twitter: meta tags |
includeMetaTags | boolean | true | Parse all <meta name=...> tags |
concurrency | integer | 5 | Parallel requests (1-20) |
timeout | integer | 30 | Per-URL timeout in seconds |
maxResults | integer | 50 | Cap on URLs processed |
Output
Each URL produces one dataset record:
{"url":"https://example.com/product/widget","jsonLd":[{"@context":"https://schema.org","@type":"Product","name":"Widget Pro","description":"A professional widget","offers":{"@type":"Offer","price":"29.99","priceCurrency":"USD"}}],"openGraph":{"title":"Widget Pro - Best Widgets","description":"A professional widget for professionals","image":"https://example.com/widget.jpg","type":"product"},"twitterCard":{"card":"summary_large_image","title":"Widget Pro","image":"https://example.com/widget-twitter.jpg"},"metaTags":[{"name":"description","content":"A professional widget for professionals"},{"name":"keywords","content":"widget, pro, professional"}],"schemaTypes":["Product","Offer"]}
If a URL fails to fetch or parse, the record includes an error field and empty arrays/objects for the structured data fields.
Use Cases
- SEO audits โ verify JSON-LD is present and correct across hundreds of pages
- Competitor research โ see what schema types competitors implement
- Rich result eligibility โ check if pages qualify for Google rich results (Product, FAQ, Article, etc.)
- Content aggregation โ extract og:image and og:title for link previews
- Schema validation โ identify missing or malformed structured data before a site launch
- Crawl pipelines โ feed output into downstream validators or dashboards
Notes
- Uses a pure HTTP client โ no browser required, fast and cost-efficient
- Handles
@grapharrays in JSON-LD (common on WordPress/Yoast sites) - Handles both
property="twitter:..."andname="twitter:..."meta tag formats - Follows up to 10 redirects per URL
- Response body capped at 10 MB per page
- No API key required
