JSON Schema Auto-Generator (Infer from Samples)
Pricing
Pay per usage
JSON Schema Auto-Generator (Infer from Samples)
Provide one or more JSON samples (inline or from URLs) and get an inferred JSON Schema (Draft 7 / 2020-12) describing their shape. Bootstrap API validators, Apify input schemas, BigQuery / DuckDB schemas. Powered by genson. $0.01 per inference.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
JSON Schema Auto-Generator
Provide one or more JSON samples (inline or from URLs) and get an inferred JSON Schema (Draft 7 / 2020-12) describing their shape. Bootstrap API validators, Apify input schemas, BigQuery / DuckDB schemas. Powered by genson. $0.01 per inference.
Why this exists
You hit an API that returns JSON. You want to validate downstream payloads, store them in a typed table, or auto-generate TypeScript types β but writing a JSON Schema by hand from 30 nested fields is tedious.
This actor takes one or more sample payloads and infers the schema. The result is a real, RFC-compliant JSON Schema you can drop into Ajv, BigQuery, OpenAPI, Apify input_schema.json, etc.
What you get
{"_type":"schema","samples_used":3,"schema_uri":"https://json-schema.org/draft/2020-12/schema","title":"User","inferred_schema":{"$schema":"https://json-schema.org/draft/2020-12/schema","type":"object","title":"User","additionalProperties":false,"properties":{"id":{"type":"integer"},"email":{"type":"string"},"tags":{"type":"array","items":{"type":"string"}},"meta":{"type":"object","properties":{"created":{"type":"string"},"verified":{"type":"boolean"}},"required":["created"]}},"required":["id","email"]},"schema_str":"<pretty-printed schema as text>","top_level_keys":["id","email","tags","meta"],"top_level_required":["id","email"]}
The full schema is also saved as inferred_schema.json in the run's KeyValueStore β easy to download.
Quick start
Single sample
{"sample":{"id":1,"email":"test@example.com","tags":["a","b"],"meta":{"created":"2024-01-01","verified":true}}}
Merge multiple samples (recommended β better union types)
{"samples":[{"id":1,"name":"foo"},{"id":2,"name":"bar","deleted_at":"2024-01-01"},{"id":3,"name":"baz","deleted_at":null}]}
Fetch samples from URLs
{"sampleUrls":["https://api.github.com/users/torvalds","https://api.github.com/users/octocat"],"schemaTitle":"GitHub User","schemaUri":"https://json-schema.org/draft/2020-12/schema"}
Pricing
Pay-Per-Event: $0.01 per schema inference.
Cheap, fixed-cost. Run as many times as you want during API evolution.
Use cases
- Bootstrap Apify
input_schema.jsonβ Use it on a sample input, drop the schema into your actor's.actor/input_schema.json - REST API validators β Generate Ajv-compatible schemas from real production responses
- BigQuery / DuckDB tables β Convert JSON Schema β DDL with a follow-up tool
- OpenAPI β Drop schemas under
components.schemasfor type-safe SDK generation - Comparison β Run on prod vs staging payloads; diff the schemas to spot drift
Output details
inferred_schemaβ the schema as a JS object (use this in code)schema_strβ pretty-printed text (paste into a file)top_level_keysβ convenience for renaming / sorting fieldsadditionalProperties: falseis set by default for objects (strict mode). To allow extras, remove it before using the schema.
Limitations / gotchas
- More samples = better schema. A single sample can't tell required vs optional. Pass 5-10 samples covering all common cases.
nullhandling β When a field is sometimes null, the inferred type becomes["string", "null"](or similar). This is correct JSON Schema but some tools (e.g. old Avro) don't like unions.- Format detection β
gensondoesn't inferformat: date-timeetc. by default. For format-aware generation, post-process manually.
Engine
- genson v1.2+ β the reference Python implementation. Well-maintained, used in many data pipelines.
Related actors (same author)
Feedback
A short review helps developers find it: Leave a review on Apify Store
