VOOZH about

URL: https://apify.com/gochujang/json-schema-generator

⇱ JSON Schema Auto-Generator (Infer from Samples) Β· Apify


πŸ‘ JSON Schema Auto-Generator (Infer from Samples) avatar

JSON Schema Auto-Generator (Infer from Samples)

Pricing

Pay per usage

Go to Apify Store

JSON Schema Auto-Generator (Infer from Samples)

Provide one or more JSON samples (inline or from URLs) and get an inferred JSON Schema (Draft 7 / 2020-12) describing their shape. Bootstrap API validators, Apify input schemas, BigQuery / DuckDB schemas. Powered by genson. $0.01 per inference.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

πŸ‘ Hojun Lee

Hojun Lee

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

JSON Schema Auto-Generator

Provide one or more JSON samples (inline or from URLs) and get an inferred JSON Schema (Draft 7 / 2020-12) describing their shape. Bootstrap API validators, Apify input schemas, BigQuery / DuckDB schemas. Powered by genson. $0.01 per inference.


Why this exists

You hit an API that returns JSON. You want to validate downstream payloads, store them in a typed table, or auto-generate TypeScript types β€” but writing a JSON Schema by hand from 30 nested fields is tedious.

This actor takes one or more sample payloads and infers the schema. The result is a real, RFC-compliant JSON Schema you can drop into Ajv, BigQuery, OpenAPI, Apify input_schema.json, etc.


What you get

{
"_type":"schema",
"samples_used":3,
"schema_uri":"https://json-schema.org/draft/2020-12/schema",
"title":"User",
"inferred_schema":{
"$schema":"https://json-schema.org/draft/2020-12/schema",
"type":"object",
"title":"User",
"additionalProperties":false,
"properties":{
"id":{"type":"integer"},
"email":{"type":"string"},
"tags":{"type":"array","items":{"type":"string"}},
"meta":{
"type":"object",
"properties":{
"created":{"type":"string"},
"verified":{"type":"boolean"}
},
"required":["created"]
}
},
"required":["id","email"]
},
"schema_str":"<pretty-printed schema as text>",
"top_level_keys":["id","email","tags","meta"],
"top_level_required":["id","email"]
}

The full schema is also saved as inferred_schema.json in the run's KeyValueStore β€” easy to download.


Quick start

Single sample

{
"sample":{
"id":1,
"email":"test@example.com",
"tags":["a","b"],
"meta":{"created":"2024-01-01","verified":true}
}
}

Merge multiple samples (recommended β€” better union types)

{
"samples":[
{"id":1,"name":"foo"},
{"id":2,"name":"bar","deleted_at":"2024-01-01"},
{"id":3,"name":"baz","deleted_at":null}
]
}

Fetch samples from URLs

{
"sampleUrls":[
"https://api.github.com/users/torvalds",
"https://api.github.com/users/octocat"
],
"schemaTitle":"GitHub User",
"schemaUri":"https://json-schema.org/draft/2020-12/schema"
}

Pricing

Pay-Per-Event: $0.01 per schema inference.

Cheap, fixed-cost. Run as many times as you want during API evolution.


Use cases

  1. Bootstrap Apify input_schema.json β€” Use it on a sample input, drop the schema into your actor's .actor/input_schema.json
  2. REST API validators β€” Generate Ajv-compatible schemas from real production responses
  3. BigQuery / DuckDB tables β€” Convert JSON Schema β†’ DDL with a follow-up tool
  4. OpenAPI β€” Drop schemas under components.schemas for type-safe SDK generation
  5. Comparison β€” Run on prod vs staging payloads; diff the schemas to spot drift

Output details

  • inferred_schema β€” the schema as a JS object (use this in code)
  • schema_str β€” pretty-printed text (paste into a file)
  • top_level_keys β€” convenience for renaming / sorting fields
  • additionalProperties: false is set by default for objects (strict mode). To allow extras, remove it before using the schema.

Limitations / gotchas

  • More samples = better schema. A single sample can't tell required vs optional. Pass 5-10 samples covering all common cases.
  • null handling β€” When a field is sometimes null, the inferred type becomes ["string", "null"] (or similar). This is correct JSON Schema but some tools (e.g. old Avro) don't like unions.
  • Format detection β€” genson doesn't infer format: date-time etc. by default. For format-aware generation, post-process manually.

Engine

  • genson v1.2+ β€” the reference Python implementation. Well-maintained, used in many data pipelines.

Related actors (same author)


Feedback

A short review helps developers find it: Leave a review on Apify Store

You might also like

JSON Schema Generator

automation-lab/json-schema-generator

Generate JSON Schema (draft-07) from sample JSON instantly. Auto-detects types, required fields, nullable values, and nested structures. Export as JSON or YAML.

πŸ‘ User avatar

Stas Persiianenko

2

Structured Data Extractor β€” URL to JSON

shelvick/structured-extractor

Extract structured data from a batch of URLs as schema-validated JSON. Send web pages and a JSON Schema; it scrapes each (stealth + residential proxy as needed), runs an LLM to convert the page to JSON matching your schema, and validates per URL. Omit schema for best-effort. Public pages only.

2

Output & Dataset Schema Creator

zuzka/output-dataset-schema-creator

Generate JSON schemas for output and dataset on your Actor using AI. Perfect for testing new actors.

πŸ‘ User avatar

Zuzka PelechovΓ‘

1

Validate Dataset(s) with JSON Schema

jaroslavhejlek/validate-dataset-with-json-schema

This Actor validates items in one or more datasets against a provided JSON Schema. Use it if you planning to add a dataset validation schema to your actor and you want test it.

πŸ‘ User avatar

Jaroslav Hejlek

5

Schema Universal Converter

fiery_dream/schema-universal-converter

Convert between JSON Schema, TypeScript, Zod, OpenAPI, GraphQL, and more. Maintain schema consistency across your entire stack.

πŸ‘ User avatar

Cody Churchwell

2