![]() |
VOOZH | about |
Generating Sample JSON Data from a Schema
JsonSchema.Net.DataGeneration is a tool that can create JSON data instances using a JSON schema as a framework.
For example, given the schema:
1
2
3
4
5
6
{"type":"object","properties":{"foo":{"type":"string"}}}
it can generate a JSON document like
1
2
3
{"foo":"bar"}
Under the covers, the library uses the fabulous Bogus library, which is commonly used to generate random test data, and a few other tricks.
One of the more practical uses of a data generator is checking whether a schema actually says what you think it says. The generator just follows the rules, so if the output looks wrong, the schema isn’t strict enough.
requiredSuppose you want a user record that always has a username:
1
2
3
4
5
6
7
{"type":"object","properties":{"username":{"type":"string"},"email":{"type":"string","format":"email"}}}
properties only describes what a property looks like if it shows up. It doesn’t make the property show up. So the generator is perfectly happy producing:
1
{}
or
1
{"email":"someone@example.com"}
Both are valid. Adding "required": ["username"] is what actually makes username mandatory, and the generator will reflect that.
A schema for an age field written as:
1
{"type":"number"}
will cheerfully produce 3.14 or -7.9. Those are valid numbers, just not valid ages. The schema should be:
1
2
3
4
5
{"type":"integer","minimum":0,"maximum":130}
additionalProperties SurprisesWithout "additionalProperties": false, the generator can (and will) tack on extra properties beyond whatever is listed in properties:
1
2
3
4
5
6
{"type":"object","properties":{"id":{"type":"integer"}}}
might produce:
1
2
3
4
5
{"id":42,"xQ7":true,"lorem":"ipsum dolor"}
If you only want id, say so with "additionalProperties": false.
This library is quite powerful. It supports most JSON Schema keywords, including if/then/else and aggregation keywords (oneOf, allOf, etc.).
It currently does not support:
$dynamicReftitle, description)content* keywordsdependencies / dependent* keywordsEverything else should be mostly supported. Feel free to open an issue if you find something isn’t working as you expect.
$refsupport does not check for infinite loops such as occur with schemas like{ "$ref": "#" }. If your schema includes a reference like this, a stack overflow exception is likely.
Without any additional parameters, string generation uses Bogus’s Lorem Ipsum generator to create some nice (but oddly readable) garbage text.
formatAll of the formats listed in the draft 2020-12 specification are supported, at least to the extent that they can be validated by JsonSchema.Net.
If a format is specified, it will be used.
patternRegular expressions specified via pattern support combined constraint evaluation, including scenarios where multiple required patterns must be satisfied together.
Supported scenarios include:
pattern constraints across composed schemasnotpattern and minLength/maxLengthpattern and formatSome highly complex or mutually incompatible regex combinations may still be impossible to satisfy. In those cases, generation fails with detailed error information.
Integer and number generation each have custom algorithms that produce values that align with minimums, maximums, multiples, and even anti-multiples (numbers that should not be divisors).
For this schema,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{"type":"integer","minimum":0,"maximum":100,"multipleOf":4,"allOf":[{"not":{"minimum":40,"maximum":60},"not":{"multipleOf":3}}]}
the only valid integers are
The library will generate such values with ease.
Care needs to be taken when specifying arrays that can have additional items or objects that can have additional properties. This library will unsubtly create moderatly deep trees of data if allowed.
For example, this schema doesn’t specify what the items should look like:
1
2
3
{"type":"array"}
So, the generator will happily create literally any JSON value for the items, including unconstrained objects and arrays.
To combat this, there are some built-in limitations:
All you need to generate data is a schema object. This can be built inline or read in from an external source. The instructions for that are on the “Overview” tab.
Once you have your schema object, simply call the .GenerateData() extension method, and it will return a result to you.
1
2
3
var schema = JsonSchema.FromFile("myFile.json");
var generationResult = schema.GenerateData();
var sampleData = generationResult.Result;
The result object has several properties:
IsSuccess indicates whether the system was able to generate a valueResult holds the value as a JsonElement, if successfulErrorMessage holds any error message, if unsuccessfulInnerResults holds result objects from nested generations. This can be useful for debugging.Location (if available) identifies where generation failed in the target instance, as a JsonPointerSchemaLocations (if available) identifies one or more schema locations related to the failure, also as JsonPointersWhen generation fails, start with the top-level GenerationResult returned by .GenerateData():
IsSuccess is false, inspect ErrorMessage and InnerResults.InnerResults contains nested failures from branches, properties, array items, or composed schemas.Location for the relative instance path that failedSchemaLocations for the schema path(s) involved in that failureIn practice, a single generation failure can contain multiple nested reasons. Walking the InnerResults tree is the best way to produce a full error report.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
void PrintFailures(GenerationResult result, string indent = "")
{
if (result.IsSuccess) return;
if (!string.IsNullOrWhiteSpace(result.ErrorMessage))
{
Console.WriteLine($"{indent}Reason: {result.ErrorMessage}");
if (result.Location != null)
Console.WriteLine($"{indent}At: {result.Location}");
if (result.SchemaLocations is { Count: > 0 })
{
Console.WriteLine($"{indent}Schema path(s):");
foreach (var schemaLocation in result.SchemaLocations)
Console.WriteLine($"{indent}- {schemaLocation}");
}
}
if (result.InnerResults == null) return;
foreach (var inner in result.InnerResults)
PrintFailures(inner, indent + " ");
}
var schema = JsonSchema.FromFile("myFile.json");
var generationResult = schema.GenerateData();
if (!generationResult.IsSuccess)
PrintFailures(generationResult);
So, uh, yeah. I guess that’s it really.
Happy generating.
A new version of content is available.