Constraining the key in JSON schema - json

I would like to put constraints the key in a JSON document, using JSON schema. For example, I may have a JSON document that looks like this:
{
"id": 1,
"name": "a green door",
"price": 12.50,
"tags": ["home", "green"]
}
I don't care about which particular keys are being used, but I'd like to enforce in the schema that no key is longer than a certain number of characters; let's say 4 characters for the sake of argument. The example above would then fail schema validation, because "price" is 5 characters long.
I know how to validate the length of the value -- here, I care about the key.

You can use patternProperties to restrict property names to those that match a regular expression. In the case of your example, it might look like this:
{ "$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"additionalProperties": false,
"patternProperties": {
"^.{1,4}$": {}
}
}
Note "additionalProperties": false is necessary as well.

Related

Valid string sequences in JSON Schema

Can anyone advise how to code up a JSON Schema document to describe a string that can be one of three possible sequences? Say a string "fruit" can be only the following: "apple", "bananna" or "coconut".
I was thinking it might be possible to use regex but not sure how to indicate the regex constraint in JSON Schema.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.6.1
Here is what I have so far:
{
"$id": "https://example.com/person.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "TestSession",
"type": "object",
"properties": {
"fruit": {
"type": "string",
"description": "only three legal possibilities: apple, banana, or coconut"
}
}
You need to use the enum keyword for this.
The value of this keyword MUST be an array. This array SHOULD have
at least one element. Elements in the array SHOULD be unique.
An instance validates successfully against this keyword if its
value is equal to one of the elements in this keyword's array
value.
Elements in the array might be of any type, including null.
https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-validation-00#section-6.1.2
For example "enum": [ "apple", "bananna", "coconut" ].

JSON schema for key (unknown column name): value (list of integers) pairs

I want to make JSON schema for JSON which looks something like that (It's for constructing regressors with delays.):
{'x1': [1,6,2], 'col5': [0], 'y': [1, 6, 3, 8]}
I don't know column names and neither the length of the lists in advance. The only thing I know is that column name should be a string and list of values an array. Any advice how to construct it?
I'm open to more suitable JSON format and it's scheme.
Although is is possible to achieve with patternProperties of .* pattern the more straightforward way is to use additionalProperties schema attribute, e.g.:
{
"type": "object",
"additionalProperties": {
"type": "array",
"items": {
"type": "integer"
}
}
}
In this example I also restricted array element type to integer.
This sounds like a perfect use-case for JSON Schema. It allows you add as few or as many constraints as are are needed. The following schema requires that the JSON be an object where all properties must be an array. An array of what? It could be anything. It's unconstrained.
{
"type": "object",
"patternProperties": {
".*": { "type": "array" }
}
}

Enforcing "style" rules in JSON Schema files?

I am looking at using JSON Schemas for an upcoming project, and looking for a way to validate our naming conventions/style and consistency rules in the JSON Schema file. Somewhat similar to StyleCop or Checkstyle.
Using this samples from JSON Schema Lint to illustrate:
{
"description": "Any validation failures are shown in the right-hand Messages pane.",
"type": "object",
"properties": {
"foo": {
"type": "number"
},
"bar": {
"type": "string",
"enum": [
"a",
"b",
"c"
]
}
}
}
Imagine another developer wants to add a new property, but I want to prevent property names from being upper-case (baz instead of Baz) or maybe boolean properties should start with "is" (isBaz). Is there a way to "unit test" the JSON Schema file and check for that?
"Baz": {
"type": "boolean"
},
It feels like a custom validator for the JSON Schema file (vs. using the JSON Schema to validate the JSON output). Does something like that already exist, or do I just parse the JSON schema file myself and write the rules?
It's completely possible to write a meta-schema that enforces this constraint on your schemas. Let's construct it step-by-step:
1. Constraining property names
The key part is to use patternProperties to specify which property names are allowed, and additionalProperties to disallow anything else:
{
"patternProperties": {
"^[a-z]+([A-Z][a-z]*)*$": {}
},
"additionalProperties": false
}
(For this example, I've used the regex ^[a-z]+([A-Z][a-z]*)*$ to detect alphabetic-only lowerCamelCase)
Note that it doesn't matter whether provide any constraints for suitably-named properties (here it's just the empty schema {}). However, the presence of this definition means that any matching property is allowed, while anything else is banned by additionalProperties.
Fancier constraints
For other constraints (such as your "boolean properties must start with is" one), you just add more complex entries here.
This answer focuses more on how to make a generic recursive naming-style schema. It's already pretty long, so if you're looking for guidance on how to express a specific constraint, then it might be neater to ask as a separate question.
2. Applying to the properties property
This bit's pretty simple - make these constraints apply to the appropriate part of the schema:
{
"properties": {
"properties": {"$ref": "#/definitions/propertyStyleRule"}
},
"definitions": {
"propertyStyleRule": {
"patternProperties": {
"^[a-z]+([A-Z][a-z]*)*$": {}
},
"additionalProperties": false
}
}
}
3. Make it recursive
In fact, you don't just want to cover sub-schemas inside "properties", but also "items", "anyOf", etc.
Here it gets quite long, so I'll omit most of it, but basically you go through every keyword that might contain a schema, and make sure they are subject to the same naming-scheme by referencing the root schema:
{
"properties": {
"properties": {"$ref": "#/definitions/propertyStyleRule"},
"additionalProperties": {"$ref": "#"},
"items": {"$ref": "#"},
"not": {"$ref": "#"},
"allOf": {"$ref": "#"},
...
},
"definitions": {
"propertyStyleRule": {
"patternProperties": {
"^[a-z]+([A-Z][a-z]*)*$": {"$ref": "#"}
},
"additionalProperties": false
}
}
}
Note: we've also now replaced the empty schema ({}) in our "propertyStyleRule" definition with a reference back to the root ({"$ref": "#"}), so the sub-schemas inside properties also recurse properly.
4. Hang on, some of those keywords can be arrays, or booleans, or...
OK, so there's an obvious problem here: "not" holds a schema, so that's fine, but "allOf" holds an array of schemas, "items" can hold either, and "additionalProperties" can be a boolean.
We could do some fancy switching with different types, or we could simply add an items entry to our root schema:
{
"items": {"$ref": "#"},
"properties": {
...
},
"definitions": {
"propertyStyleRule": {...}
}
}
Because we haven't specified a type, our root schema actually allows instances to be objects/arrays/boolean/string/whatever - and if the instance isn't an object, then properties is just ignored.
Similarly, items is ignored unless the instance is an array - but if it is an array, then the entries must also follow the root schema. So it doesn't matter whether the value of "items" is a schema or an array of schemas, it recurses properly either way.
5. Schema maps
For a few keywords (like "patternProperties" or "definitions") the value is not a schema, it's a map of strings to schemas, so you can't just reference the root schema. For these, we'll make a definition "schemaMap", and reference that instead:
{
"items": {"$ref": "#"},
"properties": {
"properties": {"$ref": "#/definitions/propertyStyleRule"},
"additionalProperties": {"$ref": "#"},
"items": {"$ref": "#"},
"not": {"$ref": "#"},
"allOf": {"$ref": "#"},
...
"patternProperties": {"$ref": "#/definitions/schemaMap"},
...
},
"definitions": {
"schemaMap": {
"type": "object",
"additionalProperties": {"$ref": "#"}
},
"propertyStyleRule": {...}
}
}
... and you're done!
I've left out details, but hopefully it's clear enough how to write the full version.
Also, once you've written this once, it should be pretty easy to adapt it for different style rules, or even applying similar constraints to the names in "definitions", etc. If you do write a schema like this, please consider posting it somewhere so that other people can adapt it! :)

How would you design JSON Schema for an arbitrary key?

I have the following JSON output data:
{
"label_name_0" : 0,
"label_name_5" : 3,
.
.
.
"label_name_XXX" : 4
}
The output is simple: a key[1] name associated with integer value. If the key name doesn't change, I can easily come up with JSON Schema similar to this:
{
"type": "array"
"title": "Data output",
"items" :{
"properties": {
"label_name": {
"type": "integer",
"default": 0,
"readonly": True,
}
}
},
Since the key name itself is not known and keep changing, I have to design schema for it. The only thing I know is that the key is string and not more than 100 characters. How do I define a JSON Schema for the key lable_name_xxx that keeps changing.
[1] Not sure if I am using the right terminology
On json-schema.org you will find something appropriate in the File System Example section. You can define patternProperties inside an object.
{
"type": "object",
"properties": {
"/": {}
},
"patternProperties": {
"^(label_name_[0-9]+)+$": { "type": "integer" }
},
"additionalProperties": false,
}
The regular expression (label_name_[0-9]+)+ should fit your needs. In JSON Schema regular expressions are explicitly anchored with ^ and $. The regular expressions defines, that there has to be at least one property (+). The property consists of label_name_ and a number between 0 and 9 whereas there has to be at least one number ([0-9]+), but there can also arbitrary many of them.
By setting additionalProperties to false it constrains object properties to match the regular expression.
As Konrad's answer stated, use patternProperties. But use in place of properties, which is not needed, and I think Konrad just pasted from his reference example that was expecting a path starting with /. In the example below, the pattern match regex .* accepts any property name and I am allowing types of string or null only by using "additionalProperties": false.
"patternProperties": {
"^.*$": {
"anyOf": [
{"type": "string"},
{"type": "null"}
]
}
},
"additionalProperties": false
Simpler solution than patternProperties, since OP does not have any requirement on the key names (documentation):
{
"type": "object",
"additionalProperties": {
"type": "integer",
"default": 0,
"readonly": true,
}
}
default and readonly included because they were included in the OP's initial suggestion, but they are not required.

How can i define a single, unique Key-Valuepair in JsonSchema?

The Schema should allow only the following constellation: {"status":"nok"}.
The Key must always be "status" and the value should allow "ok","nok","inProgress"
No differen or additional objects,... should be allowed
I have tried this:
{
"description": "blabla",
"type": "object",
"properties": {
"status": {
"type": "string",
"enum": [
"ok",
"inProgress",
"nok"
],
"required": true,
"additionalItems": false
}
},
"required": true,
"additionalProperties": false
}
This works, but this scheme allows that i can send the same key/value pair twice like {"status":"nok","status":"nok"}
I would be also happy, if it would work without this "object"-container that i'm using, because to reduce overhead.
Maybe someone knows a solution, thanks
There is a more fundamental issue with that input:
{"status":"nok","status":"nok"}
mainly: that input is not valid JSON. RFC 4627, section 2.2, explicitly states that "The names within an object SHOULD be unique". And in your case, they are not.
Which means the JSON parser you use can do whatever it wants to with such an input. Some JSON APIs will grab whatever value they come upon first, other parsers will grab the last value they read, others will even coalesce values -- none of this is illegal as per the RFC.
In essence: given such input, you cannot guarantee what the output of the JSON parser will be; and as such, you cannot guarantee JSON Schema validation of such an input either.