JSON Schema check entire json for certain property with null values - json

I'm trying to write a json schema that searches any json with different structures for all occurences of a certain property called "field_name" and checks if that property has a value. There can't be an empty "field_name".
The property "field_name" can be at any level in the json file, e.g
https://raw.githubusercontent.com/stopopol/deims_apps/master/metadata_models/smm.json
So far I have this, but it never complains when a "field_name" is empty.
{
"$schema": "http://json-schema.org/schema#",
"title": "Metadata Model",
"type": "object",
"required": [
"name",
"abbreviation",
"version",
"releaseDate",
"scope",
"content"
],
"patternProperties": {
"field_name": {
"type": "string",
"minLength": 1
}
}
}
I thought that I could just check for any occurence of property "field_name" and that it needs to be a string with a lenght of at least 1.

You can do this with a surprisingly simple recursive schema. The properties and additionalProperties keywords only apply when the data being validated is an object. If the data is not an object, these keywords get ignored. This allows us to express the "if the value is an object" part simply by leaving out the "type": "object" declaration.
The use of allOf/definitions shows how to express the recursive constraint without making the entire schema recursive.
{
"title": "Metadata Model",
"type": "object",
"required": [
"name",
"abbreviation",
"version",
"releaseDate",
"scope",
"content"
],
"allOf": [{ "$ref": "#/definitions/field_name-not-empty-deep" }],
"definitions": {
"field_name-not-empty-deep": {
"properties": {
"field_name": {
"type": "string",
"minLength": 1
}
},
"additionalProperties": { "$ref": "#/definitions/field_name-not-empty-deep" }
}
}
}

{
"anyOf" :
[
{
"not" :
{
"type" : "object"
}
},
{
"properties" :
{
"field_name" :
{
"not" :
{
"type" : "null"
}
}
},
"additionalProperties" :
{
"$ref" : "#"
}
}
]
}
Each instance encountered is either not an object OR it checks for your property not being null, then it runs the filter on all other properties (this is done using the $ref which points to the root object) in turn applying this recursively on all possible sub-objects.
(I assume by "empty" you mean the property is set and equal to null.)

Related

Why required is not an element property?

Current schema example:
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"properties": {
"id": {
"type": "string",
"uniqueItems": true
},
"name": {
"type": "string"
},
"age": {
"type": "number"
},
"description": {
"type": "string"
}
},
"required": ["id", "name", "age"]
}
This to me is counterintuitive. It requires to repeat the property names and repetition is bad. I would have expected this instead:
{
"$schema": "http://json-schema.org/schema#",
"type": "object",
"properties": {
"id": {
"type": "string",
"uniqueItems": true,
"required": true
},
"name": {
"type": "string",
"required": true
},
"age": {
"type": "number",
"required": true
},
"description": {
"type": "string"
}
}
}
Is there a technical reason for required being an array where you have to repeat the property names? Is this approach superior in any way?
The set of required keys is an attribute of an object, not of its individual properties. That is, a predefined property
{
...
"$defs": {
"age_property": {
"type": "number"
}
}
...
}
may be required by one object
{
"type": "object",
"properties": {
"age": { "$ref": "#/$defs/age_property" },
...
},
"required": ["age", ...]
}
but not another
{
"type": "object",
"properties": {
"age": { "$ref": "#/$defs/age_property" },
...
},
"required": [...]
}
tl;dr
It has to do with what the keyword is actually evaluating. It's evaluating the container for the property's presence; the subschema in /properties is checking the value, if there is one.
Explanation
(source: I'm one of the specification authors and a validator implementor)
required used to be a keyword that was contained inside a property definition. As of draft 4, it was moved to it's own root-level keyword.
The value inside properties is always to be a schema. This subschema should stand alone, unaware that it's contained within a larger schema. As a schema, its function is to evaluate a value, but it has no knowledge of the origin of the value. In the case of properties, this is a value from a key-value pair. Again, it has no knowledge of the key or the object that contains it.
If required were part of the property definition, it would be validating not the value of the property, but the object that contains it. This is the responsibility of the parent schema.
An example:
// schema
{
"type": "object",
"properties": {
"a": { "type": "string" },
"b": { "required": true }
}
}
// instance
{ "b": "some value" }
/properties/b ({"required":true}) is instructed to evaluate "some value". How can required know that this value comes from an object and is under the b property? It would need knowledge of the value's parent to do that. (JSON Schema validators had to bend themselves into funny shapes in order to support this.)
The solution was to move required out of the property and into the schema that is evaluating the object itself.
// schema
{
"type": "object",
"properties": {
"a": { "type": "string" }
},
"required": [ "b" ]
}
// instance
{ "b": "some value" }
Now, required can evaluate the full object, and it can check whether that object contains a b property. Because there is no /properties/b in this case, any value is fine, so long as b is present.
Unfortunately, the discussion around moving this keyword has been lost as the current GitHub repo was set up after the move from draft 3 to draft 4.
The written specification is not based on real world application. See https://github.com/json-schema-org/json-schema-spec/issues/725 from the one who's probably done most practical use of json schema (ajv lib's author).
There is no right or wrong about the approach, but it is going to be useful for wide range of application or not is questionable. There are TONS of debates around this specification.
IMO, yes required makes impossible state possible (out-of-sync)
there is no technical reason, who designed jsonschema decided to use an array instead of element properties, pheraps because in this way you have all the required elements name near standing.

How can I validate with JSON scheme if object is empty or have required properties?

I want to validate a JSON array of objects with schema in function. These objects must have exactly one of these formats:
empty object
object with four properties
I tried to wrap required properties in oneOf, but I got the following error: Invalid input: data[1].prop should match exactly one schema in oneOf
{
"type": "array",
"items": {
"type": "object",
"properties": {
"prop": {
"type": "object",
"properties": {
"name": {
"prop1": "string"
},
"type": {
"prop2": "string"
},
"amount": {
"prop3": "number"
},
"operation": {
"prop4": "string"
}
},
"oneOf": [
{ "required": ["prop1", "prop2", "prop3", "prop4"] },
{ "required": [] }
]
}
}
}
}
I would move the oneOf out so that it's just under the items keyword.
In one of the subschemas, you have the properties keyword along with the required keyword for those properties plus an additionalProperties: false. This portion would satisfy the "exactly four properties" condition.
In the other subschema, just identify that it needs to be an object, but don't declare any properties. Use additionalProperties: false in this one, too. This satisfies the "empty object" condition.

Using multiple anyOf inside oneOf

I wanted to create a schema where I will be having multiple objects inside "oneOf" which will be having many objects in anyOf format where some of the keys can be of required type(this part works)
My schema :-
{
"description": "schema v6",
"type": "object",
"oneOf": [
{
"properties": {
"Speed": {
"items": {
"anyOf": [
{
"$ref": "#/definitions/speed"
},
{
"$ref": "#/definitions/SituationType"
}
]
},
"required": [
"speed"
]
}
},
"additionalProperties": false
}
],
"definitions": {
"speed": {
"description": "Speed",
"type": "integer"
},
"SituationType": {
"type": "string",
"description": "Situation Type",
"enum": [
"Advice",
"Depend"
]
}
}
}
But when I'm trying to verify this schema but i'm able to authenticate some incorrect values like
{
"Speed": {
"speed": "ABC",//required
"SituationType1": "Advisory1" //optional but key needs to be correct
}
}
correct response which i was expecting was
{
"Speed": {
"speed": "1",
"SituationType": "Advise"
}
}
First, you need to set the schema type correctly, otherwise implmentations may assume you're using the latest JSON Schema version (currently draft-7).
So, in your schema root, you need the following:
"$schema": "http://json-schema.org/draft-06/schema#",
Second, items is only applicable if the target is an array.
Currently your schema only checks the following:
If the root object has a property of "Speed", it must have a key of
"speed". The root object must not have any other properties.
And nothing else.
Your use of definitions and how you reference them is probably not what you intended.
It looks like you want Speed to contain speed which must be an integer, and optionaly SituationType which must be a string, limited by enum, and nothing else.
Here's the schema I have based on that, which passes and fails correctly based on your given example data:
{
"$schema": "http://json-schema.org/draft-06/schema#",
"type": "object",
"oneOf": [
{
"properties": {
"Speed": {
"properties":{
"speed": {
"$ref": "#/definitions/speed"
},
"SituationType": {
"$ref": "#/definitions/SituationType"
}
},
"required": [
"speed"
],
"additionalProperties": false
}
},
"additionalProperties": false
}
],
"definitions": {
"speed": {
"description": "Speed",
"type": "integer"
},
"SituationType": {
"type": "string",
"description": "Situation Type",
"enum": [
"Advice",
"Depend"
]
}
}
}
You need to define the properties for Speed, because otherwise you can't prevent additional properties, as additionalProperties is only effected by adjacent an properties key. We are looking to created a new keyword in draft-8 to support this kind of behaviour, but it doesn't look like you need it in your example (Huge Github issue in relation).
Adding additionalProperties false to the Speed schema now prevents other keys in that object.
I SUSPECT that given your question title, there may be more schema at play here, and you've simplified it for this question. If you have a more detailed schema with more complex issues, I'd be happy to help also.

How to tell JSON schema validator to pick schema from property value?

For example a schema for a file system, directory contains a list of files. The schema consists of the specification of file, next a sub type "image" and another one "text".
At the bottom there is the main directory schema. Directory has a property content which is an array of items that should be sub types of file.
Basically what I am looking for is a way to tell the validator to look up the value of a "$ref" from a property in the json object being validated.
Example json:
{
"name":"A directory",
"content":[
{
"fileType":"http://x.y.z/fs-schema.json#definitions/image",
"name":"an-image.png",
"width":1024,
"height":800
}
{
"fileType":"http://x.y.z/fs-schema.json#definitions/text",
"name":"readme.txt",
"lineCount":101
}
{
"fileType":"http://x.y.z/extended-fs-schema-video.json",
"name":"demo.mp4",
"hd":true
}
]
}
The "pseudo" Schema note that "image" and "text" definitions are included in the same schema but they might be defined elsewhere
{
"id": "http://x.y.z/fs-schema.json",
"definitions": {
"file": {
"type": "object",
"properties": {
"name": { "type": "string" },
"fileType": {
"type": "string",
"format": "uri"
}
}
},
"image": {
"allOf": [
{ "$ref": "#definitions/file" },
{
"properties": {
"width": { "type": "integer" },
"height": { "type": "integer"}
}
}
]
},
"text": {
"allOf": [
{ "$ref": "#definitions/file" },
{ "properties": { "lineCount": { "type": "integer"}}}
]
}
},
"type": "object",
"properties": {
"name": { "type": "string"},
"content": {
"type": "array",
"items": {
"allOf": [
{ "$ref": "#definitions/file" },
{ *"$refFromProperty"*: "fileType" } // the magic thing
]
}
}
}
}
The validation parts of JSON Schema alone cannot do this - it represents a fixed structure. What you want requires resolving/referencing schemas at validation-time.
However, you can express this using JSON Hyper-Schema, and a rel="describedby" link:
{
"title": "Directory entry",
"type": "object",
"properties": {
"fileType": {"type": "string", "format": "uri"}
},
"links": [{
"rel": "describedby",
"href": "{+fileType}"
}]
}
So here, it takes the value from "fileType" and uses it to calculate a link with relation "describedby" - which means "the schema at this location also describes the current data".
The problem is that most validators do not take any notice of any links (including "describedby" ones). You need to find a "hyper-validator" that does.
UPDATE: the tv4 library has added this as a feature
I think cloudfeet answer is a valid solution. You could also use the same approach described here.
You would have a file object type which could be "anyOf" all the subtypes you want to define. You would use an enum in order to be able to reference and validate against each of the subtypes.
If the sub-types schemas are in the same Json-Schema file you don't need to reference the uri explicitly with the "$ref". A correct draft4 validator will find the enum value and will try to validate against that "subschema" in the Json-Schema tree.
In draft5 (in progress) a "switch" statement has been proposed, which will allow to express alternatives in a more explicit way.

support for std::map< std::string, T > in json schema

Is there a standard approach to specifying a property to be a dictionary or map keyed by string with a value type T specified somewhere else in the schema?
For example, suppose you want to model a user's favorite movies where the key type is the name of the movie and the value type is some set of attributes about the movie (year made, budget, gross income, etc.)
I imagine you could model first a MovieDataPair as a type with name property and a value property containing the desired attributes. Then the map would be an array of those. But, then you would need a special unique constraint that ensured any movie name only appeared once.
Is there something in json schema to support this, or a standard pattern used for it?
If not built in support in json schema, what about other schema solutions?
After some study I've come up with the following answer:
The best way to see this in action is to find some examples. It
happens that there are several examples of this in the draft04 schema
itself (definitions, properties, patternProperties,...) and they
usually follow the same pattern.
For example, the definitions property of the draft04 schema defines what
should appear in a schema at the definitions property. Here is the
subschema associated with the definitions property:
"definitions": {
"type": "object",
"additionalProperties": { "$ref": "#" },
"default": {}
},
This says the entry at "#/definitions/" must be an object. The fact
that it is a json object means it will have unique keys itself. Now
for the values in the object, that is what additionalProperties is
designed to describe. In this case it says that the value of each
property must itself conform to the root of the schema "#". What this
means is that each value in the definitions property object of a valid json schema
object must also be a schema.
If this were typed like C++ it might look like:
std::map< std::string, Schema > definitions;
Effectively a map with a string key can be thought of as like a json
object with a structured value type. So, to create your own:
std::map< std::string, T >
First define the schema for T. For example:
"definitions" : {
"movie" : {
"properties": {
"title" : { "type" : "string" },
"year_made" : { "type" : "integer" },
"rating" : { "type" : "integer" }
}
}
}
For the value type T stored, decide if you want to allow any
properties, as long as these specified properties are typed as
specified above. If you only want these properties, add
"additionalProperties" : false
"definitions" : {
"movie" : {
"additionalProperties" : false,
"properties": {
"title" : { "type" : "string" },
"year_made" : { "type" : "integer" },
"rating" : { "type" : "integer" }
}
}
}
Also decide if you actually require all of the properties to be
present for the movie to be valid. If so, add a required entry.
"definitions" : {
"movie" : {
"additionalProperties": false,
"required" : [ "title", "year_made", "rating" ],
"properties": {
"title" : { "type" : "string" },
"year_made" : { "type" : "integer" },
"rating" : { "type" : "integer" }
}
},
Now the shape T for movie is defined. Create a definition for
the collection, or map of movies referencing the movie schema
defined as was done by definitions in the draft schema. Note: in
the "movie_map" additionalProperties has a different meaning than
that of "movie". In the case of "movie" it is a boolean false
which indicates no additional properties beyond what is listed in
properties. In the case of "movie_map" it means - if there are
additional properties, they must look like this schema. But,
since no properties have been specified in movie_map it really means
all properties in the object instance must conform to #/definitions/movie. Now all
values in a "movie_map" will look like the defined movie schema.
{
"definitions" : {
"movie" : {
"additionalProperties": false,
"required" : [ "title", "year_made", "rating" ],
"properties": {
"title" : { "type" : "string" },
"year_made" : { "type" : "integer" },
"rating" : { "type" : "integer" }
}
},
"movie_map" : {
"type": "object",
"additionalProperties": { "$ref": "#/definitions/movie" },
"default": {}
}
}
}
Now use the defined schema movie_map somewhere within the schema:
{
"title" : "movie data",
"additionalProperties" : false,
"required" : [ "movies" ],
"properties" : {
"movies" : { "$ref" : "#/definitions/movie_map" }
},
"definitions" : {
"movie" : {
"additionalProperties": false,
"required" : [ "title", "year_made", "rating" ],
"properties": {
"title" : { "type" : "string" },
"year_made" : { "type" : "integer" },
"rating" : { "type" : "integer" }
}
},
"movie_map" : {
"type": "object",
"additionalProperties": { "$ref": "#/definitions/movie" },
"default": {}
}
}
}
Here is a sample object, which can be thought of as a map, of movies
that validates against the schema:
{
"movies" : {
"the mission" : {
"title":"The Mission",
"year_made":1986,
"rating":5
},
"troll 2" : {
"title":"Troll 2",
"year_made":1990,
"rating":2
}
}
}
If I wanted to model a structure for users favorites movies (remind Json Schema is intended for structure validation) I would make something like:
{
"description":"moviesFan",
"properties": [
"favoriteMovies": {
"type":"array",
"uniqueItems":True
"allOf": [{ "$ref": "#/definitions/movie" }]
}
],
"definitions": {
"movie": {
"type": "object",
"properties": {
"yearMade": {}
...
}
}
}
Does it make sense to you?
Here's my way to support for map. Hope to help.
{
"type": "object",
"title": "map data",
"required": [
"map"
],
"properties": {
"sOnePurRecord": {
"title": "map",
"additionalProperties": false,
"properties": {
"mapItem": {
"type": "object",
"maxProperties": 10,
"minProperties": 1,
"patternProperties": {
"^[a-zA-Z0-9]{5,20}$": {
"$ref": "#/definitions/value"
}
},
"additionalProperties": {
"$ref": "#/definitions/value"
}
}
},
"required": [
"mapItem"
]
}
},
"definitions": {
"value": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"id": {
"type": "integer"
}
}
}
}
}