Create MongoDB collection with standard JSON schema - json

I want to create a MongoDB collection using an JSON schema file.
Suppose the JSON file address.schema.json contain address information schema (this file is one of the Json-schema.org's examples):
{
"$id": "https://example.com/address.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "An address similar to http://microformats.org/wiki/h-card",
"type": "object",
"properties": {
"post-office-box": {
"type": "string"
},
"extended-address": {
"type": "string"
},
"street-address": {
"type": "string"
},
"locality": {
"type": "string"
},
"region": {
"type": "string"
},
"postal-code": {
"type": "string"
},
"country-name": {
"type": "string"
}
},
"required": [ "locality", "region", "country-name" ],
"dependencies": {
"post-office-box": [ "street-address" ],
"extended-address": [ "street-address" ]
}
}
What is the MongoDB command, such as mongoimport or db.createCollection to create a MongoDB collection using the above schema?
It can be nice if I can use the file directly in MongoDB with the need of changing the file format manually.
I wonder if the JSON schema format is a standard one, why do I need to change it to adopt it for MongoDB. Why MongoDB does not have this functionality built-in?

You can create it via createCollection command or shell helper:
db.createCollection(
"mycollection",
{validator:{$jsonSchema:{
"description": "An address similar to http://microformats.org/wiki/h-card",
"type": "object",
"properties": {
"post-office-box": { "type": "string" },
"extended-address": { "type": "string" },
"street-address": { "type": "string" },
"locality": { "type": "string" },
"region": { "type": "string" },
"postal-code": { "type": "string" },
"country-name": { "type": "string" }
},
"required": [ "locality", "region", "country-name" ],
"dependencies": {
"post-office-box": [ "street-address" ],
"extended-address": [ "street-address" ]
}
}}})
You only need to specify bsonType instead of type if you want to use a type that exists in bson but not in generic json schema. You do have to remove the lines $id and $schema as those are not supported by MongoDB JSON schema support (documented here)

The only option which you can use is adding jsonSchema validator during collection creating: https://docs.mongodb.com/manual/reference/operator/query/jsonSchema/#document-validator. It will mean that any document which you will insert/update in your collection will have to match with the provided schema

Related

Unique value for a property validation using json schema

I have a JSON object like:
{
"result": [
{
"name" : "abc",
"email": "abc.test#mail.com"
},
{
"name": "def",
"email": "def.test#mail.com"
},
{
"name": "xyz",
"email": "abc.test#mail.com"
}
]
}
and schema for this:
{
"definitions": {},
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://example.com/object1607582431.json",
"title": "Root",
"type": "object",
"required": [
"result"
],
"properties": {
"result": {
"$id": "#root/result",
"title": "Result",
"type": "array",
"default": [],
"uniqueItems": true,
"items": {
"$id": "#root/result/items",
"title": "Items",
"type": "object",
"required": [
"name",
"email"
],
"properties": {
"name": {
"$id": "#root/result/items/name",
"title": "Name",
"type": "string"
},
"email": {
"$id": "#root/result/items/email",
"title": "Email",
"type": "string"
}
}
}
}
}
}
I am looking for an option to check uniqueness for email irrespective of name. How I can validate that every email should be unique?
You can't. There are no keywords that let you compare one particular data value against another, other than uniqueItems, which compares an array element in toto against another.
The JsonSchema specification does not currently support this.
You can see the active GitHub issue here: https://github.com/json-schema-org/json-schema-vocabularies/issues/22
However, there are various extensions of JsonSchema that do validate unique fields within lists of objects.
If you happen to be using Python you can use the package (I created) JsonVL. It can be installed with pip install jsonvl and then run with jsonvl data.json schema.json.
Code examples in the GitHub repo: https://github.com/gregorybchris/jsonvl

Azure logic apps Parse Json throws an error

Background
I have a custom connector which returns a JSON response .I am trying to parse the response to JSON since i want to use the response later in other flows. So that I am using Parse JSON Action from Data operations connector. Following is the JSON response and the JSON schema i provided to Parse JSON.
Response
[
[
{
"key":"Customer_Key",
"value":{
"id":"abfa48ad-392d-e511-80d3-005056b34214",
"name":"90033"
}
},
{
"key":"Status",
"value":"Done"
}
]
]
Schema
{
"type": "array",
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
}
}
}
},
"required": [
"key",
"value"
]
}
}
}
Exception
{
"message": "Invalid type. Expected Object but got String.",
"lineNumber": 0,
"linePosition": 0,
"path": "[0][2].value",
"value": "90033",
"schemaId": "#/items/items/properties/value",
"errorType": "type",
"childErrors": []
},
Any one knows what is the issue on this ?How we can I convert above json response
Looks like the Use sample payload to generate schema couldn't generate right schema.
So you could go to this liquid studio site and paste the JSON payload then click the Generate Schema button, then you will get the Json Schema.
And I test the Schema, it worked perfectly.
Hope this could help you, if you still have other questions,please let me know.
Schema looks incorrect. try with the below schema:
{
"type": "array",
"items": [
{
"type": "array",
"items": [
{
"type": "object",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"id",
"name"
]
}
},
"required": [
"key",
"value"
]
},
{
"type": "object",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string"
}
},
"required": [
"key",
"value"
]
}
]
}
]
}

JSON Schema reporting error only for first element of array

I have the below JSON document.
[
{
"name": "aaaa",
"data": {
"key": "id",
"value": "aaaa"
}
},
{
"name": "bbbb",
"data": {
"key": "id1",
"value": "bbbb"
}
}
]
Below is the JSON Schema I have created for the above content.
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "array",
"items": [
{
"type": "object",
"properties": {
"name": {
"type": "string"
},
"data": {
"type": "object",
"properties": {
"key": {
"type": "string",
"enum": [
"id",
"temp",
]
},
"value": {
"type": "string",
}
},
"required": [
"key",
"value"
]
}
},
"required": [
"name",
"data"
]
}
]
}
As per the schema, the value of data.key is invalid for second item in the array. but any online schema validator does not find that. If we use different value in first array element, it throws the excepted error.
I assume that my schema is wrong somehow. what I expect is that any child items of the array should be reported if they have values out of the enum list.
It's an easy mistake to make, so don't beat yourself up about this one!
items can be an array or an object. If it's an array, it validates the object at that position in the instance array. Here's an excerpt from the JSON Schema spec (draft-7)
The value of "items" MUST be either a valid JSON Schema or an array of
valid JSON Schemas.
If "items" is a schema, validation succeeds if all elements in the
array successfully validate against that schema.
If "items" is an array of schemas, validation succeeds if each element
of the instance validates against the schema at the same position, if
any.
JSON Schema (validation) draft-7 items
Removing the square braces provides you with the correct schema...
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "array",
"items":
{
"type": "object",
"properties": {
"name": {
"type": "string"
},
"data": {
"type": "object",
"properties": {
"key": {
"type": "string",
"enum": [
"id",
"temp",
]
},
"value": {
"type": "string",
}
},
"required": [
"key",
"value"
]
}
},
"required": [
"name",
"data"
]
}
}

JSON Schema - Foreign Key (FK) Validation

I have the JSON schemas below:
Schema A:
{
"$schema": "http://json-schema.org/draft-04/hyper-schema#",
"title": "A",
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
},
"entityBId": {
"type": "string"
}
},
"required": [
"name",
"entityBId"
],
"links": [
{
"rel": "B",
"href": "myapi/EntityB?id={entityBId}"
}
]
}
Schema B :
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "B",
"type": "object",
"properties": {
"id": {
"type": "string"
},
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
I'm trying to figure out if there is a way to run something like a integrity check based in a JSON Schema with external links/references.
For instance: When I receive an Object A with a entityBId = 1, I'd like to fetch this entity B running a GET in the endpoint declared in the links href and check if there is a valid object from the received id.
It will run like a deep validation and can be useful in a scenario without a defined DB schema.
As suggested by Raj Kamal, I made my own "link validation".
There is a function that looks for link attribute on schemas and validate the externals references directly on database and throw an exception if a valid reference is not found.

Schema definition generated online Schema-Generator is not accepted by BigQuery while using Load Table API

Any relevant help will be appreciated.
I have several different JSON docs whic need to be inserted into BigQuery. Now to avoid generating schema manually, I am using the help of online Json Schema Generation tools available. But the schema generated by them are not being accepted by BigQuery Load Data wizard.
For eaxmple: for a Json data like this:
{"_id":100,"actor":"KK","message":"CCD is good in Pune",
"comment":[{"actor":"Subho","message":"CCD is not as good in Kolkata."},
{"actor":"bisu","message":"CCD is costly too in Kolkata"}]
}
the generated schema by online tool is:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "Generated from c:jsonccd.json with shasum a003286a350a6889b152
b3e33afc5458f3771e9c",
"type": "object",
"required": [
"_id",
"actor",
"message",
"comment"
],
"properties": {
"_id": {
"type": "integer"
},
"actor": {
"type": "string"
},
"message": {
"type": "string"
},
"comment": {
"type": "array",
"minItems": 1,
"uniqueItems": true,
"items": {
"type": "object",
"required": [
"actor",
"message"
],
"properties": {
"actor": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
}
}
}
But when I put it into BigQuery in the Load Data wizard, it fails with errors.
How can this be mitigated?
Thanks.
The schema generated by that tool is way more complex than what BigQuery requires.
Look at the sample in the docs:
"schema": {
"fields": [
{"name":"f1", "type":"STRING"},
{"name":"f2", "type":"INTEGER"}
]
},
https://developers.google.com/bigquery/loading-data-into-bigquery?hl=en#loaddatapostrequest
Meanwhile the tool mentioned in the question adds fields like $schema, description, type, required, properties that are not necessary and confusing to the BigQuery schema parser.