Json schema for a complex JSON? - json

I have a json that uses object of objects in place of an array, to easily search through the data with json keys.
How can I validate this against a schema, without hardcoding the key into the schema ? Should I be converting the object into an array in this case before validating ?
Example JSON :
{
"item_value": {
"id": 123,
"name": "Item Value",
"colors": {
"red_value": {
"id": 1231,
"name": "Red Value"
},
"blue_value": {
"id": 1231,
"name": "Blue Value"
}
}
},
"another_item_value": {
"id": 133,
"name": "Another Item Value",
"colors": {
"red_value_xyz": {
"id": 1331,
"name": "Red Value Xyz"
},
"blue_value_bar": {
"id": 1331,
"name": "Blue Value Bar"
}
}
}
}

You can use patternProperties for RegEx-matched property validation, or if all of your properties follow the same subschema, just use additionalProperties with the subschema.

Related

How to get JSON data from attributes that start with # or # in Typescript

I am working with a specific API that returns a JSON that looks like the below sample.
I want to get both values that contain the #text and #attr but I get error messages in typescript when I try to get the values.
try using,
album[0]["#attr"]
album[0]["artist"]["#text"]
Hey for JSON you can use get details by its attribute name in it and it's the same for all-weather it starts with # or # it will be the same.
See below code to get the value of your specified key:
Sample JSON:
{
"weeklyalbumchart": {
"album": [
{
"artist": {
"mbid": "data",
"#text": "Flying Lotus"
},
"mbid": "data",
"url": "",
"name": "",
"#attr": {
"rank": "1"
},
"playcount": "21"
},
{
"artist": {
"mbid": "data",
"#text": "Flying Lotus"
},
"mbid": "data",
"url": "",
"name": "",
"#attr": {
"rank": "1"
},
"playcount": "21"
}
]
}
}
Read JSON:
#attr ===> json["weeklyalbumchart"]["album"][0]["#attr"]
#text ===> json["weeklyalbumchart"]["album"][0]["artist"]["#text"]
Hope this will help you to understand it.

Converting nested JSON into CSV by adding the nested objects to s single column in Apache Nifi

I have a nested object JSON structure as given below;
{
"Bikes":[
{
"Name":"KTM",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Yamaha",
"Model":"2020",
"Colour":"Black"
}
],
"Cars":[
{
"Name":"BMW",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Audi",
"Model":"2020",
"Colour":"Black"
}
]
My output CSV should look this;
Bikes (Column 1)
{
"Name":"KTM",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Yamaha",
"Model":"2020",
"Colour":"Black"
}
Cars (Columns 2)
{
"Name":"BMW",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Audi",
"Model":"2020",
"Colour":"Black"
}
'''
I need to store the entire object Bikes in a single columns and likewise Cars in a single column
I am currently using a convert record processor to convert from JSON to CSV, MY avro schema for both JSON and CSV looks like this
{
"name": "Sydney",
"type": "record",
"namespace": "sydney",
"fields": [
{
"name": "Bikes",
"type": {
"type": "array",
"items": {
"name": "Vehicle",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Model",
"type": "string"
},
{
"name": "Colour",
"type": "string"
}
]
}
}
},
{
"name": "Cars",
"type": {
"type": "array",
"items": {
"name": "Vehicle",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Model",
"type": "string"
},
{
"name": "Colour",
"type": "string"
}
]
}
}
}
]
}
but in the convert record processor I am getting this error
ConvertRecord[id=4d909c18-0177-1000-c1cd-9456a1775358] Failed to process StandardFlowFileRecord[uuid=63d04fc1-7edd-405b-8a9d-000bcdaa3d6c,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1611930760172-32, container=default, section=32], offset=744686, length=4],offset=0,name=test.json,size=4]; will route to failure: IOException thrown from ConvertRecord[id=4d909c18-0177-1000-c1cd-9456a1775358]: org.codehaus.jackson.JsonParseException: Unexpected character ('T' (code 84)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: org.apache.nifi.stream.io.NonCloseableInputStream#5b8393b6; line: 1, column: 2]
Could anyone help me out on this?
You probably want to use a jolt transform for this, which has its own processor. Restructuring a simple nested JSON like this is one of their documented examples.

Nifi manipulate all Json Key Values from Invokehttp

I have a json that comes from InvokeHTTP. I did a Split Json and JoltTransform to get the Key, Values, but I need to change all the Keys from Camelcase to snakecase.
My Keys will be different with every InvokeHttp call. I've tried AttributestoJson and EvaluateJsonPath and some replace text, but haven't figured out a way to dynamically change just the keys and then merge back to the values without writing a custom processor.
Original Data from InvokeHTTP:
{
"data": {
"Table": [
{
"Age": 51,
"FirstName": "Bob",
"LastName": "Doe"
},
{
"Age": 26,
"FirstName": "Ryan",
"LastName": "Doe"
}
]
}
}
Input after Split Json (Gives me each json in a separate flowfile) and Jolt:
[
{
"Key": "Age",
"Value": 51
},
{
"Key": "FirstName",
"Value": "Bob"
},
{
"Key": "LastName",
"Value": "Doe"
}
]
Desired Output:
{
"data": {
"Table": [
{
"age": 51,
"first_name": "Bob",
"last_name": "Doe"
},
{
"age": 26,
"first_name": "Ryan",
"last_name": "Doe"
}
]
}
}
If you know the fields, you can use JoltTransformJSON on the original input JSON so you don't have to use SplitJson, here's a spec that will do the (explicit) field name conversion:
[
{
"operation": "shift",
"spec": {
"data": {
"Table": {
"*": {
"Age": "data.Table[#2].age",
"FirstName": "data.Table[#2].first_name",
"LastName": "data.Table[#2].last_name"
}
}
}
}
}
]
You could also use UpdateRecord, you'd just need separate schemas for the JsonTreeReader and JsonRecordSetWriter.
I wrote an answer to a similar question which uses ReplaceText to replace . in JSON keys with _. The same logic could be applied here (template available in link). As I pointed out in that answer, a cleaner solution would be to use ExecuteScript, especially as the transformation from camelcase to snakecase is done easily in most scripting languages.

Using jsonschema to validate that a key has a unique value within an array of objects?

How do I validate JSON, with jsonschema, that within an array of objects, a specific key in each object must be unique? For example, validating the uniqueness of each Name k-v pair should fail:
"test_array": [
{
"Name": "name1",
"Description": "unique_desc_1"
},
{
"Name": "name1",
"Description": "unique_desc_2"
}
]
Using uniqueItems on test_array won't work because of the unique Description keys.
I found the alternative method of using a schema that allows arbitrary properties. The only caveat is that JSON allows duplicate object keys, but duplicates will override their previous instances. The array of objects with the key "Name" can be converted to an object with arbitrary properties:
For example, the following JSON:
"test_object": {
"name1": {
"Desc": "Description 1"
},
"name2": {
"Desc": "Description 2"
}
}
would have the following schema:
{
"type": "object",
"properties": {
"test_object": {
"type": "object",
"patternProperties": {
"^.*$": {
"type": "object",
"properties": {
"Desc": {"type" : "string"}
},
"required": ["Desc"]
}
},
"minProperties": 1,
"additionalProperties": false
}
},
"required": ["test_object"]
}

Use object property keys as enum in JSON schema

I'm trying to validate a JSON file using JSON Schema, in order to find cases of "broken references". Essentially my file consists of items and groups, with each item belonging to a single group referenced by the groups property key, like so:
{
"items": {
"banana": {
"name": "Banana",
"group": "fruits"
},
"apple": {
"name": "Apple",
"group": "fruits"
},
"carrot": {
"name": "Carrot",
"group": "vegetables"
},
"potato": {
"name": "Potato",
"group": "vegetables"
},
"cheese": {
"name": "Cheese",
"group": "dairy"
}
},
"groups": {
"fruits": {
"name": "Fruits"
},
"vegetables": {
"name": "Vegetables"
}
}
}
In the example above the item cheese is to be considered invalid, as there are no dairy property in the groups object. I've tried to validate this using the following schema:
{
"$schema": "http://json-schema.org/draft-06/schema#",
"title": "Food",
"id": "food",
"type": "object",
"properties": {
"items": {
"type": "object",
"patternProperties": {
"^[A-Za-z0-9-_.:=]+$": {
"properties": {
"name": {
"type": "string",
"pattern": "^[A-Za-z- ]+$"
},
"group": {
"pattern": "^[a-z]+$",
"enum": {
"$data": "/groups"
}
}
}
}
}
},
"groups": {
"type": "object",
"patternProperties": {
"^[A-Za-z0-9-_]+$": {
"properties": {
"name": {
"type": "string",
"pattern": "^[A-Za-z- ]+$"
}
}
}
}
}
},
"additionalProperties": false
}
This has the effect that the enum for group is populated by the property values in groups, but what I want to do is use the property keys defined in groups.
If I add a property like e.g. groupIds and let that be an array of all property keys found in groups and specify the enum as "$data": "/groupIds" it does work, so I take this to be a JSON pointer issue.
The enum keyword in JSON Schema is defined as:
The value of this keyword MUST be an array. This array SHOULD have at least one element. Elements in the array SHOULD be unique.
So if I could only get JSON pointer to reference an object's keys rather than its values I guess the enum validation would just work. I'm thinking something like "$data": "/groups/.keys", "$data": "/groups/$keys" or similar, but haven't found it while googling or reading the spec. Is there such a thing or has it ever been proposed?
There is no such thing. It’s very close to general expressions inside JSON and it may have some use cases, but there is no such specification.