How to flatten a json nested object that have different names? - json

Very beginner here.
I have a JSON file with nested objects with different object names: first, second, third.
I'd like to either flatten the nested objected OR remove the object name, OR rename all the object names to the same?
Is it possible to do this using the JSON schema?
{
"first": {
"name": "test",
"age": 39
}
,
"second": {
"name": "test123",
"age": 39
}
,
"third": {
"name": "test456",
"age": 25,
"height": 159
}
}
This is the schema where i'd like all objects to be "items" or at least removed
{
"type": "object",
"properties": {
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "integer"
},
"height": {
"type": "integer"
}
}
}
}
}
And this is what I'd like the output to be:
{
"items": {
"name": "test",
"age": 39
}
,
"items": {
"name": "test123",
"age": 39
}
,
"items": {
"name": "test456",
"age": 25,
"height": 159
}
}
Or to remove "items" / object name completely?

Related

Converting nested JSON into CSV by adding the nested objects to s single column in Apache Nifi

I have a nested object JSON structure as given below;
{
"Bikes":[
{
"Name":"KTM",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Yamaha",
"Model":"2020",
"Colour":"Black"
}
],
"Cars":[
{
"Name":"BMW",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Audi",
"Model":"2020",
"Colour":"Black"
}
]
My output CSV should look this;
Bikes (Column 1)
{
"Name":"KTM",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Yamaha",
"Model":"2020",
"Colour":"Black"
}
Cars (Columns 2)
{
"Name":"BMW",
"Model":"2017",
"Colour":"Yellow"
}
{
"Name":"Audi",
"Model":"2020",
"Colour":"Black"
}
'''
I need to store the entire object Bikes in a single columns and likewise Cars in a single column
I am currently using a convert record processor to convert from JSON to CSV, MY avro schema for both JSON and CSV looks like this
{
"name": "Sydney",
"type": "record",
"namespace": "sydney",
"fields": [
{
"name": "Bikes",
"type": {
"type": "array",
"items": {
"name": "Vehicle",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Model",
"type": "string"
},
{
"name": "Colour",
"type": "string"
}
]
}
}
},
{
"name": "Cars",
"type": {
"type": "array",
"items": {
"name": "Vehicle",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Model",
"type": "string"
},
{
"name": "Colour",
"type": "string"
}
]
}
}
}
]
}
but in the convert record processor I am getting this error
ConvertRecord[id=4d909c18-0177-1000-c1cd-9456a1775358] Failed to process StandardFlowFileRecord[uuid=63d04fc1-7edd-405b-8a9d-000bcdaa3d6c,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1611930760172-32, container=default, section=32], offset=744686, length=4],offset=0,name=test.json,size=4]; will route to failure: IOException thrown from ConvertRecord[id=4d909c18-0177-1000-c1cd-9456a1775358]: org.codehaus.jackson.JsonParseException: Unexpected character ('T' (code 84)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: org.apache.nifi.stream.io.NonCloseableInputStream#5b8393b6; line: 1, column: 2]
Could anyone help me out on this?
You probably want to use a jolt transform for this, which has its own processor. Restructuring a simple nested JSON like this is one of their documented examples.

Import JSON with objects as nested to Elastic Search

i've log with thousands records of aggregated data in JSON:
{
"count": 25,
"domain": "domain.tld",
"geoips": {
"AU": 5,
"NZ": 20
},
"ips": {
"1.2.3.4": 5,
"1.2.3.5": 1,
"1.2.3.6": 1,
"1.2.3.7": 1,
"1.2.3.8": 1,
"1.2.3.9": 9,
"1.2.3.10": 7
},
"subdomains": {
"a.domain.tld": 1,
"b.domain.tld": 1,
"c.domain.tld": 1,
"domain.tld": 22
},
"tld": "tld",
"types": {
"1": 3,
"43": 22
}
}
and i have mapping on ES:
"mappings": {
"properties": {
"count": {
"type": "long"
},
"domain": {
"type": "keyword"
},
"ips": {
"type": "nested",
"properties": {
"key": {
"type": "keyword"
},
"val": {
"type": "long"
}
}
},
"geoips": {
"type": "nested",
"properties": {
"key": {
"type": "keyword"
},
"val": {
"type": "long"
}
}
},
"subdomains": {
"type": "nested",
"properties": {
"key": {
"type": "keyword"
},
"val": {
"type": "long"
}
}
},
"tld": {
"type": "keyword"
},
"types": {
"type": "nested",
"properties": {
"key": {
"type": "keyword"
},
"val": {
"type": "long"
}
}
}
}
}
Is there any simple way how import these lines to ES as nested objects ? If i use a bulk insert without modification, the ES will modify mapping by adding a new field for each IP/subdomain/GeoIP instead add it as simple key/val object.
Or only one way is regenerate JSON to key/val nested fields ?
Your mapping is already very good but the data doesn't fit it since the nested data type expects an array of objects, not a single object. So you'll need to transform your nested objects into array of key-value pairs like so:
...
"ips": [
{
"key": "1.2.3.4",
"val": 5
},
{
"key": "1.2.3.5",
"val": 1
},
...
],
"subdomains": [
{
"key": "a.domain.tld",
"val": 1
},
{
"key": "b.domain.tld",
"val": 1
},
...
]
...

Json schema ref other file

With this schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"b": {
"type": "object",
"properties": {
"c": {
"type": "string"
}
}
},
"type": "object",
"properties": {
"a": {
"$ref": "#/b"
}
}
}
I can validate this example:
{
"a": {
"c": "test"
}
}
Now I want to create a new schema file for the "b" element and refer it in my 1st schema. How can I do this ? I try a lot of things but I always obtain jsonspec.reference.exceptions.NotFound: u'b.json' not registered.
After a couple of hours of googling and various doc readings, I've finally managed to be able to compose two json-schema's using "$ref" : "file:...".
My example data looks like this:
john-doe.json:
{
"first_name": "john",
"last_name": "doe",
"age": 42,
"address": {
"street_address": "foo street 42",
"city": "baklavastan",
"state": "foobar"
"foo": 42
}
}
And then I have two json schema files, which lives in the same directory on my file system.
customer.json
{
"type": "object",
"properties": {
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"age" : { "type" : "integer" },
"address" : { "$ref" : "file:address.json#" }
},
"required": ["first_name", "last_name", "age", "address"],
"additionalProperties": false
}
address.json
{
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"],
"additionalProperties": false
}
I am able to validate against the schema in Clojure using the library scjsv (which is a wrapper around the Java json-schema-validation library)
It gives me following validation error (in clojure edn) for the john-doe.json example:
{ :level "error",
:schema {:loadingURI "file:address.json#", :pointer ""},
:instance {:pointer "/address"},
:domain "validation",
:keyword "additionalProperties",
:message
"object instance has properties which are not allowed by the schema: [\"foo\"]",
:unwanted ["foo"] }

How can I explictly constrain multiple items in a JSON Schema array?

I am creating a JSON schema and want to define an array containing only exact matches for certain items:
An example of the sort of JSON (snippet) would look like:
{
"results":
[
{ "id": 1, "test": true, "volts": 700, "duration": 100 },
{ "id": 2, "test": false }
]
}
This seems to be a combination of OneOf and "additionalProperties": false but I can't work out how that should be used. So far I have:
{
"results":
{
"type": "array",
"items":
{
"type": "object",
"OneOf":
[
{
"id": { "type": "integer" },
"test": { "type": "boolean" },
"volts": { "type": "integer" },
"duration": { "type": "integer" }
},
{
"id": { "type": "integer" },
"test": { "type": "boolean" }
}
],
"additionalProperties": false
}
}
}
I'm using http://www.jsonschemavalidator.net/ to check my JSON.
But when I validate the following JSON against my schema it says it's valid; is the website incorrect or have I done something wrong?
{
"results": [
{
"fred": 7,
"id": 7,
"test": true,
"volts": 7,
"duration": 7
},
{
"fish": 7
}
]
}

JSON Schema for tree structure

I have to build tree like structure of Json data.Each node has an id (an integer, required), a label (a string, optional), and an array of child nodes (optional). Can you help me how to write JSON schema for this Json data. I need to set Id as required in child node as well.
{
"Id": 1,
"Label": "A",
"Child": [
{
"Id": 2,
"Label": "B",
"Child": [
{
"Id": 5,
"Label": "E"
}, {
"Id": 6,
"Label": "E"
}, {
"Id": 7,
"Label": "E"
}
]
}, {
"Id": 3,
"Label": "C"
}, {
"Id": 4,
"Label": "D",
"Child": [
{
"Id": 8,
"Label": "H"
}, {
"Id": 9,
"Label": "I"
}
]
}
]
}
A schema for this structure only needs a definition of a node and a reference to that node. The property Children (renamed from Child) references the node as well.
Here's the schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"$ref": "#/definitions/node",
"definitions": {
"node": {
"properties": {
"Id": {
"type": "integer"
},
"Label": {
"type": "string"
},
"Children": {
"type": "array",
"items": {
"$ref": "#/definitions/node"
}
}
},
"required": [
"Id"
]
}
}
}