Python JSON Parsing example - json

{
"fruits" : {
"fruit" : [
{
"name" : "apple",
"size" : 1,
"price" : 1
},
{
"name" : "banana",
"size" : 1,
"price" : 2
}
]
},
"sports" : {
"sport" : [
{
"name" : "baseball",
"population" : 9
},
{
"name" : "soccer",
"population" : 11
}
]
}
}
This is my example json file.
I made this file.
If this format is not JSON, please tell me.
I want to get name's value. Using Python.
I can read JSON file and be converting dictionary.
But can't read Specific (tag's maybe) value
import json
#json file reading
with open('C:\Users\sub2\Desktop\example.json') as data_file:
data = json.load(data_file)
#dictionary key
dic_key = []
for i in data:
dic_key.append(i)
#dictionary value
for i in dic_key:
print data[i]
#name tag
for i in dic_key:
for j in data[i]:
print j.get('name')
How do get the name's value.

This should give you all names
import json
#json file reading
with open('out.json') as data_file:
data = json.load(data_file)
#dictionary key
dic_key = []
for i in data:
dic_key.append(i)
#dictionary value
for i in dic_key:
print data[i]
#name tag
for i in dic_key:
for j in data[i]:
for k in data[i][j]:
print k.get('name')

Related

Splitting Json to multiple jsons in NIFI

I have the below json file which I want to split in NIFI
Input:
[ {
"id" : 123,
"ticket_id" : 345,
"events" : [ {
"id" : 3322,
"type" : "xyz"
}, {
"id" : 6675,
"type" : "abc",
"value" : "sample value",
"field_name" : "subject"
}, {
"id" : 9988,
"type" : "abc",
"value" : [ "text_file", "json_file" ],
"field_name" : "tags"
}]
}]
and my output should be 3 different jsons like below:
{
"id" : 123,
"ticket_id" : 345,
"events.id" :3322,
"events.type":xyz
}
{
"id" : 123,
"ticket_id" : 345,
"events.id" :6675,
"events.type":"abc",
"events.value": "sample value"
"events.field_name":"subject"
}
{
"id" : 123,
"ticket_id" : 345,
"events.id" :9988,
"events.type":"abc",
"events.value": "[ "text_file", "json_file" ]"
"events.field_name":"tags"
}
I want to know can we do it using splitjson? I mean can splitjson split the json based on the array of json objects present inside the json?
Please let me know if there is a way to achieve this.
If you want 3 different flow files, each containing one JSON object from the array, you should be able to do it with SplitJson using a JSONPath of $ and/or $.*
Using reduce function:
function split(json) {
return json.reduce((acc, item) => {
const events = item.events.map((evt) => {
const obj = {id: item.id, ticket_id: item.ticket_id};
for (const k in evt) {
obj[`events.${k}`] = evt[k];
}
return obj;
});
return [...acc, ...events];
}, []);
}
const input = [{"id":123,"ticket_id":345,"events":[{"id":3322,"type":"xyz"},{"id":6675,"type":"abc","value":"sample value","field_name":"subject"},{"id":9988,"type":"abc","value":["text_file","json_file"],"field_name":"tags"}]}];
const res = split(input);
console.log(res);

Adding a attribute to json file

I have a json file of the format
{"latitude":28.488069,"longitude":-81.407208,"data":[{"time":1462680000,"summary":"Clear"},{"time":1462683600,"summary":"Clear",},{"time":1462694400,"summary":"Clear"}]}}
I want to add id attribute inside data
val result = jsonfile
val Jsonobject = Json.parse(result).as[JsObject]
val res = Jsonobject ++ Json.obj("id" -> 1234)
println(Json.prettyPrint(res))
the output should be
{"latitude":28.488069,"longitude":-81.407208,"data":[{"time":1462680000,"summary":"Clear","id":"1234"},{"time":1462683600,"summary":"Clear","id":"1235"},{"time":1462694400,"summary":"Clear","id":"1236"}]}
but my output is
{"latitude":28.488069,"longitude":-81.407208,"data":[{"time":1462680000,"summary":"Clear"},{"time":1462683600,"summary":"Clear",},{"time":1462694400,"summary":"Clear"}]},"id":"1234"}
There are a couple of things missing in your code. In your expected output, the ID is incremented, how do you expect to just add "id" -> 1234 and have it automatically incremented ?
Anyway, you have to loop through the elements in data and set it for each one of them. Something like this works:
val j = Json.parse(result).as[JsObject]
val res = j ++ Json.obj("data" ->
// get the 'data' array and loop inside
(j \ "data").as[JsArray].value.zipWithIndex.map {
// zipWithIndex lets us use the index to increment the ID
case (x,i) => x.as[JsObject] + ("id" , JsString((1234 + i).toString)) })
println(Json.prettyPrint(res))
{
"latitude" : 28.488069,
"longitude" : -81.407208,
"data" : [ {
"time" : 1462680000,
"summary" : "Clear",
"id" : "1234"
}, {
"time" : 1462683600,
"summary" : "Clear",
"id" : "1235"
}, {
"time" : 1462694400,
"summary" : "Clear",
"id" : "1236"
} ]
}

Search inside JSON with Elastic

I have an index/type in ES which has the following type of records:
body "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}"
type "xxxx"
source "11.2.21.0"
The body field is a JSON.So I want to search for example the records that have in their JSON body Status:0.
Query should look something like this(it doesn't work):
GET <host>:<port>/index/type/_search
{
"query": {
"match" : {
"body" : "Status:0"
}
}
}
Any ideas?
You have to change the analyser settings of your index.
For the JSON pattern you presented you will need to have a char_filter and a tokenizer which remove the JSON elements and then tokenize according to your needs.
Your analyser should contain a tokenizer and a char_filter like these ones here:
{
"tokenizer" : {
"type": "pattern",
"pattern": ","
},
"char_filter" : [ {
"type" : "mapping",
"mappings" : [ "{ => ", "} => ", "\" => " ]
} ],
"text" : [ "{\"Status\":\"0\",\"Time\":\"2017-10-3 16:39:58.591\"}" ]
}
Explanation: the char_filter will remove the characters: { } ". The tokenizer will tokenize by the comma.
These can be tested using the Analyze API. If you execute the above JSON against this API you will get these tokens:
{
"tokens" : [ {
"token" : "Status:0",
"start_offset" : 2,
"end_offset" : 13,
"type" : "word",
"position" : 0
}, {
"token" : "Time:2017-10-3 16:39:58.591",
"start_offset" : 15,
"end_offset" : 46,
"type" : "word",
"position" : 1
} ]
}
The first token ("Status:0") which is retrieved by the Analyze API is the one you were using in your search.

Mongolite group by/aggregate on JSON object

I have a json document like this on my mongodb collection:
Updated document:
{
"_id" : ObjectId("59da4aef8c5d757027a5a614"),
"input" : "hi",
"output" : "Hi. How can I help you?",
"intent" : "[{\"intent\":\"greeting\",\"confidence\":0.8154089450836182}]",
"entities" : "[]",
"context" : "{\"conversation_id\":\"48181e58-dd51-405a-bb00-c875c01afa0a\",\"system\":{\"dialog_stack\":[{\"dialog_node\":\"root\"}],\"dialog_turn_counter\":1,\"dialog_request_counter\":1,\"_node_output_map\":{\"node_5_1505291032665\":[0]},\"branch_exited\":true,\"branch_exited_reason\":\"completed\"}}",
"user_id" : "50001",
"time_in" : ISODate("2017-10-08T15:57:32.000Z"),
"time_out" : ISODate("2017-10-08T15:57:35.000Z"),
"reaction" : "1"
}
I need to perform group by on intent.intent field and I'm using Rstudio with mongolite library.
What I have tried is :
pp = '[{"$unwind": "$intent"},{"$group":{"_id":"$intent.intent", "count": {"$sum":1} }}]'
stats <- chat$aggregate(
pipeline=pp,
options = '{"allowDiskUse":true}'
)
print(stats)
But it's not working, output for above code is
_id count
1 NA 727
If intent attribute type is string and keep the object as string.
We can split it to array with \" and use third item of array.
db.getCollection('test1').aggregate([
{ "$project": { intent_text : { $arrayElemAt : [ { $split: ["$intent", "\""] } ,3 ] } } },
{ "$group": {"_id": "$intent_text" , "count": {"$sum":1} }}
])
Result:
{
"_id" : "greeting",
"count" : 1.0
}

Parsing a .json file with Python

There is a lot of information on loading .json files, but I just cannot figure out what the problem is:
I have an external file called LocationHistory.json with various coordinates inside. For reference sake, this is how the data is listed:
{
"data" : {
"items" : [ {
"kind" : "latitude#location",
"timestampMs" : "1374870896803",
"latitude" : 34.9482949,
"longitude" : -85.3245474,
"accuracy" : 2149
}, {
"kind" : "latitude#location",
"timestampMs" : "1374870711762",
"latitude" : 34.9857898,
"longitude" : -85.3526902,
"accuracy" : 2016
}, {
"kind" : "latitude#location",
"timestampMs" : "1374870651752",
"latitude" : 34.9857898,
"longitude" : -85.3526902,
"accuracy" : 2016
}]
}
}
I'm trying to parse this information with:
import json
json_file = open ('LocationHistory.json')
json_string = json_file.read()
json_data = json.loads (json_string)
locations = json_data ["data"]
for location in locations:
print location["timestampMS"], location["latitude"], location["longitude"], location["accuracy"]
Why am I getting the error:
line 10, in
print location["timestampMS"], location["latitude"], location["longitude"], location["accuracy"]
TypeError: string indices must be integers
All the information I can find to parse .json files explains this type of solution that I have. Where am I going wrong?
Thanks in advance, I'm sure it should be a simple mistake...
You want to iterate over data items instead:
locations = json_data["data"]["items"]
for location in locations: # now "locations" is a list of dictionaries
# ...