Find specific value in a large json file - json

I've a simple json array similar to the below. I'd like to find the record that contain a matching value, say for example name == jack.
{
"data": [
{
"Record": {
"attributes": {
"name": "Jack",
"age": "38",
"description": "1234",
}
}
}
]
}
Below python code works but it is very slow. Is there any way to get the results quicker?
with open('records.json') as f:
input_dict = json.load(f)
input_var = input_dict["data"]
for i in input_var:
if i["Record"]["attributes"]["name"] == "Jack":
print(i["Record"]["attributes"])
break

Related

How to get the All index values in Groovy JSON xpath

Please find the attached Groovy code which I am using to get the particular filed from the response body.
Query 1 :
It is retrieving the results when the I am using the correct Index value like if the data.RenewalDetails[o], will give output as Value 1 and if the data.RenewalDetails[1], output as Value 2.
But in my real case, I will never know about number of blocks in the response, so I want to get all the values that are satisficing the condition, I tried data.RenewalDetails[*] but it is not working. Can you please help ?
Query 2:
Apart from the above condition, I want to add one more filter, where "FamilyCode": "PREMIUM" in the Itemdetails, Can you help on the same ?
def BoundId = new groovy.json.JsonSlurper().parseText('{"data":{"RenewalDetails":[{"ExpiryDetails":{"duration":"xxxxx","destination":"LHR","from":"AUH","value":2,"segments":[{"valudeid":"xxx-xx6262-xxxyyy-1111-11-11-1111"}]},"Itemdetails":[{"BoundId":"Value1","isexpired":true,"FamilyCode":"PREMIUM","availabilityDetails":[{"travelID":"AAA-AB1234-AAABBB-2022-11-10-1111","quota":"X","scale":"XXX","class":"X"}]}]},{"ExpiryDetails":{"duration":"xxxxx","destination":"LHR","from":"AUH","value":2,"segments":[{"valudeid":"xxx-xx6262-xxxyyy-1111-11-11-1111"}]},"Itemdetails":[{"BoundId":"Value2","isexpired":true,"FamilyCode":"PREMIUM","availabilityDetails":[{"travelID":"AAA-AB1234-AAABBB-2022-11-10-1111","quota":"X","scale":"XXX","class":"X"}]}]}]},"warnings":[{"code":"xxxx","detail":"xxxxxxxx","title":"xxxxxxxx"}]}')
.data.RenewalDetails[0].Itemdetails.find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.length() == 33
}?.BoundId
println "Hello " + BoundId
Something like this:
def txt = '''\
{
"data": {
"RenewalDetails": [
{
"ExpiryDetails": {
"duration": "xxxxx",
"destination": "LHR",
"from": "AUH",
"value": 2,
"segments": [
{
"valudeid": "xxx-xx6262-xxxyyy-1111-11-11-1111"
}
]
},
"Itemdetails": [
{
"BoundId": "Value1",
"isexpired": true,
"FamilyCode": "PREMIUM",
"availabilityDetails": [
{
"travelID": "AAA-AB1234-AAABBB-2022-11-10-1111",
"quota": "X",
"scale": "XXX",
"class": "X"
}
]
}
]
},
{
"ExpiryDetails": {
"duration": "xxxxx",
"destination": "LHR",
"from": "AUH",
"value": 2,
"segments": [
{
"valudeid": "xxx-xx6262-xxxyyy-1111-11-11-1111"
}
]
},
"Itemdetails": [
{
"BoundId": "Value2",
"isexpired": true,
"FamilyCode": "PREMIUM",
"availabilityDetails": [
{
"travelID": "AAA-AB1234-AAABBB-2022-11-10-1111",
"quota": "X",
"scale": "XXX",
"class": "X"
}
]
}
]
}
]
},
"warnings": [
{
"code": "xxxx",
"detail": "xxxxxxxx",
"title": "xxxxxxxx"
}
]
}'''
def json = new groovy.json.JsonSlurper().parseText txt
List<String> BoundIds = json.data.RenewalDetails.Itemdetails*.find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.size() == 33 && itemDetail.FamilyCode == 'PREMIUM'
}?.BoundId
assert BoundIds.toString() == '[Value1, Value2]'
Note, that you will get the BoundIds as a List
If you amend your code like this:
def json = new groovy.json.JsonSlurper().parse(prev.getResponseData()
you would be able to access the number of returned items as:
def size = json.data.RenewalDetails.size()
as RenewalDetails represents a List
Just add as many queries you want using Groovy's && operator:
find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.length() == 33 &&
itemDetail.FamilyCode.equals('PREMIUM')
}
More information:
Apache Groovy - Parsing and producing JSON
Apache Groovy: What Is Groovy Used For?

pyjq - how to use "select" with both query and value as variables

I am writing a code in python3 where i am struggling with usage of variables with "pyjq", the code works without variables but variables are not getting parsed inside pyjq.
The documentation referred is https://github.com/doloopwhile/pyjq/blob/master/README.md#api
Please check the code given below and suggest -
My code
import json, os
import pyjq
from flask import Flask, request, jsonify
def query_records():
args = {"meta.antivirus.enabled": "true"}
for key, value in args.items():
with open('/tmp/data.txt', 'r') as f:
print (key)
print (value)
data = f.read()
records = json.loads(data)
query = ("." + key)
print (query)
#jq '.[]|select(.meta.antivirus.enabled=="true")' filename.json works,issue with variable substitution in python
match = pyjq.all('.[]|select(["$query"]==$value)', records, vars={"value": value,"query": query})
print (match)
query_records()
Content of file "/tmp/data.txt"
[
{
"name": "alpharetta",
"meta": {
"antivirus": {
"enabled": "true"
},
"limits": {
"cpu": {
"enabled": "true",
"value": "250m"
}
}
}
},
{
"meta": {
"allergens": {
"eggs": "true",
"nuts": "false",
"seafood": "false"
},
"calories": 230,
"carbohydrates": {
"dietary-fiber": "4g",
"sugars": "1g"
},
"fats": {
"saturated-fat": "0g",
"trans-fat": "1g"
}
},
"name": "sandwich-nutrition"
},
{
"meta": {
"allergens": {
"eggs": "true",
"nuts": "false",
"seafood": "true"
},
"calories": 440,
"carbohydrates": {
"dietary-fiber": "4g",
"sugars": "2g"
},
"fats": {
"saturated-fat": "0g",
"trans-fat": "1g"
}
},
"name": "random-nutrition"
}
]
Expected output(which works without variables)
{
"name": "alpharetta",
"meta": {
"antivirus": {
"enabled": "true"
},
"limits": {
"cpu": {
"enabled": "true",
"value": "250m"
}
}
}
}
Current output []
seems like some issue with variables not being passed in case of "query" , help would be appreciated.
Edit 1
It works if I hardcode "query" -
match = pyjq.all('.[]|select(.meta.antivirus.enabled==$value)', records, vars={"value": value,"query": query})
but not vice-versa
which probably narrows it down to issue with the variable "query"
JQ is not a necessity and I can use other libraries too,given that json is returned
Variables are intended to be used for values, not for jq expressions (at least not directly).
I think the easiest option here is to go for an fstring:
match = pyjq.all(f'.[]|select({query}==$value)', records, vars={"value": value})
and it probably makes sense to prepend the period inside the fstring:
match = pyjq.all(f'.[]|select(.{key}==$value)', records, vars={"value": value})

How to convert json to csv with single header and multiple values?

I have input
data = [
{
"details": [
{
"health": "Good",
"id": "1",
"timestamp": 1579155574
},
{
"health": "Bad",
"id": "1",
"timestamp": 1579155575
}
]
},
{
"details": [
{
"health": "Good",
"id": "2",
"timestamp": 1588329978
},
{
"health": "Good",
"device_id": "2",
"timestamp": 1588416380
}
]
}
]
Now I want to convert it in csv something like below,
id,health
1,Good - 1579155574,Bad - 1579155575
2,Good - 1588329978,Good - 1588416380
Is this possible?
Currently I am converting this in simple csv, my code and response are as below,
f = csv.writer(open("test.csv", "w", newline=""))
f.writerow(["id", "health", "timestamp"])
for data in data:
for details in data['details']:
f.writerow([details['id'],
details["health"],
details["timestamp"],
])
Response:
id,health,timestamp
1,Good,1579155574
1,Bad,1579155575
2,Good,1579261319
2,Good,1586911295
So how could I get the expected output? I am using python3.
You almost have done your job, I think you do not need use csv module.
And CSV does not mean anything, it just a name let people know what it is. CSV ,TXT and JSON are same things to computers, they are something to record the words.
I don't know whole patterns of your datas, but you can get output value you want.
output = 'id,health\n'
for data in datas:
output += f'{data["details"][0]["id"]},'
for d in data["details"]:
if 'health' in d:
output += f'{d["health"]} - {d["timestamp"]},'
else:
output += f'{d["battery_health"]} - {d["timestamp"]},'
output = output[:-1] + '\n'
with open('test.csv', 'w') as op:
op.write(output)

parsing nested json data - access directly to a member

I have json data like
data = {
"id":1,
"name":"abc",
"address": {
"items":[
"streetName":"cde",
"streetId":"SID"
]
}
}
How can i access directly to the streetName Value ?
Your json is actually invalid. If you have control over the json generation, first change it to this:
data = {
"id": 1,
"name": "abc",
"address": {
"items": [{
"streetName": "cde",
"streetId": "SID"
}]
}
}
Notice the additional braces around streetName and streetId. Then, to access streetName, do this:
var streetName = data.address.items[0].streetName;

Trouble Parsing Array vs Non-Array JSON with JSONPath

I have JSON that looks like the below. I'm trying to use JSONPath to grab the __ content __ value where the SKU is "8A-OK9F-9LI8" AND the Component.Type == 'Principal'. Right now, I am playing around with this JSON Path Expression Tester.
This JSONPath expression grabs all of the component information I need:
$.Order..Fulfillment[?(#.SKU=='8A-OK9F-9LI8')]..Component
But filtering further such as $.Order..Fulfillment[?(#.SKU=='8A-OK9F-9LI8')]..Component[?(#.Type=='Principal')] grabs only one (I believe the Array one) of the two Component elements I need. I suspect this is because one is an Array and one is a single JSON element. Is it possible to grab this with one command or do I have to combine several commands (one for the Array and one for the single JSON element)? If so, how can I grab the other Component information that I am not currently getting with:
$.Order..Fulfillment[?(#.SKU=='8A-OK9F-9LI8')]..Component[?(#.Type=='Principal')]?
Again, my goal is to grab the "__ content__" value and filter by a specific SKU and where the Component.Type == 'Principal'. Something like:
$.Order..Fulfillment[?(#.SKU=='8A-OK9F-9LI8')]..Component[?(#.Type=='Principal')]..Amount..__content__
I'm expecting to get back ["8.49", "8.49"]
Here is the JSON I am testing with:
{
"SettlementData": {},
"Order": [
{
"OrderID": "XXX",
"Fulfillment": {
"Item": {
"SKU": "8A-OK9F-9LI8",
"Quantity": "1",
"ItemPrice": {
"Component": [
{
"Type": "Principal",
"Amount": {
"__content__": "8.49",
"currency": "USD"
}
},
{
"Type": "Tax",
"Amount": {
"__content__": "0.74",
"currency": "USD"
}
}
]
}
}
}
},
{
"OrderID": "XXX",
"Fulfillment": {
"Item": {
"SKU": "8A-OK9F-9LI8",
"Quantity": "1",
"ItemPrice": {
"Component": {
"Type": "Principal",
"Amount": {
"__content__": "8.49",
"currency": "USD"
}
}
}
}
}
}
]
}
I was able to solve this in two passes. In this example, #{sku} is a Ruby interpolated string that contains the SKU I am passing in:
$.Order..Fulfillment[?(#.SKU=='#{sku}')]..ItemPrice..[?(#.Type=='Principal')].Amount.__content__
$.Order..Fulfillment..Item[?(#.SKU=='#{sku}')]..ItemPrice..[?(#.Type=='Principal')].Amount.__content__
Using a Ruby gem "jsonpath", I was able to get the amounts I needed like this:
amount = JsonPath.on(settlement, "$.Order..Fulfillment[?(#.SKU=='#{sku}')]..ItemPrice..[?(#.Type=='Principal')].Amount.__content__")
.map(&:to_f).inject(:+)
amount2 = JsonPath.on(settlement, "$.Order..Fulfillment..Item[?(#.SKU=='#{sku}')]..ItemPrice..[?(#.Type=='Principal')].Amount.__content__")
.map(&:to_f).inject(:+)