Parsing json values using jq

Parsing json values using jq - json

I am trying to get values "en" of a JSON structure using jq on the linux command line.
find . -name "*.json" -exec jq -r \ '(input_filename | gsub("^\\./|\\.json$";"")) as $fname (map(.tags) | .[] | .[] | .tag.en ) as $tags | "\($fname)&\($tags)"' '{}' +
i have more than 5000 files, start from 0001.json 0002.json .. 5000.json
This is a simple file 0001.json
{
"result": {
"tags": [
{ "confidence": 100, "tag": { "en": "turbine" } },
{ "confidence": 64.8014373779297, "tag": { "en": "wind" } },
{ "confidence": 63.3033409118652, "tag": { "en": "generator" } },
{ "confidence": 7.27894926071167, "tag": { "en": "device" } },
{ "confidence": 7.01708889007568, "tag": { "en": "line" } }
]
},
"status": { "text": "", "type": "success" }
}
i get this result :
0001&turbine
0001&wind
0001&generator
0001&device
0001&line
jq: error (at ./0001.json:0): Cannot iterate over null (null)
Ouptut..
jq: error (at ./0002.json:0): Cannot iterate over null (null)
Output..
jq: error (at ./0003.json:0): Cannot iterate over null (null)
My Desired Output in one file from all json files results.
filename&enValue:confidenceValue
0001&turbine:100,wind:64,generator:63,device:7,line:7
0002&...
0003&...
0004&...

The jq filter you want can be written as follows:
(input_filename | gsub("^\\./|\\.json$";"")) as $fname
| ( [ .result.tags[] | [.tag.en, (.confidence | floor)] | join(":") ]
| join(",") ) as $tags
| "\($fname)&\($tags)"

Related

Use JQ to output JSON nested object into array, before conversion to CSV

Use JQ to output JSON nested object into array, before conversion to CSV
Question is an extension of previous solution:
Use JQ to parse JSON array of objects, using select to match specified key-value in the object element, then convert to CSV
Data Source:
{
"Other": [],
"Objects": [
{
"ObjectElementName": "Test 123",
"ObjectElementArray": [],
"ObjectNested": {
"0": 20,
"1": 10.5
},
"ObjectElementUnit": "1"
},
{
"ObjectElementName": "Test ABC 1",
"ObjectElementArray": [],
"ObjectNested": {
"0": 0
},
"ObjectElementUnit": "2"
},
{
"ObjectElementName": "Test ABC 2",
"ObjectElementArray": [],
"ObjectNested": {
"0": 15,
"1": 20
},
"ObjectElementUnit": "5"
}
],
"Language": "en-US"
}
JQ command to extract [FAILS]
jq -r '.Objects[]
| select(.ObjectElementName | test("ABC"))
| [.ObjectElementName,.ObjectNested,.ObjectElementUnit]
|#csv' input.json
Output CSV required (or variation, so long as ObjectNested appears into a single column in CSV)
ObjectElementName,ObjectNested,ObjectElementUnit
"Test ABC 1","0:0","2"
"Test ABC 2","0:15,1:20","5"

With keys_unsorted and string interpolation, it's easy to turn ObjectNested into the form you desired:
.Objects[] | select(.ObjectElementName | index("ABC")) | [
.ObjectElementName,
([.ObjectNested | keys_unsorted[] as $k | "\($k):\(.[$k])"] | join(",")),
.ObjectElementUnit
] | #csv

Filter object or array

I would like to list all the Ids and roles in a given json but where there is only a single role, rather than an array of 1 it provides it as an object, so if I run "[]?" I get the error Cannot index string with string "Name".
Extract (example.json):
{
"Person": [
{
"Roles": {
"Role": {
"#Id": "1",
"Name": "Job1"
}
}
},
{
"Roles": {
"Role": [
{
"#Id": "2",
"Name": "Job2"
},
{
"#Id": "3",
"Name": "Job3"
}
]
}
}
]
}
I hoped this may work:
jq -r . | '.Roles.Role[]?>.#Id + "," + .Roles.Role[]?>.Name'
This is the output I'd like (so I can pipe to a csv)
1,Job1
2,Job2
3,Job3

The following produces the CSV shown below. It would be easy to tweak the program to remove the double-quotation marks, etc.
.Person[]
| .Roles.Role
| if type == "array" then .[] else . end
| [.["#Id"], .Name]
| #csv
Output
"1","Job1"
"2","Job2"
"3","Job3"
Adding the index in .Person
.Person
| range(0; length) as $ix
| .[$ix]
| .Roles.Role
| if type == "array" then .[] else . end
| [$ix, .["#Id"], .Name]
| #csv

jq get the value of x based on y in a complex json file

jq strikes again. Trying to get the value of DATABASES_DEFAULT based on the name in a json file that has a whole lot of names and I'm completely lost.
My file looks like the following (output of an aws ecs describe-task-definition) only much more complex; I've stripped this to the most basic example I can where the structure is still intact.
{
"taskDefinition": {
"status": "bar",
"family": "bar2",
"volumes": [],
"taskDefinitionArn": "bar3",
"containerDefinitions": [
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
},
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo2"
}
],
"name": "boo",
"links": []
}
],
"revision": 1
}
}
I need the value of DATABASES_DEFAULT where the name is baz. Note that there are a lot of keypairs with name, I'm specifically talking about the one outside of environment.
I've been tinkering with this but only got this far before realizing that I don't understand how to access nested values.
jq '.[] | select(.name==DATABASES_DEFAULT) | .value'
which is returning
jq: error: DATABASES_DEFAULT/0 is not defined at <top-level>, line 1:
.[] | select(.name==DATABASES_DEFAULT) | .value
jq: 1 compile error
Obviously this a) doesn't work, and b) even if it did, it's independant of the name value. My thought was to return all the db defaults and then identify the one with baz, but I don't know if that's the right approach.

I like to think of it as digging down into the structure, so first you open the outer layers:
.taskDefinition.containerDefinitions[]
Now select the one you want:
select(.name =="baz")
Open the inner structure:
.environment[]
Select the desired object:
select(.name == "DATABASES_DEFAULT")
Choose the key you want:
.value
Taken together:
parse.jq
.taskDefinition.containerDefinitions[] |
select(.name =="baz") |
.environment[] |
select(.name == "DATABASES_DEFAULT") |
.value
Run it like this:
<infile jq -f parse.jq
Output:
"foo"

The following seems to work:
.taskDefinition.containerDefinitions[] |
select(
select(
.environment[] | .name == "DATABASES_DEFAULT"
).name == "baz"
)
The output is the object with the name key mapped to "baz".
$ jq '.taskDefinition.containerDefinitions[] | select(select(.environment[]|.name == "DATABASES_DEFAULT").name=="baz")' tmp.json
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
}

How to get a flat output based on conditional jq query?

I have the following JSON:
[
{
"name": "InstanceA",
"tags": [
{
"key": "environment",
"value": "production"
},
{
"key": "group",
"value": "group1"
}
]
},
{
"name": "InstanceB",
"tags": [
{
"key": "group",
"value": "group2"
},
{
"key": "environment",
"value": "staging"
}
]
}
]
I'm trying to get a flat output of value based on the condition key == 'environment'. I already tried select(boolean_expression), but I cannot get the desired output, like:
"InstanceA, production"
"InstanceB, staging"
Does jq support this kind of output? If so, how to do it?

Yes.
For example:
$ jq '.[] | "\(.name), \(.tags | from_entries | .environment)"' input.json
Output:
"InstanceA, production"
"InstanceB, staging"

jq '.[] | .name + ", " + (.tags[] | select(.key == "environment").value)' f.json

Here is a solution using join
.[]
| [.name, (.tags[] | if .key == "environment" then .value else empty end)]
| join(", ")

Json search and print

I've been trying to use jq parser to help me extract information from json files.
Here is an example snippet
{
"main_attribute": {
"name": {
"display_name": "abc"
},
"address": {
"unit": "1",
"street": "Dundas",
"suburb": "Syd",
"state": "NSW"
},
"financial_debt": {
"bank_loan": true
}
},
"secondary_attr": {
"income": {
"pretax": 100000
},
"automobile": {
"make": "Citroen",
"model": 2015,
"new": true
},
"property": {
"property_owned": 1,
"owned_since": 2000,
"first_sale": true
},
"education": {
"degree": "MS",
"graduated": 1990,
"financial_debt": {
"bank_loan": false
}
}
}
}
I need to find the blocks where "financial_debt" is true. This field could be either in the main_attribute (as a global value) or in the secondary attribute.
Expected output:
financial_debt: bank_loan on "automobile" and "property"
Can you please advise how to go about doing this search using jq?

This is by no means the most efficient way, but it is functional. It returns a boolean value specifying whether or not there is a true boolean value under the financial_debt property.
jq '[recurse | .financial_debt? | select(. != null) | recurse | booleans] | any'

tostream can be used to find paths containing "financial_debt" as follows:
tostream
| select(length==2)
| select(.[0] | contains(["financial_debt"]))
with this filter in filter.jq and data in data.json
$ jq -M -c -f filter.jq data.json
produces
[["main_attribute","financial_debt","bank_loan"],true]
[["secondary_attr","education","financial_debt","bank_loan"],false]
This intermediate result can be used along with reduce, setpath, getpath and a filter such as
. as $d
| reduce ( tostream
| select(length==2)
| select(.[0] | contains(["financial_debt"]))) as [$p,$v] (
{}
; setpath($p[:-1]; $d | getpath($p[:-1]))
)
to produce
{
"main_attribute": {
"financial_debt": {
"bank_loan": true
}
},
"secondary_attr": {
"education": {
"financial_debt": {
"bank_loan": false
}
}
}
}

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Parsing json values using jq - json

The jq filter you want can be written as follows: (input_filename | gsub("^\\./|\\.json$";"")) as $fname | ( [ .result.tags[] | [.tag.en, (.confidence | floor)] | join(":") ] | join(",") ) as $tags | "\($fname)&\($tags)"

Related

Use JQ to output JSON nested object into array, before conversion to CSV

Filter object or array

jq get the value of x based on y in a complex json file

How to get a flat output based on conditional jq query?

Json search and print

Categories

Resources