JQ - Join nested arrays and filter - json

I'm trying to use JQ to create the following paths:
/staging-data-0/cassandra/cassandra-client-port
/staging-data-0/cassandra/cassandra-gossip-port
from the following blob of JSON (I've stripped unnecessary bits out):
{
"DebugConfig": {
"ServerPort": 8300,
"Services": [
{
"Checks": [
{
"CheckID": "cassandra-client-port",
"Timeout": "1s"
},
{
"CheckID": "cassandra-gossip-port",
"Timeout": "1s"
}
],
"Name": "cassandra"
},
{
"Checks": [
{
"CheckID": "cockroachdb-tcp",
"Timeout": "1s"
}
],
"Name": "cockroachdb"
}
]
},
"Member": {
"Name": "staging-data-0"
},
"Meta": {
"consul-network-segment": ""
}
}
I'm struggling with the JQ manual to generate the paths, I can only pull out the last part so far with
jq '.DebugConfig.Services | map(select(.Name=="cassandra")) | map(.Checks[].CheckID)'
The final path should be /{.Member.Name}/{.DebugConfig.Services.Name}/{.DebugConfig.Services.Checks.CheckID}

Only cassandra
jq -r '{a:.Member.Name, b:.DebugConfig.Services[]} | select(.b.Name=="cassandra") | {a:.a, b:.b.Name, c:.b.Checks[].CheckID} | [.a, .b, .c] | join("/")'
staging-data-0/cassandra/cassandra-client-port
staging-data-0/cassandra/cassandra-gossip-port
Both
jq -r '{a:.Member.Name, b:.DebugConfig.Services[]} | {a:.a, b:.b.Name, c:.b.Checks[].CheckID} | [.a, .b, .c] | join("/")'
staging-data-0/cassandra/cassandra-client-port
staging-data-0/cassandra/cassandra-gossip-port
staging-data-0/cockroachdb/cockroachdb-tcp

With your input, the jq filter:
.DebugConfig.Services[] as $s
| "/\(.Member.Name)/\($s.Name)/\($s.Checks[].CheckID)"
produces:
"/staging-data-0/cassandra/cassandra-client-port"
"/staging-data-0/cassandra/cassandra-gossip-port"
"/staging-data-0/cockroachdb/cockroachdb-tcp"
Since you only want the "cassandra" strings, you just need to interject a "select" filter:
.DebugConfig.Services[] as $s
| "/\(.Member.Name)/\($s.Name)/" +
($s
| select(.Name == "cassandra")
| .Checks[].CheckID)
but it's worth noting how easy it is to process all the "Checks" items.

Related

jq: map arrays to csv field headers

Is there a way to export a json like this:
{
"id":"2261026",
"meta":{
"versionId":"1",
"lastUpdated":"2021-11-08T15:13:39.318+01:00",
},
"address": [
"string-value1",
"string-value2"
],
"identifier":[
{
"system":"urn:oid:2.16.724.4.9.20.93",
"value":"6209"
},
{
"system":"urn:oid:2.16.724.4.9.20.2",
"value":"00042"
},
{
"system":"urn:oid:2.16.724.4.9.20.90",
"value":"UAB2"
}
]
}
{
"id":"2261027",
"meta":{
"versionId":"1",
"lastUpdated":"2021-11-08T15:13:39.318+01:00",
},
"address": [
"string-value1",
"string-value2",
"string-value3",
"string-value4"
],
"identifier":[
{
"system":"urn:oid:2.16.724.4.9.20.93",
"value":"6205"
},
{
"system":"urn:oid:2.16.724.4.9.20.2",
"value":"05041"
}
]
}
I'd like to get something like this:
"id","meta_versionId","meta_lastUpdated","address","identifier0_system","identifier0_value","identifier1_system","identifier1_value","identifier2_system","identifier2_value"
"2261026","1","2021-11-08T15:13:39.318+01:00","string-value1|string-value2","urn:oid:2.16.724.4.9.20.93","6209","urn:oid:2.16.724.4.9.20.2","00042","urn:oid:2.16.724.4.9.20.90","UAB2"
"2261027","1","2021-11-08T15:13:39.318+01:00","string-value1|string-value2|string-value3|string-value4","urn:oid:2.16.724.4.9.20.93","6205","urn:oid:2.16.724.4.9.20.2","05041",,
In short:
address array field string values has to be mapped joining its values using "|" character. Example: "string-value1|string-value2"
identifiers array field objects have to be mapped to "n-field-header". Example: "identifier0_system","identifier0_value","identifier1_system","identifier1_value","identifier2_system","identifier2_value,..."
Any ideas?
Try this
jq -r '[
.id,
(.meta | .versionId, .lastUpdated),
(.address | join("|")),
(.identifier[] | .system, .value)
] | #csv'
Demo
To prepend a header row with the number of identifierX_system and identifierX_value field pairs in it matching the length of the input's longest identifier array, try this
jq -rs '[
"id",
"meta_versionId", "meta_lastUpdated",
"address",
(
range([.[].identifier | length] | max)
| "identifier\(.)_system", "identifier\(.)_value"
)
], (.[] | [
.id,
(.meta | .versionId, .lastUpdated),
(.address | join("|")),
(.identifier[] | .system, .value)
]) | #csv'
Demo

JSON/JQ: Merge 2 files on key-value with condition

I have 2 JSON files. I would like to use jq to take the value of "capital" from File 2 and merge it with File 1 for each element where the same "name"-value pair occurs. Otherwise, the element from File 2 should not occur in the output. If there is no "name"-value pair for an element in File 1, it should have empty text for "capital."
File 1:
{
"countries":[
{
"name":"china",
"continent":"asia"
},
{
"name":"france",
"continent":"europe"
}
]
}
File 2:
{
"countries":[
{
"name":"china",
"capital":"beijing"
},
{
"name":"argentina",
"capital":"buenos aires"
}
]
}
Desired result:
{
"countries":[
{
"name":"china",
"continent":"asia",
"capital":"beijing"
},
{
"name":"france",
"continent":"europe",
"capital":""
}
]
}
You could first construct a dictionary from File2, and then perform the update, e.g. like so:
jq --argfile dict File2.json '
($dict.countries | map( {(.name): .capital}) | add) as $capitals
| .countries |= map( .capital = ($capitals[.name] // ""))
' File2.json
From a JSON-esque perspective, it would probably be better to use null for missing values; in that case, you could simplify the above by omitting // "".
Using INDEX/2
If your jq has INDEX/2, then the $capitals dictionary could be constructed using the expression:
INDEX($dict.countries[]; .name) | map_values(.capital)
Using INDEX makes the intention clearer, but if efficiency were a major concern, you'd probably be better off using reduce explicitly:
reduce $dict.countries[] as $c ({}; . + ($c | {(.name): .capital}))
One way:
$ jq --slurpfile file2 file2.json '
{ countries:
[ .countries[] |
. as $curr |
$curr + { capital: (($file2[0].countries[] | select(.name == $curr.name) | .capital) // "") }
]
}' file1.json
{
"countries": [
{
"name": "china",
"continent": "asia",
"capital": "beijing"
},
{
"name": "france",
"continent": "europe",
"capital": ""
}
]
}
An alternative:
$ jq -n '{ countries: ([inputs] | map(.countries) | flatten | group_by(.name) |
map(select(.[] | has("continent")) | add | .capital //= ""))
}' file[12].json

Parsing json values using jq

I am trying to get values "en" of a JSON structure using jq on the linux command line.
find . -name "*.json" -exec jq -r \ '(input_filename | gsub("^\\./|\\.json$";"")) as $fname (map(.tags) | .[] | .[] | .tag.en ) as $tags | "\($fname)&\($tags)"' '{}' +
i have more than 5000 files, start from 0001.json 0002.json .. 5000.json
This is a simple file 0001.json
{
"result": {
"tags": [
{ "confidence": 100, "tag": { "en": "turbine" } },
{ "confidence": 64.8014373779297, "tag": { "en": "wind" } },
{ "confidence": 63.3033409118652, "tag": { "en": "generator" } },
{ "confidence": 7.27894926071167, "tag": { "en": "device" } },
{ "confidence": 7.01708889007568, "tag": { "en": "line" } }
]
},
"status": { "text": "", "type": "success" }
}
i get this result :
0001&turbine
0001&wind
0001&generator
0001&device
0001&line
jq: error (at ./0001.json:0): Cannot iterate over null (null)
Ouptut..
jq: error (at ./0002.json:0): Cannot iterate over null (null)
Output..
jq: error (at ./0003.json:0): Cannot iterate over null (null)
My Desired Output in one file from all json files results.
filename&enValue:confidenceValue
0001&turbine:100,wind:64,generator:63,device:7,line:7
0002&...
0003&...
0004&...
The jq filter you want can be written as follows:
(input_filename | gsub("^\\./|\\.json$";"")) as $fname
| ( [ .result.tags[] | [.tag.en, (.confidence | floor)] | join(":") ]
| join(",") ) as $tags
| "\($fname)&\($tags)"

Filter object or array

I would like to list all the Ids and roles in a given json but where there is only a single role, rather than an array of 1 it provides it as an object, so if I run "[]?" I get the error Cannot index string with string "Name".
Extract (example.json):
{
"Person": [
{
"Roles": {
"Role": {
"#Id": "1",
"Name": "Job1"
}
}
},
{
"Roles": {
"Role": [
{
"#Id": "2",
"Name": "Job2"
},
{
"#Id": "3",
"Name": "Job3"
}
]
}
}
]
}
I hoped this may work:
jq -r . | '.Roles.Role[]?>.#Id + "," + .Roles.Role[]?>.Name'
This is the output I'd like (so I can pipe to a csv)
1,Job1
2,Job2
3,Job3
The following produces the CSV shown below. It would be easy to tweak the program to remove the double-quotation marks, etc.
.Person[]
| .Roles.Role
| if type == "array" then .[] else . end
| [.["#Id"], .Name]
| #csv
Output
"1","Job1"
"2","Job2"
"3","Job3"
Adding the index in .Person
.Person
| range(0; length) as $ix
| .[$ix]
| .Roles.Role
| if type == "array" then .[] else . end
| [$ix, .["#Id"], .Name]
| #csv

Json search and print

I've been trying to use jq parser to help me extract information from json files.
Here is an example snippet
{
"main_attribute": {
"name": {
"display_name": "abc"
},
"address": {
"unit": "1",
"street": "Dundas",
"suburb": "Syd",
"state": "NSW"
},
"financial_debt": {
"bank_loan": true
}
},
"secondary_attr": {
"income": {
"pretax": 100000
},
"automobile": {
"make": "Citroen",
"model": 2015,
"new": true
},
"property": {
"property_owned": 1,
"owned_since": 2000,
"first_sale": true
},
"education": {
"degree": "MS",
"graduated": 1990,
"financial_debt": {
"bank_loan": false
}
}
}
}
I need to find the blocks where "financial_debt" is true. This field could be either in the main_attribute (as a global value) or in the secondary attribute.
Expected output:
financial_debt: bank_loan on "automobile" and "property"
Can you please advise how to go about doing this search using jq?
This is by no means the most efficient way, but it is functional. It returns a boolean value specifying whether or not there is a true boolean value under the financial_debt property.
jq '[recurse | .financial_debt? | select(. != null) | recurse | booleans] | any'
tostream can be used to find paths containing "financial_debt" as follows:
tostream
| select(length==2)
| select(.[0] | contains(["financial_debt"]))
with this filter in filter.jq and data in data.json
$ jq -M -c -f filter.jq data.json
produces
[["main_attribute","financial_debt","bank_loan"],true]
[["secondary_attr","education","financial_debt","bank_loan"],false]
This intermediate result can be used along with reduce, setpath, getpath and a filter such as
. as $d
| reduce ( tostream
| select(length==2)
| select(.[0] | contains(["financial_debt"]))) as [$p,$v] (
{}
; setpath($p[:-1]; $d | getpath($p[:-1]))
)
to produce
{
"main_attribute": {
"financial_debt": {
"bank_loan": true
}
},
"secondary_attr": {
"education": {
"financial_debt": {
"bank_loan": false
}
}
}
}