I have JSON data like this:
{
"profiles": {
"auto_scaler": [
{
"auto_scaler_group_name": "myasg0",
"auto_scaler_group_options": {
":availability_zones": ["1a", "1b", "1c"],
":max_size": 1,
":min_size": 1,
":subnets": ["a", "b", "c"],
":tags": [
{":key": "Name", ":value": "app0" },
{":key": "env", ":value": "dev" },
{":key": "role", ":value": "app" },
{":key": "domain", ":value": "example.com" },
{":key": "fonzi_app", ":value": "true"},
{":key": "vpc", ":value": "nonprod"}
]
},
"dns_name": "fonz1"
},
{
"auto_scaler_group_name": "myasg1",
"auto_scaler_group_options": {
":availability_zones": ["1a", "1b", "1c"],
":max_size": 1,
":min_size": 1,
":subnets": ["a", "b", "c"],
":tags": [
{":key": "Name", ":value": "app1" },
{":key": "env", ":value": "dev" },
{":key": "role", ":value": "app" },
{":key": "domain", ":value": "example.com" },
{":key": "bozo_app", ":value": "true"},
{":key": "vpc", ":value": "nonprod"}
]
},
"dns_name": "bozo1"
}
]
}
}
I want to write a jq query to firstly select the Hash element in the Array at .profiles.auto_scaler whose Array of Hashes at .auto_scaler_group_options.tags contains Hashes containing a ":key" key whose value contains "fonzi" and a ":value" key whose value is exactly true and then return the value of the key dns_name.
In the example, the query would simply return "fonz1".
Does anyone know how to do this, if it is possible, using jq?
In brief, yes.
In long:
.profiles.auto_scaler[]
| .dns_name as $name
| .auto_scaler_group_options
| select( any(.[":tags"][];
(.[":key"] | index("fonzi")) and (.[":value"] == "true")) )
| $name
The output of the above is:
"fonz1"
The trick here is to extract the candidate .dns_name before diving more deeply into your "complex nested JSON".
An alternative
If your jq does not have any, you could (in this particular case) get away without it by replacing the select expression above with:
select( .[":tags"][]
| (.[":key"] | index("fonzi")) and (.[":value"] == "true") )
Be warned, though, that the semantics of the two expressions are slightly different. (Homework exercise: what is the difference?)
If your jq doesn't have any and if you want the semantics of any, then you could easily roll your own, or simply upgrade :-)
You can lookup JSON in number or ways ,
by checking property exists or not
by using [property] syntax,
'property' in object syntax
for your case , here's a small sample , further you can lookup on array and see
containing a "key" key whose value contains EEE and a "value" key
whose value is exactly FFF
for(var k=0; k < p['AAA']['BBB'].length;k++){
console.log(p['AAA']['BBB'][k])
}
where p is JSON object.
Hope that helps
Related
I'm interacting with an API that returns an array of objects related to a product in JSON format:
[
{
"type": "category",
"name": "food"
},
{
"type": "category",
"name": "fruit"
},
{
"type": "barcode",
"name": "123456"
}
]
I'm trying to use jq tool on bash in the shortest and tidiest form, to check if a product has a barcode and it's categorized as food. In other words, check if an object with type=barcode exists in the array, then check if there is an object with type=category together with name=food.
This outputs true or false based on your requirements:
any(.type=="barcode") and any(.type=="category" and .name=="food")
jq -e will set the exit code of the program accordingly:
if jq -e '...'; then
...
fi
Without -e:
if test "$(jq '...')" = true; then
...
fi
And not necessarily shorter or tidier, but semantically easier to follow:
group_by(.type)
| map({
key: first.type,
value: map(.name)
})
| from_entries
| .category as $cat
| .barcode and ("food"|IN($cat[]))
This first builds an intermediate object of the form:
{
"barcode": [
"123456"
],
"category": [
"food",
"fruit"
]
}
Which you can then query, e.g. does it have a barcode and is "food" one of the categories.
Lets say I have an I/P json file as below. And I want to extract the O/P in a CSV format with the below fields. Specifically, I want to get the value of the key "Gamma" in the o/p if the key "Gamma" exists in "tags" map. If the key doesn't exists, it should just print a NULL value. The expected o/p is below.
generated_time,platform,id,,
2021-09-09:12:03:12,earth,2eeee67748,Ray,2021-08-25 09:41:06
2021-09-09:12:03:12,sun,xxxxx12334,NULL,2021-08-25 10:11:31
[
{
"generated_time": "generated_time",
"platform": "platform",
"id": "id"
},
{
"generated_time": "2021-09-09:12:03:12",
"platform": "earth",
"id": "2eeee67748",
"tags": {
"app": "map",
"Gamma": "Ray",
"null": [
"allow-all-humans"
]
},
"created": "2021-08-25 09:41:06"
},
{
"generated_time": "2021-09-09:12:03:12",
"platform": "sun",
"id": "xxxxx12334",
"tags": {
"component": "machine",
"environment": "hot",
"null": [
"aallow-all-humans"
]
},
"created": "2021-08-25 10:11:31"
}
]
jq has a builtin #csv which renders an array
as CSV with double quotes for strings, and quotes escaped by repetition.
If the additional quoting (as compared to your expected output) isn't an issue, the following
jq --raw-output '
# produce an array for each element in the input array
.[] | [
# containing the first three columns unchanged
.generated_time, .platform, .id,
# if the input element has a field named "tags"
if has("tags")
# then add two more columns and replace an inexistant Gamma with "NULL"
then (.tags.Gamma // "NULL", .created)
# otherwise add two empty columns instead
else (null, null) end
# and convert the array into CSV format
] | #csv
' input.json
will produce
"generated_time","platform","id",,
"2021-09-09:12:03:12","earth","2eeee67748","Ray","2021-08-25 09:41:06"
"2021-09-09:12:03:12","sun","xxxxx12334","NULL","2021-08-25 10:11:31"
Preface: If the following is not possible with jq, then I completely accept that as an answer and will try to force this with bash.
I have two files that contain some IDs that, with some massaging, should be able to be combined into a single file. I have some content that I'll add to that as well (as seen in output). Essentially "mitre_test" should get compared to "sys_id". When compared, the "mitreid" from in2.json becomes technique_ID in the output (and is generally the unifying field of each output object).
Caveats:
There are some junk "desc" values placed in the in1.json that are there to make sure this is as programmatic as possible, and there are actually numerous junk inputs on the true input file I am using.
some of the mitre_test values have pairs and are not in a real array. I can split on those and break them out, but find myself losing the other information from in1.json.
Notice in the "metadata" for the output that is contains the "number" values from in1.json, and stored in a weird way (but the way that the receiving tool requires).
in1.json
[
{
"test": "Execution",
"mitreid": "T1204.001",
"mitre_test": "90b"
},
{
"test": "Defense Evasion",
"mitreid": "T1070.001",
"mitre_test": "afa"
},
{
"test": "Credential Access",
"mitreid": "T1556.004",
"mitre_test": "14b"
},
{
"test": "Initial Access",
"mitreid": "T1200",
"mitre_test": "f22"
},
{
"test": "Impact",
"mitreid": "T1489",
"mitre_test": "fa2"
}
]
in2.json
[
{
"number": "REL0001346",
"desc": "apple",
"mitre_test": "afa"
},
{
"number": "REL0001343",
"desc": "pear",
"mitre_test": "90b"
},
{
"number": "REL0001366",
"desc": "orange",
"mitre_test": "14b,f22"
},
{
"number": "REL0001378",
"desc": "pineapple",
"mitre_test": "90b"
}
]
The output:
[{
"techniqueID": "T1070.001",
"tactic": "defense-evasion",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001346"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1204.001",
"tactic": "execution",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001343"
},
{
"name": "DET_ID",
"value": "REL0001378"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1556.004",
"tactic": "credential-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1200",
"tactic": "initial-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
}
]
I'm assuming I have some splitting to do on mitre_test with something like .mitre_test |= split(",")), and there are some joins I'm assuming, but doing so causes data loss or mixing up of the data. You'll notice the static data in the output exists as well, but is likely easy to place in and as such isn't as much of an issue.
Edit: reduced some of the match IDs so that it is easier to look at while analyzing the in1 and in2 files. Also simplified the two inputs to have a similar structure so that the answer is easier to understand later.
The requirements are somewhat opaque but it's fairly clear that if the task can be done by computer, it can be done using jq.
From the description, it would appear that one of the unusual aspects of the problem is that the "dictionary" defined by in1.json must be derived by splitting the key names that are CSV (comma-separated values). Here therefore is a jq def that will do that:
# Input: a JSON dictionary for which some keys are CSV,
# Output: a JSON dictionary with the CSV keys split on the commas
def refine:
. as $in
| reduce keys_unsorted[] as $k ({};
if ($k|index(","))
then ($k/",") as $keys
| . + ($keys | map( {(.): $in[$k]}) | add)
else .[$k] = $in[$k]
end );
You can see how this works by running:
INDEX($mitre.records[]; .mitre_test) | refine
using an invocation of jq such as:
jq --argfile mitre in1.json -f program.jq in2.json
For the joining part of the problem, there are many relevant Q&As on SO, e.g.
How to join JSON objects on particular fields using jq?
There is probably a much more elegant way to do this, but I ended up manually walking around things and piping to new output.
Explanation:
Read in both files, pull the fields I need.
Break out the mitre_test values that were previously just a comma separated set of values with map and try.
Store the none-changing fields as a variable and then manipulate mitre_test to become an appropriately split array, removing nulls.
Group by mitre_test values, since they are the common thing that the output is based on.
Cleanup more nulls.
Sort output to look like I want it.
jq . in1.json in2.json | \
jq '.[] |{number: .number, test: .test, mitreid: .mitreid, mitre_test: .mitre_test}' |\
jq -s '[. |map(try(.mitre_test |= split(",")) // .)|\
.[] | [.number,.test,.mitreid] as $h | .mitre_test[] |$h + [.] | \
{DET_ID: .[0], tactic: .[1], techniqueID: .[2], mitre_test: .[3]}] |\
del(.[][] | nulls)' |jq '[group_by(.mitre_test)[]|{mitre_test: .[0].mitre_test, techniqueID: [.[].techniqueID],tactic: [.[].tactic], DET_ID: [.[].DET_ID]}]|\
del(.[].techniqueID[] | nulls) | del(.[].tactic[] | nulls) | del(.[].DET_ID[] | nulls)' | \
jq '.[]| [{techniqueID: .techniqueID[0],tactic: .tactic[0], metadata: [{name: "DET_ID",value: .DET_ID[]}]}] | .[] | \
select((.metadata|length)>0)'
It was a long line, so I split it among some of the basic ideas.
I have this JSON document:
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
}
}
My expected result should be:
1,G1
2,GM1
With *.a i get
[
"G1",
"GM1"
]
but I am absolutely stuck for the rest.
Sadly there is not much you can do that would be totally matching your use case and that would scale properly.
This is because JMESPath does not have a way to reference its parent, although this has been requested before, to allow you something like
*.[join(',', [keys($), a])]
You can definitely extract a list of keys and values, thanks to the function keys:
#.{keys: keys(#), values: *.a}
That gives
{
"keys": [
"1",
"2"
],
"values": [
"G1",
"GM1"
]
}
But then you just fall under the same case as this other question, because keys will give you a list of keys.
You can also end with a list of lists:
#.[keys(#), *.a]
Will give you:
[
[
"1",
"2"
],
[
"G1",
"GM1"
]
]
And you can even go further and flatten it if needed:
#.[keys(#), *.a] []
Gives:
[
"1",
"2",
"G1",
"GM1"
]
With all this if you do happen to have a list of exactly two items, then a solution would be to use a combination of join and slice:
#.[join(',',[keys(#),*.a][] | [::2]), join(',',[keys(#),*.a][] | [1::2])]
That would give the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, as soon as you have more than two items to consider you would end up with a buggy:
[
"1,3,G1,GM3",
"2,4,GM1,GM4"
]
With a data set of
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
},
"3": {
"a": "GM3"
},
"4": {
"a": "GM4"
}
}
And then, of course, the same can be achieved hardcoding indexes:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])]
That also gives the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, this only works if you know in advance the number of rows that are going to be returned to you.
And if you want a single string, given that were you want to feed the data accepts \n as a new line, you can join he whole array again:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])].join(`\n`,#)
Will give:
"1,G1\n2,GM1"
Finally this expression worked 100% for me:
[{key1:keys(#)[0],a:*.a| [0]},{key1:keys(#)[1],a:*.a| [1]}]
I have some JSON output I am trying to parse with jq. I read some examples on filtering but I don't really understand it and my output it more complicated than the examples. I have no idea where to even begin beyond jq '.[]' as I don't understand the syntax of jq beyond that and the hierarchy and terminology are challenging as well. My JSON output is below. I want to return the value for Valid where the ItemName equals Item_2. How can I do this?
"1"
[
{
"GroupId": "1569",
"Title": "My_title",
"Logo": "logo.jpg",
"Tags": [
"tag1",
"tag2",
"tag3"
],
"Owner": [
{
"Name": "John Doe",
"Id": "53335"
}
],
"ItemId": "209766",
"Item": [
{
"Id": 47744,
"ItemName": "Item_1",
"Valid": false
},
{
"Id": 47872,
"ItemName": "Item_2",
"Valid": true
},
{
"Id": 47872,
"ItemName": "Item_3",
"Valid": false
}
]
}
]
"Browse"
"8fj9438jgge9hdfv0jj0en34ijnd9nnf"
"v9er84n9ogjuwheofn9gerinneorheoj"
Except for the initial and trailing JSON scalars, you'd simply write:
.[] | .Item[] | select( .ItemName == "Item_2" ) | .Valid
In your particular case, to ensure the top-level JSON scalars are ignored, you could prefix the above with:
arrays |