jq: group and key by property - json

I have a list of objects that look like this:
[
{
"ip": "1.1.1.1",
"component": "name1"
},
{
"ip": "1.1.1.2",
"component": "name1"
},
{
"ip": "1.1.1.3",
"component": "name2"
},
{
"ip": "1.1.1.4",
"component": "name2"
}
]
Now I'd like to group and key that by the component and assign a list of ips to each of the components:
{
"name1": [
"1.1.1.1",
"1.1.1.2"
]
},{
"name2": [
"1.1.1.3",
"1.1.1.4"
]
}

I figured it out myself. I first group by .component and then just create new lists of ips that are indexed by the component of the first object of each group:
jq ' group_by(.component)[] | {(.[0].component): [.[] | .ip]}'

The accepted answer doesn't produce valid json, but:
{
"name1": [
"1.1.1.1",
"1.1.1.2"
]
}
{
"name2": [
"1.1.1.3",
"1.1.1.4"
]
}
name1 as well as name2 are valid json objects, but the output as a whole isn't.
The following jq statement results in the desired output as specified in the question:
group_by(.component) | map({ key: (.[0].component), value: [.[] | .ip] }) | from_entries
Output:
{
"name1": [
"1.1.1.1",
"1.1.1.2"
],
"name2": [
"1.1.1.3",
"1.1.1.4"
]
}
Suggestions for simpler approaches are welcome.
If human readability is preferred over valid json, I'd suggest something like ...
jq -r 'group_by(.component)[] | "IPs for " + .[0].component + ": " + (map(.ip) | tostring)'
... which results in ...
IPs for name1: ["1.1.1.1","1.1.1.2"]
IPs for name2: ["1.1.1.3","1.1.1.4"]

As a further example of #replay's technique, after many failures using other methods, I finally built a filter that condenses this Wazuh report (excerpted for brevity):
{
"took" : 228,
"timed_out" : false,
"hits" : {
"total" : {
"value" : 2806,
"relation" : "eq"
},
"hits" : [
{
"_source" : {
"agent" : {
"name" : "100360xx"
},
"data" : {
"vulnerability" : {
"severity" : "High",
"package" : {
"condition" : "less than 78.0",
"name" : "Mozilla Firefox 68.11.0 ESR (x64 en-US)"
}
}
}
}
},
{
"_source" : {
"agent" : {
"name" : "100360xx"
},
"data" : {
"vulnerability" : {
"severity" : "High",
"package" : {
"condition" : "less than 78.0",
"name" : "Mozilla Firefox 68.11.0 ESR (x64 en-US)"
}
}
}
}
},
...
Here is the jq filter I use to provide an array of objects, each consisting of an agent name followed by an array of names of the agent's vulnerable packages:
jq ' .hits.hits |= unique_by(._source.agent.name, ._source.data.vulnerability.package.name) | .hits.hits | group_by(._source.agent.name)[] | { (.[0]._source.agent.name): [.[]._source.data.vulnerability.package | .name ]}'
Here is an excerpt of the output produced by the filter:
{
"100360xx": [
"Mozilla Firefox 68.11.0 ESR (x64 en-US)",
"VLC media player",
"Windows 10"
]
}
{
"WIN-KD5C4xxx": [
"Windows Server 2019"
]
}
{
"fridxxx": [
"java-1.8.0-openjdk",
"kernel",
"kernel-headers",
"kernel-tools",
"kernel-tools-libs",
"python-perf"
]
}
{
"mcd-xxx-xxx": [
"dbus",
"fribidi",
"gnupg2",
"graphite2",
...

Related

Array and String cannot have their containment checked error when trying to search array using jq

I have a json file that looks roughly like this:
{
"default": [
{
"name" : "Joe Bloggs",
"email" : "joe.bloggs#business.org"
}
],
"groups": [
{
"recipients" : [
{
"name" : "Jane Bloggs",
"email" : "jane.bloggs#business.org"
}
],
"orgs" : [
"Service A",
"Service B",
"Service C"
]
},
{
"recipients" : [
{
"name" : "Bill Gates",
"email" : "bill.gates#business.org"
}
],
"orgs" : [
"Service D",
"Service E"
]
},
{
"recipients" : [
{
"name" : "Steve Jobs",
"email" : "steve.jobs#me.com"
}
],
"orgs" : [
"Service F",
"Service G"
]
}
]
}
Using jq I want to be able to search using one of the orgs, so for example 'Service A' and return only the recipients information
I can search recipients easy enough using jq like:
cat /path/to/file.json | jq -r '.groups[] | .recipients[] | select(.name | contains("Jobs"))' )
to return
{
"name": "Steve Jobs",
"email": "steve.jobs#me.com"
}
But If I try to search via the orgs array, I get an error:
cat /path/to/file.json | jq -r '.groups[] | select(.orgs | contains("Service A"))' )
jq: error (at <stdin>:46): array (["Service A...) and string ("Service A") cannot have their containment checked
Is it possible to do what I am looking for with jq?
Instead off contains you'll need index [docs] to check if there's an index with the value Service A:
.groups[] | select(.orgs | index("Service A"))
Will output:
{
"recipients": [
{
"name": "Jane Bloggs",
"email": "jane.bloggs#business.org"
}
],
"orgs": [
"Service A",
"Service B",
"Service C"
]
}
JqPlay demo
We can extend that to output only the recipients like so:
.groups[] | select(.orgs | index("Service A")) | .recipients | first
Where we use first to select the first object from the .recipients array. The output will be:
{
"name": "Jane Bloggs",
"email": "jane.bloggs#business.org"
}
JqPlay demo

Fill arrays in the first input with elements from the second based on common field

I have two files and I would need to merge the elements of the second file into an object array in the first file based on searching the reference field.
The first file:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : []
},
{
"reference": 25423,
"order_number": "10_2",
"details" : []
}
]
The second file:
[
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
},
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
I would like to get:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : [
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details" :[
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
}
]
Below is my code in es_func.jq file launched by this command:
jq -n --argfile f1 es_file1.json --argfile f2 es_file2.json -f es_func.jq
INDEX($f2[] ; .reference) as $details
| $f1
| map( ($details[.reference|tostring]| .row_description) as $vn
| if $vn then .details = [{"row_description" : $vn}] else . end)
I get the result only for the last record in 25422 reference with "row description": "descr_1_1" and not have "row_description": "descr_1_0"
[
{
"reference": 25422,
"order_number": "10_1",
"details": [
{
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details": [
{
"row_description": "descr_2_0"
}
]
}
]
I think I'm close to the solution but something is still missing. Thank you
This would be way easier if you used reduce instead.
jq 'reduce inputs[] as $rec (INDEX(.reference);
.[$rec.reference | tostring].details += [$rec]
) | map(.)' es_file1.json es_file2.json
Online demo
Here's a straightforward, reduce-free solution:
jq '
group_by(.reference)
| INDEX(.[]; .[0]|.reference|tostring) as $dict
| input
| map_values(. + {details: $dict[.reference|tostring]})
' 2.json 1.json

Parse and Map 2 Arrays with jq

I am working with a JSON file similar to the one below:
{ "Response" : {
"TimeUnit" : [ 1576126800000 ],
"metaData" : {
"errors" : [ ],
"notices" : [ "query served by:1"]
},
"stats" : {
"data" : [ {
"identifier" : {
"names" : [ "apiproxy", "response_status_code", "target_response_code", "target_ip" ],
"values" : [ "IO", "502", "502", "7.1.143.6" ]
},
"metric" : [ {
"env" : "dev",
"name" : "sum(message_count)",
"values" : [ 0.0]
} ]
} ]
} } }
My object is to display a mapping of the identifier and values like :
apiproxy=IO
response_status_code=502
target_response_code=502
target_ip=7.1.143.6
I have been able to parse both names and values with
.[].stats.data[] | (.identifier.names[]) and .[].stats.data[] | (.identifier.values[])
but I need help with the jq way to map the values.
The whole thing can be done in jq using the -r command-line option:
.[].stats.data[]
| [.identifier.names, .identifier.values]
| transpose[]
| "\(.[0])=\(.[1])"

Moving a json nested key-value pair up one level with jq

I want to use jq to move a nested key:value pair up one level. So given a geojson array of objects like this:
{
"type" : "FeatureCollection",
"features" : [ {
"type" : "Feature",
"geometry" : {
"type" : "MultiLineString",
"coordinates" : [ [ [ -74, 40 ], [ -73, 40 ] ] ]
},
"properties" : {
"startTime" : "20160123T162547-0500",
"endTime" : "20160123T164227-0500",
"activities" : [ {
"activity" : "car",
"group" : "car"
} ]
}
} ]
}
I want to return the exact same object, but with "group": "car" in the features object. So the result would look something like this:
{
"type" : "FeatureCollection",
"features" : [ {
"type" : "Feature",
"geometry" : {
"type" : "MultiLineString",
"coordinates" : [ [ [ -74, 40 ], [ -73, 40 ] ] ]
},
"properties" : {
"type" : "move",
"startTime" : "20160123T162547-0500",
"endTime" : "20160123T164227-0500",
"group" : "car",
"activities" : [ {
"activity" : "car"
} ]
}
} ]
}
This seems simple, but somehow I'm struggling to figure out how to do it with jq. Help appreciated!
jq solution:
jq '(.features[0].properties.group = .features[0].properties.activities[0].group)
| del(.features[0].properties.activities[0].group)' input.json
The output:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "MultiLineString",
"coordinates": [
[
[
-74,
40
],
[
-73,
40
]
]
]
},
"properties": {
"startTime": "20160123T162547-0500",
"endTime": "20160123T164227-0500",
"activities": [
{
"activity": "car"
}
],
"group": "car"
}
}
]
}
In two steps (first add, then delete):
.features[0].properties |= (.group = .activities[0].group)
| del(.features[0].properties.activities[0].group)
Or still more succinctly:
.features[0].properties |=
((.group = .activities[0].group) | del(.activities[0].group))
The problem doesn't discuss what should be done if there are no activities or
if there is more than one activity so the following filter encapsulates the
property change to a function:
def update_activity:
if .activities|length<1 then .
else
.group = .activities[0].group
| del(.activities[0].group)
end
;
.features[].properties |= update_activity
.properties is left unmodified when there are no activities otherwise the group
of the first activity is moved to the property, leaving other activities unmodified.
So if the sample input (slightly abbreviated) were instead
{
"type" : "FeatureCollection",
"features" : [ {
"properties" : {
"activities" : [ {
"activity" : "car",
"group" : "car"
}, {
"activity" : "bike",
"group" : "bike"
} ]
}
} ]
}
the result would be
{
"type": "FeatureCollection",
"features" : [ {
"properties" : {
"group": "car",
"activities": [ {
"activity": "car"
}, {
"activity": "bike",
"group": "bike"
} ]
}
} ]
}
This approach offers a specific place to put the logic dealing with other
variations. E.g. this version of update_activity removes the .group of
all activities:
def update_activity:
if .activities|length<1 then .
else
.group = .activities[0].group
| del(.activities[].group)
end
;
and this version also assigns .group to null in the event there are no activites:
def update_activity:
if .activities|length<1 then
.group = null
else
.group = .activities[0].group
| del(.activities[].group)
end
;
Here is a generalized solution:
# move the key/value specified by apath up to the containing JSON object:
def up(apath):
def trim:
if .[-1] | type == "number" then .[0:-2] | trim
else .
end;
. as $in
| (null | path(apath)) as $p
| ($p|last) as $last
| $in
| getpath($p) as $v
| setpath(($p[0:-1]|trim) + [$last]; $v)
| del(apath)
;
With this definition, the solution is simply:
up( .features[0].properties.activities[0].group )

Replacing specific fields in JSON from text file

I have a json structure and would like to replace strings in 2 fields that are in a seperate text file.
Here is the json file with 2 records:
{
"events" : {
"-KKQQIUR7FAVxBOPOFhr" : {
"dateAdded" : 1487592568926,
"owner" : "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type" : "boycott"
},
"-KKjjM-pAXvTuEjDjoj_" : {
"dateAdded" : 1487933370561,
"owner" : "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type" : "boycott"
}
},
"geo" : {
"-KKQQIUR7FAVxBOPOFhr" : {
".priority" : "qw3yttz1k9",
"g" : "qw3yttz1k9",
"l" : [ 40.762632, -73.973837 ]
},
"-KKjjM-pAXvTuEjDjoj_" : {
".priority" : "qw3yttx6bv",
"g" : "qw3yttx6bv",
"l" : [ 41.889019, -87.626291 ]
}
},
"log" : "null",
"users" : {
"62e6aaa0-a50c-4448-a381-f02efde2316d" : {
"events" : {
"-KKQQIUR7FAVxBOPOFhr" : {
"type" : "boycott"
},
"-KKjjM-pAXvTuEjDjoj_" : {
"type" : "boycott"
}
}
}
}
}
And here is the txt file that I want to substitue in:
49.287130, -123.124026
36.129770, -115.172811
There are lots more records but I kept this to 2 for brevity.
Any help would be appreciated. Thank you.
The problem description seems to assume that the ordering of the key-value pairs within a JSON object is fixed. Different JSON-oriented tools (and indeed different versions of jq) have different takes on this. In any case, the following assumes a version of jq that respects the ordering (e.g. jq 1.5); it also assumes that inputs is available, though that is inessential.
The key to the following solution is the helper function, map_nth_value/2, which modifies the value of the nth key in a JSON object:
def map_nth_value(n; filter):
to_entries
| (.[n] |= {"key": .key, "value": (.value | filter)} )
| from_entries ;
[inputs | select(length > 0) | split(",") | map(tonumber)] as $lists
| reduce range(0; $lists|length) as $i
( $object;
.geo |= map_nth_value($i; .l = $lists[$i] ) )
With the above jq program in a file (say program.jq), and with the text file in a file (say input.txt) and the JSON object in a file (say object.json), the following invocation:
jq -R -n --argfile object object.json -f program.jq input.txt
produces:
{
"events": {
"-KKQQIUR7FAVxBOPOFhr": {
"dateAdded": 1487592568926,
"owner": "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type": "boycott"
},
"-KKjjM-pAXvTuEjDjoj_": {
"dateAdded": 1487933370561,
"owner": "62e6aaa0-a50c-4448-a381-f02efde2316d",
"type": "boycott"
}
},
"geo": {
"-KKQQIUR7FAVxBOPOFhr": {
".priority": "qw3yttz1k9",
"g": "qw3yttz1k9",
"l": [
49.28713,
-123.124026
]
},
"-KKjjM-pAXvTuEjDjoj_": {
".priority": "qw3yttx6bv",
"g": "qw3yttx6bv",
"l": [
36.12977,
-115.172811
]
}
},
"log": "null",
"users": {
"62e6aaa0-a50c-4448-a381-f02efde2316d": {
"events": {
"-KKQQIUR7FAVxBOPOFhr": {
"type": "boycott"
},
"-KKjjM-pAXvTuEjDjoj_": {
"type": "boycott"
}
}
}
}
}