JQ: Merging array entities with same attribute name - json

I have a data structure like this:
[
{
"some_id": "123",
"items_1": [
{
"label": "my_name"
}
],
"items_2": []
},
{
"some_id": "123",
"items_1": [],
"items_2": [
"value_1",
"value_3"
]
},
{
"some_id": "123",
"items_1": [],
"items_2": [
"value_1",
"value_2"
]
}
]
And I want to modify the data into something like
[
{
"some_id": "123",
"items_1": [
{
"label": "my_name"
}
],
"items_2": [
"value_1",
"value_2",
"value_3"
]
}
]
Basically taking any fields that are the same and concatenating the arrays together. Similarly, items_1 can have some value for the same id down the line and I want to concatenate that array with another if needed.
I have tried using JQ with something like
jq -Mr '[ group_by(.media_url)[] | add | tojson ] | join(",\n")' test.json
However this doesnt seem to be working.

Would the following work for you?
group_by(.some_id) | map({
some_id: map(.some_id) | first,
items_1: map(.items_1) | add | unique,
items_2: map(.items_2) | add | unique })
demo

Related

Access to first key of a nested json using jq and obtain its value

I have the following JSON:
{
"query": "rest ec",
"elected_facts_mapping": {
"AWS": {
"ECS": {
"attachments": [
"restart_ecs"
],
"text": [
"Great!"
]
}
}
},
"top_facts_mapping": {
"AWS": {
"ECS": {
"attachments": [
"restart_ecs"
],
"text": [
"Great!"
]
},
"EC2": {
"attachments": [
"create_ec2"
],
"text": [
"Awesome"
]
}
},
"GitHub": {
"Pull": {
"attachments": [
"pull_req"
],
"text": [
"Be right on it"
]
}
},
"testtttt": {
"test": {
"attachments": [
"hello_world"
],
"text": [
"Be right on it"
]
}
},
"fgjgh": {
"fnfgj": {
"attachments": [
"hello_world"
],
"text": [
"Be right on it"
]
}
},
"tessttertre": {
"gfdgfdgfd": {
"attachments": [
"hello_world"
],
"text": [
"Great!"
]
}
}
},
"elected_facts_with_prefix_text": null
}
And I want to access to top_facts_mapping's first key AWS and it's first key ECS
I am trying to do this (in my DSL):
'.span | fromjson'
'.span_data.top_facts_mapping | keys[0]'
'.span_data.top_facts_mapping[${top_facts_prepare_top_fact_topic}] | keys[0]'
'.top_facts_prepare_top_fact_topic_subtopic[${top_facts_prepare_top_fact_topic}][${top_facts_prepare_top_fact_topic_subtopic}]'
You could use to_entries to turn the object into an array of key-value pairs, then select the first value using [0].value
.top_facts_mapping | to_entries[0].value | to_entries[0].value
{
"attachments": [
"restart_ecs"
],
"text": [
"Great!"
]
}
Demo
If at one level the object may be empty, you can prepend each to_entries with try (optionally followed by a catch clause)
Here's a stream-based approach which disassembles the input using the --stream option, filters for the "top_facts_mapping" key on top level .[0][0], truncates the stream to descend 3 levels, re-assembles the stream using fromstream, and outputs the first match:
jq --stream -n 'first(fromstream(3| truncate_stream(inputs | select(.[0][0] == "top_facts_mapping"))))'
{
"attachments": [
"restart_ecs"
],
"text": [
"Great!"
]
}
You could use the keys_unsorted builtin, since the underlying object is a dictionary and not a list
.top_facts_mapping | keys_unsorted[0] as $k | .[$k] | .[keys_unsorted[0]]
The above filter could be re-written with a simple function
def get_firstkey_val: keys_unsorted[0] as $k | .[$k];
.top_facts_mapping |
get_firstkey_val | get_firstkey_val
Or with some jq trick-play, assumes the path provided top_facts_mapping is guaranteed to exist
getpath([ paths | select(.[-3] == "top_facts_mapping" ) ] | first)
Since the paths built-in constructs the root to leaf paths as arrays, we all paths containing the second to last field (denoted by .[-3]) as "top_facts_mapping" which returns paths inside it
From which first selects the first entity in the list i.e. below list
[
"top_facts_mapping",
"AWS",
"ECS"
]
Use getpath/1 to obtain the JSON value at the obtained path.
If there is a risk of the key top_facts_mapping not being present in the JSON, getpath/1 could return an error as written above. Fix it by adding a proper check
([ paths | select(.[-3] == "top_facts_mapping" ) ] | first) as $p |
if $p | length > 0 then getpath($p) else empty end

jq - how to filter values in an inner array without knowing the key

I have the following JSON:
{
"ids": {
"sda": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
],
"sdb": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
],
"sdb1": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
],
"sdc": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
}
What I want to achieve is to search for .*scsi0$ in the values of the inner array and get sdb as the result.
Using JSON jq endswith to filter results:
.ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key
Results in:
"sdb"
Try it here: https://jqplay.org/s/DAhKosXXgiA
First get .ids:
{
"sda": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
],
"sdb": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
],
"sdb1": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
],
"sdc": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
...then pipe the results to the to_entries function to convert that to an array of {key, value} objects, .ids | to_entries:
[
{
"key": "sda",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
]
},
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
},
{
"key": "sdb1",
"value": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
]
},
{
"key": "sdc",
"value": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
]
...next stream the list of objects, .ids | to_entries[]:
{
"key": "sda",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
]
}
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
}
{
"key": "sdb1",
"value": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
]
}
{
"key": "sdc",
"value": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
...and select from a stream of values, .ids | to_entries[] | select(.value[]) where value endswith "scsi0", select(.value[] | endswith("scsi0")) :
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
}
...finally, get the key value, .ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key:
"sdb"
Command line:
jq '.ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key'
Try it here: https://jqplay.org/s/DAhKosXXgiA
.ids | to_entries[] | select(.value[] | test(".*scsi0$")) | .key
Will print the key if any value in the array matches your regex.
If none match, there will be no output.
to_entries is uses to easy capture the key of the objects.
select to filter the values
.value[] | test(".*scsi0$") checks each value to the regex
.key and we show the key of the result
Try it online

convert json to csv using jq bash

have a JSON data like below
"metric": {
"name" : "name1"
},
"values": [
[
16590879,
"0.043984349"
],
"values": [
[
16590876,
"0.043983444"
]
]
}
}
writing below jq , but not giving proper result
jq -r '[.metric.name,(.values[] | map(.) | #csv)'
Actual result
[
"name1",
"16590879",\"0.043984349\"",
"16590876",\"0.043983444\"",
"16590874",\"0.043934345\""
Expected result
name1,16590879,0.043984349
name1,16590876,0.043983444
name2,16590874,0.043934345
The sample data provided is invalid as JSON, but assuming it has been adjusted as shown below, we would have:
< sample.json jq -r '[.metric.name] + .values[] | #csv'
"name1",16590879,"0.043984349"
"name1",16590876,"0.043983444"
If you don't want the quotation marks, then use join(",") instead of #csv.
sample.json
{
"metric": {
"name": "name1"
},
"values": [
[
16590879,
"0.043984349"
],
[
16590876,
"0.043983444"
]
]
}

Retrieve value based on contents of another value

I have this json that i am trying to get the just the id out of based on a contains from another value. I am able to jq the contains part but when I add on | .id i cannot get a result
{
"restrictions": [
{
"id": 1,
"database": {
"match": "exact",
"value": "db_contoso"
},
"measurement": {},
"permissions": [
"write"
]
},
{
"id": 2,
"database": {
"match": "exact",
"value": "db2_contoso"
},
"measurement": {},
"permissions": [
"write"
]
}
]
}
When id run
jq -r '.restrictions[] | .database.value | select(contains("conto")?)
I get the values of db_contoso and db2_contoso. but I am trying to pull just the id based on that. When I add | .id to the end of that command I get nothing.
So that would be to do below. Select the whole object matching the condition and get the value of .id
jq '.restrictions[] | select(.database.value | contains("conto")).id'

jq: output values of ids instead of numbers

Here's my input json:
{
"channels": [
{ "id": 1, "name": "Pop"},
{ "id": 2, "name": "Rock"}
],
"links": [
{ "id": 2, "streams": [ {"url": "http://example.com/rock"} ] },
{ "id": 1, "streams": [ {"url": "http://example.com/pop"} ] }
]
}
This is what I want as an output:
"http://example.com/pop"
"Pop"
"http://example.com/rock"
"Rock"
So I need jq to replace .channels[].id with .links[].streams[0].url based on .links[].id
I don't know if it's right, but this is how I managed to output the urls:
(.channels[].id | tostring) as $ids | [.links[]] | map({(.id | tostring): .streams[0].url}) | add as $urls | $urls[$ids]
"http://example.com/pop"
"http://example.com/rock"
The question is, how do I add .channels[].name to it?
You sometimes have to be careful what you ask for, but this will produce the result you said you want:
.channels[] as $channel
| $channel.name,
(.links[] | select(.id == $channel.id) | .streams[0].url)
Output for the given input:
"Pop"
"http://example.com/pop"
"Rock"
"http://example.com/rock"
Here is a solution which uses reduce and setpath to make a $urls lookup table from .links and then scans .channels generating corresponding urls and names.
(
reduce .links[] as $l (
{};
setpath([ $l.id|tostring ]; [$l.streams[].url])
)
) as $urls
| .channels[]
| $urls[ .id|tostring ][], .name
If multiple urls are present in the "streams" attribute this will
print them all before printing the name. e.g. if the input is
{
"channels": [
{ "id": 1, "name": "Pop"},
{ "id": 2, "name": "Rock"}
],
"links": [
{ "id": 2, "streams": [ {"url": "http://example.com/rock"},
{"url": "http://example.com/hardrock"} ] },
{ "id": 1, "streams": [ {"url": "http://example.com/pop"} ] }
]
}
the output will be
"http://example.com/pop"
"Pop"
"http://example.com/rock"
"http://example.com/hardrock"
"Rock"