Swap key and value and accumulate values

Swap key and value and accumulate values - json

I have the following JSON snippet:
{
"a": [ 1, "a:111" ],
"b": [ 2, "a:111", "irrelevant" ],
"c": [ 1, "a:222" ],
"d": [ 1, "b:222" ],
"e": [ 2, "b:222", "irrelevant"]
}
and I would like to swap the key with the second value of the array and accumulate keys with the same value, discarding possible values that come after the second one:
{ "a:111": [ [ 1, "a" ], [ 2, "b" ] ],
"a:222": [ [ 1, "c" ] ],
"b:222": [ [ 1, "d" ], [ 2, "e" ] ]
}
My initial solution is the following:
echo '{
"a": [ 1, "a:111" ],
"b": [ 2, "a:111", "irrelevant" ],
"c": [ 1, "a:222" ],
"d": [ 1, "b:222" ],
"e": [ 2, "b:222", "irrelevant"]
}' \
| jq 'to_entries
| map({(.value[1]|tostring) : [[.value[0], .key]]})
| reduce .[] as $o ({}; reduce ($o|keys)[] as $key (.; .[$key] += $o[$key]))'
This produces the needed result but is probably not very robust, hard to read and excessively long. I guess there is a much more readable solution using with_entries but it has eluded me for now.

Short jq approach:
jq 'reduce to_entries[] as $o ({};
.[$o.value[1]] += [[$o.value[0], $o.key]])' input.json
The output:
{
"a:111": [
[
1,
"a"
],
[
2,
"b"
]
],
"a:222": [
[
1,
"c"
]
],
"b:222": [
[
1,
"d"
],
[
2,
"e"
]
]
}

Related

How to combine key values from list of dictionaries using jq [duplicate]

I have an array of objects with 2 properties, say "key" and "value":
[
{key: 1, value: a},
{key: 2, value: b},
{key: 1, value: c}
]
Now, I would like to merge the values of the "value" properties of objects with the same "key" property value. That is the previous array is transformed into:
[
{key: 1, value: [a, c]},
{key: 2, value: [b]}
]
I tried something like:
$ echo '[{"key": "1", "val": "a"}, {"key": "2", "val": "b"}, {"key": "1", "val": "c"}]' | jq '. | group_by(.["key"]) | .[] | reduce .[] as $in ({"val": []}; {"key": $in.key, "val": [$in.val] + .["val"]})'
But it triggers a jq syntax error and I have no idea why. I am stuck.
Any idea ?
Thanks
B

Your approach using reduce could be sanitized to
jq 'group_by(.["key"]) | .[] |= reduce .[] as $in (
{value: []}; .key = $in.key | .value += [$in.value]
)'
[
{
"value": [
"a",
"c"
],
"key": 1
},
{
"value": [
"b"
],
"key": 2
}
]
Demo
Another approach using map would be
jq 'group_by(.key) | map({key: .[0].key, value: map(.value)})'
[
{
"key": 1,
"value": [
"a",
"c"
]
},
{
"key": 2,
"value": [
"b"
]
}
]
Demo

Merging property values of objects with common key with jq

I have an array of objects with 2 properties, say "key" and "value":
[
{key: 1, value: a},
{key: 2, value: b},
{key: 1, value: c}
]
Now, I would like to merge the values of the "value" properties of objects with the same "key" property value. That is the previous array is transformed into:
[
{key: 1, value: [a, c]},
{key: 2, value: [b]}
]
I tried something like:
$ echo '[{"key": "1", "val": "a"}, {"key": "2", "val": "b"}, {"key": "1", "val": "c"}]' | jq '. | group_by(.["key"]) | .[] | reduce .[] as $in ({"val": []}; {"key": $in.key, "val": [$in.val] + .["val"]})'
But it triggers a jq syntax error and I have no idea why. I am stuck.
Any idea ?
Thanks
B

Your approach using reduce could be sanitized to
jq 'group_by(.["key"]) | .[] |= reduce .[] as $in (
{value: []}; .key = $in.key | .value += [$in.value]
)'
[
{
"value": [
"a",
"c"
],
"key": 1
},
{
"value": [
"b"
],
"key": 2
}
]
Demo
Another approach using map would be
jq 'group_by(.key) | map({key: .[0].key, value: map(.value)})'
[
{
"key": 1,
"value": [
"a",
"c"
]
},
{
"key": 2,
"value": [
"b"
]
}
]
Demo

Streaming without truncating

I have json data of the form below. I want to transform it, making the key of each record into a field of that record in a streaming fashion. My problem: I don't know how to do that without truncating the key and losing it. I have inferred the required structure of the stream, see at the bottom.
Question: how do I transform the input data into a stream without losing the key?
Data:
{
"foo" : {
"a" : 1,
"b" : 2
},
"bar" : {
"a" : 1,
"b" : 2
}
}
A non-streaming transformation uses:
jq 'with_entries(.value += {key}) | .[]'
yielding:
{
"a": 1,
"b": 2,
"key": "foo"
}
{
"a": 1,
"b": 2,
"key": "bar"
}
Now, if my data file is very very large, I'd prefer to stream:
jq -ncr --stream 'fromstream(1|truncate_stream(inputs))`
The problem: this truncates the keys "foo" and "bar". On the other hand, not truncating the stream and just calling fromstream(inputs) is pretty meaningless: this makes the whole --stream part a no-op and jq reads everything into memory.
The structure of the stream is the following, using . | tostream:
[
[
"foo",
"a"
],
1
]
[
[
"foo",
"b"
],
2
]
[
[
"foo",
"b"
]
]
[
[
"bar",
"a"
],
1
]
[
[
"bar",
"b"
],
2
]
[
[
"bar",
"b"
]
]
[
[
"bar"
]
]
while with truncation, . as $dot | (1|truncate_stream($dot | tostream)), the structure is:
[
[
"a"
],
1
]
[
[
"b"
],
2
]
[
[
"b"
]
]
[
[
"a"
],
1
]
[
[
"b"
],
2
]
[
[
"b"
]
]
So it looks like that in order for me to construct a stream the way I need it, I will have to generate the following structure (I have inserted a [["foo"]] after the first record is finished):
[
[
"foo",
"a"
],
1
]
[
[
"foo",
"b"
],
2
]
[
[
"foo",
"b"
]
]
[
[
"foo"
]
]
[
[
"bar",
"a"
],
1
]
[
[
"bar",
"b"
],
2
]
[
[
"bar",
"b"
]
]
[
[
"bar"
]
]
Making this into a string jq can consume, I indeed get what I need (see also the snippet here: https://jqplay.org/s/iEkMfm_u92):
fromstream([ [ "foo", "a" ], 1 ],[ [ "foo", "b" ], 2 ],[ [ "foo", "b" ] ],[["foo"]],[ [ "bar", "a" ], 1 ],[ [ "bar", "b" ], 2 ],[ [ "bar", "b" ] ],[ [ "bar" ] ])
yielding:
{
"foo": {
"a": 1,
"b": 2
}
}
{
"bar": {
"a": 1,
"b": 2
}
}
The final result (see https://jqplay.org/s/-UgbEC4BN8) would be:
fromstream([ [ "foo", "a" ], 1 ],[ [ "foo", "b" ], 2 ],[ [ "foo", "b" ] ],[["foo"]],[ [ "bar", "a" ], 1 ],[ [ "bar", "b" ], 2 ],[ [ "bar", "b" ] ],[ [ "bar" ] ]) | with_entries(.value += {key}) | .[]
yielding
{
"a": 1,
"b": 2,
"key": "foo"
}
{
"a": 1,
"b": 2,
"key": "bar"
}

A generic function, atomize(s), for converting objects to key-value objects is provided in the jq Cookbook. Using it, the solution to the problem here is simply:
atomize(inputs) | to_entries[] | .value + {key}
({key} is shorthand for {key: .key}.)
For reference, here is the def:
atomize(s)
# Convert an object (presented in streaming form as the stream s) into
# a stream of single-key objects
# Example:
# atomize(inputs) (used in conjunction with "jq -n --stream")
def atomize(s):
fromstream(foreach s as $in ( {previous:null, emit: null};
if ($in | length == 2) and ($in|.[0][0]) != .previous and .previous != null
then {emit: [[.previous]], previous: ($in|.[0][0])}
else { previous: ($in|.[0][0]), emit: null}
end;
(.emit // empty), $in
) ) ;

Jq iterating array objects to create new json objects

I've been thinking and searching for a long time, but I didn't find out what I'm looking for.
I'm using JQ to parse tshark (-ek) json output, but I'm a jq newby
When a frame is multivalue I have a JSON similar to this:
{
"timestamp": "1525627021656",
"layers": {
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST",
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST",
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST",
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2",
"2"
],
"diameter_CC-Request-Number": [
"10",
"3"
],
"diameter_Rating-Group": [
"9004",
"9001"
],
"diameter_Called-Station-Id": [
"testing",
"testing"
],
"diameter_User-Name": [
"testuser",
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666",
"77777777777"
],
"gtp_qos_version": [
"0x00000008",
"0x00000005"
],
"gtp_qos_max_dl": [
"8640",
"42"
],
"diameter_Session-Id": [
"test1;sessionID1;test1",
"test2;sessionID2;test2"
]
}
}
As you can see, many keys are array and I want to iterate them to create different json objects in a result like this:
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"3"
],
"diameter_Rating-Group": [
"9001"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"77777777777"
],
"gtp_qos_version": [
"0x00000005"
],
"gtp_qos_max_dl": [
"42"
],
"diameter_Session-Id": [
"test2;sessionID2;test2"
]
}
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"10"
],
"diameter_Rating-Group": [
"9004"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666"
],
"gtp_qos_version": [
"0x00000008"
],
"gtp_qos_max_dl": [
"8640"
],
"diameter_Session-Id": [
"test1;sessionID1;test1"
]
}
Another hand made example:
INPUT:
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value1" , "value2"],
"any_key_name": ["value4" ,"value5"]
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value6" , "value7", "value8"],
"any_key_name": ["value9" ,"value10" , "value11"]
}
Desired output:
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value1"],
"any_key_name": ["value4"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value2"],
"any_key_name": ["value5"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value6"],
"any_key_name": ["value9"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value7"],
"any_key_name": ["value10"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value8"],
"any_key_name": ["value11"],
}
Could you help Me?
Thanks in advance.

It looks like you want to take, in turn, the i-th element of the selected arrays. Using your second example, this could be done like so:
range(0; .multiple_value_key|length) as $i
| . + { multiple_value_key: [.multiple_value_key[$i]],
any_key_name: [.any_key_name[$i]] }
The output in compact form:
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value1"],"any_key_name":["value4"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value2"],"any_key_name":["value5"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value6"],"any_key_name":["value9"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value7"],"any_key_name":["value10"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value8"],"any_key_name":["value11"]}

Here is a simple solution to the problem as described in the "comments", though the
output differs slightly from that shown in the Q.
For clarity, a helper function is defined for producing the $i-th slice of an object,
that is, for all array-valued keys with array-length greater than 1,
the value is replaced by the $i-th item in the array.
def slice($i):
map_values(if (type == "array" and length>1)
then [.[$i]]
else . end);
The solution is then simply:
.layers
| range(0; [.[] | length] | max) as $i
| slice($i)
Output
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"10"
],
"diameter_Rating-Group": [
"9004"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666"
],
"gtp_qos_version": [
"0x00000008"
],
"gtp_qos_max_dl": [
"8640"
],
"diameter_Session-Id": [
"test1;sessionID1;test1"
]
}
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"3"
],
"diameter_Rating-Group": [
"9001"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"77777777777"
],
"gtp_qos_version": [
"0x00000005"
],
"gtp_qos_max_dl": [
"42"
],
"diameter_Session-Id": [
"test2;sessionID2;test2"
]
}

Add multiple parent keys to json array

I have a json file, example.json:
[
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
...and would like to add multiple parent items with the desired output:
{
"target": "Systolic",
"datapoints": [
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
}
I'm having trouble, attempting things like:
cat example.json | jq -s '{target:.[]}', which adds the one key but not understanding how to add a value to the target and another key datapoints.

With straightforward jq expression:
jq '{target: "Systolic", datapoints: .}' example.json
The output:
{
"target": "Systolic",
"datapoints": [
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
}

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Swap key and value and accumulate values - json

Short jq approach: jq 'reduce to_entries[] as $o ({}; .[$o.value[1]] += [[$o.value[0], $o.key]])' input.json The output: { "a:111": [ [ 1, "a" ], [ 2, "b" ] ], "a:222": [ [ 1, "c" ] ], "b:222": [ [ 1, "d" ], [ 2, "e" ] ] }

Related

How to combine key values from list of dictionaries using jq [duplicate]

Merging property values of objects with common key with jq

Streaming without truncating

Jq iterating array objects to create new json objects

Add multiple parent keys to json array

Categories

Resources