I have the following JSON snippet:
{
"a": [ 1, "a:111" ],
"b": [ 2, "a:111", "irrelevant" ],
"c": [ 1, "a:222" ],
"d": [ 1, "b:222" ],
"e": [ 2, "b:222", "irrelevant"]
}
and I would like to swap the key with the second value of the array and accumulate keys with the same value, discarding possible values that come after the second one:
{ "a:111": [ [ 1, "a" ], [ 2, "b" ] ],
"a:222": [ [ 1, "c" ] ],
"b:222": [ [ 1, "d" ], [ 2, "e" ] ]
}
My initial solution is the following:
echo '{
"a": [ 1, "a:111" ],
"b": [ 2, "a:111", "irrelevant" ],
"c": [ 1, "a:222" ],
"d": [ 1, "b:222" ],
"e": [ 2, "b:222", "irrelevant"]
}' \
| jq 'to_entries
| map({(.value[1]|tostring) : [[.value[0], .key]]})
| reduce .[] as $o ({}; reduce ($o|keys)[] as $key (.; .[$key] += $o[$key]))'
This produces the needed result but is probably not very robust, hard to read and excessively long. I guess there is a much more readable solution using with_entries but it has eluded me for now.
Short jq approach:
jq 'reduce to_entries[] as $o ({};
.[$o.value[1]] += [[$o.value[0], $o.key]])' input.json
The output:
{
"a:111": [
[
1,
"a"
],
[
2,
"b"
]
],
"a:222": [
[
1,
"c"
]
],
"b:222": [
[
1,
"d"
],
[
2,
"e"
]
]
}
Related
I have an array of objects with 2 properties, say "key" and "value":
[
{key: 1, value: a},
{key: 2, value: b},
{key: 1, value: c}
]
Now, I would like to merge the values of the "value" properties of objects with the same "key" property value. That is the previous array is transformed into:
[
{key: 1, value: [a, c]},
{key: 2, value: [b]}
]
I tried something like:
$ echo '[{"key": "1", "val": "a"}, {"key": "2", "val": "b"}, {"key": "1", "val": "c"}]' | jq '. | group_by(.["key"]) | .[] | reduce .[] as $in ({"val": []}; {"key": $in.key, "val": [$in.val] + .["val"]})'
But it triggers a jq syntax error and I have no idea why. I am stuck.
Any idea ?
Thanks
B
Your approach using reduce could be sanitized to
jq 'group_by(.["key"]) | .[] |= reduce .[] as $in (
{value: []}; .key = $in.key | .value += [$in.value]
)'
[
{
"value": [
"a",
"c"
],
"key": 1
},
{
"value": [
"b"
],
"key": 2
}
]
Demo
Another approach using map would be
jq 'group_by(.key) | map({key: .[0].key, value: map(.value)})'
[
{
"key": 1,
"value": [
"a",
"c"
]
},
{
"key": 2,
"value": [
"b"
]
}
]
Demo
I have an array of objects with 2 properties, say "key" and "value":
[
{key: 1, value: a},
{key: 2, value: b},
{key: 1, value: c}
]
Now, I would like to merge the values of the "value" properties of objects with the same "key" property value. That is the previous array is transformed into:
[
{key: 1, value: [a, c]},
{key: 2, value: [b]}
]
I tried something like:
$ echo '[{"key": "1", "val": "a"}, {"key": "2", "val": "b"}, {"key": "1", "val": "c"}]' | jq '. | group_by(.["key"]) | .[] | reduce .[] as $in ({"val": []}; {"key": $in.key, "val": [$in.val] + .["val"]})'
But it triggers a jq syntax error and I have no idea why. I am stuck.
Any idea ?
Thanks
B
Your approach using reduce could be sanitized to
jq 'group_by(.["key"]) | .[] |= reduce .[] as $in (
{value: []}; .key = $in.key | .value += [$in.value]
)'
[
{
"value": [
"a",
"c"
],
"key": 1
},
{
"value": [
"b"
],
"key": 2
}
]
Demo
Another approach using map would be
jq 'group_by(.key) | map({key: .[0].key, value: map(.value)})'
[
{
"key": 1,
"value": [
"a",
"c"
]
},
{
"key": 2,
"value": [
"b"
]
}
]
Demo
I have json data of the form below. I want to transform it, making the key of each record into a field of that record in a streaming fashion. My problem: I don't know how to do that without truncating the key and losing it. I have inferred the required structure of the stream, see at the bottom.
Question: how do I transform the input data into a stream without losing the key?
Data:
{
"foo" : {
"a" : 1,
"b" : 2
},
"bar" : {
"a" : 1,
"b" : 2
}
}
A non-streaming transformation uses:
jq 'with_entries(.value += {key}) | .[]'
yielding:
{
"a": 1,
"b": 2,
"key": "foo"
}
{
"a": 1,
"b": 2,
"key": "bar"
}
Now, if my data file is very very large, I'd prefer to stream:
jq -ncr --stream 'fromstream(1|truncate_stream(inputs))`
The problem: this truncates the keys "foo" and "bar". On the other hand, not truncating the stream and just calling fromstream(inputs) is pretty meaningless: this makes the whole --stream part a no-op and jq reads everything into memory.
The structure of the stream is the following, using . | tostream:
[
[
"foo",
"a"
],
1
]
[
[
"foo",
"b"
],
2
]
[
[
"foo",
"b"
]
]
[
[
"bar",
"a"
],
1
]
[
[
"bar",
"b"
],
2
]
[
[
"bar",
"b"
]
]
[
[
"bar"
]
]
while with truncation, . as $dot | (1|truncate_stream($dot | tostream)), the structure is:
[
[
"a"
],
1
]
[
[
"b"
],
2
]
[
[
"b"
]
]
[
[
"a"
],
1
]
[
[
"b"
],
2
]
[
[
"b"
]
]
So it looks like that in order for me to construct a stream the way I need it, I will have to generate the following structure (I have inserted a [["foo"]] after the first record is finished):
[
[
"foo",
"a"
],
1
]
[
[
"foo",
"b"
],
2
]
[
[
"foo",
"b"
]
]
[
[
"foo"
]
]
[
[
"bar",
"a"
],
1
]
[
[
"bar",
"b"
],
2
]
[
[
"bar",
"b"
]
]
[
[
"bar"
]
]
Making this into a string jq can consume, I indeed get what I need (see also the snippet here: https://jqplay.org/s/iEkMfm_u92):
fromstream([ [ "foo", "a" ], 1 ],[ [ "foo", "b" ], 2 ],[ [ "foo", "b" ] ],[["foo"]],[ [ "bar", "a" ], 1 ],[ [ "bar", "b" ], 2 ],[ [ "bar", "b" ] ],[ [ "bar" ] ])
yielding:
{
"foo": {
"a": 1,
"b": 2
}
}
{
"bar": {
"a": 1,
"b": 2
}
}
The final result (see https://jqplay.org/s/-UgbEC4BN8) would be:
fromstream([ [ "foo", "a" ], 1 ],[ [ "foo", "b" ], 2 ],[ [ "foo", "b" ] ],[["foo"]],[ [ "bar", "a" ], 1 ],[ [ "bar", "b" ], 2 ],[ [ "bar", "b" ] ],[ [ "bar" ] ]) | with_entries(.value += {key}) | .[]
yielding
{
"a": 1,
"b": 2,
"key": "foo"
}
{
"a": 1,
"b": 2,
"key": "bar"
}
A generic function, atomize(s), for converting objects to key-value objects is provided in the jq Cookbook. Using it, the solution to the problem here is simply:
atomize(inputs) | to_entries[] | .value + {key}
({key} is shorthand for {key: .key}.)
For reference, here is the def:
atomize(s)
# Convert an object (presented in streaming form as the stream s) into
# a stream of single-key objects
# Example:
# atomize(inputs) (used in conjunction with "jq -n --stream")
def atomize(s):
fromstream(foreach s as $in ( {previous:null, emit: null};
if ($in | length == 2) and ($in|.[0][0]) != .previous and .previous != null
then {emit: [[.previous]], previous: ($in|.[0][0])}
else { previous: ($in|.[0][0]), emit: null}
end;
(.emit // empty), $in
) ) ;
I've been thinking and searching for a long time, but I didn't find out what I'm looking for.
I'm using JQ to parse tshark (-ek) json output, but I'm a jq newby
When a frame is multivalue I have a JSON similar to this:
{
"timestamp": "1525627021656",
"layers": {
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST",
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST",
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST",
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2",
"2"
],
"diameter_CC-Request-Number": [
"10",
"3"
],
"diameter_Rating-Group": [
"9004",
"9001"
],
"diameter_Called-Station-Id": [
"testing",
"testing"
],
"diameter_User-Name": [
"testuser",
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666",
"77777777777"
],
"gtp_qos_version": [
"0x00000008",
"0x00000005"
],
"gtp_qos_max_dl": [
"8640",
"42"
],
"diameter_Session-Id": [
"test1;sessionID1;test1",
"test2;sessionID2;test2"
]
}
}
As you can see, many keys are array and I want to iterate them to create different json objects in a result like this:
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"3"
],
"diameter_Rating-Group": [
"9001"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"77777777777"
],
"gtp_qos_version": [
"0x00000005"
],
"gtp_qos_max_dl": [
"42"
],
"diameter_Session-Id": [
"test2;sessionID2;test2"
]
}
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"10"
],
"diameter_Rating-Group": [
"9004"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666"
],
"gtp_qos_version": [
"0x00000008"
],
"gtp_qos_max_dl": [
"8640"
],
"diameter_Session-Id": [
"test1;sessionID1;test1"
]
}
Another hand made example:
INPUT:
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value1" , "value2"],
"any_key_name": ["value4" ,"value5"]
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value6" , "value7", "value8"],
"any_key_name": ["value9" ,"value10" , "value11"]
}
Desired output:
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value1"],
"any_key_name": ["value4"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value2"],
"any_key_name": ["value5"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value6"],
"any_key_name": ["value9"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value7"],
"any_key_name": ["value10"],
}
{
"key_single": ["single_value"],
"key2": ["single_value"],
"multiple_value_key": ["value8"],
"any_key_name": ["value11"],
}
Could you help Me?
Thanks in advance.
It looks like you want to take, in turn, the i-th element of the selected arrays. Using your second example, this could be done like so:
range(0; .multiple_value_key|length) as $i
| . + { multiple_value_key: [.multiple_value_key[$i]],
any_key_name: [.any_key_name[$i]] }
The output in compact form:
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value1"],"any_key_name":["value4"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value2"],"any_key_name":["value5"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value6"],"any_key_name":["value9"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value7"],"any_key_name":["value10"]}
{"key_single":["single_value"],"key2":["single_value"],"multiple_value_key":["value8"],"any_key_name":["value11"]}
Here is a simple solution to the problem as described in the "comments", though the
output differs slightly from that shown in the Q.
For clarity, a helper function is defined for producing the $i-th slice of an object,
that is, for all array-valued keys with array-length greater than 1,
the value is replaced by the $i-th item in the array.
def slice($i):
map_values(if (type == "array" and length>1)
then [.[$i]]
else . end);
The solution is then simply:
.layers
| range(0; [.[] | length] | max) as $i
| slice($i)
Output
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"10"
],
"diameter_Rating-Group": [
"9004"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"66666666666"
],
"gtp_qos_version": [
"0x00000008"
],
"gtp_qos_max_dl": [
"8640"
],
"diameter_Session-Id": [
"test1;sessionID1;test1"
]
}
{
"frame_time_epoch": [
"1525627021.656417000"
],
"ip_src": [
"10.10.10.10"
],
"ip_src_host": [
"test"
],
"ip_dst": [
"10.10.10.11"
],
"ip_dst_host": [
"dest_test"
],
"diameter_Event-Timestamp": [
"May 6, 2018 19:17:02.000000000 CEST"
],
"diameter_Origin-Host": [
"TESTHOST"
],
"diameter_Destination-Host": [
"DESTHOST"
],
"diameter_CC-Request-Type": [
"2"
],
"diameter_CC-Request-Number": [
"3"
],
"diameter_Rating-Group": [
"9001"
],
"diameter_Called-Station-Id": [
"testing"
],
"diameter_User-Name": [
"testuser"
],
"diameter_Subscription-Id-Data": [
"77777777777"
],
"gtp_qos_version": [
"0x00000005"
],
"gtp_qos_max_dl": [
"42"
],
"diameter_Session-Id": [
"test2;sessionID2;test2"
]
}
I have a json file, example.json:
[
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
...and would like to add multiple parent items with the desired output:
{
"target": "Systolic",
"datapoints": [
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
}
I'm having trouble, attempting things like:
cat example.json | jq -s '{target:.[]}', which adds the one key but not understanding how to add a value to the target and another key datapoints.
With straightforward jq expression:
jq '{target: "Systolic", datapoints: .}' example.json
The output:
{
"target": "Systolic",
"datapoints": [
[
"126",
1522767000
],
[
"122",
1522859400
],
[
"126",
1523348520
]
]
}