Flatten JSON with jq retaining key names - json

I'm trying to flatten a JSON consisting of nested objects. The top layer contains several key/value pairs, where each value is itself an array of a number of objects (the bottom layer).
What I would like to get, using jq, is simply an array of objects containing all the objects of the bottom layer, each of which with an additional key/value pair identifying the top-layer key it originally belonged to.
In other words, I would like to turn a JSON
{
"key1": [obj1, obj2],
"key2": [obj3]
}
into a plain array
[OBJ1, OBJ2, OBJ3]
where each OBJi is simply the original object with an extra key/value pair
"parent-key-name": keyx
where keyx would be the top-layer key obji belonged to, i.e. "key1" for obj1 and obj2, and "key2" for obj3.
I'm struggling with the fact that when referencing the objects in the bottom layer, e.g. via .[], jq does not seem to have inbuilt functionality to access associated top-layer information. However, I'm new to jq, and hope there is an easy solution after all.

Given the following input :
{
"key1": [{"name":"Emma"},{"name":"Bob"}],
"key2": [{"name":"Jean"}]
}
You can divide your items to entries, store the key in a variable and add the value for each item in value object:
jq '[ to_entries[] | .key as $parent | .value[] |
.["parent-key-name"] |= (.+ $parent) ] ' test.json
which gives the following output :
[
{
"name": "Emma",
"parent-key-name": "key1"
},
{
"name": "Bob",
"parent-key-name": "key1"
},
{
"name": "Jean",
"parent-key-name": "key2"
}
]

The solution presented below consists of two steps, each of which might be helpful separately, e.g. if someone wants to "flatten" the JSON in a slightly different way.
First, let's make the changes to obj[i] "in-place":
with_entries( .key as $k | .value[] |= ( . + {"parent-key-name": $k} ) )
Example:
$ jq -n -c -f program.jq
Input:
{
"key1": [{a:1}, {a:2}],
"key2": [{b:3}]
}
Output:
{
"key1": [
{
"a": 1,
"parent-key-name": "key1"
},
{
"a": 2,
"parent-key-name": "key1"
}
],
"key2": [
{
"b": 3,
"parent-key-name": "key2"
}
]
}
To flatten, simply append | [.[]] to the above filter. This produces:
[[{"a":1,"parent-key-name":"key1"},{"a":2,"parent-key-name":"key1"}],[{"b":3,"parent-key-name":"key2"}]]

Related

jq with multiple select statements and an array

I've got some JSON like the following (I've filtered the output here):
[
{
"Tags": [
{
"Key": "Name",
"Value": "example1"
},
{
"Key": "Irrelevant",
"Value": "irrelevant"
}
],
"c7n:MatchedFilters": [
"tag: example_tag_rule"
],
"another_key": "another_value_I_dont_want"
},
{
"Tags": [
{
"Key": "Name",
"Value": "example2"
}
],
"c7n:MatchedFilters": [
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
}
]
I'd like to create a csv file with the value within the Name key and all of the "c7n:MatchedFilters" in the array. I've made a few attempts but still can't get quite the output I expect. There's some example code and the output below:
#Prints the key that I'm after.
cat new.jq | jq '.[] | [.Tags[], {"c7n:MatchedFilters"}] | .[] | select(.Key=="Name")|.Value'
"example1"
"example2"
#Prints all the filters in an array I'm after.
cat new.jq | jq -r '.[] | [.Tags[], {"c7n:MatchedFilters"}] | .[] | select(."c7n:MatchedFilters") | .[]'
[
"tag: example_tag_rule"
]
[
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
#Prints *all* the tags (including ones I don't want) and all the filters in the array I'm after.
cat new.jq | jq '.[] | [.Tags[], {"c7n:MatchedFilters"}] | select((.[].Key=="Name") and (.[]."c7n:MatchedFilters"))'
[
{
"Key": "Name",
"Value": "example1"
},
{
"Key": "Irrelevant",
"Value": "irrelevant"
},
{
"c7n:MatchedFilters": [
"tag: example_tag_rule"
]
}
]
[
{
"Key": "Name",
"Value": "example2"
},
{
"c7n:MatchedFilters": [
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
}
]
I hope this makes sense, let me know if I've missed anything.
Your attempts are not working because you start out with [.Tags[], {"c7n:MatchedFilters"}] to construct one array containing all the tags and an object containing the filters. You are then struggling to find a way to process this entire array at once because it jumbles together these unrelated things without any distinction. You will find it much easier if you don't combine them in the first place!
You want to find the single tag with a Key of "Name". Here's one way to find that:
first(
.Tags[]|
select(.Key=="Name")
).Value as $name
By using a variable binding we can save it for later and worry about constructing the array separately.
You say (in the comments) that you just want to concatenate the filters with spaces. You can do that easily enough:
(
."c7n:MatchedFilters"|
join(" ")
) as $filters
You can combine all this together like follows. Note that each variable binding leaves the input stream unchanged, so it's easy to compose everything.
jq --raw-output '
.[]|
first(
.Tags[]|
select(.Key=="Name")
).Value as $name|
(
."c7n:MatchedFilters"|
join(" ")
) as $filters|
[$name, $filters]|
#csv
Hopefully that's easy enough to read and separates out each concept. We break up the array into a stream of objects. For each object, we find the name and bind it to $name, we concatenate the filters and bind them to $filters, then we construct an array containing both, then we convert the array to a CSV string.
We don't need to use variables. We could just have a big array constructor wrapped around the expression to find the name and the expression to find the filters. But I hope you can see the variables make things a bit flatter and easier to understand.

JQ: key selection from numeric objects

I use jq 1.6 in a Windows 10 PowerShell enviroment and trying to select keys from coincidentally numeric json objects.
Json exampel:
{
"alliances_info":{
"744085325458334213":{
"emblem":3,
"name":"wellwell",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"MELL",
"slogan":"",
"id":744085325458334213
},
"744128593839677958":{
"emblem":0,
"name":"Brave",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"GABA",
"slogan":"",
"id":744128593839677958
},
"746034084459209223":{
"emblem":0,
"name":"Queen",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"QUE",
"slogan":"",
"id":746034084459209223
},
"750446471312466445":{
"emblem":0,
"name":"Phoenix Inc",
"member_count":35,
"level":6,
"military_might":453369,
"public":true,
"tag":"PHOI",
"slogan":"",
"id":750446471312466445
},
"750446518934594062":{
"emblem":11,
"name":"Australia",
"member_count":44,
"level":8,
"military_might":957211,
"public":true,
"tag":"AUST",
"slogan":"Go Australia",
"id":750446518934594062
}
},
"server_version":"v7.190.4-master.000000006"
}
I tried several jq commands:
.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]
or
.alliances_info | .. | objects | [{alliance_name: .name, alliance_c
ount: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slog
an, alliance_id: .id}]
But Always get a jq error: parse error: Invalid numeric literal at line 1, column 3
I renounce on the object Building in the first command (and built only a Array) it works. But i need that objects. Any tips?
BR
Timo
Your first query works perfectly well with the given JSON sample. Perhaps you're invoking jq incorrectly. If you have the jq program in a file, say select.jq, you'd invoke jq like so:
jq -f select.jq sample.json
If that doesn't help, then try:
jq empty sample.json
If that fails, there might be something wrong with the encoding of the JSON.
I'm not sure I understand what you want.
Your first attempt works for me, but generates one output for JSON value in the input. That is, I created a file named so.json and put in it your JSON from above:
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
⋮
}
When I run your program , I get:
$ jq '.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]' so.json
[
{
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"alliance_tag": "MELL",
"alliance_slogan": "",
"alliance_id": 744085325458334200
}
]
[
{
"alliance_name": "Brave",
⋮
]
If you want an array at all, you probably want one array containing all the alliances like this:
$ jq '.alliances_info | [ .[] | { alliance_name: .name, alliance_id: .id } ]' so.json
[
{
"alliance_name": "wellwell",
"alliance_id": 744085325458334200
},
{
"alliance_name": "Brave",
"alliance_id": 744128593839678000
},
{
"alliance_name": "Queen",
"alliance_id": 746034084459209200
},
{
"alliance_name": "Phoenix Inc",
"alliance_id": 750446471312466400
},
{
"alliance_name": "Australia",
"alliance_id": 750446518934594000
}
]
Starting from the left,
- .alliances_info looks in its input object for the field named "alliances_info" and outputs its value
- the | next says take the output from the left-hand side and pass those as inputs to the right-hand side.
- right after that first |, I have a [ «jq expressions» ] which tells jq to create one JSON array output for each input; the elements of that array are the outputs of that inner «jq expressions»
- that inner expression starts with .[] which means to produce one output for each JSON value (ignoring the keys) in the input object. For us, that will be the objects named "744085325458334213", "744128593839677958", …
- The next | uses those objects as input and for each, generates a JSON object { alliance_name: .name, alliance_id: .id }
That's why I end up with one JSON array containing 5 JSON objects.
As far as I can tell, you are mostly just renaming a bunch of the fields. For that, you could just do something like this:
$ jq --argjson renameMap '{ "name": "alliance_name", "member_count": "alliance_count", "level": "alliance_level", "military_might": "alliance_power", "tag": "alliance_tag", "slog": "alliance_slogan"}' '.alliances_info |= ( . | [ to_entries[] | ( .value |= ( . | [ to_entries[] | ( .key |= ( if $renameMap[.] then $renameMap[.] else . end ) ) ] | from_entries ) ) ] | from_entries )' so.json
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "MELL",
"slogan": "",
"id": 744085325458334200
},
"744128593839677958": {
"emblem": 0,
"alliance_name": "Brave",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "GABA",
"slogan": "",
"id": 744128593839678000
},
⋮
},
"server_version": "v7.190.4-master.000000006"
}
well i am a idiot (to be here totally clear). I found the reason (and this is normally a nobrainer...). I read the input from a file and the funny thing is that the file is Unicode but no UTF8. after recoding the command is working fine. Thanks for the help.
BR
Timo

Merge arrays in object

I have an object that is just a bunch of arbitrary keys with each an array:
{
"foo": [
"hello",
"world"
],
"bar": [
"foobar"
]
}
How can I return the merged arrays in this object. The expected output would be:
[
"hello",
"world",
"foobar"
]
Create a list of the values and concatenate the elements in that list:
[.[]] | add
Create a list of each element in each array:
[.[][]]
I'd prefer the first one since it parses easier in my mind.
Generalizing a bit:
jq '[..|scalars]' input.json

tag jq how to combine multiple paths into one object

I have an object that I can filter to several paths I wish to keep. The paths are of the form:
[
"key1",
"key2",
"mykey"
]
[
"key3",
"key4",
"mykey"
]
What I want is:
{ "key1":{
"key2": .key2
},
"key3":{
"key4": .key4
}
}
The closest I can get is:
{ "key1":{
"key2": .key2
}
}
{ "key3":{
"key4": .key4
}
using:
(paths(objects)|select(last(.[])=="mykey")) as $path|
getpath([$path[0],$path[1]]) as $getpath|
{($path[0]):{($path[1]):$getpath}}
Though I can pipe this output to a jq -s '.' command, I cannot find a way to sum the reconstructions of the paths together within the original set of filters. It appears that the filters reset at the end of each object. $path appears to hold only one path array at a time, rather than be an array of paths. This prevents me from iterating over $path in a reduce function.
I have created the following script that works but I am interested in finding how to use the paths() function, as well. I have not figured out how to make it very useful to me, as yet.
(to_entries |
map(select(.["value"][]?|has("mykey")?))|[.[].key]) as $rooms|
(to_entries |
map(select(.["value"][]?|has("mykey")?))|[.[].value]) as $roomvals| #allows room paths to be avail
##### creates object containing only those locations and sensors within the locations that include "mykey" objs
reduce
range(0;$rooms|length) as $i ({};
.+{($rooms[$i]): ($roomvals[$i] | to_entries |
map(select(.["value"]|has("mykey")))|{(.[]["key"]):.[]["value"]})})
Any assistance on these approaches or the suggestion of alternative approaches is appreciated.
With guidance from peak's responses I can now better pose and answer my question.
I have expanded the data object that peak used in his example so it better represents the problem I was trying to solve. Now the data better illustrates that "key2" and "key10" have "mykey" while "key6", "key8", and "key11" do not. What I was trying to express in my question is that I wanted to retain the full paths of "key2" and "key10" which would include "another" as well as "mykey". I realized that a small modification to peak's example selection criterion would do that. I only need to add a filter to the selection to retain the 1st two elements of each path that includes "mykey":
def data:
{ "key1": { "key2": {"mykey": {"a": 123},
"another":{"q": "six"} },
"key6": {"key7": {"d": 997} } },
"key3": { "key8": {"key9": {"b": 234} },
"key10": {"mykey": {"d": 997},
"another":{"q": "seven"} } },
"key4": { "key11": {"key5" : {"a": 123} } }
};
def selection(mykey):
. as $in
| reduce (paths(objects) | select(last(.[])==mykey)|.[0:2]) as $path
(null; setpath($path; $in|getpath($path)) );
data | selection("mykey")
Which returns what I desired:
{"key1":{"key2":{"mykey":{"a":123},
"another":{"q":"six"}}},
"key3":{"key6":{"mykey":{"d":997},
"another":{"q":"seven"}}}
}
You will note the selection now indcludes "|.[0:2]".
Thanks to peak for setting my on the right track. I have voted for his answer as it helped me better phrase my question and give me the insight to solve it.
The output that you have indicated you want is not JSON, and it's unclear to me what the desired output is, but the following seems to come close:
reduce inputs as $d (null; setpath( $d[0: ($d|length) - 1]; $d[-1] ) )
With your input, and using jq -n, this produces:
{
"key1": {
"key2": "mykey"
},
"key3": {
"key4": "mykey"
}
}
Hopefully, you'll be able to take it from here.
Example:
def data:
{ "key1": { "key2": {"mykey": {"a": 123} } },
"key3": { "key4": {"mykey": {"b": 234} } },
"key4": { "key1": {"key2" : {"a": 123} } }
};
def selection(mykey):
. as $in
| reduce (paths(objects) | select(.[-1] == mykey)) as $path
(null; setpath($path; $in|getpath($path)) );
data | selection("mykey")
Invocation: jq -n -c -f example.jq
Output:
{"key1":{"key2":{"mykey":{"a":123}}},
"key3":{"key4":{"mykey":{"b":234}}}}

is it possible to use jq to replace a value in one json file from a another dictionary json file?

I have two json file, one contains a map key name and a type, the other is a flat json file.
eg. first file contains something like this:
[ { "field": "col1", "type": "int" }, { "field" : "col2", "type" : "string" }]
second file is a large jsons object file separated by line break:
{ "col1":123, "col2": "foo"}
{ "col1":123, "col2": "foo"}
...
can I use JQ to generate an output json like this:
{ "col1":{ "int" : 123 }, "col2": { "string" : "foo"} }
{ "col1":{ "int" : 123 }, "col2": { "string" : "foo"} }
....
Sure. You might want to transform your first file in an easier to consume format first: map the .type to the .field properties to an object (to use as a dictionary)
reduce .[] as $i ({}; .[$i.field] = $i.type)
Then you could go through your second file to use these mappings to update the values. Use --argfile to read the contents of the first file into a variable.
$ jq --argfile file1 file1.json '
(reduce $file1[] as $i ({}; .[$i.field] = $i.type)) as $map
| with_entries(.value = { ($map[.key]): .value })
' file2.json
which yields:
{
"col1": {
"int": 123
},
"col2": {
"string": "foo"
}
}
{
"col1": {
"int": 123
},
"col2": {
"string": "foo"
}
}
Yes. You could use the --slurpfile option but your dictionary is already a single JSON entity (a JSON object in your case), so it would be simpler to read the dictionary using the --argfile option.
Assuming that:
your jq filter is in a file, say merge.jq;
your dictionary is in dictionary.json;
your input stream is in input.json
the jq invocation would look like this:
jq -f merge.jq --argfile dict dictionary.json input.json
With the above, you would of course refer to the dictionary as $dict in merge.jq
(Of course you could specify the filter on the jq command line, if that's what you prefer.)
Now, over to you!