JQ: Convert Dictionary with List as Values to flat CSV - json

Original Data
I have the following JSON:
{
"foo":[
"asd",
"fgh"
],
"bar":[
"abc",
"xyz",
"ert"
],
"baz":[
"something"
]
}
Now I want to transform it to a "flat" CSV, such that for every key in my object the list in the value is expanded to n rows with n being the number of entries in the respective list.
Expected Output
foo;asd
foo;fgh
bar;abc
bar;xyz
bar;ert
baz;something
Approaches
I guess I need to use to_entries and then for each .value repeat the same .key for the first column. The jq docs state that:
Thus as functions as something of a foreach loop.
So I tried combining
to_entriesto give the keys and values from my dictionary an accessible name
then build kind of a foreach loop around the .values
and pass the result to #csv
to_entries|map(.value) as $v|what goes here?|#csv
I prepared something that at least compiles here

Don't need to use _entries function, a simple key/value lookup and string interpolation should suffice
keys_unsorted[] as $k | "\($k);\( .[$k][])"
The construct .[$k][] is an expression that first expands the value associated with each key, i.e. .foo and then with object construction, produces multiple results for each key identified and stored in $k variable.

Related

Sorting a JSON file by outer object name

I have a json file input.json thus:
{
"foo":{
"prefix":"abc",
"body":[1,2,3]
},
"bar":{
"prefix":"def",
"body":[4,5,6]
}
}
I would like to sort it by the outer object names, with "bar" coming before "foo" in alphabetical order like so:
{
"bar":{
"prefix":"def",
"body":[4,5,6]
},
"foo":{
"prefix":"abc",
"body":[1,2,3]
}
}
to produce file output.json.
Versions of this question have been asked of Java/Javascript (here and here)
Is there a way to accomplish this using a command line tool like sed/awk or boost.json?
Using jq, you could use the keys built-in to get the key names in sorted order and form the corresponding value object
jq 'keys[] as $k | { ($k) : .[$k] }' json
Note that jq does have a field --sort-keys option, which cannot be used here, as it internally sorts the inner level objects as well.
Here's a variable-free jq solution:
to_entries | sort_by(.key) | from_entries
It is also worth noting that gojq, the Go implementation of jq, currently always sorts the keys within all JSON objects.

JQ - Select JSON subarray by value then update by index

I'm trying to process some JSON output and modify a value but struggling to get anywhere.
I have no control over the source data, which looks like this:
[
[
"dave",
"likes",
"rabbits"
],
[
"brian",
"likes",
"fish"
]
]
In pseudo code, I need to:
Select the subarray with value "brian" at index 0
Change the value at index [2] in the selected array to "cats"
Return the complete modified array
I've managed to use map and select to get the subarray I want (jq -r -c 'map(select(.[]=="brian"))), but not build that into anything more useful...
Help much appreciated!
Update the required value by specifying the array indices and using the |= update select construct
map(select(.[0] == "brian")[2] |= "cats" )
This also populates [2] with "cats" even if previously there was no value at the specific index.
Of course it goes without saying, the indices could be dynamically arrived at as well
map(select(any(.[]; . == "brian"))[2] |= "cats")

jq join on common key

I'm very new to jq and this post is a result of not understanding the mechanics behind jq.
I could develop a bash script, which does what I want but jq and it's JSON super-powers have intrigued me and I'd like to learn it by applying to real world scenarios. Here's one...
BTW, I've tried to make use of the existing jq related SO solutions for merging/joining JSONs but have failed.
The closest I came to what I needed was to use an INDEX and a concatenation of $x + . , however I was only getting the LAST item from my second (c2) json.
So, my problem is as follows:
There are Two JSON files:
JSON #1 will have unique "id" and "type" keys - among other key/value pairs, which I've removed for better clarity of my post.
JSON #2 will contain multiples/non-unique "type" keys, which I'd like to match these two JSON files on. This JSON #2 will also contain other key/value pairs, which are expected to be contained in the resultant output.
My output requirements are:
I'd like to obtain a (one per line or a single array) list of all combinations of matching key/values pairs between c1 and c2 array where the value of the "type" key (string) matches between c1 and c2 exactly.
One more question, how much more difficult would it be to scale the solution to perform similar matching/joining between three JSON files at once - again on the same value of a particular key?
Any assistance or even just hints on how to solve and understand how to solve this would be greatly appreciated!
1st input file: JSON #1, Array c1 (collection 1)
{ "c1":
[
{ "c1id":1, "type":"alpha" },
{ "c1id":2, "type":"beta" }
]
}
2nd input file: JSON #2, Array c2 (collection 2)
{
"c2":
[
{ "c2id":1,"type":"alpha","serial":"DDBB001"} ,
{ "c2id":2,"type":"beta","serial":"DDBB007"} ,
{ "c2id":3,"type":"alpha","serial":"DDTT005"} ,
{ "c2id":4,"type":"beta","serial":"DDAA002"} ,
{ "c2id":5,"type":"yotta","serial":"DDCC017"}
]
}
Expected output:
{"c1id":1,"type":"alpha","c2id":1,"serial":"DDBB001"}
{"c1id":1,"type":"alpha","c2id":3,"serial":"DDTT005"}
{"c1id":2,"type":"beta","c2id":2,"serial":"DDBB007"}
{"c1id":2,"type":"beta","c2id":4,"serial":"DDAA002"}
You will notice that type "yotta" from the c2 is not included in the output. This is expected. Only "types" which exist in c1 and match c2 are expected to be in the results. I guess this is implied by this being a matching/joining exercise - I added it just for clarity - I hope it worked.
Here's an example of using INDEX and JOIN:
jq --compact-output --slurpfile c1 c1.json '
INDEX(
$c1[0].c1[];
.type
) as $index |
JOIN(
$index;
.c2[];
.type;
reverse|add
)
' c2.json
The first argument to INDEX needs to produce a stream of items, which is why we apply [] to get the items from the array individually. The second argument selects our index key.
We use the four argument version of JOIN. The first argument is the index itself, the second is a stream of objects to be joined to the index, the third argument selects the lookup key from the streamed objects, and the fourth argument is an expression to assemble the join object. The input to that expression is a stream of two-item arrays, each looking something like this:
[{"c2id":1,"type":"alpha","serial":"DDBB001"},{"c1id":1,"type":"alpha"}]
Since we just want to combine all the keys and values from the objects we just use add, but we first reverse the array to nicely arrange the c1 fields before the c2 fields. The end result is as you hoped:
{"c1id":1,"type":"alpha","c2id":1,"serial":"DDBB001"}
{"c1id":2,"type":"beta","c2id":2,"serial":"DDBB007"}
{"c1id":1,"type":"alpha","c2id":3,"serial":"DDTT005"}
{"c1id":2,"type":"beta","c2id":4,"serial":"DDAA002"}

JSON: using jq with variable keys

I have input JSON data in a bunch of files, with an IP address as one of the keys. I need to iterate over a the files, and I need to get "stuff" out of them. The IP address is different for each file, but I'd like to use a single jq command to get the data. I have tried a bunch of things, the closest I've come is this:
jq '.facts | keys | keys as $ip | .[0]' a_file
On my input in a_file of:
{
"facts": {
"1.1.1.1": {
"stuff":"value"
}
}
}
it returns the IP address, i.e. 1.1.1.1, but then how do I to go back do something like this (which is obviously wrong, but I hope you get the idea):
jq '.facts.$ip[0].stuff' a_file
In my mind I'm trying to populate a variable, and then use the value of that variable to rewind the input and scan it again.
=== EDIT ===
Note that my input was actually more like this:
{
"facts": {
"1.1.1.1": {
"stuff": {
"thing1":"value1"
}
},
"outer_thing": "outer_value"
}
}
So I got an error:
jq: error (at <stdin>:9): Cannot index string with string "stuff"
This fixed it- the question mark after .stuff:
.facts | keys_unsorted[] as $k | .[$k].stuff?
You almost got it right, but need the object value iterator construct, .[] to get the value corresponding to the key
.facts | keys_unsorted[] as $k | .[$k].stuff
This assumes that, inside facts you have one object containing the IP address as the key and you want to extract .stuff from it.
Optionally, to guard against objects that don't contain stuff inside, you could add ? as .[$k].stuff?. And also you could optionally validate keys against a valid IP regex condition and filter values only for those keys.

Having separate arrays, how to extract value based on the column name?

I am trying to extract data from some JSON with JQ - I have already got it down to the last level of data that I need to extract from, but I am completely stumped as to how to proceed with how this part of the data is formatted.
An example would be:
{
"values": [
[
1483633677,
42
]
],
"columns": [
"time",
"count_value"
],
"name": "response_time_error"
}
I would want to extract just the value for a certain column (e.g. count_value) and I can extract it by using [-1] in this specific case, but I want to select the column by its name in case they change in the future.
If you're only extracting a single value and the arrays will always correspond with eachother, you could find the index in the columns array then use that index into the values array.
It seems like values is an array of rows with those values. Assuming you want to output the values of all rows with the selected column:
$ jq --arg col 'count_value' '.values[][.columns | index($col)]' input.json
If the specified column name does not exist in .columns, then Jeff's filter will fail with a rather obscure error message. It might therefore be preferable to check whether the column name is found. Here is an illustration of how to do so:
jq --arg col count_value '
(.columns | index($col)) as $ix
| if $ix then .values[][$ix] else empty end' input.json
If you want an informative error message to be printed, then replace empty with something like:
error("specified column name, \($col), not found")