How to convert object keys to arrays with jq - json

I am trying to convert a csv where the headers are keys and the values in the column are a list.
For example I have the following csv
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
I would like the following format.
{
"field1" : [Mazda RX4 ,Mazda RX4 Wag,Datsun 710],
"mpg" : [21,21,22.8],
"cyl" : [6,6,6],
"disp" : [160,160,108],
...
}
Note that the numerical values are not quoted. I am assuming that the columns all have the same type.
I am using the following jq command.
curl https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/mtcars.csv cars.csv | head -n4 | csvtojson | jq '.'
[
{
"field1": "Mazda RX4",
"mpg": "21",
"cyl": "6",
"disp": "160",
"hp": "110",
"drat": "3.9",
"wt": "2.62",
"qsec": "16.46",
"vs": "0",
"am": "1",
"gear": "4",
"carb": "4"
},
{
"field1": "Mazda RX4 Wag",
"mpg": "21",
"cyl": "6",
"disp": "160",
"hp": "110",
"drat": "3.9",
"wt": "2.875",
"qsec": "17.02",
"vs": "0",
"am": "1",
"gear": "4",
"carb": "4"
},
{
"field1": "Datsun 710",
"mpg": "22.8",
"cyl": "4",
"disp": "108",
"hp": "93",
"drat": "3.85",
"wt": "2.32",
"qsec": "18.61",
"vs": "1",
"am": "1",
"gear": "4",
"carb": "1"
}
]
Complete working solution
cat <csv_data> | csvtojson | jq '. as $in | reduce (.[0] | keys_unsorted[]) as $k ( {}; .[$k] = ($in|map(.[$k])))'
jq play - Converting all numbers to strings
https://jqplay.org/s/HKjHLVp9KZ

Here's a concise, efficient, and conceptually simple solution based on just map and reduce:
. as $in
| reduce (.[0] | keys_unsorted[]) as $k ( {}; .[$k] = ($in|map(.[$k])))
Converting all number-valued strings to numbers
. as $in
| reduce (.[0] | keys_unsorted[]) as $k ( {};
.[$k] = ($in|map(.[$k] | (tonumber? // .))))

Related

how to format the result with jq

I have the following printout,
{
"metric": {
"container": "container1",
"namespace": "namespace1",
"pod": "pod1"
},
"values": [
[
1664418600,
"1"
],
[
1664418900,
"2"
],
[
1664419200,
"6"
],
[
1664419500,
"8"
],
[
1664419800,
"7"
],
[
1664420100,
"9"
]
]
}
{
"metric": {
"container": "container2",
"namespace": "namespace2",
"pod": "pod2"
},
"values": [
[
1664420100,
"1"
]
]
}
What I want:
container=container1,namespace=namespace1,pod=pod1
1 1664418600
2 1664418900
6 1664419200
8 1664419500
7 1664419800
9 1664420100
container=container2,namespace=namespace2,pod=pod2
1 1664420100
Build it from two JSON programs:
Header lines: .metric | to_entries | map(join("=")) | join(",")
Get metric object: .metric
Convert to an array of key-value pairs: to_entries
Map each key-value pair object to a string "key=value": map(join("="))
Join all pairs by comma: join(",")
Value lists: .values[] | [last,first] | join(" ")
Stream values: .values[]
Reverse each two-valued array: [last,first]
Join items by blank: join(" ")
An alternative for 2.2. and 2.3. could be "\(last) \(first)", i.e. values[] | "\(last) \(first)". Or [last,first] could be replaced with reverse: .values[] | reverse | join(" ").
Putting the two programs together:
(.metric | to_entries | map(join("=")) | join(",")),
(.values[] | [last,first] | join(" "))
And then execute with raw output enabled: jq -r (.metrics|to_entries…
Output:
container=container1,namespace=namespace1,pod=pod1
1 1664418600
2 1664418900
6 1664419200
8 1664419500
7 1664419800
9 1664420100
container=container2,namespace=namespace2,pod=pod2
1 1664420100

JQ - Denormalize nested object

I've been trying to convert some JSON to csv and I have the following problem:
I have the following input json:
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
And I expect the following CSV output:
id,t1,t2
100,2,3
200,,3
300,3,
Unfortunately JQ doesn't output if one of select has no match.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
output:
{ "t1": 2, "t2": 3 }
but if one of the objects select returns no match it doesn't return at all.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
Expected output:
{ "t1": 2, "t2": null }
Does anyone know how to achieve this with JQ?
EDIT:
Based on a comment made by #peak I found the solution that I was looking for.
jq -r '["id","t1","t2"],[.id, (.a[] | select(.t==1)).c//null, (.a[] | select(.t==2)).c//null ]|#csv'
The alternative operator does exactly what I was looking for.
Alternative Operator
Here's a simple solution that does not assume anything about the ordering of the items in the .a array, and easily generalizes to arbitrarily many .t values:
# Convert an array of {t, c} to a dictionary:
def tod: map({(.t|tostring): .c}) | add;
["id", "t1", "t2"], # header
(inputs
| (.a | tod) as $dict
| [.id, (range(1;3) as $i | $dict[$i|tostring]) ])
| #csv
Command-line options
Use the -n option (because inputs is being used), and the -r option (to produce CSV).
This is an absolute mess, but it works:
$ cat tmp.json
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
$ cat filter.jq
def t(id):
.a |
map({key: "t\(.t)", value: .c}) |
({t1:null, t2:null, id:id} | to_entries) + . | from_entries
;
inputs |
map(.id as $id | t($id)) |
(.[0] | keys) as $hdr |
([$hdr] + map(to_entries |map(.value)))[]|
#csv
$ jq -rn --slurp -f filter.jq tmp.json
"id","t1","t2"
2,3,100
,3,200
3,,300
In short, you produce a direct object containing the values from your input, then add it to a "default" object to fill in the missing keys.

How to create key:value list from JSON? Key name should contain some values from object itself

I'm trying to parse JSON and store certain values as metrics in Graphite.
In order to make my Graphite more user-friendly I have to form a metric name, that contains some values from its object.
I got working solution on bash loops + jq, but it's really slow. So I'm asking for help :)
Here is my input:
{
...
},
"Johnny Cage": {
"firstname": "Johnny",
"lastname": "Cage",
"height": 183,
"weight": 82,
"hands": 2,
"legs": 2,
...
},
...
}
Desired output:
mk.fighter.Johnny.Cage.firstname Johnny
mk.fighter.Johnny.Cage.lastname Cage
mk.fighter.Johnny.Cage.height 183
mk.fighter.Johnny.Cage.weight 82
mk.fighter.Johnny.Cage.hands 2
mk.fighter.Johnny.Cage.legs 2
...
With single jq command:
Sample input.json:
{
"Johnny Cage": {
"firstname": "Johnny",
"lastname": "Cage",
"height": 183,
"weight": 82,
"hands": 2,
"legs": 2
}
}
jq -r 'to_entries[] | (.key | sub(" "; ".")) as $name
| .value | to_entries[]
| "mk.fighter.\($name).\(.key) \(.value)"' input.json
To get $name as a combination of inner firstname and lastname keys replace (.key | sub(" "; ".")) as $name with "\(.value.firstname).\(.value.lastname)" as $name
The output:
mk.fighter.Johnny.Cage.firstname Johnny
mk.fighter.Johnny.Cage.lastname Cage
mk.fighter.Johnny.Cage.height 183
mk.fighter.Johnny.Cage.weight 82
mk.fighter.Johnny.Cage.hands 2
mk.fighter.Johnny.Cage.legs 2

Convert JSON categories tree to database table

Imagine I have a categories tree like this JSON file:
[
{
"id": "1",
"text": "engine",
"children": [
{
"id": "2",
"text": "exhaust",
"children": []
},
{
"id": "3",
"text": "cooling",
"children": [
{
"id": "4",
"text": "cooling fan",
"children": []
},
{
"id": "5",
"text": "water pump",
"children": []
}
]
}
]
},
{
"id": "6",
"text": "frame",
"children": [
{
"id": "7",
"text": "wheels",
"children": []
},
{
"id": "8",
"text": "brakes",
"children": [
{
"id": "9",
"text": "brake calipers",
"children": []
}
]
},
{
"id": "10",
"text": "cables",
"children": []
}
]
}
]
How can I convert it to this flat table?
id parent_id text
1 NULL engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 NULL frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
I found similar questions and inverted questions (from table to JSON) but I can't figure it out with jq and its #tsv filter. Also I noticed the "flatten" filter is not often referenced in the answers (while it looks to be the exact tool I need) but it might be because it was introduced recently in the latests versions of jq.
Here is another solution which uses jq's recurse builtin:
["id","parent_id","text"]
, (
.[]
| recurse(.id as $p| .children[] | .parent=$p )
| [.id, .parent, .text]
)
| #tsv
Sample Run (assumes filter in filter.jq and sample data in data.json)
$ jq -Mr -f filter.jq data.json
id parent_id text
1 engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Try it online!
The key here is to define a recursive function, like so:
def children($parent_id):
.id as $id
| [$id, $parent_id, .text],
(.children[] | children($id)) ;
With your data, the filter:
.[]
| children("NULL")
| #tsv
produces the tab-separated values shown below. It is now easy to add headers, convert to fixed-width format if desired, etc.
1 NULL engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 NULL frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Here is a solution which uses a recursive function:
def details($parent):
[.id, $parent, .text], # details for current node
(.id as $p | .children[] | details($p)) # details for children
;
["id","parent_id","text"] # header
, (.[] | details(null)) # details
| #tsv # convert to tsv
Sample Run (assumes filter in filter.jq and sample data in data.json)
$ jq -Mr -f filter.jq data.json
id parent_id text
1 engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Try it online!

How to generate continuing indices for multiple objects in nested arrays that are in an array

Given
[{
"objects": [{
"key": "value"
},{
"key": "value"
}]
}, {
"objects": [{
"key": "value"
}, {
"key": "value"
}]
}]
How do I generate
[{
"objects": [{
"id": 0,
"key": "value"
},{
"id": 1,
"key": "value"
}]
}, {
"objects": [{
"id": 2,
"key": "value"
}, {
"id": 3,
"key": "value"
}]
}]
Using jq?
I tried to use this one, but ids are all 0:
jq '[(-1) as $i | .[] | {objects: [.objects[] | {id: ($i + 1 as $i | $i), key}]}]'
The key to a simple solution here is to break the problem down into easy pieces. This can be accomplished by defining a helper function, addId/1. Once that is done, the rest is straightforward:
# starting at start, add {id: ID} to each object in the input array
def addId(start):
reduce .[] as $o
([];
length as $l
| .[length] = ($o | (.id = start + $l)));
reduce .[] as $o
( {start: -1, answer: []};
(.start + 1) as $next
| .answer += [$o | (.objects |= addId($next))]
| .start += ($o.objects | length) )
| .answer
Inspired by #peak answer, I came up with this solution. Not much difference, just shorter way to generate IDs and opt for foreach instead of reduce since there is intermediate result involved.
def addIdsStartWith($start):
[to_entries | map((.value.id = .key + $start) | .value)];
[foreach .[] as $set (
{start: 0};
.set = $set |
.start as $start | .set.objects |= addIdsStartWith($start) |
.start += ($set.objects | length);
.set
)]