I am trying to convert a csv where the headers are keys and the values in the column are a list.
For example I have the following csv
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
I would like the following format.
{
"field1" : [Mazda RX4 ,Mazda RX4 Wag,Datsun 710],
"mpg" : [21,21,22.8],
"cyl" : [6,6,6],
"disp" : [160,160,108],
...
}
Note that the numerical values are not quoted. I am assuming that the columns all have the same type.
I am using the following jq command.
curl https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/mtcars.csv cars.csv | head -n4 | csvtojson | jq '.'
[
{
"field1": "Mazda RX4",
"mpg": "21",
"cyl": "6",
"disp": "160",
"hp": "110",
"drat": "3.9",
"wt": "2.62",
"qsec": "16.46",
"vs": "0",
"am": "1",
"gear": "4",
"carb": "4"
},
{
"field1": "Mazda RX4 Wag",
"mpg": "21",
"cyl": "6",
"disp": "160",
"hp": "110",
"drat": "3.9",
"wt": "2.875",
"qsec": "17.02",
"vs": "0",
"am": "1",
"gear": "4",
"carb": "4"
},
{
"field1": "Datsun 710",
"mpg": "22.8",
"cyl": "4",
"disp": "108",
"hp": "93",
"drat": "3.85",
"wt": "2.32",
"qsec": "18.61",
"vs": "1",
"am": "1",
"gear": "4",
"carb": "1"
}
]
Complete working solution
cat <csv_data> | csvtojson | jq '. as $in | reduce (.[0] | keys_unsorted[]) as $k ( {}; .[$k] = ($in|map(.[$k])))'
jq play - Converting all numbers to strings
https://jqplay.org/s/HKjHLVp9KZ
Here's a concise, efficient, and conceptually simple solution based on just map and reduce:
. as $in
| reduce (.[0] | keys_unsorted[]) as $k ( {}; .[$k] = ($in|map(.[$k])))
Converting all number-valued strings to numbers
. as $in
| reduce (.[0] | keys_unsorted[]) as $k ( {};
.[$k] = ($in|map(.[$k] | (tonumber? // .))))
Related
I have the following printout,
{
"metric": {
"container": "container1",
"namespace": "namespace1",
"pod": "pod1"
},
"values": [
[
1664418600,
"1"
],
[
1664418900,
"2"
],
[
1664419200,
"6"
],
[
1664419500,
"8"
],
[
1664419800,
"7"
],
[
1664420100,
"9"
]
]
}
{
"metric": {
"container": "container2",
"namespace": "namespace2",
"pod": "pod2"
},
"values": [
[
1664420100,
"1"
]
]
}
What I want:
container=container1,namespace=namespace1,pod=pod1
1 1664418600
2 1664418900
6 1664419200
8 1664419500
7 1664419800
9 1664420100
container=container2,namespace=namespace2,pod=pod2
1 1664420100
Build it from two JSON programs:
Header lines: .metric | to_entries | map(join("=")) | join(",")
Get metric object: .metric
Convert to an array of key-value pairs: to_entries
Map each key-value pair object to a string "key=value": map(join("="))
Join all pairs by comma: join(",")
Value lists: .values[] | [last,first] | join(" ")
Stream values: .values[]
Reverse each two-valued array: [last,first]
Join items by blank: join(" ")
An alternative for 2.2. and 2.3. could be "\(last) \(first)", i.e. values[] | "\(last) \(first)". Or [last,first] could be replaced with reverse: .values[] | reverse | join(" ").
Putting the two programs together:
(.metric | to_entries | map(join("=")) | join(",")),
(.values[] | [last,first] | join(" "))
And then execute with raw output enabled: jq -r (.metrics|to_entries…
Output:
container=container1,namespace=namespace1,pod=pod1
1 1664418600
2 1664418900
6 1664419200
8 1664419500
7 1664419800
9 1664420100
container=container2,namespace=namespace2,pod=pod2
1 1664420100
I've been trying to convert some JSON to csv and I have the following problem:
I have the following input json:
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
And I expect the following CSV output:
id,t1,t2
100,2,3
200,,3
300,3,
Unfortunately JQ doesn't output if one of select has no match.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
output:
{ "t1": 2, "t2": 3 }
but if one of the objects select returns no match it doesn't return at all.
Example:
echo '{ "id": 100, "a": [{"t" : 1,"c" : 2 }] }' | jq '{t1: (.a[] | select(.t==1)).c , t2: (.a[] | select(.t==2)).c }'
Expected output:
{ "t1": 2, "t2": null }
Does anyone know how to achieve this with JQ?
EDIT:
Based on a comment made by #peak I found the solution that I was looking for.
jq -r '["id","t1","t2"],[.id, (.a[] | select(.t==1)).c//null, (.a[] | select(.t==2)).c//null ]|#csv'
The alternative operator does exactly what I was looking for.
Alternative Operator
Here's a simple solution that does not assume anything about the ordering of the items in the .a array, and easily generalizes to arbitrarily many .t values:
# Convert an array of {t, c} to a dictionary:
def tod: map({(.t|tostring): .c}) | add;
["id", "t1", "t2"], # header
(inputs
| (.a | tod) as $dict
| [.id, (range(1;3) as $i | $dict[$i|tostring]) ])
| #csv
Command-line options
Use the -n option (because inputs is being used), and the -r option (to produce CSV).
This is an absolute mess, but it works:
$ cat tmp.json
{"id": 100, "a": [{"t" : 1,"c" : 2 }, {"t": 2, "c" : 3 }] }
{"id": 200, "a": [{"t": 2, "c" : 3 }] }
{"id": 300, "a": [{"t": 1, "c" : 3 }] }
$ cat filter.jq
def t(id):
.a |
map({key: "t\(.t)", value: .c}) |
({t1:null, t2:null, id:id} | to_entries) + . | from_entries
;
inputs |
map(.id as $id | t($id)) |
(.[0] | keys) as $hdr |
([$hdr] + map(to_entries |map(.value)))[]|
#csv
$ jq -rn --slurp -f filter.jq tmp.json
"id","t1","t2"
2,3,100
,3,200
3,,300
In short, you produce a direct object containing the values from your input, then add it to a "default" object to fill in the missing keys.
I'm trying to parse JSON and store certain values as metrics in Graphite.
In order to make my Graphite more user-friendly I have to form a metric name, that contains some values from its object.
I got working solution on bash loops + jq, but it's really slow. So I'm asking for help :)
Here is my input:
{
...
},
"Johnny Cage": {
"firstname": "Johnny",
"lastname": "Cage",
"height": 183,
"weight": 82,
"hands": 2,
"legs": 2,
...
},
...
}
Desired output:
mk.fighter.Johnny.Cage.firstname Johnny
mk.fighter.Johnny.Cage.lastname Cage
mk.fighter.Johnny.Cage.height 183
mk.fighter.Johnny.Cage.weight 82
mk.fighter.Johnny.Cage.hands 2
mk.fighter.Johnny.Cage.legs 2
...
With single jq command:
Sample input.json:
{
"Johnny Cage": {
"firstname": "Johnny",
"lastname": "Cage",
"height": 183,
"weight": 82,
"hands": 2,
"legs": 2
}
}
jq -r 'to_entries[] | (.key | sub(" "; ".")) as $name
| .value | to_entries[]
| "mk.fighter.\($name).\(.key) \(.value)"' input.json
To get $name as a combination of inner firstname and lastname keys replace (.key | sub(" "; ".")) as $name with "\(.value.firstname).\(.value.lastname)" as $name
The output:
mk.fighter.Johnny.Cage.firstname Johnny
mk.fighter.Johnny.Cage.lastname Cage
mk.fighter.Johnny.Cage.height 183
mk.fighter.Johnny.Cage.weight 82
mk.fighter.Johnny.Cage.hands 2
mk.fighter.Johnny.Cage.legs 2
Imagine I have a categories tree like this JSON file:
[
{
"id": "1",
"text": "engine",
"children": [
{
"id": "2",
"text": "exhaust",
"children": []
},
{
"id": "3",
"text": "cooling",
"children": [
{
"id": "4",
"text": "cooling fan",
"children": []
},
{
"id": "5",
"text": "water pump",
"children": []
}
]
}
]
},
{
"id": "6",
"text": "frame",
"children": [
{
"id": "7",
"text": "wheels",
"children": []
},
{
"id": "8",
"text": "brakes",
"children": [
{
"id": "9",
"text": "brake calipers",
"children": []
}
]
},
{
"id": "10",
"text": "cables",
"children": []
}
]
}
]
How can I convert it to this flat table?
id parent_id text
1 NULL engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 NULL frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
I found similar questions and inverted questions (from table to JSON) but I can't figure it out with jq and its #tsv filter. Also I noticed the "flatten" filter is not often referenced in the answers (while it looks to be the exact tool I need) but it might be because it was introduced recently in the latests versions of jq.
Here is another solution which uses jq's recurse builtin:
["id","parent_id","text"]
, (
.[]
| recurse(.id as $p| .children[] | .parent=$p )
| [.id, .parent, .text]
)
| #tsv
Sample Run (assumes filter in filter.jq and sample data in data.json)
$ jq -Mr -f filter.jq data.json
id parent_id text
1 engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Try it online!
The key here is to define a recursive function, like so:
def children($parent_id):
.id as $id
| [$id, $parent_id, .text],
(.children[] | children($id)) ;
With your data, the filter:
.[]
| children("NULL")
| #tsv
produces the tab-separated values shown below. It is now easy to add headers, convert to fixed-width format if desired, etc.
1 NULL engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 NULL frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Here is a solution which uses a recursive function:
def details($parent):
[.id, $parent, .text], # details for current node
(.id as $p | .children[] | details($p)) # details for children
;
["id","parent_id","text"] # header
, (.[] | details(null)) # details
| #tsv # convert to tsv
Sample Run (assumes filter in filter.jq and sample data in data.json)
$ jq -Mr -f filter.jq data.json
id parent_id text
1 engine
2 1 exhaust
3 1 cooling
4 3 cooling fan
5 3 water pump
6 frame
7 6 wheels
8 6 brakes
9 8 brake calipers
10 6 cables
Try it online!
Given
[{
"objects": [{
"key": "value"
},{
"key": "value"
}]
}, {
"objects": [{
"key": "value"
}, {
"key": "value"
}]
}]
How do I generate
[{
"objects": [{
"id": 0,
"key": "value"
},{
"id": 1,
"key": "value"
}]
}, {
"objects": [{
"id": 2,
"key": "value"
}, {
"id": 3,
"key": "value"
}]
}]
Using jq?
I tried to use this one, but ids are all 0:
jq '[(-1) as $i | .[] | {objects: [.objects[] | {id: ($i + 1 as $i | $i), key}]}]'
The key to a simple solution here is to break the problem down into easy pieces. This can be accomplished by defining a helper function, addId/1. Once that is done, the rest is straightforward:
# starting at start, add {id: ID} to each object in the input array
def addId(start):
reduce .[] as $o
([];
length as $l
| .[length] = ($o | (.id = start + $l)));
reduce .[] as $o
( {start: -1, answer: []};
(.start + 1) as $next
| .answer += [$o | (.objects |= addId($next))]
| .start += ($o.objects | length) )
| .answer
Inspired by #peak answer, I came up with this solution. Not much difference, just shorter way to generate IDs and opt for foreach instead of reduce since there is intermediate result involved.
def addIdsStartWith($start):
[to_entries | map((.value.id = .key + $start) | .value)];
[foreach .[] as $set (
{start: 0};
.set = $set |
.start as $start | .set.objects |= addIdsStartWith($start) |
.start += ($set.objects | length);
.set
)]