I have the following json file which I would like to convert to csv:
{
"id": 1,
"date": "2014-05-05T19:07:48.577"
}
{
"id": 2,
"date": null
}
Converting it to csv with the following jq produces:
$ jq -sr '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | #csv' < test.json
"date","id"
"2014-05-05T19:07:48.577",1
,2
Unfortunately, for the line with "id" equal to "2", the date column was not set to "null" - instead, it was empty. This in turn makes MySQL error on import if it's a datetime column (it expects a literal "null" if we don't have a date, and errors on "").
How can I make jq print the literal "null", and not ""?
I'd go with:
(map(keys_unsorted) | add | unique) as $cols
| $cols,
(.[] | [.[$cols[]]] | map(. // "null") )
| #csv
First, using keys_unsorted avoids useless sorting.
Second, [.[$cols[]]] is an important, recurrent and idiomatic pattern, used to ensure an array is constructed in the correct order without resorting to the reduce sledge-hammer.
Third, although map(. // "null") seems to be appropriate here, it should be noted that this expression will also replace false with "null", so, it would not be appropriate in general. Instead, to preserve false, one could write map(if . == null then "null" else . end).
Fourth, it should be noted that using map(. // "null") as above will also mask missing values of any of the keys, so if one wants some other behavior (e.g., raising an error if id is missing), then an alternative approach would be warranted.
The above assumes the stream of JSON objects shown in the question is "slurped", e.g. using jq's -s command-line option.
Use // as alternative operator for your cell value:
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.] // "null")) as $rows | $cols, $rows[] | #csv' < test.json
(The whole string is pretty good explained here: https://stackoverflow.com/a/32965227/16174836)
You can "stringify" the value using tostring by changing map($row[.]) into map($row[.]|tostring):
$ cat so2332.json
{
"id": 1,
"date": "2014-05-05T19:07:48.577"
}
{
"id": 2,
"date": null
}
$ jq --slurp --raw-output '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.]|tostring)) as $rows | $cols, $rows[] | #csv' so2332.json
"date","id"
"2014-05-05T19:07:48.577","1"
"null","2"
Note that the use of tostring will cause the numbers to be converted to strings.
Related
I got a Json which is basically a array but with a weird format that i can not change.
Is there any way that i can get with JQ the url by searching for the name, like this?
{
"servers": {
"servers[0].name": "abc",
"servers[0].url": "www.abc.test.com",
"servers[1].name": "xyz",
"servers[1].url": "www.xyz.test.com"
}
}
jq -r '.servers | select(.name=="abc") | .url'
Assuming the "=" can be naively changed to ":":
sed 's/ = /: /' | jq '
.servers
| keys_unsorted[] as $k
| select(.[$k] == "abc")
| ($k | sub("[.]name"; ".url")) as $k
| .[$k]
'
If you are looking for a general way to build a JSON array or object from such source, here's one way using reduce and setpath with regexes for splitting up the keys:
def build:
reduce (to_entries[] | .key |= [
splits("(?=\\[\\d+\\])|\\.")
| capture("\\[(?<index>\\d+)\\]|(?<field>.+)")
| (.index | tonumber)? // .field
]) as {$key, $value} (null; setpath($key; $value));
.servers | build.servers[] | select(.name == "abc").url
Demo
Input json:
{
"food_group": "fruit",
"glycemic_index": "low",
"fruits": {
"fruit_name": "apple",
"size": "large",
"color": "red"
}
}
Below two jq commands work:
# jq -r 'keys_unsorted[] as $key | "\($key), \(.[$key])"' food.json
food_group, fruit
glycemic_index, low
fruits, {"fruit_name":"apple","size":"large","color":"red"}
# jq -r 'keys_unsorted[0:2] as $key | "\($key)"' food.json
["food_group","glycemic_index"]
How to get values for the first two keys using jq in the same manner? I tried below
# jq -r 'keys_unsorted[0:2] as $key | "\($key), \(.[$key])"' food.json
jq: error (at food.json:9): Cannot index object with array
Expected output:
food_group, fruit
glycemic_index, low
To iterate over a hash array , you can use to_entries and that will transform to a array .
After you can use select to filter rows you want to keep .
jq -r 'to_entries[]| select( ( .value | type ) == "string" ) | "\(.key), \(.value)" '
You can use to_entries
to_entries[] | select(.key=="food_group" or .key=="glycemic_index") | "\(.key), \(.value)"
Demo
https://jqplay.org/s/Aqvos4w7bo
We are trying to parse a JSON file to a tsv file. We are having problems trying to eliminate duplicate Id with unique.
JSON file
[
{"Id": "101",
"Name": "Yugi"},
{"Id": "101",
"Name": "Yugi"},
{"Id": "102",
"Name": "David"},
]
cat getEvent_all.json | jq -cr '.[] | [.Id] | unique_by(.[].Id)'
jq: error (at :0): Cannot iterate over string ("101")
A reasonable approach would be to use unique_by, e.g.:
unique_by(.Id)[]
| [.Id, .Name]
| #tsv
Alternatively, you could form the pairs first:
map([.Id, .Name])
| unique_by(.[0])[]
| #tsv
uniques_by/2
For very large arrays, though, or if you want to respect the original ordering, a sort-free alternative to unique_by should be considered. Here is a suitable, generic, stream-oriented alternative:
def uniques_by(stream; f):
foreach stream as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s
else ($s|tostring) end) as $y
| if .[$t][$y] then .emit = false
else .emit = true | (.item = $x) | (.[$t][$y] = true)
end;
if .emit then .item else empty end );
I've following json,
{
"A": {
"C": {
"D": "T1",
"E": 1
},
"F": {
"D": "T2",
"E": 2
}
},
"B": {
"C": {
"D": "T3",
"E": 3
}
}
}
I want to convert it into csv as follows,
A,C,T1,1
A,F,T2,2
B,C,T3,3
Description of output: The parents keys will be printed until, I've reached the leaf child. Once I reached leaf child, print its value.
I've tried following and couldn't succeed,
cat my.json | jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $rows[] | #csv'
and it throwing me an error.
I can't hardcode the parent keys, as the actual json has too many records. But the structure of the json is similar. What am I missing?
Some of the requirements are unclear, but the following solves one interpretation of the problem:
paths as $path
| {path: $path, value: getpath($path)}
| select(.value|type == "object" )
| select( [.value[]][0] | type != "object")
| .path + ([.value[]])
| #csv
(This program could be optimized but the presentation here is intended to make the separate steps clear.)
Invocation:
jq -r -f leaves-to-csv.jq input.json
Output:
"A","C","T1",1
"A","F","T2",2
"B","C","T3",3
Unquoted strings
To avoid the quotation marks around strings, you could replace the last component of the pipeline above with:
join(",")
Here is a solution using tostream and group_by
[
tostream
| select(length == 2) # e.g. [["A","C","D"],"T1"]
| .[0][:-1] + [.[1]] # ["A","C","T1"]
]
| group_by(.[:-1]) # [[["A","C","T1"],["A","C",1]],...
| .[] # [["A","C","T1"],["A","C",1]]
| .[0][0:2] + map(.[-1]|tostring) # ["A","C","T1","1"]
| join(",") # "A,C,T1,1"
I'd like to filter output from below json file to get all start with "tag_Name"
{
...
"tag_Name_abc": [
"10_1_4_3",
"10_1_6_2",
"10_1_5_3",
"10_1_5_5"
],
"tag_Name_efg": [
"10_1_4_5"
],
...
}
Try something but failed.
$ cat output.json |jq 'map(select(startswith("tag_Name")))'
jq: error (at <stdin>:1466): startswith() requires string inputs
There's plenty of ways you can do this but the simplest way you can do so is to convert that object to entries so you can get access to the keys, then filter the entries by the names you want then back again.
with_entries(select(.key | startswith("tag_Name")))
Here are a few more solutions:
1) combining values for matching keys with add
. as $d
| keys
| map( select(startswith("tag_Name")) | {(.): $d[.]} )
| add
2) filtering out non-matching keys with delpaths
delpaths([
keys[]
| select(startswith("tag_Name") | not)
| [.]
])
3) filtering out non-matching keys with reduce and del
reduce keys[] as $k (
.
; if ($k|startswith("tag_Name")) then . else del(.[$k]) end
)