JQ sorting key by arbitrary order - json

I am trying to sort by key in an arbitrary format (key 'event' should be the first)
I know this one sort the key by alphabetical order:
jq -S '.' file.json
but is there a function to sort key so that the first one is always the same and not by alphabetical order?
It is to make them more human readable and have the most significant key first
Currently have:
{key1:value, shouldbeFirstKey:value2, ...}
Would like
{shouldbeFirstKey:value2, key1:value, ...}

Suppose we have an object with a bunch of keys, and want some of those keys to appear in a certain order, while leaving the others as-is. Then the technique illustrated by the following example can be used:
$ jq -n '{a:2,b:3,first:0,second:1} | . as $in | {first,second} + $in'
The result:
{
"first": 0,
"second": 1,
"a": 2,
"b": 3
}
rekey
Let's call the object defining the key ordering as the "template object" ({first,second} above). Notice that using the technique described above, the keys in the "template object" always appear in the result. If we only want the template object keys to appear in the result if they appear in the input, we can modify the above approach using the following function:
def rekey(obj):
. as $in
| reduce (obj|keys_unsorted)[] as $k ({};
if $in|has($k) then . + {($k): $in[$k]} else . end)
| . + $in ;
For example:
{a:2,b:3,first:0,second:1} | rekey({first,second,third})
produces:
{
"first": 0,
"second": 1,
"a": 2,
"b": 3
}
With walk/1
If one wants to reorder keys recursively, one can use walk/1 (defined as at https://github.com/stedolan/jq/blob/master/src/builtin.jq),
as illustrated here using the above definition of rekey:
walk(if type == "object" then rekey($template) else . end)
where $template represents the "template object".
There is no need to define a special variant of walk/1. Simply define a function that takes as input an arbitrary object, and that produces the desired reordering.
(If your jq comes with the version of walk/1 that uses keys, then you should consider updating your jq, or redefine walk/1 to use keys_unsorted.)

Here is a filter (derived from walk/1 [source]) which will reorder the keys of subobjects using a function:
def reorder(order):
. as $in
| if type == "object" then
reduce (keys_unsorted|order)[] as $key(
{}; . + { ($key): ($in[$key] | reorder(order)) }
)
elif type == "array" then map( reorder(order) )
else .
end;
the order function is expected to return the set of keys in whatever order is desired. e.g. the following function moves "shouldbeFirstKey" to the first position
def neworder:
"shouldbeFirstKey" as $s
| index($s) as $i
| if $i==null then . else [$s] + .[:$i] + .[$i+1:] end
;
so that
{key1:"value", shouldbeFirstKey:"value2", other:{key3:"value3", shouldbeFirstKey:"value4"}}
| reorder(neworder)
produces the output
{
"shouldbeFirstKey": "value2",
"key1": "value",
"other": {
"shouldbeFirstKey": "value4",
"key3": "value3"
}
}
Try it online!

idea from #jq170727 's answer:
jq '
def reorder(a; n):
def order(a; n):
a as $a |
n as $n |
index($a) as $i |
if $i == null then .
else
if $i<$n then .[:$i] + .[$i+1:$n+1] + [$a] + .[$n+1:]
elif $n<$i then .[:$n] + [$a] + .[$n:]
else .
end
end;
. as $in
| if type == "object" then
reduce (keys_unsorted|order(a;n))[] as $key(
{}; . + { ($key): ($in[$key] | reorder(a; n)) }
)
elif type == "array" then map( reorder(a; n) )
else .
end;
[.[] | reorder("id"; 0)| reorder("size"; 3) | reorder("name"; 1)]'
data.json

Related

Transformation working on small files fails when used with the "--stream" option (required due to file size)

JQ play snippet: https://jqplay.org/s/D5-FZl8wOs
I'm using jq to flatten a json array to be used for sql.
json:
{
"0123":[
{"i":0,"p":"file 1","l":100},
{"i":1,"p":"file 2","l":200}
],
"0234":[
{"i":0,"p":"file 1","l":100},
{"i":1,"p":"file 2","l":200}
]
}
jq:
jq -r to_entries[] | {hash: .key, val: .value[]} | [.hash, .val.i, .val.p, .val.l]
Desired output:
[
"0123",
0,
"file 1",
100
]
[
"0123",
1,
"file 2",
200
]
[
"0234",
0,
"file 1",
100
]
[
"0234",
1,
"file 2",
200
]
The above worked only while the file was small, but now I get memory errors / OS killing it as its grown larger.
If I pass the --stream parameter, I get the error:
jq: error (at <stdin>:9): Cannot index array with string "i"
How can I solve this?
something like following will work for your sample input.
foreach inputs as $pv ([[],[]]; # [A, B]
if ($pv|length) == 2 # if pv is a path-value pair
then .[0] |= if . == [] # if A is empty
then . + [$pv[0][0],$pv[1]] # add first key from path definition and the value located at path to A
else . + [$pv[1]] end # add value to A
else [[],.[0]] end; # move A to B's place, leave A empty
if .[0] == [] and .[1] != [] # if A is empty but B is not
then .[1] else empty end # print B
)
invocation:
jq --stream -n 'foreach inputs as $pv ([[],[]]; if ($pv|length) == 2 then (.[0] |= if . == [] then . + [$pv[0][0],$pv[1]] else . + [$pv[1]] end) else [[],.[0]] end; if .[0] == [] and .[1] != [] then .[1] else empty end)' file
jqplay: https://jqplay.org/s/Q81EZahkjG
I need a way to get to_entries[] working with streaming
Here's a def that does just that:
def atomize(s):
fromstream(foreach s as $in ( {previous:null, emit: null};
if ($in | length == 2) and ($in|.[0][0]) != .previous and .previous != null
then {emit: [[.previous]], previous: $in|.[0][0]}
else { previous: ($in|.[0][0]), emit: null}
end;
(.emit // empty), $in) ) ;
With this def, you can use your filter by prepending atomize(inputs) assuming you invoke jq with both the -n and --stream options. That is, your main filter would be:
atomize(inputs)
| to_entries[]
| {hash: .key, val: .value[]}
| [.hash, .val.i, .val.p, .val.l]
Alternative
If the JSON is completely regular, as in the example, you could alternatively write:
atomize(inputs)
| to_entries[]
| .value[] as $value
| [.key, $value[]]

Convert even odd index in array to key value pairs in json using jq

I'm trying to use jq to parse Solr 6.5 metrics into key value pairs:
{
"responseHeader": {
"status": 0,
"QTime": 7962
},
"metrics": [
"solr.core.shard1",
"QUERY./select",
"solr.core.shard2",
"QUERY./update"
...
]
}
I'd like to pick even odd entries in metrics array and put them together into a single object as key value pairs like this:
{
"solr.core.shard1": "QUERY./select",
"solr.core.shard2": "QUERY./update",
...
}
Till now, I am only able to come up with:
.metrics | to_entries | .[] | {(select(.key % 2 == 0).value): select(.key % 2 == 1).value}
But this returns an error or no results.
I'd be grateful if someone could point me in the right direction. I feel like the answer is probably in the map operator, but I haven't been able to figure it out.
jq solution:
jq '[ .metrics as $m | range(0; $m | length; 2)
| {($m[.]): $m[(. + 1)]} ] | add' jsonfile
The output:
{
"solr.core.shard1": "QUERY./select",
"solr.core.shard2": "QUERY./update"
}
https://stedolan.github.io/jq/manual/v1.5/#range(upto),range(from;upto)range(from;upto;by)
Here's a helper function which makes the solution trivial:
# Emit a stream consisting of pairs of items taken from `stream`
def pairwise(stream):
foreach stream as $i ([];
if length == 1 then . + [$i] else [$i] end;
select(length == 2));
From here there are several good options, e.g. we could start with:
.metrics
| [pairwise(.[]) | {(.[0]): .[1]}]
| add
With your input, this produces:
{
"solr.core.shard1": "QUERY./select",
"solr.core.shard2": "QUERY./update"
}
So you might want to write:
.metrics |= ([pairwise(.[]) | {(.[0]): .[1]}] | add)

Remove all null values

I am trying to remove null values from a json object using jq. I found this issue on their github and so now I'm trying to remove them with del.
I have this:
'{ id: $customerId, name, phones: ([{ original: .phone },
{ original: .otherPhone}]), email} | del(. | nulls)'
This doesn't seem to do anything. However if I replace nulls with .phones it does remove the phone numbers.
The following illustrates how to remove all the null-valued keys from a JSON object:
jq -n '{"a":1, "b": null, "c": null} | with_entries( select( .value != null ) )'
{
"a": 1
}
Alternatively, paths/0 can be used as follows:
. as $o | [paths[] | {(.) : ($o[.])} ] | add
By the way, del/1 can also be used to achieve the same result, e.g. using this filter:
reduce keys[] as $k (.; if .[$k] == null then del(.[$k]) else . end)
Or less obviously, but more succinctly:
del( .[ (keys - [paths[]])[] ] )
And for the record, here are two ways to use delpaths/1:
jq -n '{"a":1, "b": null, "c": null, "d":2} as $o
| $o
| delpaths( [ keys[] | select( $o[.] == null ) ] | map( [.]) )'
$ jq -n '{"a":1, "b": null, "c": null, "d":2}
| [delpaths((keys - paths) | map([.])) ] | add'
In both these last two cases, the output is the same:
{
"a": 1,
"d": 2
}
For reference, if you wanted to remove null-valued keys from all JSON objects in a JSON text (i.e., recursively), you could use walk/1, or:
del(.. | objects | (to_entries[] | select(.value==null) | .key) as $k | .[$k])
This answer by Michael Homer on https://unix.stackexchange.com has a super concinse solution which works since jq 1.6:
del(..|nulls)
It deletes all null-valued properties (and values) from your JSON. Simple and sweet :)
nulls is a builtin filter and can be replaced by custom selects:
del(..|select(. == "value to delete"))
To remove elements based on multiple conditions, e.g. remove all bools and all numbers:
del(..|booleans,numbers)
or, to only delete nodes not matching a condition:
del(..|select(. == "value to keep" | not))
(The last example is only illustrative – of course you could swap == for !=, but sometimes this is not possible. e.g. to keep all truthy values: del(..|select(.|not)))
All the other answers to date here are workarounds for old versions of jq, and it isn't clear how do do this simply in the latest released version. In JQ 1.6 or newer this will do the job to remove nulls recursively:
$ jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)' input.json
Sourced from this comment on the issue where adding the walk() function was discussed upstream.
[WARNING: the definition of walk/1 given in this response is problematic, not least for the reason given in the first comment; note also that jq 1.6 defines walk/1 differently.]
I am adding the new answer to emphasize the extended version of the script by #jeff-mercado. My version of the script assumes the empty values are as follows:
null;
[] - empty arrays;
{} - empty objects.
Removing of empty arrays and objects was borrowed from here https://stackoverflow.com/a/26196653/3627676.
def walk(f):
. as $in |
if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ( $in[$key] | walk(f) ) } ) | f
elif type == "array" then
select(length > 0) | map( walk(f) ) | f
else
f
end;
walk(
if type == "object" then
with_entries(select( .value != null and .value != {} and .value != [] ))
elif type == "array" then
map(select( . != null and . != {} and .!= [] ))
else
.
end
)
That's not what del/1 was meant to be used for. Given an object as input, if you wanted to remove the .phones property, you'd do:
del(.phones)
In other words, the parameter to del is the path to the property you wish to remove.
If you wanted to use this, you would have to figure out all the paths to null values and pass it in to this. That would be more of a hassle though.
Streaming the input in could make this task even simpler.
fromstream(tostream | select(length == 1 or .[1] != null))
Otherwise for a more straightforward approach, you'll have to walk through the object tree to find null values. If found, filter it out. Using walk/1, your filter could be applied recursively to exclude the null values.
walk(
(objects | with_entries(select(.value != null)))
// (arrays | map(select(. != null)))
// values
)
So if you had this for input:
{
"foo": null,
"bar": "bar",
"biz": [1,2,3,4,null],
"baz": {
"a": 1,
"b": null,
"c": ["a","b","c","null",32,null]
}
}
This filter would yield:
{
"bar": "bar",
"baz": {
"a": 1,
"c": ["a","b","c","null",32]
},
"biz": [1,2,3,4]
}
Elsewhere on this page, some interest has been expressed in
using jq to eliminate recursively occurrences of [] and {} as well as null.
Although it is possible to use the built-in definition of walk/1 to do
this, it is a bit tricky to do so correctly. Here therefore is a variant version
of walk/1 which makes it trivial to do so:
def traverse(f):
if type == "object" then map_values(traverse(f)) | f
elif type == "array" then map( traverse(f) ) | f
else f
end;
In order to make it easy to modify the criterion for removing elements,
we define:
def isempty: .==null or ((type|(.=="array" or .=="object")) and length==0);
The solution is now simply:
traverse(select(isempty|not))
With newer versions of jq (1.6 and later)
You can use this expression to remove null-valued keys recursively:
jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)'
REF
[WARNING: the response below has several problems, not least those arising from the fact that 0|length is 0.]
Elaborating on an earlier answer, in addition to removing properties with null values, it's often helpful to remove JSON properties with empty array or object values (i.e., [] or {}), too.
ℹī¸ — jq's walk() function (walk/1) makes this easy. walk() will be
available in a future version of jq (> jq-1.5), but its
definition
can be added to current filters.
The condition to pass to walk() for removing nulls and empty structures is:
walk(
if type == "object" then with_entries(select(.value|length > 0))
elif type == "array" then map(select(length > 0))
else .
end
)
Given this JSON input:
{
"notNullA": "notNullA",
"nullA": null,
"objectA": {
"notNullB": "notNullB",
"nullB": null,
"objectB": {
"notNullC": "notNullC",
"nullC": null
},
"emptyObjectB": {},
"arrayB": [
"b"
],
"emptyBrrayB": []
},
"emptyObjectA": {},
"arrayA": [
"a"
],
"emptyArrayA": []
}
Using this function gives the result:
{
"notNullA": "notNullA",
"objectA": {
"notNullB": "notNullB",
"objectB": {
"notNullC": "notNullC"
},
"arrayB": [
"b"
]
},
"arrayA": [
"a"
]
}

How to apply a function to all strings in record's structure recursively using jq

Is it possible to apply a recursive transformation to a record to return the same record, but having all string values mapped?
For example:
{"x":"1", "a": {"b": 2, "c": ["a"]}, "d": {"e": "z"}}
with a mapping of "add prime" applied:
{"x":"1'", "a": {"b": 2, "c": ["a'"]}, "d": {"e": "z'"}}
I've tried using a combination of recurse, map, string and select with little luck. Any ideas?
You can also do this easily with the recurse operator:
jq '(.. | strings) += "\'"'
Where .. generates a stream by recursively iterating through every element of the input, strings filters the stream for those who are strings, += adds the right-hand element to every element on the left-hand stream and "\'" is a literal containing the "prime" you seek.
Yes, use walk/1. It is explained in the jq manual.
If your jq does not have it defined, here is its definition from builtin.jq:
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
Here is a solution which uses paths/1 to identify string values and updates them with reduce, setpath and getpath
reduce paths(type == "string") as $p (
.
; setpath($p; getpath($p) + "'")
)

Transforming the name of key deeper in the JSON structure with jq

I have following json:
{
"vertices": [
{
"__cp": "foo",
"__type": "metric",
"__eid": "foobar",
"name": "Undertow Metrics~Sessions Created",
"_id": 45056,
"_type": "vertex"
},
...
]
"edges": [
...
and I would like to achieve this format:
{
"nodes": [
{
"cp": "foo",
"type": "metric",
"label": "metric: Undertow Metrics~Sessions Created",
"name": "Undertow Metrics~Sessions Created",
"id": 45056
},
...
]
"edges": [
...
So far I was able to create this expression:
jq '{nodes: .vertices} | del(.nodes[]."_type", .nodes[]."__eid")'
I.e. rename 'vertices' to 'nodes' and remove '_type' and '__eid', how can I rename a key nested deeper in the JSON?
You can change the names of properties of objects if you use with_entries(filter). This converts an object to an array of key/value pairs and applies a filter to the pairs and converts back to an object. So you would just want to update the key of those objects to your new names.
Depending on which version of jq you're using, the next part can be tricky. String replacement doesn't get introduced until jq 1.5. If that was available, you could then do this:
{
nodes: .vertices | map(with_entries(
.key |= sub("^_+"; "")
)),
edges
}
Otherwise if you're using jq 1.4, then you'll have to remove them manually. A recursive function can help with that since the number of underscores varies.
def ltrimall(str): str as $str |
if startswith($str)
then ltrimstr($str) | ltrimall(str)
else .
end;
{
nodes: .vertices | map(with_entries(
.key |= ltrimall("_")
)),
edges
}
The following program works with jq 1.4 or jq 1.5.
It uses walk/1 to remove leading underscores from any key, no matter where it occurs in the input JSON.
The version of ltrim provided here uses recurse/1 for efficiency and portability, but any suitable substitute may be used.
def ltrim(c):
reduce recurse( if .[0:1] == c then .[1:] else null end) as $x
(null; $x);
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
.vertices = .nodes
| del(.nodes)
| (.vertices |= walk(
if type == "object"
then with_entries( .key |= ltrim("_") )
else .
end ))
From your example data it looks like you intend lots of little manipulations so I'd break things out into stages like this:
.nodes = .vertices # \ first take care of renaming
| del(.vertices) # / .vertices to .nodes
| .nodes = [
.nodes[] # \ then scan each node
| . as $n # /
| del(._type, .__eid) # \ whatever key-specific tweaks like
| .label = "metric: \(.name)" # / calculating .label you want can go here
| reduce keys[] as $k ( # \
{}; # | final reduce to handle renaming
.[$k | sub("^_+";"")] = $n[$k] # | any keys that start with _
) # /
]