Remove all null values - json

I am trying to remove null values from a json object using jq. I found this issue on their github and so now I'm trying to remove them with del.
I have this:
'{ id: $customerId, name, phones: ([{ original: .phone },
{ original: .otherPhone}]), email} | del(. | nulls)'
This doesn't seem to do anything. However if I replace nulls with .phones it does remove the phone numbers.

The following illustrates how to remove all the null-valued keys from a JSON object:
jq -n '{"a":1, "b": null, "c": null} | with_entries( select( .value != null ) )'
{
"a": 1
}
Alternatively, paths/0 can be used as follows:
. as $o | [paths[] | {(.) : ($o[.])} ] | add
By the way, del/1 can also be used to achieve the same result, e.g. using this filter:
reduce keys[] as $k (.; if .[$k] == null then del(.[$k]) else . end)
Or less obviously, but more succinctly:
del( .[ (keys - [paths[]])[] ] )
And for the record, here are two ways to use delpaths/1:
jq -n '{"a":1, "b": null, "c": null, "d":2} as $o
| $o
| delpaths( [ keys[] | select( $o[.] == null ) ] | map( [.]) )'
$ jq -n '{"a":1, "b": null, "c": null, "d":2}
| [delpaths((keys - paths) | map([.])) ] | add'
In both these last two cases, the output is the same:
{
"a": 1,
"d": 2
}
For reference, if you wanted to remove null-valued keys from all JSON objects in a JSON text (i.e., recursively), you could use walk/1, or:
del(.. | objects | (to_entries[] | select(.value==null) | .key) as $k | .[$k])

This answer by Michael Homer on https://unix.stackexchange.com has a super concinse solution which works since jq 1.6:
del(..|nulls)
It deletes all null-valued properties (and values) from your JSON. Simple and sweet :)
nulls is a builtin filter and can be replaced by custom selects:
del(..|select(. == "value to delete"))
To remove elements based on multiple conditions, e.g. remove all bools and all numbers:
del(..|booleans,numbers)
or, to only delete nodes not matching a condition:
del(..|select(. == "value to keep" | not))
(The last example is only illustrative – of course you could swap == for !=, but sometimes this is not possible. e.g. to keep all truthy values: del(..|select(.|not)))

All the other answers to date here are workarounds for old versions of jq, and it isn't clear how do do this simply in the latest released version. In JQ 1.6 or newer this will do the job to remove nulls recursively:
$ jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)' input.json
Sourced from this comment on the issue where adding the walk() function was discussed upstream.

[WARNING: the definition of walk/1 given in this response is problematic, not least for the reason given in the first comment; note also that jq 1.6 defines walk/1 differently.]
I am adding the new answer to emphasize the extended version of the script by #jeff-mercado. My version of the script assumes the empty values are as follows:
null;
[] - empty arrays;
{} - empty objects.
Removing of empty arrays and objects was borrowed from here https://stackoverflow.com/a/26196653/3627676.
def walk(f):
. as $in |
if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ( $in[$key] | walk(f) ) } ) | f
elif type == "array" then
select(length > 0) | map( walk(f) ) | f
else
f
end;
walk(
if type == "object" then
with_entries(select( .value != null and .value != {} and .value != [] ))
elif type == "array" then
map(select( . != null and . != {} and .!= [] ))
else
.
end
)

That's not what del/1 was meant to be used for. Given an object as input, if you wanted to remove the .phones property, you'd do:
del(.phones)
In other words, the parameter to del is the path to the property you wish to remove.
If you wanted to use this, you would have to figure out all the paths to null values and pass it in to this. That would be more of a hassle though.
Streaming the input in could make this task even simpler.
fromstream(tostream | select(length == 1 or .[1] != null))
Otherwise for a more straightforward approach, you'll have to walk through the object tree to find null values. If found, filter it out. Using walk/1, your filter could be applied recursively to exclude the null values.
walk(
(objects | with_entries(select(.value != null)))
// (arrays | map(select(. != null)))
// values
)
So if you had this for input:
{
"foo": null,
"bar": "bar",
"biz": [1,2,3,4,null],
"baz": {
"a": 1,
"b": null,
"c": ["a","b","c","null",32,null]
}
}
This filter would yield:
{
"bar": "bar",
"baz": {
"a": 1,
"c": ["a","b","c","null",32]
},
"biz": [1,2,3,4]
}

Elsewhere on this page, some interest has been expressed in
using jq to eliminate recursively occurrences of [] and {} as well as null.
Although it is possible to use the built-in definition of walk/1 to do
this, it is a bit tricky to do so correctly. Here therefore is a variant version
of walk/1 which makes it trivial to do so:
def traverse(f):
if type == "object" then map_values(traverse(f)) | f
elif type == "array" then map( traverse(f) ) | f
else f
end;
In order to make it easy to modify the criterion for removing elements,
we define:
def isempty: .==null or ((type|(.=="array" or .=="object")) and length==0);
The solution is now simply:
traverse(select(isempty|not))

With newer versions of jq (1.6 and later)
You can use this expression to remove null-valued keys recursively:
jq 'walk( if type == "object" then with_entries(select(.value != null)) else . end)'
REF

[WARNING: the response below has several problems, not least those arising from the fact that 0|length is 0.]
Elaborating on an earlier answer, in addition to removing properties with null values, it's often helpful to remove JSON properties with empty array or object values (i.e., [] or {}), too.
ℹī¸ — jq's walk() function (walk/1) makes this easy. walk() will be
available in a future version of jq (> jq-1.5), but its
definition
can be added to current filters.
The condition to pass to walk() for removing nulls and empty structures is:
walk(
if type == "object" then with_entries(select(.value|length > 0))
elif type == "array" then map(select(length > 0))
else .
end
)
Given this JSON input:
{
"notNullA": "notNullA",
"nullA": null,
"objectA": {
"notNullB": "notNullB",
"nullB": null,
"objectB": {
"notNullC": "notNullC",
"nullC": null
},
"emptyObjectB": {},
"arrayB": [
"b"
],
"emptyBrrayB": []
},
"emptyObjectA": {},
"arrayA": [
"a"
],
"emptyArrayA": []
}
Using this function gives the result:
{
"notNullA": "notNullA",
"objectA": {
"notNullB": "notNullB",
"objectB": {
"notNullC": "notNullC"
},
"arrayB": [
"b"
]
},
"arrayA": [
"a"
]
}

Related

JQ sorting key by arbitrary order

I am trying to sort by key in an arbitrary format (key 'event' should be the first)
I know this one sort the key by alphabetical order:
jq -S '.' file.json
but is there a function to sort key so that the first one is always the same and not by alphabetical order?
It is to make them more human readable and have the most significant key first
Currently have:
{key1:value, shouldbeFirstKey:value2, ...}
Would like
{shouldbeFirstKey:value2, key1:value, ...}
Suppose we have an object with a bunch of keys, and want some of those keys to appear in a certain order, while leaving the others as-is. Then the technique illustrated by the following example can be used:
$ jq -n '{a:2,b:3,first:0,second:1} | . as $in | {first,second} + $in'
The result:
{
"first": 0,
"second": 1,
"a": 2,
"b": 3
}
rekey
Let's call the object defining the key ordering as the "template object" ({first,second} above). Notice that using the technique described above, the keys in the "template object" always appear in the result. If we only want the template object keys to appear in the result if they appear in the input, we can modify the above approach using the following function:
def rekey(obj):
. as $in
| reduce (obj|keys_unsorted)[] as $k ({};
if $in|has($k) then . + {($k): $in[$k]} else . end)
| . + $in ;
For example:
{a:2,b:3,first:0,second:1} | rekey({first,second,third})
produces:
{
"first": 0,
"second": 1,
"a": 2,
"b": 3
}
With walk/1
If one wants to reorder keys recursively, one can use walk/1 (defined as at https://github.com/stedolan/jq/blob/master/src/builtin.jq),
as illustrated here using the above definition of rekey:
walk(if type == "object" then rekey($template) else . end)
where $template represents the "template object".
There is no need to define a special variant of walk/1. Simply define a function that takes as input an arbitrary object, and that produces the desired reordering.
(If your jq comes with the version of walk/1 that uses keys, then you should consider updating your jq, or redefine walk/1 to use keys_unsorted.)
Here is a filter (derived from walk/1 [source]) which will reorder the keys of subobjects using a function:
def reorder(order):
. as $in
| if type == "object" then
reduce (keys_unsorted|order)[] as $key(
{}; . + { ($key): ($in[$key] | reorder(order)) }
)
elif type == "array" then map( reorder(order) )
else .
end;
the order function is expected to return the set of keys in whatever order is desired. e.g. the following function moves "shouldbeFirstKey" to the first position
def neworder:
"shouldbeFirstKey" as $s
| index($s) as $i
| if $i==null then . else [$s] + .[:$i] + .[$i+1:] end
;
so that
{key1:"value", shouldbeFirstKey:"value2", other:{key3:"value3", shouldbeFirstKey:"value4"}}
| reorder(neworder)
produces the output
{
"shouldbeFirstKey": "value2",
"key1": "value",
"other": {
"shouldbeFirstKey": "value4",
"key3": "value3"
}
}
Try it online!
idea from #jq170727 's answer:
jq '
def reorder(a; n):
def order(a; n):
a as $a |
n as $n |
index($a) as $i |
if $i == null then .
else
if $i<$n then .[:$i] + .[$i+1:$n+1] + [$a] + .[$n+1:]
elif $n<$i then .[:$n] + [$a] + .[$n:]
else .
end
end;
. as $in
| if type == "object" then
reduce (keys_unsorted|order(a;n))[] as $key(
{}; . + { ($key): ($in[$key] | reorder(a; n)) }
)
elif type == "array" then map( reorder(a; n) )
else .
end;
[.[] | reorder("id"; 0)| reorder("size"; 3) | reorder("name"; 1)]'
data.json

How to apply a function to all strings in record's structure recursively using jq

Is it possible to apply a recursive transformation to a record to return the same record, but having all string values mapped?
For example:
{"x":"1", "a": {"b": 2, "c": ["a"]}, "d": {"e": "z"}}
with a mapping of "add prime" applied:
{"x":"1'", "a": {"b": 2, "c": ["a'"]}, "d": {"e": "z'"}}
I've tried using a combination of recurse, map, string and select with little luck. Any ideas?
You can also do this easily with the recurse operator:
jq '(.. | strings) += "\'"'
Where .. generates a stream by recursively iterating through every element of the input, strings filters the stream for those who are strings, += adds the right-hand element to every element on the left-hand stream and "\'" is a literal containing the "prime" you seek.
Yes, use walk/1. It is explained in the jq manual.
If your jq does not have it defined, here is its definition from builtin.jq:
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
Here is a solution which uses paths/1 to identify string values and updates them with reduce, setpath and getpath
reduce paths(type == "string") as $p (
.
; setpath($p; getpath($p) + "'")
)

Transforming the name of key deeper in the JSON structure with jq

I have following json:
{
"vertices": [
{
"__cp": "foo",
"__type": "metric",
"__eid": "foobar",
"name": "Undertow Metrics~Sessions Created",
"_id": 45056,
"_type": "vertex"
},
...
]
"edges": [
...
and I would like to achieve this format:
{
"nodes": [
{
"cp": "foo",
"type": "metric",
"label": "metric: Undertow Metrics~Sessions Created",
"name": "Undertow Metrics~Sessions Created",
"id": 45056
},
...
]
"edges": [
...
So far I was able to create this expression:
jq '{nodes: .vertices} | del(.nodes[]."_type", .nodes[]."__eid")'
I.e. rename 'vertices' to 'nodes' and remove '_type' and '__eid', how can I rename a key nested deeper in the JSON?
You can change the names of properties of objects if you use with_entries(filter). This converts an object to an array of key/value pairs and applies a filter to the pairs and converts back to an object. So you would just want to update the key of those objects to your new names.
Depending on which version of jq you're using, the next part can be tricky. String replacement doesn't get introduced until jq 1.5. If that was available, you could then do this:
{
nodes: .vertices | map(with_entries(
.key |= sub("^_+"; "")
)),
edges
}
Otherwise if you're using jq 1.4, then you'll have to remove them manually. A recursive function can help with that since the number of underscores varies.
def ltrimall(str): str as $str |
if startswith($str)
then ltrimstr($str) | ltrimall(str)
else .
end;
{
nodes: .vertices | map(with_entries(
.key |= ltrimall("_")
)),
edges
}
The following program works with jq 1.4 or jq 1.5.
It uses walk/1 to remove leading underscores from any key, no matter where it occurs in the input JSON.
The version of ltrim provided here uses recurse/1 for efficiency and portability, but any suitable substitute may be used.
def ltrim(c):
reduce recurse( if .[0:1] == c then .[1:] else null end) as $x
(null; $x);
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
.vertices = .nodes
| del(.nodes)
| (.vertices |= walk(
if type == "object"
then with_entries( .key |= ltrim("_") )
else .
end ))
From your example data it looks like you intend lots of little manipulations so I'd break things out into stages like this:
.nodes = .vertices # \ first take care of renaming
| del(.vertices) # / .vertices to .nodes
| .nodes = [
.nodes[] # \ then scan each node
| . as $n # /
| del(._type, .__eid) # \ whatever key-specific tweaks like
| .label = "metric: \(.name)" # / calculating .label you want can go here
| reduce keys[] as $k ( # \
{}; # | final reduce to handle renaming
.[$k | sub("^_+";"")] = $n[$k] # | any keys that start with _
) # /
]

Parse json and extract items in array upon condition of nested values

I've already asked a similar question here - Parse json and choose items\keys upon condition, but this time it's slightly different.
This is the Example:
[
{
"item1": "value123",
"array0": [
{
"item2": "aaa"
}
]
},
{
"item1": "value456",
"array1": [
{
"item2": "bbb"
}
]
},
{
"item1": "value789",
"array2": [
{
"item2": "ccc"
}
]
}
]
I'd like to get the value of "item1", only when "item2" has a specific value.
Let's say if item2 equals "bbb", then all I want to get back is "value456".
I've tried to solve it with jq like it worked for me in the issue mentioned above, but to no avail, as I can't extract values from a "higher" level than the one i'm searching in with jq's select and map.
An easier solution is available thanks to the magic powers of the recurse operator, ..:
jq -r '.[] | select(.. | .item2? == "bbb").item1'
Basically what this does is, for each (.[]) object in the original array, pick (select) only those in which any of the keys recursively (..) named .item2 equals "bbb", and then select the .item1 property of said object.
found the solution on jq's manual (with #ifthenelse and #Objects) - http://stedolan.github.io/jq/manual
jq -r '{name1: .[].item1, name2: .[].array[].item2} | /
if .item2 == "bbb" then .name1 elif .item2 != "bbb" then empty else empty end'
If you find the magic of .. too powerful for your purposes, the following may be useful. It restricts the search for the "item2" key.
def some(condition): reduce .[] as $x
(false; if . then . else $x|condition end);
.[]
| to_entries
| select( some( .value | (type == "array") and some(.item2 == "bbb")) )
| from_entries
| .item1
If your jq has any/1, then use it instead of some/1.
Here is a solution which uses tostream and getpath
foreach (tostream|select(length==2)) as [$p,$v] (.;.;
if $p[-1] == "item2" and $v == "bbb" then getpath($p[:-3]).item1 else empty end
)
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
(tostream | select(length==2)) as [$p,$v]
| if $p[-1] == "item2" and $v == "bbb" then getpath($p[:-3]).item1 else empty end

jq: selecting a subset of keys from an object

Given an input json string of keys from an array, return an object with only the entries that had keys in the original object and in the input array.
I have a solution but I think that it isn't elegant ({($k):$input[$k]} feels especially clunky...) and that this is a chance for me to learn.
jq -n '{"1":"a","2":"b","3":"c"}' \
| jq --arg keys '["1","3","4"]' \
'. as $input
| ( $keys | fromjson )
| map( . as $k
| $input
| select(has($k))
| {($k):$input[$k]}
)
| add'
Any ideas how to clean this up?
I feel like Extracting selected properties from a nested JSON object with jq is a good starting place but i cannot get it to work.
solution with inside check:
jq 'with_entries(select([.key] | inside(["key1", "key2"])))'
the inside operator works for most of time; however, I just found the inside operator has side effect, sometimes it selected keys not desired, suppose input is { "key1": val1, "key2": val2, "key12": val12 } and select by inside(["key12"]) it will select both "key1" and "key12"
use the in operator if need an exact match: like this will select .key2 and .key12 only
jq 'with_entries(select(.key | in({"key2":1, "key12":1})))'
because the in operator checks key from an object only (or index exists? from an array), here it has to be written in an object syntax, with desired keys as keys, but values do not matter; the use of in operator is not a perfect one for this purpose, I would like to see the Javascript ES6 includes API's reverse version to be implemented as jq builtin
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/includes
jq 'with_entries(select(.key | included(["key2", "key12"])))'
to check an item .key is included? from an array
You can use this filter:
with_entries(
select(
.key as $k | any($keys | fromjson[]; . == $k)
)
)
Here is some additional clarification
For the input object {"key1":1, "key2":2, "key3":3} I would like to drop all keys that are not in the set of desired keys ["key1","key3","key4"]
jq -n --argjson desired_keys '["key1","key3","key4"]' \
--argjson input '{"key1":1, "key2":2, "key3":3}' \
' $input
| with_entries(
select(
.key == ($desired_keys[])
)
)'
with_entries converts {"key1":1, "key2":2, "key3":3} into the following array of key value pairs and maps the select statement on the array and then turns the resulting array back into an object.
Here is the inner object in the with_entries statement.
[
{
"key": "key1",
"value": 1
},
{
"key": "key2",
"value": 2
},
{
"key": "key3",
"value": 3
}
]
we can then select the keys from this array that meet our criteria.
This is where the magic happens... here is a look at whats going on in the middle of this command. The following command takes the expanded array of values and turns them into a list of objects that we can select from.
jq -cn '{"key":"key1","value":1}, {"key":"key2","value":2}, {"key":"key3","value":3}
| select(.key == ("key1", "key3", "key4"))'
This will yield the following result
{"key":"key1","value":1}
{"key":"key3","value":3}
The with entries command can be a little tricky but its easy to remember that it takes a filter and is defined as follows
def with_entries(f): to_entries|map(f)|from_entries;
This is the same as
def with_entries(f): [to_entries[] | f] | from_entries;
The other part of the question that confuses people is the multiple matches on the right hand side of the ==
Consider the following command. We see the output is an outer production of all the left hand lists and the right hand lists.
jq -cn '1,2,3| . == (1,1,3)'
true
true
false
false
false
false
false
false
true
If that predicate is in a select statement, we keep the input when the predicate is true. Note you can duplicate the inputs here too.
jq -cn '1,2,3| select(. == (1,1,3))'
1
1
3
Jeff's answer has a couple of unnecessary inefficiencies, both of which are addressed by the following, on the assumption that --argjson keys is used instead of --arg keys:
with_entries( select( .key as $k | $keys | index($k) ) )
Even better, if your jq has IN:
with_entries(select(.key | IN($keys[])))
If you are sure that all keys in the input array are present in the original object, you can use the object construction shortcut.
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3"}'
{
"1": "a",
"3": "c"
}
Numbers should be quoted to force jq to interpret them as keys instead of literals. In the case of keys not resembling a number, quotes are not needed:
$ echo '{"key1":"a","key2":"b","key3":"c"}' | jq '{key1, key3}'
{
"key1": "a",
"key3": "c"
}
Adding a non-existent key will yield a null value, unlikely what OP wanted:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"}'
{
"1": "a",
"3": "c",
"4": null
}
but those can be filtered out:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"} | with_entries(select(.value != null))'
{
"1": "a",
"3": "c"
}
Although this answer doesn't receive a valid input json array as OP asked, I find it useful for just filtering some keys you know are present.
An example usecase: get aud and iss from a JWT. The following is very succint:
echo "jwt-as-json" | jq '{aud, iss}'