Using jq I need to delete a specific key:value pair if value contains string "null". The string could be " null" or "null ". Hence need to use contains and not exact string match. If not a string it will be a number
My sample json is as below: (The null value is expected only in the 'th' and 'sf' keys)
'dets':{
'S1':{
'type':'class',
'input': [12,7,6,19],
'config':{
'file':'sfile1',
'th': -10,
'sf': 'null'
}
},
'S2':{
'type':'class',
'input': [12,7,6,19],
'config':{
'file':'sfile2',
'th': -5,
'sf': 3
}
},
'S3':{
'type':'bottom',
'input': [12,7,16],
'config':{
'file':'sfile3',
'th': ' null',
'sf': 'null '
}
}
}
The required output should be like:
'dets':{
'S1':{
'type':'class',
'input': [12,7,6,19],
'config':{
'file':'sfile1',
'th': -10
}
},
'S2':{
'type':'class',
'input': [12,7,6,19],
'config':{
'file':'sfile2',
'th': -5,
'sf': 3
}
},
'S3':{
'type':'bottom',
'input': [12,7,16],
'config':{
'file':'sfile3'
}
}
}
I believe something on the lines of del(.[][].config.smoothing_span | select(contains("null"))) but i am running into a problem since type is different.
The given data is not valid JSON but it is valid HJSON, so the first step in the following jq-oriented solution is to use hjson to convert the data to JSON:
hjson -j input.hjson
The concept of what values should be regarded as "null" might change over time, so in the following let's define a filter that can be used to capture whichever definition is appropriate, e.g.
def isnull: . == null or (type=="string" and test("null"));
(Perhaps a better definition would use test("^ *null *$").)
If you want to delete all keys whose value isnull, you could use walk/1:
walk(if type=="object"
then with_entries(select(.value|isnull|not))
else . end)
(If your jq does not have walk, you could simply copy-and-paste its definition from https://github.com/stedolan/jq/blob/master/src/builtin.jq or elsewhere on the web.)
Assuming your jq has walk, we could therefore write:
hjson -j input.hjson |
jq 'def isnull: . == null or (type=="string" and test("null"));
walk(if type=="object"
then with_entries(select(.value|isnull|not))
else . end)'
If you want to "walk" the input but restrict attention to specific keys, you can easily modify the selection criterion in (3).
delkey
If you want the scope of the changes to be as narrow as possible, you could use map_values, e.g. in conjunction with a helper function for checking and possibly deleting a specific key:
# key is assumed to be a filter, e.g. .foo
def delkey(key): if key | isnull then del(key) else . end;
.dets |= map_values(.config |= (delkey(.th) | delkey(.sf)))
delkeys
If there are several specific keys to be checked, it might be more convenient to define a function for checking a list of keys:
# keys is assumed to be an array of strings
def delkeys(keys):
with_entries(select(.key as $k
| (keys|index($k)) and (.value|isnull) | not));
.dets |= map_values(.config |= delkeys(["th", "sf"]))
Related
I've got two different JSON structures to retrieve a specific object value from, basically something like this
{
"one": {
"foo": {
"bar": "baz"
}
}
}
and another like that
{
"two": {
"foo": {
"bar": "qux"
}
}
}
I'd like to return the bar value in both cases plus an additional return variant error in case neither case 1 - baz - nor case 2 - qux - matches anything (i.e. matches null).
Is there a simple way to do that with just jq 1.6?
Update:
Here are snippets of actual JSON files:
/* manifest.json, variant A */
{
"browser_specific_settings": {
"gecko": {
"id": "{95ad7b39-5d3e-1029-7285-9455bcf665c0}",
"strict_min_version": "68.0"
}
}
}
/* manifest.json, variant B */
{
"applications": {
"gecko": {
"id": "j30D-3YFPUvj9u9izFoPSjlNYZfF22xS#foobar",
"strict_min_version": "53.0"
}
}
}
I need the id values (*gecko.id so to speak) or error if there is none:
{95ad7b39-5d3e-1029-7285-9455bcf665c0}
j30D-3YFPUvj9u9izFoPSjlNYZfF22xS#foobar
error
You can use a filter as below that could work with both your sample JSON content provided
jq '.. | if type == "object" and has("id") then .id else empty end'
See them live jqplay - VariantA and jqplay - VariantB
Note: This only gets the value of .id when it is present, see others answers (oguz ismail's) for displaying error when the object does not contain the required field.
(.. | objects | select(has("id"))).id // "error"
This will work with multiple files and files containing multiple separate entities.
jqplay demo
You can use a combination of the ? "error suppression" and // "alternative` operators :
jq -n --slurpfile variantA yourFirstFile --slurpfile variantB yourSecondFile \
'(
($variantA[0].browser_specific_settings.gecko.id)?,
($variantB[0].applications.gecko.id)?
) // "error"'
This will output the id from the first file and/or the id from the second file if any of the two exist, avoiding to raise errors when they don't, and output error instead if none of them can be found.
The command can be shortened as follows if it makes sense in your context :
jq -n --slurpfile variantA yourFirstFile --slurpfile variantB yourSecondFile \
'(($variantA[0].browser_specific_settings, $variantB[0].applications) | .gecko.id)? // "error"'
I think you are looking for hasOwnProperty()
for example:
var value;
if(applications.gecko.hasOwnProperty('id'))
value = applications.gecko.id;
else
value = 'error';
I need the id values (*gecko.id so to speak) or error if there is none:
In accordance with your notation "*gecko.id", here are two solutions, the first interpreting the "*" as a single unknown key (or index), the second interpreting it (more or less) as any number of keys or indices:
.[] | .gecko.id // "error"
.. | objects | select(has("gecko")) | (.gecko.id // "error")
If you don't really care about whether there's a "gecko" key, you might wish to consider:
first(.. | objects | select(has("id")).id ) // "error"
Anyone knows how to use jq to sort keys and their array values in json?
For example:
Before sort:
{
z:["c","b","a"],
y:["e", "d", "f"],
x:["g", "i", "h"]
}
After sort:
{
x:["h", "i", "j"]
y:["d", "e", "f"],
z:["a","b","c"]
}
I am trying to use
jq --sort-keys
but it only sorts the keys, but not including their values.
Thanks!
If you are willing to rely on the --sort-keys command-line option to sort the keys, then you can ensure all arrays are sorted by writing:
walk(if type=="array" then sort else . end)
If you want the object keys to be sorted internally (i.e. before the final output is generated), then you could augment the above by using the following filter:
walk(if type=="array" then sort
elif type == "object" then to_entries | sort | from_entries
else . end)
Alternatives
If for some reason you wish not to use walk, then you can roll your own solution using some combination of sort (for JSON arrays) and to_entries|sort|from_entries (for JSON objects).
I am processing a very large JSON wherein I need to filter the inner JSON objects using a value of a key. My JSON looks like as follows:
{"userActivities":{"L3ATRosRdbDgSmX75Z":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-20"},"L3ATSFGrpAYRkIIKqrh":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-21"},"L3AVHvmReBBPNGluvHl":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-22"},"L3AVIcqaDpZxLf6ispK":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday,"localDate":"2018-01-19"}}}
I want to put a filter on localDate values such that localDate in 2018-01-20 or localDate in "2018-01-21" such that the output look like.
{"userActivities":{"L3ATRosRdbDgSmX75Z":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-20"},"L3ATSFGrpAYRkIIKqrh":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-21"}}}
I have asked a similar question here and realised that I need to put filter on multiple values and retain the original structure of JSON.
https://stackoverflow.com/questions/52324497/how-to-filter-json-using-jq-stream
Thanks a ton in advance!
From the jq Cookbook, let's borrow def atomize(s):
# Convert an object (presented in streaming form as the stream s) into
# a stream of single-key objects
# Examples:
# atomize({a:1,b:2}|tostream)
# atomize(inputs) (used in conjunction with "jq -n --stream")
def atomize(s):
fromstream(foreach s as $in ( {previous:null, emit: null};
if ($in | length == 2) and ($in|.[0][0]) != .previous and .previous != null
then {emit: [[.previous]], previous: $in|.[0][0]}
else { previous: ($in|.[0][0]), emit: null}
end;
(.emit // empty), $in) ) ;
Since the top-level object described by the OP contains just one key, we can select the August 2018 objects as follows:
atomize(1|truncate_stream(inputs))
| select( .[].localDate[0:7] == "2018-08")
If you want these collected into a composite object, you might have to be careful about memory, so you might want to pipe the selected objects to another program (e.g. awk or jq). Otherwise, I'd go with:
def add(s): reduce s as $x (null; .+$x);
{"userActivities": add(
atomize(1|truncate_stream(inputs | select(.[0][0] == "userActivities")))
| select( .[].localDate[0:7] =="2018-01") ) }
Variation
If the top-level object has more than one key, then the following variation would be appropriate:
atomize(1|truncate_stream(inputs | select(.[0][0] == "userActivities")))
| select( .[].localDate[0:7] =="2018-08")
I'm looking to transform JSON using jq to a delimiter-separated and flattened structure.
There have been attempts at this. For example, Flatten nested JSON using jq.
However the solutions on that page fail if the JSON contains arrays. For example, if the JSON is:
{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}
The solution above will fail to transform the above to:
{"a.b.0":1,"x.0.y":2,"x.1.z":3}
In addition, I'm looking for a solution that will also allow for an arbitrary delimiter. For example, suppose the space character is the delimiter. In this case, the result would be:
{"a b 0":1,"x 0 y":2,"x 1 z":3}
I'm looking to have this functionality accessed via a Bash (4.2+) function as is found in CentOS 7, something like this:
flatten_json()
{
local JSONData="$1"
# jq command to flatten $JSONData, putting the result to stdout
jq ... <<<"$JSONData"
}
The solution should work with all JSON data types, including null and boolean. For example, consider the following input:
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
It should produce:
{"a b 0":"p q r","w 0 x":null,"w 1 y":false,"w 2 z":3}
If you stream the data in, you'll get pairings of paths and values of all leaf values. If not a pair, then a path marking the end of a definition of an object/array at that path. Using leaf_paths as you found would only give you paths to truthy leaf values so you'll miss out on null or even false values. As a stream, you won't get this problem.
There are many ways this could be combined to an object, I'm partial to using reduce and assignment in these situations.
$ cat input.json
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
.[[$i[0][]|tostring]|join($delim)] = $i[1]
)' input.json
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
Here's the same solution broken up a bit to allow room for explanation of what's going on.
$ jq --arg delim '.' 'reduce (tostream|select(length==2)) as $i ({};
[$i[0][]|tostring] as $path_as_strings
| ($path_as_strings|join($delim)) as $key
| $i[1] as $value
| .[$key] = $value
)' input.json
Converting the input to a stream with tostream, we'll receive multiple values of pairs/paths as input to our filter. With this, we can pass those multiple values into reduce which is designed to accept multiple values and do something with them. But before we do, we want to filter those pairs/paths by only the pairs (select(length==2)).
Then in the reduce call, we're starting with a clean object and assigning new values using a key derived from the path and the corresponding value. Remember that every value produced in the reduce call is used for the next value in the iteration. Binding values to variables doesn't change the current context and assignments effectively "modify" the current value (the initial object) and passes it along.
$path_as_strings is just the path which is an array of strings and numbers to just strings. [$i[0][]|tostring] is a shorthand I use as an alternative to using map when the array I want to map is not the current array. This is more compact since the mapping is done as a single expression. That instead of having to do this to get the same result: ($i[0]|map(tostring)). The outer parentheses might not be necessary in general but, it's still two separate filter expressions vs one (and more text).
Then from there we convert that array of strings to the desired key using the provided delimiter. Then assign the appropriate values to the current object.
The following has been tested with jq 1.4, jq 1.5 and the current "master" version. The requirement about including paths to null and false is the reason for "allpaths" and "all_leaf_paths".
# all paths, including paths to null
def allpaths:
def conditional_recurse(f): def r: ., (select(.!=null) | f | r); r;
path(conditional_recurse(.[]?)) | select(length > 0);
def all_leaf_paths:
def isscalar: type | (. != "object" and . != "array");
allpaths as $p
| select(getpath($p)|isscalar)
| $p ;
. as $in
| reduce all_leaf_paths as $path ({};
. + { ($path | map(tostring) | join($delim)): $in | getpath($path) })
With this jq program in flatten.jq:
$ cat input.json
{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}
$ jq --arg delim . -f flatten.jq input.json
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
Collisions
Here is a helper function that illustrates an alternative path-flattening algorithm. It converts keys that contain the delimiter to quoted strings, and array elements are presented in square brackets (see the example below):
def flattenPath(delim):
reduce .[] as $s ("";
if $s|type == "number"
then ((if . == "" then "." else . end) + "[\($s)]")
else . + ($s | tostring | if index(delim) then "\"\(.)\"" else . end)
end );
Example: Using flattenPath instead of map(tostring) | join($delim), the object:
{"a.b": [1]}
would become:
{
"\"a.b\"[0]": 1
}
To add a new option to the solutions already given, jqg is a script I wrote to flatten any JSON file and then search it using a regex. For your purposes your regex would simply be '.' which would match everything.
$ echo '{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}' | jqg .
{
"a.b.0": 1,
"x.0.y": 2,
"x.1.z": 3
}
and can produce compact output:
$ echo '{"a":{"b":[1]},"x":[{"y":2},{"z":3}]}' | jqg -q -c .
{"a.b.0":1,"x.0.y":2,"x.1.z":3}
It also handles the more complicated example that #peak used:
$ echo '{"a":{"b":["p q r"]},"w":[{"x":null},{"y":false},{"z":3}]}' | jqg .
{
"a.b.0": "p q r",
"w.0.x": null,
"w.1.y": false,
"w.2.z": 3
}
as well as empty arrays and objects (and a few other edge-case values):
$ jqg . test/odd-values.json
{
"one.start-string": "foo",
"one.null-value": null,
"one.integer-number": 101,
"two.two-a.non-integer-number": 101.75,
"two.two-a.number-zero": 0,
"two.true-boolean": true,
"two.two-b.false-boolean": false,
"three.empty-string": "",
"three.empty-object": {},
"three.empty-array": [],
"end-string": "bar"
}
(reporting empty arrays & objects can be turned off with the -E option).
jqg was tested with jq 1.6
Note : I am the author of the jqg script.
Given an input json string of keys from an array, return an object with only the entries that had keys in the original object and in the input array.
I have a solution but I think that it isn't elegant ({($k):$input[$k]} feels especially clunky...) and that this is a chance for me to learn.
jq -n '{"1":"a","2":"b","3":"c"}' \
| jq --arg keys '["1","3","4"]' \
'. as $input
| ( $keys | fromjson )
| map( . as $k
| $input
| select(has($k))
| {($k):$input[$k]}
)
| add'
Any ideas how to clean this up?
I feel like Extracting selected properties from a nested JSON object with jq is a good starting place but i cannot get it to work.
solution with inside check:
jq 'with_entries(select([.key] | inside(["key1", "key2"])))'
the inside operator works for most of time; however, I just found the inside operator has side effect, sometimes it selected keys not desired, suppose input is { "key1": val1, "key2": val2, "key12": val12 } and select by inside(["key12"]) it will select both "key1" and "key12"
use the in operator if need an exact match: like this will select .key2 and .key12 only
jq 'with_entries(select(.key | in({"key2":1, "key12":1})))'
because the in operator checks key from an object only (or index exists? from an array), here it has to be written in an object syntax, with desired keys as keys, but values do not matter; the use of in operator is not a perfect one for this purpose, I would like to see the Javascript ES6 includes API's reverse version to be implemented as jq builtin
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/includes
jq 'with_entries(select(.key | included(["key2", "key12"])))'
to check an item .key is included? from an array
You can use this filter:
with_entries(
select(
.key as $k | any($keys | fromjson[]; . == $k)
)
)
Here is some additional clarification
For the input object {"key1":1, "key2":2, "key3":3} I would like to drop all keys that are not in the set of desired keys ["key1","key3","key4"]
jq -n --argjson desired_keys '["key1","key3","key4"]' \
--argjson input '{"key1":1, "key2":2, "key3":3}' \
' $input
| with_entries(
select(
.key == ($desired_keys[])
)
)'
with_entries converts {"key1":1, "key2":2, "key3":3} into the following array of key value pairs and maps the select statement on the array and then turns the resulting array back into an object.
Here is the inner object in the with_entries statement.
[
{
"key": "key1",
"value": 1
},
{
"key": "key2",
"value": 2
},
{
"key": "key3",
"value": 3
}
]
we can then select the keys from this array that meet our criteria.
This is where the magic happens... here is a look at whats going on in the middle of this command. The following command takes the expanded array of values and turns them into a list of objects that we can select from.
jq -cn '{"key":"key1","value":1}, {"key":"key2","value":2}, {"key":"key3","value":3}
| select(.key == ("key1", "key3", "key4"))'
This will yield the following result
{"key":"key1","value":1}
{"key":"key3","value":3}
The with entries command can be a little tricky but its easy to remember that it takes a filter and is defined as follows
def with_entries(f): to_entries|map(f)|from_entries;
This is the same as
def with_entries(f): [to_entries[] | f] | from_entries;
The other part of the question that confuses people is the multiple matches on the right hand side of the ==
Consider the following command. We see the output is an outer production of all the left hand lists and the right hand lists.
jq -cn '1,2,3| . == (1,1,3)'
true
true
false
false
false
false
false
false
true
If that predicate is in a select statement, we keep the input when the predicate is true. Note you can duplicate the inputs here too.
jq -cn '1,2,3| select(. == (1,1,3))'
1
1
3
Jeff's answer has a couple of unnecessary inefficiencies, both of which are addressed by the following, on the assumption that --argjson keys is used instead of --arg keys:
with_entries( select( .key as $k | $keys | index($k) ) )
Even better, if your jq has IN:
with_entries(select(.key | IN($keys[])))
If you are sure that all keys in the input array are present in the original object, you can use the object construction shortcut.
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3"}'
{
"1": "a",
"3": "c"
}
Numbers should be quoted to force jq to interpret them as keys instead of literals. In the case of keys not resembling a number, quotes are not needed:
$ echo '{"key1":"a","key2":"b","key3":"c"}' | jq '{key1, key3}'
{
"key1": "a",
"key3": "c"
}
Adding a non-existent key will yield a null value, unlikely what OP wanted:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"}'
{
"1": "a",
"3": "c",
"4": null
}
but those can be filtered out:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"} | with_entries(select(.value != null))'
{
"1": "a",
"3": "c"
}
Although this answer doesn't receive a valid input json array as OP asked, I find it useful for just filtering some keys you know are present.
An example usecase: get aud and iss from a JWT. The following is very succint:
echo "jwt-as-json" | jq '{aud, iss}'