Select only non-empty objects from array - json

I have a JSON array with objects as items. Some of the objects have properties, some don't. How can I filter the list with jq to keep only non-empty objects in the list?
Input:
[
{},
{ "x": 42 },
{},
{ "y": 21 },
{},
{ "z": 13 }
]
Expected output:
[
{ "x": 42 },
{ "y": 21 },
{ "z": 13 }
]
I tried jq 'select(. != {})', but the result still contains all items. Using jq 'map(select(empty))' doesn't keep any item and returns an empty array!

You need map to make select test individual objects:
jq 'map(select(. != {}))'

map is required to apply the filter to every item in the array. Then any can be used: it will return true for objects with at least one property, and false if the object is empty.
$ jq -n '{} | any'
false
$ jq -n '{x:42} | any'
true
empty is not a test, but a generator for "no value" (docs).
Solution:
map(select(any))
As noted by user Philippe in the comments, using any only works if the values of the objects' properties are not false. It would detect objects such as {"x":false} as "empty". A better approach is comparing the empty object directly or using length:
map(select(length>0))

Just use the del filter:
jq 'del(.[] | select(length == 0))'
or
jq 'del(.[] | select(. == {}))'

Related

failed to extract data from json with jq command

I've this json.
I want to extract test fields which their values equal to true.
I tried with jq and got that error, pls any fix ?
$- jq '.[].name | select(.[].test == "true")' ddd
jq: error (at ddd:12): Cannot iterate over string ("AA")
[
{
"name": "AA",
"program_url": "https://www.google.com",
"test": false
},
{
"name": "BB",
"program_url": "https://yahoo.com",
"test": true
}
]
Are you looking for this? It iterates over the array .[], selects those item objects whose .test field evaluates to true (implicit), and traverses further down to the .name field. Using the --raw-output (or -r) option renders the output raw text (instead of JSON string in this case).
jq -r '.[] | select(.test).name' ddd
BB
Demo

Why is adding parentheses to a filter in 'jq' producing valid JSON and without parentheses, multiple outputs of objects?

With jq, I would like to set a property within JSON data and let jq output the original JSON with the updated value. I found, more or less due to trial and error, a solution, and want to understand why and how it works.
I have the following JSON data:
{
"notifications": [
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "2021-02-02 02:02:02"
}
]
}
My goal is to update the time property of an object with a specific source and channel (the original JSON is way longer with lots of objects in the notifications array of the same format).
(In the following example, I want to update the time property of observer01 with channel info, so the second object in the example data above.)
My first try, not producing the desired output, was the following jq command:
jq '.notifications[] | select(.source == "observer01" and .channel == "info").time = "NEWTIME"' data.json
That produces the following output:
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "NEWTIME"
}
Which is just a list of the JSON objects within the notifications array. I understand that this can be useful, for example piping the objects to other command line tools.
Now let's try the following jq command, which is the same as above plus one pair of parentheses:
jq '(.notifications[] | select(.source == "observer01" and .channel == "info").time) = "NEWTIME"' data.json
This produces the desired output, the original valid JSON with the updated time property:
{
"notifications": [
{
"source": "observer01",
"channel": "error",
"time": "2021-01-01 01:01:01"
},
{
"source": "observer01",
"channel": "info",
"time": "NEWTIME"
}
]
}
Why is adding the parentheses to the jq filter in the case above producing a different output?
The parentheses just change the precedence. It's documented in man jq:
Parenthesis work as a grouping operator just as in any typical programming language.
jq ´(. + 2) * 5´
1
=> 15
Let's have a simpler example:
echo '[{"a":1}, {"a":2}]' | jq '.[] | .a |= .+1'
It outputs
{
"a": 2
}
{
"a": 3
}
because it's interpreted as
↓ ↓
echo '[{"a":1}, {"a":2}]' | jq '.[] | (.a |= .+1)'
The first filter .[] outputs the elements as separated objects, they are then modified by the second filter.
Placing the parentheses after the first two elements changes the precedence:
↓ ↓
echo '[{"a":1}, {"a":2}]' | jq '(.[] | .a) |= .+1'
and produces a different otuput:
[
{
"a": 2
},
{
"a": 3
}
]
BTW, this is the same output as from
echo '[{"a":1}, {"a":2}]' | jq '.[].a |= .+1'
It changes the value associated with the "a" key in the array.
Let's compare the two.
.notifications[] | select(...).time = "NEWTIME"
(.notifications[] | select(...).time) = "NEWTIME"
In the first one, the top-level filter is defined by |. The input is an object, and the output is the result of applying select(...).time = "NEWTIME" to each value produced by .notifications[]. In essence, the original object is "lost".
In the second one, the top-level filter is defined by =. x = y returns its input as output, but with a side effect produced by
Determining what the path expression x refers to in the input,
Evaluating the filter y on the input, (Even an expression like "NEWTIME" is just a filter: one that ignores its input and returns the string "NEWTIME")
Assigning the result of y to the thing addressed by x.

JQ: key selection from numeric objects

I use jq 1.6 in a Windows 10 PowerShell enviroment and trying to select keys from coincidentally numeric json objects.
Json exampel:
{
"alliances_info":{
"744085325458334213":{
"emblem":3,
"name":"wellwell",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"MELL",
"slogan":"",
"id":744085325458334213
},
"744128593839677958":{
"emblem":0,
"name":"Brave",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"GABA",
"slogan":"",
"id":744128593839677958
},
"746034084459209223":{
"emblem":0,
"name":"Queen",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"QUE",
"slogan":"",
"id":746034084459209223
},
"750446471312466445":{
"emblem":0,
"name":"Phoenix Inc",
"member_count":35,
"level":6,
"military_might":453369,
"public":true,
"tag":"PHOI",
"slogan":"",
"id":750446471312466445
},
"750446518934594062":{
"emblem":11,
"name":"Australia",
"member_count":44,
"level":8,
"military_might":957211,
"public":true,
"tag":"AUST",
"slogan":"Go Australia",
"id":750446518934594062
}
},
"server_version":"v7.190.4-master.000000006"
}
I tried several jq commands:
.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]
or
.alliances_info | .. | objects | [{alliance_name: .name, alliance_c
ount: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slog
an, alliance_id: .id}]
But Always get a jq error: parse error: Invalid numeric literal at line 1, column 3
I renounce on the object Building in the first command (and built only a Array) it works. But i need that objects. Any tips?
BR
Timo
Your first query works perfectly well with the given JSON sample. Perhaps you're invoking jq incorrectly. If you have the jq program in a file, say select.jq, you'd invoke jq like so:
jq -f select.jq sample.json
If that doesn't help, then try:
jq empty sample.json
If that fails, there might be something wrong with the encoding of the JSON.
I'm not sure I understand what you want.
Your first attempt works for me, but generates one output for JSON value in the input. That is, I created a file named so.json and put in it your JSON from above:
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
⋮
}
When I run your program , I get:
$ jq '.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]' so.json
[
{
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"alliance_tag": "MELL",
"alliance_slogan": "",
"alliance_id": 744085325458334200
}
]
[
{
"alliance_name": "Brave",
⋮
]
If you want an array at all, you probably want one array containing all the alliances like this:
$ jq '.alliances_info | [ .[] | { alliance_name: .name, alliance_id: .id } ]' so.json
[
{
"alliance_name": "wellwell",
"alliance_id": 744085325458334200
},
{
"alliance_name": "Brave",
"alliance_id": 744128593839678000
},
{
"alliance_name": "Queen",
"alliance_id": 746034084459209200
},
{
"alliance_name": "Phoenix Inc",
"alliance_id": 750446471312466400
},
{
"alliance_name": "Australia",
"alliance_id": 750446518934594000
}
]
Starting from the left,
- .alliances_info looks in its input object for the field named "alliances_info" and outputs its value
- the | next says take the output from the left-hand side and pass those as inputs to the right-hand side.
- right after that first |, I have a [ «jq expressions» ] which tells jq to create one JSON array output for each input; the elements of that array are the outputs of that inner «jq expressions»
- that inner expression starts with .[] which means to produce one output for each JSON value (ignoring the keys) in the input object. For us, that will be the objects named "744085325458334213", "744128593839677958", …
- The next | uses those objects as input and for each, generates a JSON object { alliance_name: .name, alliance_id: .id }
That's why I end up with one JSON array containing 5 JSON objects.
As far as I can tell, you are mostly just renaming a bunch of the fields. For that, you could just do something like this:
$ jq --argjson renameMap '{ "name": "alliance_name", "member_count": "alliance_count", "level": "alliance_level", "military_might": "alliance_power", "tag": "alliance_tag", "slog": "alliance_slogan"}' '.alliances_info |= ( . | [ to_entries[] | ( .value |= ( . | [ to_entries[] | ( .key |= ( if $renameMap[.] then $renameMap[.] else . end ) ) ] | from_entries ) ) ] | from_entries )' so.json
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "MELL",
"slogan": "",
"id": 744085325458334200
},
"744128593839677958": {
"emblem": 0,
"alliance_name": "Brave",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "GABA",
"slogan": "",
"id": 744128593839678000
},
⋮
},
"server_version": "v7.190.4-master.000000006"
}
well i am a idiot (to be here totally clear). I found the reason (and this is normally a nobrainer...). I read the input from a file and the funny thing is that the file is Unicode but no UTF8. after recoding the command is working fine. Thanks for the help.
BR
Timo

jq: How to match one of array and get sibling value

I have some JSON like this:
{
"x": [
{
"name": "Hello",
"id": "211"
},
{
"name": "Goodbye",
"id": "221"
},
{
"name": "Christmas",
"id": "171"
}
],
"y": "value"
}
Using jq, given a name value (e.g. Christmas) how can I get it's associated id (i.e. 171).
I've got as far as being able to check for presence of the name in one of the array's objects, but I can't work out how to filter it down
jq -r 'select(.x[].name == "Christmas")'
jq approach:
jq -r '.x[] | select(.name == "Christmas").id' file
171
The function select(boolean_expression) produces its input unchanged if boolean_expression returns true for that input, and produces no output otherwise.
It can also been done like:
jq '.x[] | select(.name == "Christmas").id'
Also you can try this at link online jq play

Getting only desired properties from nested array values with jq

The structure I ultimately want would be:
{
"catalog": [
{
"name": "X",
"catalog": [
{ "name": "Y", "uniqueId": "Z" },
{ "name": "Q", "uniqueId": "B" }
]
}
]
}
This is what the existing structure looks like except there are many other properties at each level (https://gist.github.com/ajcrites/e0e0ca4ca3a08ff2dc401ec872e6094c). I just want to filter those out and get a JSON format that looks specifically like this.
I have started out with: jq '.catalog', but this returns only the array. I still want the catalog property name there. I can do this with jq '{catalog: .catalog[]}, but this prints out each catalog object individually which makes the whole output invalid JSON. I still want the properties to be in the array. Is there a way to filter specific property key-values within arrays using jq?
The following transforms the given input to the desired output and may well be what you want:
{catalog}
| .catalog |= map( {name, catalog} )
| .catalog[].catalog |= map( {name, uniqueId} )
| .catalog |= .[0:1]
However, it's not clear to me that this is really what you want, as you don't discuss the duplication in the given JSON input. So maybe you don't really want the last line in the above, or maybe you want duplicates to be handled in some other way, or ....
Anyway, the trick to keeping things simple here is to use |=.
An alternative approach would be to use del to delete the unwanted properties (rather than selecting the ones you want), but in the present case, that would be (at best) tedious.
You could start by using tostream to convert your sample.json
into a stream of [path, value] arrays as you can see by running
jq -c tostream sample.json
This will generate
[["catalog",0,"catalog",0,"name"],"Y"]
[["catalog",0,"catalog",0,"prop11"],""]
[["catalog",0,"catalog",0,"uniqueId"],"Z"]
[["catalog",0,"catalog",0,"uniqueId"]]
[["catalog",0,"catalog",1,"name"],"Y"]
[["catalog",0,"catalog",1,"prop11"],""]
...
reduce and setpath can be used to convert back into the
original form with a filter such as:
reduce (tostream|select(length==2)) as [$p,$v] (
{};
setpath($p;$v)
)
Adding conditionals makes it easy to omit properties at any level.
For example the following removes leaf attributes starting with "prop":
reduce (tostream|select(length==2)) as [$p,$v] (
{};
if $p[-1]|startswith("prop")
then .
else setpath($p;$v)
end
)
With your sample.json this produces
{
"catalog": [
{
"catalog": [
{
"name": "Y",
"uniqueId": "Z"
},
{
"name": "Y",
"uniqueId": "Z"
}
],
"name": "X"
},
{
"catalog": [
{
"name": "Y",
"uniqueId": "Z"
},
{
"name": "Y",
"uniqueId": "Z"
}
],
"name": "X"
}
]
}
If the goal is to remove certain properties, then one could do so using walk/1. For example, to remove properties whose names start with "prop":
walk(if type == "object"
then with_entries(select(.key|startswith("prop") | not))
else . end)
The same approach would also be applicable if the focus is on retaining certain properties, e.g.:
walk(if type == "object"
then with_entries(select(.key == "name" or .key == "uniqueId" or .key == "catalog"))
else . end)
You could build up a file that contains paths into the json (expressed as arrays) that you want to keep. Then filter out values that do not fit in those paths.
paths.json:
["catalog","name"]
["catalog","catalog","name"]
["catalog","catalog","uniqueId"]
Then filter values based on their paths. Using streams is a great way to go for this since it gives you access to these paths directly:
$ jq --slurpfile paths paths.json '
def keep_path($path): any($paths[]; . == [$path[] | select(strings)]);
fromstream(tostream | select(length == 1 or keep_path(.[0])))
' input.json