JQ: how can I remove keys based on regex? - json

I would like to remove all keys that start with "hide". Important to note that the keys may be nested at many levels. I'd like to see the answer using a regex, although I recognise that in my example a simple contains would suffice. (I don't know how to do this with contains, either, BTW.)
Input JSON 1:
{
"a": 1,
"b": 2,
"hideA": 3,
"c": {
"d": 4,
"hide4": 5
}
}
Desired output JSON:
{
"a": 1,
"b": 2,
"c": {
"d": 4
}
}
Input JSON 2:
{
"a": 1,
"b": 2,
"hideA": 3,
"c": {
"d": 4,
"hide4": 5
},
"e": null,
"f": "hiya",
"g": false,
"h": [{
"i": 343.232,
"hide9": "private",
"so_smart": true
}]
}
Thanks!

Since you're just checking the start of the keys, you could use startswith/1 instead in this case, otherwise you could use test/1 or test/2. Then you could pass those paths to be removed to delpaths/1.
You might want to filter the key by strings (or convert to strings) beforehand to account for arrays in your tree.
delpaths([paths | select(.[-1] | strings | startswith("hide"))])
delpaths([paths | select(.[-1] | strings | test("^hide"; "i"))])

A straightforward approach to the problem is to use walk in conjunction with with_entries, e.g.
walk(if type == "object"
then with_entries(select(.key | test("^hide") | not))
else . end)
If your jq does not have walk/1 simply include its def (available e.g. from https://raw.githubusercontent.com/stedolan/jq/master/src/builtin.jq) before invoking it.

Related

Count elements in nested JSON with jq

I am trying to count all elements in a nested JSON-document with jq?
Given the following JSON-document
{"a": true, "b": [1, 2], "c": {"a": {"aa":1, "bb": 2}, "b": "blue"}}
I want to calculate the result 6.
In order to do this, I tried the following:
echo '{"a": true, "b": [1, 2], "c": {"a": {"aa":1, "bb": 2}, "b": "blue"}}' \
| jq 'reduce (.. | if (type == "object" or type == "array")
then length else 0 end) as $counts
(1; . + $counts)'
# Actual output: 10
# Desired output: 6
However, this counts the encountered objects and arrays as well and therefore yields 10 opposing to the desired output: 6
So, how can I only count the document's elements/leaf-nodes?
Thanks already in advance for you help!
Edit: What would be an efficient approach to count empty arrays and objects as well?
You can use the scalars filter to find leaf nodes. Scalars are all "simple" JSON values, i.e. null, true, false, numbers and strings. Alternatively you can compare the type of each item and use length to determine if an object or array has children.
I've expanded your input data a little to distinguish a few more corner cases:
Input:
{
"a": true,
"b": [1, 2],
"c": {
"a": {
"aa": 1,
"bb": 2
},
"b": "blue"
},
"d": [],
"e": [[], []],
"f": {}
}
This has 15 JSON entities:
5 of them are arrays or objects with children.
4 of them are empty arrays or objects.
6 of them are scalars.
Depending on what you're trying to do, you might consider only scalars to be "leaf nodes", or you might consider both scalars and empty arrays and objects to be leaf nodes.
Here's a filter that counts scalars:
[..|scalars]|length
Output:
6
And here's a filter that counts all entities which have no children. It just checks for all the scalar types explicitly (there are only six possible types for a JSON value) and if it's not one of those it must be an array or object, where we can check how many children it has with length.
[
..|
select(
(type|IN("boolean","number","string","null")) or
length==0
)
]|
length
Output:
10

How can I spread an object's properties in jq?

If I need to access the properties of an object, I'm currently accessing each property manually:
echo '{"a": {"a1":1, "a2": 2}, "b": 3}' | jq '{a1:.a.a1, a2: .a.a2,b}'
{
"a1": 1,
"a2": 2,
"b": 3
}
I'd like to avoid specifying every property. Is there an equivalent to the Object spread operator in JS, something like jq '{...a, b}'?
You can add objects together to combine their contents. If a key exists in both the left and right objects the value from the right object will remain.
echo '{"a": {"a1":1, "a2": 2}, "b": 3}' | jq '.a+{b}'
{
"a1": 1,
"a2": 2,
"b": 3
}
If you want a completely generic solution:
[..|objects|with_entries(select(.value|type!="object"))]|add
Or if you want a depth-first approach, replace add by reverse|add.
The above of course comes with the understanding that add resolves conflicts in a lossy way. If you don’t want any lossiness, choose a different method for combining objects, or maybe don’t combine them at all.
Here is a solution that only examines the top-level values, without referring to any key by name:
with_entries(if .value|type=="object" then .value|to_entries[] else . end)
For the example, this produces:
{
"a1": 1,
"a2": 2,
"b": 3
}
Note that even though this solution doesn't use add explicitly, it comes with a similar caveat about key collisions.

accumulate an array of key-value pairs into a single object

How can I use jq to transform this:
[
{
"k": "a",
"v": 123
},
{
"k": "b",
"v": 456
}
]
into this:
{
"a": 123,
"b": 456
}
Reconstruct each object, and add them all to get a big, single one.
map({(.k): .v}) | add
If your input is a large dataset, reduce might be a better choice in terms of performance.
reduce .[] as {$k,$v} ({}; . + {($k): $v})
Another option, since your objects are similar to how entries are structured, you could map them as those key/value pairs and convert to an object that way.
map({key: .k, value: .v}) | from_entries

Linux command to print all jsons of same key

I have json as a string "Str"
"{
"A": {
"id": 4
},
"B": {//Something},
"C": {
"A": {
"id": 2
}
},
"E": {
"A": null
},
"F": {//Something}
}"
I wanted all non null values of "A" which can be repeated anywhere in json. I wanted output like all contents of "A"
{"id": 4}
{"id": 2}
Can you please help me with Linux command to get this ?
Instead of line oriented ones use a tool which is capable of parsing JSON values syntax wise. An example using jq:
$ json_value='{"A":{"id":4},"B":{"foo":0},"C":{"A":{"id":2}},"E":{"A":null},"F":{"foo":0}}'
$
$ jq -c '..|objects|.A//empty' <<< "$json_value"
{"id":4}
{"id":2}
.. # list nodes recursively
| objects # select objects
| .A // empty # print A's value if present.

Almost automatically sorting keys with `jq`, but keep "id" key, if present, on top

Is there a way to sort the keys of a JSON using jq but keeping keys named "id" as first descendants on all trees? It's nice to have a way to easily compare JSON files to one another and normalizing key order and formatting is a great way to ensure they are easy to match, but sometimes the "id" key is the one we are looking for and it's not always easy to find if it's buried in the middle of the tree.
As an example, this:
{
"z-displacement": 3,
"absorption": 0.4,
"collections": [
{
"b": 12,
"a": 18,
"id" 190:,
},
{
"m": 22,
"id": 169,
"n": 3,
},
],
"id": 256767
}
Would become something like:
{
"id": 256767,
"absorption": 0.4,
"collections": [
{
"id" 190:,
"a": 18,
"b": 12
},
{
"id": 169,
"m": 22,
"n": 3
}
],
"z-displacement": 3
}
Assuming you are using jq 1.4 or later, the following will do what is requested for all the JSON objects in the input, not only those at the top level:
def reorder:
(if has("id") then {id} else null end) + (to_entries | sort | from_entries );
walk(if type == "object" then reorder else . end)
If your jq does not have walk/1, you can snarf its def from the jq FAQ https://github.com/stedolan/jq/wiki/FAQ or from the "master" version of builtin.jq
I have no idea how robust this is, but it gets the desired result in this case.
jq -S '.' | jq '{id} + .'