add values to array of keys with JQ - json

I have a simple JSON array:
[
"smoke-tests",
"other-tests"
]
I'd like to convert to a simple JSON:
{"smoke-tests": true,
"other-tests": true
}
I've tried several jq examples, but none seem to do what I want.
jq '.[] | walk(.key = true)' produces a compile error.

If you like the efficiency of reduce but don't want to use reduce explicitly:
. as $in | {} | .[$in[]] = true

$ s='["smoke-tests", "other-tests"]'
$ jq '[.[] | {(.): true}] | add' <<<"$s"
{
"smoke-tests": true,
"other-tests": true
}
Breaking down how that works: .[] | {(.): true} converts each item into a dictionary mapping the value (as a key) to true. Surrounding that in [ ] means we generate a list of such objects; sending that to add combines them into a single object.

Here is a solution using add. It's close to Charles's solution but uses the behavior of Object construction to implicitly return multiple objects when used with an expression which returns multiple results.
[{(.[]):true}]|add

With reduce() function:
jq 'reduce .[] as $k ({}; .[$k]=true)' file
The output:
{
"smoke-tests": true,
"other-tests": true
}

Related

jq - Looping through json and concatenate the output to single string

I was currently learning the usage of jq. I have a json file and I am able to loop through and filter out the values I need from the json. However, I am running into issue when I try to combine the output into single string instead of having the output in multiple lines.
File svcs.json:
[
{
"name": "svc-A",
"run" : "True"
},
{
"name": "svc-B",
"run" : "False"
},
{
"name": "svc-C",
"run" : "True"
}
]
I was using the jq to filter to output the service names with run value as True
jq -r '.[] | select(.run=="True") | .name ' svcs.json
I was getting the output as follows:
svc-A
svc-C
I was looking to get the output as single string separated by commas.
Expected Output:
"svc-A,svc-C"
I tried to using join, but was unable to get it to work so far.
The .[] expression explodes the array into a stream of its elements. You'll need to collect the transformed stream (the names) back into an array. Then you can use the #csv filter for the final output
$ jq -r '[ .[] | select(.run=="True") | .name ] | #csv' svcs.json
"svc-A","svc-C"
But here's where map comes in handy to operate on an array's elements:
$ jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Keep the array using map instead of decomposing it with .[], then join with a glue string:
jq -r 'map(select(.run=="True") | .name) | join(",")' svcs.json
svc-A,svc-C
Demo
If your goal is to create a CSV output, there is a special #csv command taking care of quoting, escaping etc.
jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Demo

output arrays values as single object

I need to be able to produce the following but without having to explicit array's indexes, so that I don't need to know input array's lenght
echo '[{"name":"John", "age":30, "car":null},{"name":"Marc", "age":32, "car":null}]' | jq -r '{(.[0].name):.[0].age,(.[1].name):.[1].age}'
Produces :
{ "John": 30, "Marc": 32}
Use add to merge the objects.
jq '[ .[] | { (.name) : .age } ] | add'

Find out parent key when a certain child value is met with jq

Here's the json:
{
"vendors": {
"vendor1": {
"vendor_version": "LS TT1706-POL",
"vendor_name": "toyota"
},
"vendor2": {
"vendor_version": "LSGS-2002-RC",
"vendor_name": "honda"
},
"vendor3": {
"vendor_version": "LS1903",
"vendor_name": "suzuki"
}
}
}
I basically need the jq expression to get "vendor2" when I am given LSGS-2002-RC. I've tried using select, map, variables, and every combination thereof.
here is something that didnt work:
jq -r '.vendors|to_entries[]|.value|select(.vendor_version=="LSGS-2002-RC")'
Basically I always end up with the keys vendor1, vendor2, etc... stripped
I am a little stumped. Note that the json structure or values cannot be altered. Thanks
You almost had it, but the right filter should have been to use the select() function on the .value.vendor_version and pick out the key name
jq -r '.vendors | to_entries[] | select(.value.vendor_version=="LSGS-2002-RC").key'
Also don't pass in dynamic strings to the function, use placeholders like variables
jq -r --arg vendor "LSGS-2002-RC" '.vendors | to_entries[] | select(.value.vendor_version == $vendor).key'
An alternate, less readable version than select() would be to use keys[]
.vendors | keys[] as $k | if .[$k].vendor_version == "LSGS-2002-RC" then $k else empty end

jq add value of a key in nested array and given to a new key

I have a stream of JSON arrays like this
[{"id":"AQ","Count":0}]
[{"id":"AR","Count":1},{"id":"AR","Count":3},{"id":"AR","Count":13},
{"id":"AR","Count":12},{"id":"AR","Count":5}]
[{"id":"AS","Count":0}]
I want to use jq to get a new json like this
{"id":"AQ","Count":0}
{"id":"AR","Count":34}
{"id":"AS","Count":0}
34=1+3+13+12+5 which are in the second array.
I don't know how to describe it in detail. But the basic idea is shown in my example.
I use bash and prefer to use jq to solve this problem. Thank you!
If you want an efficient but generic solution that does NOT assume each input array has the same ids, then the following helper function makes a solution easy:
# Input: a JSON object representing the subtotals
# Output: the object augmented with additional subtotals
def adder(stream; id; filter):
reduce stream as $s (.; .[$s|id] += ($s|filter));
Assuming your jq has inputs, then the most efficient approach is to use it (but remember to use the -n command-line option):
reduce inputs as $row ({}; adder($row[]; .id; .Count) )
This produces:
{"AQ":0,"AR":34,"AS":0}
From here, it's easy to get the answer you want, e.g. using to_entries[] | {(.key): .value}
If your jq does not have inputs and if you don't want to upgrade, then use the -s option (instead of -n) and replace inputs by .[]
Assuming the .id is the same in each array:
first + {Count: map(.Count) | add}
Or perhaps more intelligibly:
(map(.Count) | add) as $sum | first | .Count = $sum
Or more declaratively:
{ id: (first|.id), Count: (map(.Count) | add) }
It's a bit kludgey, but given your input:
jq -c '
reduce .[] as $item ({}; .[($item.id)] += ($item.Count))
| to_entries
| .[] | {"id": .key, "Count": .value}
'
Yields the output:
{"id":"AQ","Count":0}
{"id":"AR","Count":34}
{"id":"AS","Count":0}

Flatten nested JSON using jq

I'd like to flatten a nested json object, e.g. {"a":{"b":1}} to {"a.b":1} in order to digest it in solr.
I have 11 TB of json files which are both nested and contains dots in field names, meaning not elasticsearch (dots) nor solr (nested without the _childDocument_ notation) can digest it as is.
The other solutions would be to replace dots in the field names with underscores and push it to elasticsearch, but I have far better experience with solr therefore I prefer the flatten solution (unless solr can digest those nested jsons as is??).
I will prefer elasticsearch only if the digestion process will take far less time than solr, because my priority is digesting as fast as I can (thus I chose jq instead of scripting it in python).
Kindly help.
EDIT:
I think the pair of examples 3&4 solves this for me:
https://lucidworks.com/blog/2014/08/12/indexing-custom-json-data/
I'll try soon.
You can also use the following jq command to flatten nested JSON objects in this manner:
[leaf_paths as $path | {"key": $path | join("."), "value": getpath($path)}] | from_entries
The way it works is: leaf_paths returns a stream of arrays which represent the paths on the given JSON document at which "leaf elements" appear, that is, elements which do not have child elements, such as numbers, strings and booleans. We pipe that stream into objects with key and value properties, where key contains the elements of the path array as a string joined by dots and value contains the element at that path. Finally, we put the entire thing in an array and run from_entries on it, which transforms an array of {key, value} objects into an object containing those key-value pairs.
This is just a variant of Santiago's jq:
. as $in
| reduce leaf_paths as $path ({};
. + { ($path | map(tostring) | join(".")): $in | getpath($path) })
It avoids the overhead of the key/value construction and destruction.
(If you have access to a version of jq later than jq 1.5, you can omit the "map(tostring)".)
Two important points about both these jq solutions:
Arrays are also flattened.
E.g. given {"a": {"b": [0,1,2]}} as input, the output would be:
{
"a.b.0": 0,
"a.b.1": 1,
"a.b.2": 2
}
If any of the keys in the original JSON contain periods, then key collisions are possible; such collisions will generally result in the loss of a value. This would happen, for example, with the following input:
{"a.b":0, "a": {"b": 1}}
Here is a solution that uses tostream, select, join, reduce and setpath
reduce ( tostream | select(length==2) | .[0] |= [join(".")] ) as [$p,$v] (
{}
; setpath($p; $v)
)
I've recently written a script called jqg that flattens arbitrarily complex JSON and searches the results using a regex; to simply flatten the JSON, your regex would be '.', which matches everything. Unlike the answers above, the script will handle embedded arrays, false and null values, and can optionally treat empty arrays and objects ([] & {}) as leaf nodes.
$ jq . test/odd-values.json
{
"one": {
"start-string": "foo",
"null-value": null,
"integer-number": 101
},
"two": [
{
"two-a": {
"non-integer-number": 101.75,
"number-zero": 0
},
"true-boolean": true,
"two-b": {
"false-boolean": false
}
}
],
"three": {
"empty-string": "",
"empty-object": {},
"empty-array": []
},
"end-string": "bar"
}
$ jqg . test/odd-values.json
{
"one.start-string": "foo",
"one.null-value": null,
"one.integer-number": 101,
"two.0.two-a.non-integer-number": 101.75,
"two.0.two-a.number-zero": 0,
"two.0.true-boolean": true,
"two.0.two-b.false-boolean": false,
"three.empty-string": "",
"three.empty-object": {},
"three.empty-array": [],
"end-string": "bar"
}
jqg was tested using jq 1.6
Note: I am the author of the jqg script.
As it turns out, curl -XPOST 'http://localhost:8983/solr/flat/update/json/docs' -d #json_file does just this:
{
"a.b":[1],
"id":"24e3e780-3a9e-4fa7-9159-fc5294e803cd",
"_version_":1535841499921514496
}
EDIT 1: solr 6.0.1 with bin/solr -e cloud. collection name is flat, all the rest are default (with data-driven-schema which is also default).
EDIT 2: The final script I used: find . -name '*.json' -exec curl -XPOST 'http://localhost:8983/solr/collection1/update/json/docs' -d #{} \;.
EDIT 3: Is is also possible to parallel with xargs and to add the id field with jq: find . -name '*.json' -print0 | xargs -0 -n 1 -P 8 -I {} sh -c "cat {} | jq '. + {id: .a.b}' | curl -XPOST 'http://localhost:8983/solr/collection/update/json/docs' -d #-" where -P is the parallelism factor. I used jq to set an id so multiple uploads of the same document won't create duplicates in the collection (when I searched for the optimal value of -P it created duplicates in the collection)
As #hraban mentioned, leaf_paths does not work as expected (furthermore, it is deprecated). leaf_paths is equivalent to paths(scalars), it returns the paths of any values for which scalars returns a truthy value. scalars returns its input value if it is a scalar, or null otherwise. The problem with that is that null and false are not truthy values, so they will be removed from the output. The following code does work, by checking the type of the values directly:
. as $in
| reduce paths(type != "object" and type != "array") as $path ({};
. + { ($path | map(tostring) | join(".")): $in | getpath($path) })