jq get values from complex object - json

I have an object that looks like this
{
"my_list": [
{
"name": "an_image",
"image": "an_image.com"
},
{
"name": "another_image",
"image": "another_image.com"
},
...<more objects with image property>
],
"foo": {
"image": "foobar.io"
},
"bar": {
"image": "bar_image.io"
},
...<more objects with image property>
}
I'd like to get all of the image properties from each of the objects, and from each object in my_list and other lists that have objects that include an image property. So in this example I'd like to get
"an_image.com"
"another_image.com"
"foobar.io"
"bar_image.io"
We don't know the keys of any of these objects at runtime, so we can't reference my_list, foo, or bar in this example.
Previously we didn't have my_list in the object and jq '.[].image' worked, but now that results in jq: error (at bar.json:18): Cannot index array with string "image".
The problem is that we don't know the name of the objects that contain image properties so we can't reference them explicitly, and now that we've added another element that's a list we're running into type errors, that I'm not sure how to solve.
I've tried various combinations of .[].image, but they all seem to run into issues with typing.

If you don't mind the terseness, you could perhaps go with:
jq '..|select(.image?).image'

You could select by objects and items of arrays:
jq '.[] | ., arrays[] | objects.image'
"an_image.com"
"another_image.com"
"foobar.io"
"bar_image.io"
Demo

Using recursive descent .. is more elegant:
jq '.. | .image? // empty'

If the input is large, you might want to consider streaming the data in:
$ jq --stream -r 'select(.[0][-1] == "image")[1] // empty' input.json
an_image.com
another_image.com
foobar.io
bar_image.io
When streamed, your input will be processed as path/value pairs for the most part. Filter the paths you want, then return the value.

Related

How Can I Create a Space-Delimited String from JSON?

I am doing some work with Azure Devops (ADO) Variable Groups and I would like to query an existing Variable Group to get all its variables to build a list of parameters to send to ADO CLI method.
Here is the JSON representation of an existing Variable Group:
{
"authorized": true,
"description": "test",
"name": "TESTGROUP",
"providerData": null,
"type": "Vsts",
"variables": {
"app_container_environment": {
"value": "dev"
},
"aws_region": {
"value": "us-west-2"
}
}
}
Problem:
What I'd like to do is use jq to read each variable definition and extract the variable name and value. Then, I'd build a string prefixed by "--variables" followed by a list of all the key/value pairs, space-delimited as follows:
--variables app_container_environment="dev" aws_region="us-west-2"
Note that the list begins with "--variables" followed by a key/value pair delimited by a space between each key/value pair.
I have tried to use join which is kind of close. But the main problem I'm having is due to the way the JSON is structured. I'm not sure how to refer to the variable name. The value element is easy to get, but I can't seem to get it's parent(e.g. the variable name). For example, "us-west-2" value's variable name is "aws_region".
How would I do this?
With your sample, the invocation:
jq -r '
.variables
| [to_entries[]
| "\(.key)=\"\(.value.value|tostring)\""]
| "--variables " + join(" ")
' sample.json
produces:
--variables app_container_environment="dev" aws_region="us-west-2"

adding json objects from one file to another under single array using jq

I am new here so sorry if I do any mistakes while asking the question.
I have a json file that keeps updating every minute(File_1.json) with json objects. All i want to do is copy these objects to another file under a single array using the jq command.
Samples of files
File_1.json:
{
"Id":"1",
"Name":"Kiran",
"Age":"12"
}
{
"Id":"2",
"Name":"Dileep",
"Age":"22"
}
Expected Output
[
{
"Id":"1",
"Name":"Kiran",
"Age":"12"
}
{
"Id":"2",
"Name":"Dileep",
"Age":"22"
}
]
I have tried using -s(slurp) but since the code will be running once for every minute its creating multiple arrays.
If you wanted simply to append the objects in File_1.json to an existing array in (say) output.json, you could write:
jq '. + [inputs]' output.json File_1.json
This presupposes output.json contains exactly one array (or the JSON value null). So to start, you could initialize output.json by running:
echo null > output.json
If you want to take the risk and overwrite output.json, you might like to use sponge:
jq '. + [inputs]' output.json File_1.json | sponge output.json
If you want to remove duplicates and don't mind sorting the objects, you could simply append | unique to the above jq filter. If retaining the order is important, then see
https://github.com/stedolan/jq/wiki/Cookbook#using-bag-to-implement-a-sort-free-version-of-unique

Is there a way to delete the same key from a list of objects within a nested field?

I'm setting up a devops pipeline so that certain data profiles stored in JSON format can be shifted across different servers. While downloading it from the current server I need to clean up all the protected keys and unique identifiers. I'm looking for the cleanest way to do the following in JQ
Input:
{
"TopKey1":{
"some_key":"some_value"
},
"TopKey2":{
"some_key2":"some_value2"
},
"KeytoSearch":[
{
"_id":"sdf",
"non_relevant_key1":"val"
},
{
"_id":"sdfdsdf",
"non_relevant_key2":"val"
},
{
"_id":"sgf",
"non_relevant_key3":"val"
}
]
}
Output:
{
"TopKey1":{
"some_key":"some_value"
},
"TopKey2":{
"some_key2":"some_value2"
},
"KeytoSearch":[
{
"non_relevant_key1":"val"
},
{
"non_relevant_key2":"val"
},
{
"non_relevant_key3":"val"
}
]
}
In python terms if this were a dictionary
for json_object in dictionary["KeytoSearch"]:
json_object.pop("_id")
I've tried combinations of map and del but can't seem to figure out the nested indexing with this. The error messages I get are along the lines of jq: error (at <stdin>:277): Cannot index string with string "_id" which sort of tells me I haven't fundamentally understood how jq works or is to be used, but this is the route I need to go because using a Python script to clean up JSON objects is something I'd rather avoid
Going with your input JSON and assuming there are other properties in your KeytoSearch object along with the _id fields, you could just do below.
jq 'del(.KeytoSearch[]._id)'
See this jqplay.org snippet for a demo. The quotes around the property key containing _ are not needed as confirmed in one of the comments below. Some meta-characters (e.g. . in the property key values needs be accessed with quotes as ".id") needs to be quoted properly, but _ is clearly not one of them.
I've tried combinations of map and del
Good! You were probably just missing the '|=' magic ingredient:
.Keytosearch |= map( del(._id) )
alternatively, you could use a walk-path unix tool for JSON: jtc and apply changes right into the sourse json file (-f):
bash $ jtc -fpw'[KeytoSearch]<_id>l:' file.json
bash $
bash $
bash $ jtc file.json
{
"KeytoSearch": [
{
"non_relevant_key1": "val"
},
{
"non_relevant_key2": "val"
},
{
"non_relevant_key3": "val"
}
],
"TopKey1": {
"some_key": "some_value"
},
"TopKey2": {
"some_key2": "some_value2"
}
}
bash $
if given json snippet is a part of a larger JSON (and [KeytoSearch] is not addressable from the root), then replace it with the search lexeme: <KeytoSearch>l.
PS> Disclosure: I'm the creator of the jtc tool

How to write jq script to extract elements that might appear as singleton or list?

How can one write a jq query that will extract a property from an element that may appear as singleton or list?
For example, extract the URL property from the creator in both example JSON strings below.
Example #1:
{
"#type": "example1",
"creator":{
"#type":"Organization",
"url": "https://www.ncei.noaa.gov/"
}
}
Example #2:
{
"#type": "example2",
"creator": [{
"#type":"Person",
"url": "https://www.example.com/homepage"
},
{
"url": "https://www.example.com/another"
}]
}
I have tried using .creator for the first one and .creator[] for the second one, but these two are not compatible. Is there a way to write so that it works for both examples?
One possibility that is very straightforward is simply to test whether .creator is an array or not:
if .creator|type == "array" then .creator[] else .creator end
| .url
Streaming parser
At the other end of the spectrum of straightforwardness, here is another possibility that would be relevant if (a) the input JSON document was ginormous, and (b) the goal is to list all .url values that occur as immediate children of .creator, no matter where the keys are located in the input JSON document. The invocation would use the --stream command-line option, e.g.
jq --stream -f filter.jq input.json
The jq filter in this case is:
select(length==2 and
(.[0][-1] == "url") and
last( .[0][:-1][]| strings ) == "creator")
| .[1]

JSON path - extract all maps

I am trying to write a JSON path expression to extract all maps and submaps from a JSON structure. Considering the JSON:
{
"k1":"v1",
"arr": ["1","2","3" ,["7","8"] ],
"submap":
{
"a":"b",
"c":"d"
},
"submap_2":
{
"a_2":"b",
"c_2":"d",
"nested": { "x":"y" }
}
}
I would want to extract the elements "submap", "submap_2", "nested".
I've tried JSONPath expressions like:
$..*[?(#.length()>0 && #.*[0] empty true)]
This returns the structures I want, but also returns [ "7","8" ]. Is there any way to do this with JSONPath or is this better done in code?
(A neat JSONPath testing tools is here: http://jsonpath.herokuapp.com/)
(The specific implementation that I'm using is this one: https://github.com/jayway/JsonPath )
jq queries are often very similar to JSONPath queries, and I would strongly recommend that if at all possible you consider using jq.
Assuming the example data is in a file named example.json, the following invocation of jq produces the result you requested:
$ jq 'path(.. | select(type=="object")) | .[-1] | select(.)' example.json
"submap"
"submap_2"
"nested"
The output of the first filter (path(....)) consists of the full path expressions of the paths to all the JSON objects, including the top-level object itself. The remaining filters are needed to produce the exact output you requested. In practice, though, the full path expressions are probably more useful, so it might be helpful for you to see the output produced by the first filter:
$ jq -c 'path(.. | select(type=="object"))' example.json
[]
["submap"]
["submap_2"]
["submap_2","nested"]