JSON path - extract all maps - json

I am trying to write a JSON path expression to extract all maps and submaps from a JSON structure. Considering the JSON:
{
"k1":"v1",
"arr": ["1","2","3" ,["7","8"] ],
"submap":
{
"a":"b",
"c":"d"
},
"submap_2":
{
"a_2":"b",
"c_2":"d",
"nested": { "x":"y" }
}
}
I would want to extract the elements "submap", "submap_2", "nested".
I've tried JSONPath expressions like:
$..*[?(#.length()>0 && #.*[0] empty true)]
This returns the structures I want, but also returns [ "7","8" ]. Is there any way to do this with JSONPath or is this better done in code?
(A neat JSONPath testing tools is here: http://jsonpath.herokuapp.com/)
(The specific implementation that I'm using is this one: https://github.com/jayway/JsonPath )

jq queries are often very similar to JSONPath queries, and I would strongly recommend that if at all possible you consider using jq.
Assuming the example data is in a file named example.json, the following invocation of jq produces the result you requested:
$ jq 'path(.. | select(type=="object")) | .[-1] | select(.)' example.json
"submap"
"submap_2"
"nested"
The output of the first filter (path(....)) consists of the full path expressions of the paths to all the JSON objects, including the top-level object itself. The remaining filters are needed to produce the exact output you requested. In practice, though, the full path expressions are probably more useful, so it might be helpful for you to see the output produced by the first filter:
$ jq -c 'path(.. | select(type=="object"))' example.json
[]
["submap"]
["submap_2"]
["submap_2","nested"]

Related

jq get values from complex object

I have an object that looks like this
{
"my_list": [
{
"name": "an_image",
"image": "an_image.com"
},
{
"name": "another_image",
"image": "another_image.com"
},
...<more objects with image property>
],
"foo": {
"image": "foobar.io"
},
"bar": {
"image": "bar_image.io"
},
...<more objects with image property>
}
I'd like to get all of the image properties from each of the objects, and from each object in my_list and other lists that have objects that include an image property. So in this example I'd like to get
"an_image.com"
"another_image.com"
"foobar.io"
"bar_image.io"
We don't know the keys of any of these objects at runtime, so we can't reference my_list, foo, or bar in this example.
Previously we didn't have my_list in the object and jq '.[].image' worked, but now that results in jq: error (at bar.json:18): Cannot index array with string "image".
The problem is that we don't know the name of the objects that contain image properties so we can't reference them explicitly, and now that we've added another element that's a list we're running into type errors, that I'm not sure how to solve.
I've tried various combinations of .[].image, but they all seem to run into issues with typing.
If you don't mind the terseness, you could perhaps go with:
jq '..|select(.image?).image'
You could select by objects and items of arrays:
jq '.[] | ., arrays[] | objects.image'
"an_image.com"
"another_image.com"
"foobar.io"
"bar_image.io"
Demo
Using recursive descent .. is more elegant:
jq '.. | .image? // empty'
If the input is large, you might want to consider streaming the data in:
$ jq --stream -r 'select(.[0][-1] == "image")[1] // empty' input.json
an_image.com
another_image.com
foobar.io
bar_image.io
When streamed, your input will be processed as path/value pairs for the most part. Filter the paths you want, then return the value.

adding json objects from one file to another under single array using jq

I am new here so sorry if I do any mistakes while asking the question.
I have a json file that keeps updating every minute(File_1.json) with json objects. All i want to do is copy these objects to another file under a single array using the jq command.
Samples of files
File_1.json:
{
"Id":"1",
"Name":"Kiran",
"Age":"12"
}
{
"Id":"2",
"Name":"Dileep",
"Age":"22"
}
Expected Output
[
{
"Id":"1",
"Name":"Kiran",
"Age":"12"
}
{
"Id":"2",
"Name":"Dileep",
"Age":"22"
}
]
I have tried using -s(slurp) but since the code will be running once for every minute its creating multiple arrays.
If you wanted simply to append the objects in File_1.json to an existing array in (say) output.json, you could write:
jq '. + [inputs]' output.json File_1.json
This presupposes output.json contains exactly one array (or the JSON value null). So to start, you could initialize output.json by running:
echo null > output.json
If you want to take the risk and overwrite output.json, you might like to use sponge:
jq '. + [inputs]' output.json File_1.json | sponge output.json
If you want to remove duplicates and don't mind sorting the objects, you could simply append | unique to the above jq filter. If retaining the order is important, then see
https://github.com/stedolan/jq/wiki/Cookbook#using-bag-to-implement-a-sort-free-version-of-unique

Extract URL from JSON using jq

I have the following json output available and i need to extract the value of href which is the https URL using jq processor.
I have tried using
jq -r .links.urn:vodafoneid:follow.hrefs
However this does not work ?
JSON Output:
{
"links":{
"urn:somedomainid:follow":{
"href":"https://abc.somedomain.com/ula/login?service=IDGW&channel=WEB&usecaseid=a0b51311-d14b-4733-9e6b-ba5f5deec05f&opco=DE&nonce=89e31cde-fecc-41e1-91d6-1f9f84f9c136&acr_values=explicit&scopes=phone_number&returnUrl=https%3A%2F%2Fidgw.somedomain.com%2Fauthorize%23state%3Da0b51311-d14b-4733-9e6b-ba5f5deec05f",
"type":"text/html"
}
},
"context":"FOLLOW"
}
You have an obvious typo in the field name that you are trying to use vodafoneid is not somedomainid. But in general to access a field having special characters like : in their names, do a proper quoting of the field as below.
jq --raw-output '.links."urn:somedomainid:follow".href'
jqplay.org - URL

How to write jq script to extract elements that might appear as singleton or list?

How can one write a jq query that will extract a property from an element that may appear as singleton or list?
For example, extract the URL property from the creator in both example JSON strings below.
Example #1:
{
"#type": "example1",
"creator":{
"#type":"Organization",
"url": "https://www.ncei.noaa.gov/"
}
}
Example #2:
{
"#type": "example2",
"creator": [{
"#type":"Person",
"url": "https://www.example.com/homepage"
},
{
"url": "https://www.example.com/another"
}]
}
I have tried using .creator for the first one and .creator[] for the second one, but these two are not compatible. Is there a way to write so that it works for both examples?
One possibility that is very straightforward is simply to test whether .creator is an array or not:
if .creator|type == "array" then .creator[] else .creator end
| .url
Streaming parser
At the other end of the spectrum of straightforwardness, here is another possibility that would be relevant if (a) the input JSON document was ginormous, and (b) the goal is to list all .url values that occur as immediate children of .creator, no matter where the keys are located in the input JSON document. The invocation would use the --stream command-line option, e.g.
jq --stream -f filter.jq input.json
The jq filter in this case is:
select(length==2 and
(.[0][-1] == "url") and
last( .[0][:-1][]| strings ) == "creator")
| .[1]

How do I simplify a JSON object using JQ?

I've got a huge JSON object and I want to filter it down, to a small % of the available fields. I've searched some similar questions, such as enter link description here but that is for an array of objects. I have a JSON object that looks something like:
{
"timestamp":1455408955250999808,
"client":
{
"ip":"76.72.172.208",
"srcPort":0,
"country":"us",
"deviceType":"desktop"},
"clientRequest":
{
"bytes":410,
"bodyBytes":0}
}
What I'm trying to do is create a new JSON object that looks likes:
{
"timestamp":1455408955250999808,
"client":
{
"ip":"76.72.172.208",
}
"clientRequest":
{
"bytes":410
}
}
So effectively filter down the data. I've tried:
| jq 'map({client.ip: .client.ip, timestamp: .timestamp})' and I continue to get:
jq: error (at <stdin>:0): Cannot index number with string "client"
Even the most simple | jq 'map({timestamp: .timestamp})' is showing the same error.
I thought I could access the K,V pairs and use the map function as the person did for his array in the question linked above. Any help much appreciated.
Huzzah. Simple enough really :)
cat LogSample.txt | jq '. | {Id: .Id, client: {ip: .client.ip}}'
Basically define the object yourself :)
It looks like it will be simplest if you construct the object you want. Based on your example, you could do so using the following filter:
{ timestamp,
client: { ip: .client.ip },
clientRequest: {bytes: .clientRequest.bytes }
}
By contrast, map expects its input to be an array, whereas your input is a JSON object.
Please also note that jq provides direct ways to remove keys as well, e.g. using del/1.