Linux CLI - How to get substring from JSON jq + grep? - json

I need to pull a substring from JSON. In the JSON doc below, I need the end of the value of jq '.[].networkProfile.networkInterfaces[].id' In other words, I need just A10NICvw4konls2vfbw-data to pass to another command. I can't seem to figure out how to pull a substring using grep. I've seem regex examples out there but haven't been successful with them.
[
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Compute/virtualMachines/A10VNAvw4konls2vfbw",
"instanceView": null,
"licenseType": null,
"location": "centralus",
"name": "A10VNAvw4konls2vfbw",
"networkProfile": {
"networkInterfaces": [
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Network/networkInterfaces/A10NICvw4konls2vfbw-data",
"resourceGroup": "IPv6v2"
}
]
}
}
]

In your case, sub(".*/";"") will do the trick as * is greedy:
.[].networkProfile.networkInterfaces[].id | sub(".*/";"")

Try this:
jq -r '.[]|.networkProfile.networkInterfaces[].id | split("/") | last'
The -r tells JQ to print the output in "raw" form - in this case, that means no double-quotes around the string value.
As for the jq expression, after you access the id you want, piping it (still inside jq) through split("/") turns it into an array of the parts between slashes. Piping that through the last function (thanks, #Thor) returns just the last element of the array.

If you want to do it with grep here is one way:
jq -r '.[].networkProfile.networkInterfaces[].id' | grep -o '[^/]*$'
Output:
A10NICvw4konls2vfbw-data

Related

How to iterate a JSON array of objects with jq and grab multiple variables from each object in each loop

I need to grab variables from JSON properties.
The JSON array looks like this (GitHub API for repository tags), which I obtain from a curl request.
[
{
"name": "my-tag-name",
"zipball_url": "https://api.github.com/repos/path-to-my-tag-name",
"tarball_url": "https://api.github.com/repos/path-to-my-tag-name-tarball",
"commit": {
"sha": "commit-sha",
"url": "https://api.github.com/repos/path-to-my-commit-sha"
},
"node_id": "node-id"
},
{
"name": "another-tag-name",
"zipball_url": "https://api.github.com/repos/path-to-my-tag-name",
"tarball_url": "https://api.github.com/repos/path-to-my-tag-name-tarball",
"commit": {
"sha": "commit-sha",
"url": "https://api.github.com/repos/path-to-my-commit-sha"
},
"node_id": "node-id"
},
]
In my actual JSON there are 100s of objects like these.
While I loop each one of these I need to grab the name and the commit URL, then perform more operations with these two variables before I get to the next object and repeat.
I tried (with and without -r)
tags=$(curl -s -u "${GITHUB_USERNAME}:${GITHUB_TOKEN}" -H "Accept: application/vnd.github.v3+json" "https://api.github.com/repos/path-to-my-repository/tags?per_page=100&page=${page}")
for row in $(jq -r '.[]' <<< "$tags"); do
tag=$(jq -r '.name' <<< "$row")
# I have also tried with the syntax:
url=$(echo "${row}" | jq -r '.commit.url')
# do stuff with $tag and $url...
done
But I get errors like:
parse error: Unfinished JSON term at EOF at line 2, column 0 jq: error
(at :1): Cannot index string with string "name" } parse error:
Unmatched '}' at line 1, column 1
And from the terminal output it appears that it is trying to parse $row in a strange way, trying to grab .name from every substring? Not sure.
I am assuming the output from $(jq '.[]' <<< "$tags") could be valid JSON, from which I could again use jq to grab the object properties I need, but maybe that is not the case? If I output ${row} it does look like valid JSON to me, and I tried pasting the results in a JSON validator, everything seems to check out...
How do I grab the ".name" and ".commit.url" for each of these object before I move onto the next one?
Thanks
It would be better to avoid calling jq more than once. Consider, for example:
while read -r name ; do
read -r url
echo "$name" "$url"
done < <( curl .... | jq -r '.[] | .name, .commit.url' )
where curl .... signifies the relevant invocation of curl.

How to find something in a json file using Bash

I would like to search a JSON file for some key or value, and have it print where it was found.
For example, when using jq to print out my Firefox' extensions.json, I get something like this (using "..." here to skip long parts) :
{
"schemaVersion": 31,
"addons": [
{
"id": "wetransfer#extensions.thunderbird.net",
"syncGUID": "{e6369308-1efc-40fd-aa5f-38da7b20df9b}",
"version": "2.0.0",
...
},
{
...
}
]
}
Say I would like to search for "wetransfer#extensions.thunderbird.net", and would like an output which shows me where it was found with something like this:
{ "addons": [ {"id": "wetransfer#extensions.thunderbird.net"} ] }
Is there a way to get that with jq or with some other json tool?
I also tried to simply list the various ids in that file, and hoped that I would get it with jq '.id', but that just returned null, because it apparently needs the full path.
In other words, I'm looking for a command-line json parser which I could use in a way similar to Xpath tools
The path() function comes in handy:
$ jq -c 'path(.. | select(. == "wetransfer#extensions.thunderbird.net"))' input.json
["addons",0,"id"]
The resulting path is interpreted as "In the addons field of the initial object, the first array element's id field matches". You can use it with getpath(), setpath(), delpaths(), etc. to get or manipulate the value it describes.
Using your example with modifications to make it valid JSON:
< input.json jq -c --arg s wetransfer#extensions.thunderbird.net '
paths as $p | select(getpath($p) == $s) | null | setpath($p;$s)'
produces:
{"addons":[{"id":"wetransfer#extensions.thunderbird.net"}]}
Note
If there are N paths to the given value, the above will produce N lines. If you want only the first, you could wrap everything in first(...).
Listing all the "id" values
I also tried to simply list the various ids in that file
Assuming that "id" values of false and null are of no interest, you can print all the "id" values of interest using the jq filter:
.. | .id? // empty

Search and extract value using JQ command line processor

I have a JSON file very similar to the following:
[
{
"uuid": "832390ed-58ed-4338-bf97-eb42f123d9f3",
"name": "Nacho"
},
{
"uuid": "5b55ea5e-96f4-48d3-a258-75e152d8236a",
"name": "Taco"
},
{
"uuid": "a68f5249-828c-4265-9317-fc902b0d65b9",
"name": "Burrito"
}
]
I am trying to figure out how to use the JQ command line processor to first find the UUID that I input and based on that output the name of the associated item. So for example, if I input UUID a68f5249-828c-4265-9317-fc902b0d65b9 it should search the JSON file, find the matching UUID and then return the name Burrito. I am doing this in Bash. I realize it may require some outside logic in addition to JQ. I will keep thinking about it and put an update here in a bit. I know I could do it in an overly complicated way, but I know there is probably a really simple JQ method of doing this in one or two lines. Please help me.
https://shapeshed.com/jq-json/#how-to-find-a-key-and-value
You can use select:
jq -r --arg query Burrito '.[] | select( .name == $query ) | .uuid ' tst.json

Bash jq modify json : get and set

I use jq to parse and modify cURL response and it works perfect for all of my requirements except one. I wish to modify a key value in the json, like:
A) Input json
[
{
"id": 169,
"path": "dir1/dir2"
}
]
B) Output json
[
{
"id": 169,
"path": "dir1"
}
]
So the last directory is removed from the path. I use the script:
curl --header -X GET -k "${URL}" | jq '[.[] | {id: .id, path: .path_with_namespace}]' | jq '(.[] | .path) = "${.path%/*}"'
The last pipe is ofcourse not correct and this is where I am stuck. The point is to get the path value and modify it. Any help is appreciated.
One way to do this is to use split and join to process the path, and use |= to bind the correct expression to the .path attribute.
... | jq '.[] | .path|=(split("/")[:-1]|join("/"))
split("/") takes a string and returns an array
x[:-1] returns an array consisting of all but the last element of x
join("/") combines the elements of the incoming array with / to return a single string.
.path|=x takes the value of .path, feeds it through the filter x, and assigns the resulting value to .path again.

Exclude column from jq json output

I would like to get rid of the timestamp field here using jq JSON processor.
[
{
"timestamp": 1448369447295,
"group": "employees",
"uid": "elgalu"
},
{
"timestamp": 1448369447296,
"group": "employees",
"uid": "mike"
},
{
"timestamp": 1448369786667,
"group": "services",
"uid": "pacts"
}
]
White listing would also works for me, i.e. select uid, group
Ultimately what I would really like is a list with unique values like this:
employees,elgalu
employees,mike
services,pacts
If you just want to delete the timestamps you can use the del() function:
jq 'del(.[].timestamp)' input.json
However to achieve the desired output, I would not use the del() function. Since you know which fields should appear in output, you can simply populate an array with group and id and then use the join() function:
jq -r '.[]|[.group,.uid]|join(",")' input.json
-r stands for raw ouput. jq will not print quotes around the values.
Output:
employees,elgalu
employees,mike
services,pacts
For the record, an alternative would be:
$ jq -r '.[] | "\(.uid),\(.group)"' input.json
(The white-listing approach makes it easy to rearrange the order, and this variant makes it easy to modify the spacing, etc.)
The following example may be of interest to anyone who wants safe CSV (i.e. even if the values have embedded commas or newline characters):
$ jq -r '.[] | [.uid, .group] | #csv' input.json
"elgalu","employees"
"mike","employees"
"pacts","services"
Sed is your best friend - I can't think of anything simpler. I've got here having the same problem as the question's author - but maybe this is a simpler answer to the same problem:
< file sed -e '/timestamp/d'