Selecting 'name' with highest build number from json list - json

Here is my sample data, it's a list of objects in a storage bucket on Oracle cloud:
{
"objects": [
{
"name": "rhel/"
},
{
"name": "rhel/app-3.9.6.629089.txt"
},
{
"name": "rhel/app-3.11.4.629600.txt"
}
]
}
The part of the value before the '/' is a folder name, after is a filename. The last number in the filename is a build number. The desired output is the name of the object with the highest build number in the rhel folder:
$ jq -r 'some_program' file.json
rhel/app-3.11.4.629600.txt
I can somewhat process the data to exclude the bare "rhel/" folder as follows:
$ jq -r '.objects[] | select(.name|test("rhel/."))' file.json
{
"name": "rhel/app-3.9.6.629089.txt"
}
{
"name": "rhel/app-3.11.4.629600.txt"
}
When I try to split this on the period jq throws an error:
$ jq -r '.objects[] | select(.name|test("rhel/.")) | split(".")' file.json
jq: error (at file.json:1): split input and separator must be strings
I was expecting to use 'map(tonumber)[-2]' on the result of the split and wrap the entirety in 'max_by()'.
How can get closer to the desired output with jq?

[.objects[]
| select(.name|test("rhel/."))]
| max_by(.name|split(".")[-2]|tonumber)
produces:
{
"name": "rhel/app-3.11.4.629600.txt"
}
If you only want the names you could begin by extracting them:
[.objects[].name|select(test("rhel/."))]
| max_by(split(".")[-2]|tonumber)

Related

Generate csv files from a JSON

Unfortunately I have considerable difficulties to generate three csv files from one json format. Maybe someone has a good hint how I could do this. Thanks
Here is the output. Within dropped1 and dropped2 can be several different and multiple addresses.
{
"result": {
"found": 0,
"dropped1": {
"address10": 1140
},
"rates": {
"total": {
"1min": 3579,
"5min": 1593,
"15min": 5312,
"60min": 1328
},
"dropped2": {
"address20": {
"1min": 9139,
"5min": 8355,
"15min": 2785,
"60min": 8196
}
}
},
"connections": 1
},
"id": "whatever",
"jsonrpc": "2.0"
}
The 3 csv files should be displayed in this form.
address10,1140
total,3579,1593,5312,1328
address20,9139,8355,2785,8196
If you decide to use jq, then unless there is some specific reason not to, I'd suggest invoking jq once for each of the three output files. The three invocations would then look like these:
jq -r '.result.dropped1 | [to_entries[][]] | #csv' > 1.csv
jq -r '.result.rates.total | ["total", .["1min"], .["5min"], .["15min"], .["60min"]] | #csv' > 2.csv
jq -r '.result.rates.dropped2
| to_entries[]
| [.key] + ( .value | [ .["1min"], .["5min"], .["15min"], .["60min"]] )
| #csv
' > 3.csv
If you can be sure the ordering of keys within the total and address20 objects is fixed and in the correct order, then the last two invocations can be simplified.
Did you try using this library?
https://www.npmjs.com/package/json-to-csv-stream
npm i json-to-csv-stream

Bash JSON compare two list and delete id

I have a JSON endpoint which I can fetch value with curl and yml local file. I want to get the difference and delete it with id of name present on JSON endpoint.
JSON's endpoint
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_toto_a"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_tata_b"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_titi_c"
}
]
files.yml
---
instance:
toto:
name: "toto"
tata:
name: "tata"
Between JSON's endpoint and local file, I want to delete it with id of tata, because it is the difference between the sources.
declare -a arr=(_a _b _c)
ar=$(cat files.yml | grep name | cut -d '"' -f2 | tr "\n" " ")
fileItemArray=($ar)
ARR_PRE=("${fileItemArray[#]/#/V1_}")
for i in "${arr[#]}"; do local_var+=("${ARR_PRE[#]/%/$i}"); done
remote_var=$(curl -sX GET "XXXX" | jq -r '.[].name | #sh' | tr -d \'\")
diff_=$(echo ${local_var[#]} ${remote_var[#]} | tr ' ' '\n' | sort | uniq -u)
output = titi
the code works, but I want to delete the titi with id dynamically
curl -X DELETE "XXXX" $id_titi
I am trying to delete with bash script, but I have no idea to continue...
Your endpoint is not proper JSON as it has
commas after the .name field but no following field
no commas between the elements of the top-level array
If this is not just a typo from pasting your example into this question, then you'd need to address this first before proceeding. This is how it should look like:
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "toto"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "tata"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "titi"
}
]
If your endpoint is proper JSON, try the following. It extracts the names from your .yml file (just as you do - there are plenty of more efficient and less error-prone ways but I'm trying to adapt your approach as much as possible) but instead of a Bash array generates a JSON array using jq which for Bash is a simple string. For your curl output it's basically the same thing, extracting a (JSON) array of names into a Bash string. Note that in both cases I use quotes <var>="$(…)" to capture strings that may include spaces (although I also use the -c option for jq to compact it's output to a single line). For the difference between the two, everything is taken over by jq as it can easily be fed with the JSON arrays as variables, perform the subtraction and output in your preferred format:
fromyml="$(cat files.yml | grep name | cut -d '"' -f2 | jq -Rnc '[inputs]')"
fromcurl="$(curl -sX GET "XXXX" | jq -c 'map(.name)')"
diff="$(jq -nr --argjson fromyml "$fromyml" --argjson fromcurl "$fromcurl" '
$fromcurl - $fromyml | .[]
')"
The Bash variable diff now contains a list of names only present in the curl output ($fromcurl - $fromyml), one per line (if, other than in your example, there happens to be more than one). If the curl output had duplicates, they will still be included (use $fromcurl - $fromyml | unique | .[] to get rid of them):
titi
As you can see, this solution has three calls to jq. I'll leave it to you to further reduce that number as it fits your general workflow (basically, it can be put together into one).
Getting the output of a program into a variable can be done using read.
perl -M5.010 -MYAML -MJSON::PP -e'
sub get_next_file { local $/; "".<> }
my %filter = map { $_->{name} => 1 } values %{ Load(get_next_file)->{instance} };
say for grep !$filter{$_}, map $_->{name}, #{ decode_json(get_next_file) };
' b.yaml a.json |
while IFS= read -r id; do
curl -X DELETE ..."$id"...
done
I used Perl here because what you had was no way to parse a YAML file. The snippet requires having installed the YAML Perl module.

How do I print a specific value of an array given a condition in jq if there is no key specified

I am trying to output the value for .metadata.name followed by the student's name in .spec.template.spec.containers[].students[] array using the regex test() function in jq.
I am having trouble to retrieve the individual array value since there is no key specified for the students[] array.
For example, if I check the students[] array if it contains the word "Jeff", I would like the output to display as below:
student-deployment: Jefferson
What i have tried:
I've tried the command below which somewhat works but I am not sure how to get only the "Jefferson" value. The command below would print out all of the students[] array values which is not what I want. I am using Powershell to run the command below.
kubectl get deployments -o json | jq -r '.items[] | select(.spec.template.spec.containers[].students[]?|test("\"^Jeff.\"")) | .metadata.name, "\":\t\"", .spec.template.spec.containers[].students'
Is there a way to print a specific value of an array given a condition in jq if there is no key specified? Also, would the solution work if there are multiple deployments?
The deployment template below is in json and I shortened it to only the relevant parts.
{
"apiVersion": "v1",
"items": [
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": "student-deployment",
"namespace": "default"
},
"spec": {
"template": {
"spec": {
"containers": [
{
"students": [
"Alice",
"Bob",
"Peter",
"Sally",
"Jefferson"
]
}
]
}
}
}
}
]
}
For this approch, we introduce a variable $pattern. You may set it with --arg pattern to your regex, e.g. "Jeff" or "^Al" or "e$" to have the student list filtered by test, or leave it empty to see all students.
Now, we iterate over all .item[] elements (i.e. over "all deployments"). For each found, we output the content of .metadata.name followed by a literal colon and a space. Then we iterate again over all .spec.template.spec.containers[].students[], perform the pattern test and concatenate the outcome.
To print out raw strings instead of JSON, we use the -r option when calling jq.
kubectl get deployments -o json \
| jq --arg pattern "Jeff" -r '
.items[]
| .metadata.name + ": " + (
.spec.template.spec.containers[].students[]
| select(test($pattern))
)
'
To retrieve the "students" array(s) in the input, you could use this filter:
.items[]
| paths(objects) as $p
| getpath($p)
| select( objects | has("students") )
| .students
You can then add additional filters to select the particular student(s) of interest, e.g.
| .[]
| select(test("Jeff"))
And then add any postprocessing filters, e.g.
| "student-deployment: \(.)"
Of course you can obtain the students array in numerous other ways.

Extract a property slug from a json file

I would like to extract the names of the repositories and their size after getting the list of all the repositories in bitbucket using API via shell script. The command I'm using for that is
repo_list=$(cat repo.json | jq '.[] | .slug ' | sed 's/"//g')
repo.json contains:
{
"pagelen":100,
"size":494,
"values":[
{
"scm":"git",
"website":"",
"fork_policy":"no_public_forks",
"full_name":"org_name/ecomm-dist-cache",
"name":"ecomm-dist-cache",
"language":"java",
"created_on":"2014-11-18T19:01:25.741787+00:00",
"mainbranch":{
"type":"branch",
"name":"master"
},
"workspace":{
"slug":"org_name",
"type":"workspace",
"name":"Org Name ",
"uuid":"{xxxxxxxxxxxxx}"
},
"has_issues":true,
"updated_on":"2018-06-06T22:17:02.947496+00:00",
"size":105095621,
"type":"repository",
"slug":"ecomm-dist-cache",
"is_private":true,
"description":"Initial Migration of ecomm-dist-cache"
},
{
"scm":"git",
"website":"",
"full_name":"org_name/mqfte_ecommoutboundtransfertoweddingchannel",
"name":"MQFTE_ECOMMOutboundTransferToWeddingChannel",
"language":"",
"mainbranch":{
"type":"branch",
"name":"master"
},
"workspace":{
"slug":"org_name",
"type":"workspace",
"name":"Org Name ",
"uuid":"{xxxxxxxxxxxxx}"
},
"has_issues":false,
"size":99549,
"type":"repository",
"slug":"mqfte_ecommoutboundtransfertoweddingchannel",
"is_private":true,
"description":""
}
],
"page":1,
"next":"https://api.bitbucket.org/2.0/repositories/org_name? pagelen=100&page=2"
}
The error msg I'm getting is
Cannot index number with string "slug"
Expected result is
ecomm-dist-cache
mqfte_ecommoutboundtransfertoweddingchannel
jq -r '.values[].slug' repo.json
-r will remove the quotation marks and so there is no need to pipe through to sed.

Linux CLI - How to get substring from JSON jq + grep?

I need to pull a substring from JSON. In the JSON doc below, I need the end of the value of jq '.[].networkProfile.networkInterfaces[].id' In other words, I need just A10NICvw4konls2vfbw-data to pass to another command. I can't seem to figure out how to pull a substring using grep. I've seem regex examples out there but haven't been successful with them.
[
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Compute/virtualMachines/A10VNAvw4konls2vfbw",
"instanceView": null,
"licenseType": null,
"location": "centralus",
"name": "A10VNAvw4konls2vfbw",
"networkProfile": {
"networkInterfaces": [
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Network/networkInterfaces/A10NICvw4konls2vfbw-data",
"resourceGroup": "IPv6v2"
}
]
}
}
]
In your case, sub(".*/";"") will do the trick as * is greedy:
.[].networkProfile.networkInterfaces[].id | sub(".*/";"")
Try this:
jq -r '.[]|.networkProfile.networkInterfaces[].id | split("/") | last'
The -r tells JQ to print the output in "raw" form - in this case, that means no double-quotes around the string value.
As for the jq expression, after you access the id you want, piping it (still inside jq) through split("/") turns it into an array of the parts between slashes. Piping that through the last function (thanks, #Thor) returns just the last element of the array.
If you want to do it with grep here is one way:
jq -r '.[].networkProfile.networkInterfaces[].id' | grep -o '[^/]*$'
Output:
A10NICvw4konls2vfbw-data