Extract a property slug from a json file - json

I would like to extract the names of the repositories and their size after getting the list of all the repositories in bitbucket using API via shell script. The command I'm using for that is
repo_list=$(cat repo.json | jq '.[] | .slug ' | sed 's/"//g')
repo.json contains:
{
"pagelen":100,
"size":494,
"values":[
{
"scm":"git",
"website":"",
"fork_policy":"no_public_forks",
"full_name":"org_name/ecomm-dist-cache",
"name":"ecomm-dist-cache",
"language":"java",
"created_on":"2014-11-18T19:01:25.741787+00:00",
"mainbranch":{
"type":"branch",
"name":"master"
},
"workspace":{
"slug":"org_name",
"type":"workspace",
"name":"Org Name ",
"uuid":"{xxxxxxxxxxxxx}"
},
"has_issues":true,
"updated_on":"2018-06-06T22:17:02.947496+00:00",
"size":105095621,
"type":"repository",
"slug":"ecomm-dist-cache",
"is_private":true,
"description":"Initial Migration of ecomm-dist-cache"
},
{
"scm":"git",
"website":"",
"full_name":"org_name/mqfte_ecommoutboundtransfertoweddingchannel",
"name":"MQFTE_ECOMMOutboundTransferToWeddingChannel",
"language":"",
"mainbranch":{
"type":"branch",
"name":"master"
},
"workspace":{
"slug":"org_name",
"type":"workspace",
"name":"Org Name ",
"uuid":"{xxxxxxxxxxxxx}"
},
"has_issues":false,
"size":99549,
"type":"repository",
"slug":"mqfte_ecommoutboundtransfertoweddingchannel",
"is_private":true,
"description":""
}
],
"page":1,
"next":"https://api.bitbucket.org/2.0/repositories/org_name? pagelen=100&page=2"
}
The error msg I'm getting is
Cannot index number with string "slug"
Expected result is
ecomm-dist-cache
mqfte_ecommoutboundtransfertoweddingchannel

jq -r '.values[].slug' repo.json
-r will remove the quotation marks and so there is no need to pipe through to sed.

Related

Bash JSON compare two list and delete id

I have a JSON endpoint which I can fetch value with curl and yml local file. I want to get the difference and delete it with id of name present on JSON endpoint.
JSON's endpoint
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_toto_a"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_tata_b"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_titi_c"
}
]
files.yml
---
instance:
toto:
name: "toto"
tata:
name: "tata"
Between JSON's endpoint and local file, I want to delete it with id of tata, because it is the difference between the sources.
declare -a arr=(_a _b _c)
ar=$(cat files.yml | grep name | cut -d '"' -f2 | tr "\n" " ")
fileItemArray=($ar)
ARR_PRE=("${fileItemArray[#]/#/V1_}")
for i in "${arr[#]}"; do local_var+=("${ARR_PRE[#]/%/$i}"); done
remote_var=$(curl -sX GET "XXXX" | jq -r '.[].name | #sh' | tr -d \'\")
diff_=$(echo ${local_var[#]} ${remote_var[#]} | tr ' ' '\n' | sort | uniq -u)
output = titi
the code works, but I want to delete the titi with id dynamically
curl -X DELETE "XXXX" $id_titi
I am trying to delete with bash script, but I have no idea to continue...
Your endpoint is not proper JSON as it has
commas after the .name field but no following field
no commas between the elements of the top-level array
If this is not just a typo from pasting your example into this question, then you'd need to address this first before proceeding. This is how it should look like:
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "toto"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "tata"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "titi"
}
]
If your endpoint is proper JSON, try the following. It extracts the names from your .yml file (just as you do - there are plenty of more efficient and less error-prone ways but I'm trying to adapt your approach as much as possible) but instead of a Bash array generates a JSON array using jq which for Bash is a simple string. For your curl output it's basically the same thing, extracting a (JSON) array of names into a Bash string. Note that in both cases I use quotes <var>="$(…)" to capture strings that may include spaces (although I also use the -c option for jq to compact it's output to a single line). For the difference between the two, everything is taken over by jq as it can easily be fed with the JSON arrays as variables, perform the subtraction and output in your preferred format:
fromyml="$(cat files.yml | grep name | cut -d '"' -f2 | jq -Rnc '[inputs]')"
fromcurl="$(curl -sX GET "XXXX" | jq -c 'map(.name)')"
diff="$(jq -nr --argjson fromyml "$fromyml" --argjson fromcurl "$fromcurl" '
$fromcurl - $fromyml | .[]
')"
The Bash variable diff now contains a list of names only present in the curl output ($fromcurl - $fromyml), one per line (if, other than in your example, there happens to be more than one). If the curl output had duplicates, they will still be included (use $fromcurl - $fromyml | unique | .[] to get rid of them):
titi
As you can see, this solution has three calls to jq. I'll leave it to you to further reduce that number as it fits your general workflow (basically, it can be put together into one).
Getting the output of a program into a variable can be done using read.
perl -M5.010 -MYAML -MJSON::PP -e'
sub get_next_file { local $/; "".<> }
my %filter = map { $_->{name} => 1 } values %{ Load(get_next_file)->{instance} };
say for grep !$filter{$_}, map $_->{name}, #{ decode_json(get_next_file) };
' b.yaml a.json |
while IFS= read -r id; do
curl -X DELETE ..."$id"...
done
I used Perl here because what you had was no way to parse a YAML file. The snippet requires having installed the YAML Perl module.

Selecting 'name' with highest build number from json list

Here is my sample data, it's a list of objects in a storage bucket on Oracle cloud:
{
"objects": [
{
"name": "rhel/"
},
{
"name": "rhel/app-3.9.6.629089.txt"
},
{
"name": "rhel/app-3.11.4.629600.txt"
}
]
}
The part of the value before the '/' is a folder name, after is a filename. The last number in the filename is a build number. The desired output is the name of the object with the highest build number in the rhel folder:
$ jq -r 'some_program' file.json
rhel/app-3.11.4.629600.txt
I can somewhat process the data to exclude the bare "rhel/" folder as follows:
$ jq -r '.objects[] | select(.name|test("rhel/."))' file.json
{
"name": "rhel/app-3.9.6.629089.txt"
}
{
"name": "rhel/app-3.11.4.629600.txt"
}
When I try to split this on the period jq throws an error:
$ jq -r '.objects[] | select(.name|test("rhel/.")) | split(".")' file.json
jq: error (at file.json:1): split input and separator must be strings
I was expecting to use 'map(tonumber)[-2]' on the result of the split and wrap the entirety in 'max_by()'.
How can get closer to the desired output with jq?
[.objects[]
| select(.name|test("rhel/."))]
| max_by(.name|split(".")[-2]|tonumber)
produces:
{
"name": "rhel/app-3.11.4.629600.txt"
}
If you only want the names you could begin by extracting them:
[.objects[].name|select(test("rhel/."))]
| max_by(split(".")[-2]|tonumber)

How do I print a specific value of an array given a condition in jq if there is no key specified

I am trying to output the value for .metadata.name followed by the student's name in .spec.template.spec.containers[].students[] array using the regex test() function in jq.
I am having trouble to retrieve the individual array value since there is no key specified for the students[] array.
For example, if I check the students[] array if it contains the word "Jeff", I would like the output to display as below:
student-deployment: Jefferson
What i have tried:
I've tried the command below which somewhat works but I am not sure how to get only the "Jefferson" value. The command below would print out all of the students[] array values which is not what I want. I am using Powershell to run the command below.
kubectl get deployments -o json | jq -r '.items[] | select(.spec.template.spec.containers[].students[]?|test("\"^Jeff.\"")) | .metadata.name, "\":\t\"", .spec.template.spec.containers[].students'
Is there a way to print a specific value of an array given a condition in jq if there is no key specified? Also, would the solution work if there are multiple deployments?
The deployment template below is in json and I shortened it to only the relevant parts.
{
"apiVersion": "v1",
"items": [
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": "student-deployment",
"namespace": "default"
},
"spec": {
"template": {
"spec": {
"containers": [
{
"students": [
"Alice",
"Bob",
"Peter",
"Sally",
"Jefferson"
]
}
]
}
}
}
}
]
}
For this approch, we introduce a variable $pattern. You may set it with --arg pattern to your regex, e.g. "Jeff" or "^Al" or "e$" to have the student list filtered by test, or leave it empty to see all students.
Now, we iterate over all .item[] elements (i.e. over "all deployments"). For each found, we output the content of .metadata.name followed by a literal colon and a space. Then we iterate again over all .spec.template.spec.containers[].students[], perform the pattern test and concatenate the outcome.
To print out raw strings instead of JSON, we use the -r option when calling jq.
kubectl get deployments -o json \
| jq --arg pattern "Jeff" -r '
.items[]
| .metadata.name + ": " + (
.spec.template.spec.containers[].students[]
| select(test($pattern))
)
'
To retrieve the "students" array(s) in the input, you could use this filter:
.items[]
| paths(objects) as $p
| getpath($p)
| select( objects | has("students") )
| .students
You can then add additional filters to select the particular student(s) of interest, e.g.
| .[]
| select(test("Jeff"))
And then add any postprocessing filters, e.g.
| "student-deployment: \(.)"
Of course you can obtain the students array in numerous other ways.

jq select error: "Cannot index string with string <object>"

command:
cat test.json | jq -r '.[] | select(.input[] | .["$link"] | contains("randomtext1")) | .id'
I was expecting to have both entries (a and b) to show up since they both contains randomtext1
Instead, I got the following output message:
a
jq: error (at <stdin>:22): Cannot index string with string "$link"
From some digging I understand that the issue is likely caused by the following object/value pair in the a entry:
"someotherobj": "123"
because it does not contain the object $link and the filter in the command expects to see $link in all objects under the input so it errors out before the command has a chance to search in the b entry.
What I really want is to be able to search for any entries that have at least one "$link": "randomtext1" pair under input. Is there a fuzzier search feature allowing me to achieve this?
I tried to use two contains hoping it will just pipe things through:
jq -r '.[] | select(.input[] | contains(["$link"]) | contains("randomtext1")) | .id'
but it did not like that at all..
the test.json file:
[
{
"input": {
"obj1": {
"$link": "randomtext1"
},
"obj2": {
"$link": "randomtext2"
},
"someotherobj": "123"
},
"id": "a"
},
{
"input": {
"obj3": {
"$link": "randomtext1"
},
"obj4": {
"$link": "randomtext2"
}
},
"id": "b"
}
]
What I really want is to be able to search for any entries that have at least one "$link": "randomtext1" pair under input.
The key word here, both in the question and the following answer, is any:
.[]
| select( any(.input[];
type=="object" and has("$link") and (.["$link"] | index("randomtext1"))))
| .id
Of course if you require the key's value to be "randomtext1", you'd write .["$link"] == "randomtext1".

Building new JSON with JQ and bash

I am trying to create JSON from scratch using bash.
The final structure needs to be like:
{
"hosts": {
"a_hostname" : {
"ips" : [
1,
2,
3
]
},
{...}
}
}
First I'm creating an input file with the format:
hostname ["1.1.1.1","2.2.2.2"]
host-name2 ["3.3.3.3","4.4.4.4"]
This is being created by:
for host in $( ansible -i hosts all --list-hosts ) ; \
do echo -n "${host} " ; \
ansible -i hosts $host -m setup | sed '1c {' | jq -r -c '.ansible_facts.ansible_all_ipv4_addresses' ; \
done > hosts.txt
The key point here is that the IP list/array, is coming from a JSON file and being extracted by jq. This extraction outputs an already valid / quoted JSON array, but as a string in a txt file.
Next I'm using jq to parse the whole text file into the desired JSON:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]):
{ips:.[1]}
}
] | add }
' < ~/hosts.txt
This is almost correct, everything except for the IPs value which is treated as a string and quoted leading to:
{
"hosts": {
"hostname1": {
"ips": "[\"1.1.1.1\",\"2.2.2.2\"]"
},
"host-name2": {
"ips": "[\"3.3.3.3\",\"4.4.4.4\"]"
}
}
}
I'm now stuck at this final hurdle - how to insert the IPs without causing them to be quoted again.
Edit - quoting solved by using {ips: .[1] | fromjson }} instead of {ips:.[1]}.
However this was completely negated by #CharlesDuffy's help suggesting converting to TSV.
Original Q body:
So far I've got to
jq -n {hosts:{}} | \
for host in $( ansible -i hosts all --list-hosts ) ; \
do jq ".hosts += {$host:{}}" | \
jq ".hosts.$host += {ips:[1,2,3]}" ; \
done ;
([1,2,3] is actually coming from a subshell but including it seemed unnecessary as that part works, and made it harder to read)
This sort of works, but there seems to be 2 problems.
1) Final output only has a single host in it containg data from the first host in the list (this persists even if the second problem is bypassed):
{
"hosts": {
"host_1": {
"ips": [
1,
2,
3
]
}
}
}
2) One of the hostnames has a - in it, which causes syntax and compiler errors from jq. I'm stuck going around quote hell trying to get it to be interpreted but also quoted. Help!
Thanks for any input.
Let's say your input format is:
host_1 1 2 3
host_2 2 3 4
host-with-dashes 3 4 5
host-with-no-addresses
...re: edit specifying a different format: Add #tsv onto the JQ command producing the existing format to generate this one instead.
If you want to transform that to the format in question, it might look like:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]): .[1:]}
] | add
}' <input.txt
Which yields as output:
{
"hosts": {
"host_1": [
"1",
"2",
"3"
],
"host_2": [
"2",
"3",
"4"
],
"host-with-dashes": [
"3",
"4",
"5"
],
"host-with-no-addresses": []
}
}