glob patterns difference between {} and +() - glob

I referenced https://facelessuser.github.io/wcmatch/glob/ and it says that
+(pattern_list) : The pattern matches if one or more occurrences of any of the patterns in the pattern_list match the input string.
Requires the EXTGLOB flag.
{} : Bash style brace expansions. This is applied to patterns before
anything else. Requires the BRACE flag.
I can see that the description is different for each of them, but hard to understand with the actual example.
For example, what is the difference between those two below?
"lint-staged": {
"src/**/*.{js,jsx,ts,tsx,json,css}": [
"prettier --write",
"eslint --fix src/",
"tslint --fix --project .",
"git add"
]
},
"lint-staged": {
"src/**/*.+(js|jsx|ts|tsx|json|css)": [
"prettier --write",
"eslint --fix src/",
"tslint --fix --project .",
"git add"
]
},
Any advice would be appreciated!

{} implements something similar to Bash's brace expansion. Essentially src/**/*.{js,jsx,ts,tsx,json,css} will become:
[
'src/**/*.js',
'src/**/*.jsx',
'src/**/*.ts',
'src/**/*.tsx',
'src/**/*.json',
'src/**/*.css'
]
There is a time and place for this, but you can see this might be less efficient as now you are processing multiple patterns.
You can think of +() more like in regular expression where +(js|jsx|ts|tsx|json|css) would be more equivalent to (js|jsx|ts|tsx|json|css)+.
So it would match things like js or jsjsxtxjson Which is not really equivalent to {}.
What you are probably interested in, if looking for a more efficient comparison to {}, is probably #(js|jsx|ts|tsx|json|css) which is equivalent to regular expression patterns like this (js|jsx|ts|tsx|json|css) which would match just one occurrence and would match js but not jsjsxtxjson. The reason why this may be more efficient is simply that you get a single pattern as opposed to multiple patterns.

Related

Non json output from gcloud ai-platform predict. Parsing non-json outputs

I am using gcloud ai-platform predict to call an endpoint and get predictions as below using json-request and not json-response
gcloud ai-platform predict --json-request instances.json
The response is however not json and hense cannot be read further causing other complications. Below is the response.
VAL HS
0.5 {'hs_1': [[-0.134501, -0.307326, -0.151994, -0.065352, -0.14138]], 'hs_2' : [[-0.134501, -0.307326, -0.151994, -0.065352, 0.020759]]}
Can gcloud ai-platform predict return a json instead or may be parse it differently. ?
Thanks for your help.
Apparently, your output is a table with headers and two columns: a score and the (alleged) JSON content. You should extract the second column of any preferred data row (your example only has one but in general you might receive several score-JSON pairs). Maybe your API already offers functionality to extract a certain 'state', e.g. the one with the highest score. If not, a simple awk or sed script can get this job done easily.
Then, the only remaining issue before having proper JSON (which can then be queried by jq) is with the quoting style. Your output encloses field names with ' instead of " ('lstm_1' instead of "lstm_1"). Correcting thin, unfortunately, is a not-so-easy task if you can expect to receive arbitrarily complex JSON data (such as strings containing quotation marks etc.). However, if your JSON will always look as simple as in the example provided, simply substituting the wrong for the right one becomes an easy task again for tools like awk or sed.
For instance, using sed on your example output to select the second line (which is the first data row), drop everything from the beginning until but not including the first opening curly brace (which marks the beginning of the second column), make said substitutions and pipe the result into jq:
... | sed -n "2{s/^[^{]\+//;s/'/\"/g;p;q}" | jq .
{
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
[Edited to reflect upon a comment]
If you want to utilize the score as well, let jq handle it. For instance:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{score:first,status:last}'
{
"score": 0.548,
"status": {
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
}
[Edited to reflect upon changes in the OP]
As changes affected only names and values but no structure, the hitherto valid approach still holds:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{val:first,hs:last}'
{
"val": 0.5,
"hs": {
"hs_1": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
-0.14138
]
],
"hs_2": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
0.020759
]
]
}
}

Search and extract value using JQ command line processor

I have a JSON file very similar to the following:
[
{
"uuid": "832390ed-58ed-4338-bf97-eb42f123d9f3",
"name": "Nacho"
},
{
"uuid": "5b55ea5e-96f4-48d3-a258-75e152d8236a",
"name": "Taco"
},
{
"uuid": "a68f5249-828c-4265-9317-fc902b0d65b9",
"name": "Burrito"
}
]
I am trying to figure out how to use the JQ command line processor to first find the UUID that I input and based on that output the name of the associated item. So for example, if I input UUID a68f5249-828c-4265-9317-fc902b0d65b9 it should search the JSON file, find the matching UUID and then return the name Burrito. I am doing this in Bash. I realize it may require some outside logic in addition to JQ. I will keep thinking about it and put an update here in a bit. I know I could do it in an overly complicated way, but I know there is probably a really simple JQ method of doing this in one or two lines. Please help me.
https://shapeshed.com/jq-json/#how-to-find-a-key-and-value
You can use select:
jq -r --arg query Burrito '.[] | select( .name == $query ) | .uuid ' tst.json

Pass bash variable (string) as jq argument to walk JSON object

This is seemingly a very basic question. I am new to JSON, so I am prepared for the incoming facepalm.
I have the following JSON file (test-app-1.json):
"application.name": "test-app-1",
"environments": {
"development": [
"server1"
],
"stage": [
"server2",
"server3"
],
"production": [
"server4",
"server5"
]
}
The intent is to use this as a configuration file and reference for input validation.
I am using bash 3.2 and will be using jq 1.4 (not the latest) to read the JSON.
The problem:
I need to return all values in the specified JSON array based on an argument.
Example: (how the documentation and other resources show that it should work)
APPLICATION_ENVIRONMENT="developement"
jq --arg appenv "$APPLICATION_ENVIRONMENT" '.environments."$env[]"' test-app-1.json
If executed, this returns null. This should return server1.
Obviously, if I specify text explicitly matching the JSON arrays under environment, it works just fine:
jq 'environments.development[]' test-app-1.json returns: server1.
Limitations: I am stuck to jq 1.4 for this project. I have tried the same actions in 1.5 on a different machine with the same null results.
What am I doing wrong here?
jq documentation: https://stedolan.github.io/jq/manual/
You have three issues - two typos and one jq filter usage issue:
set APPLICATION_ENVIRONMENT to development instead of developement
use variable name consistently: if you define appenv, use $appenv, not $env
address with .environments[$appenv]
When fixed, looks like this:
$ APPLICATION_ENVIRONMENT="development"
$ jq --arg appenv "$APPLICATION_ENVIRONMENT" '.environments[$appenv][]' test-app-1.json
"server1"

Store JSON directly in bash script with variables?

I'm going to preface by saying that "no, find a different way to do it" is an acceptable answer here.
Is there a reliable way to store a short bit of JSON in a bash variable for use in a AWS CLI command running from the same script?
I'll be running a job from Jenkins that's updating an AWS Route53 record, which requires UPSERTing a JSON file with the change in records. Because it's running from Jenkins, there's no local storage where I can keep this file, and I'd really like to avoid needing to do a git checkout every time this project will run (which will be once an hour).
Ideally, storing the data in a variable ($foo) and calling it as part of the change-resource-record-sets command would be most convenient given the Jenkins setup, but I'm unfamiliar with exactly how to quote/store JSON inside bash - can it be done safely?
The specific JSON in this case is the following;
{"Comment":"Update DNSName.","Changes":[{"Action":"UPSERT","ResourceRecordSet":{"Name":"alex.","Type":"A","AliasTarget":{"HostedZoneId":"######","DNSName":"$bar","EvaluateTargetHealth":false}}}]}
As an added complication the DNSName value - $bar - needs to be expanded.
You could use a here-doc:
foo=$(cat <<EOF
{"Comment":"Update DNSName.","Changes":[{"Action":"UPSERT","ResourceRecordSet":{"Name":"alex.","Type":"A","AliasTarget":{"HostedZoneId":"######","DNSName":"$bar","EvaluateTargetHealth":false}}}]}
EOF
)
By leaving EOF in the first line unquoted, the contents of the here-doc will be subject to parameter expansion, so your $bar expands to whatever you put in there.
If you can have linebreaks in your JSON, you can make it a little more readable:
foo=$(cat <<EOF
{
"Comment": "Update DNSName.",
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "alex.",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "######",
"DNSName": "$bar",
"EvaluateTargetHealth": false
}
}
}
]
}
EOF
)
or even (first indent on each line must be a tab)
foo=$(cat <<-EOF
{
"Comment": "Update DNSName.",
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "alex.",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "######",
"DNSName": "$bar",
"EvaluateTargetHealth": false
}
}
}
]
}
EOF
)
and to show how that is stored, including quoting (assuming that bar=baz):
$ declare -p foo
declare -- foo="{
\"Comment\": \"Update DNSName.\",
\"Changes\": [
{
\"Action\": \"UPSERT\",
\"ResourceRecordSet\": {
\"Name\": \"alex.\",
\"Type\": \"A\",
\"AliasTarget\": {
\"HostedZoneId\": \"######\",
\"DNSName\": \"baz\",
\"EvaluateTargetHealth\": false
}
}
}
]
}"
Because this expands some shell metacharacters, you could run into trouble if your JSON contains something like `, so alternatively, you could assign directly, but be careful about quoting around $bar:
foo='{"Comment":"Update DNSName.","Changes":[{"Action":"UPSERT","ResourceRecordSet":{"Name":"alex.","Type":"A","AliasTarget":{"HostedZoneId":"######","DNSName":"'"$bar"'","EvaluateTargetHealth":false}}}]}'
Notice the quoting for $bar: it's
"'"$bar"'"
│││ │││
│││ ││└ literal double quote
│││ │└ opening syntactical single quote
│││ └ closing syntactical double quote
││└ opening syntactical double quote
│└ closing syntactical single quote
└ literal double quote
It can be stored safely; generating it is a different matter, since the contents of $bar may need to be encoded. Let a tool like jq handle creating the JSON.
var=$(jq -n --arg b "$bar" '{
Comment: "Update DNSName.",
Changes: [
{
Action: "UPSERT",
ResourceRecordSet: {
Name: "alex.",
Type: "A",
AliasTarget: {
HostedZoneId: "######",
DNSName: $b,
EvaluateTargetHealth: false
}
}
}
]
}')

How to extract the value of "name" from this puppet metadata.json without jq in shell?

For some extreme reason, I can't use jq or other cli tool. I need to extract the value of "name" from any json matching this puppet metadata.json. format.
the json might not be properly formatted and indented but will be valid. Meaning, white spaces, and line breaks, carriage backs might be inserted in eligible places.
Note that there could be "name" elements in dependencies array.
So, how to extract the value only using standard unix commands and/or shell script without installing any application like jq or other tools?
Thank you!!
{
"name": "examplecorp-mymodule",
"version": "0.0.1",
"author": "Pat",
"license": "Apache-2.0",
"summary": "A module for a thing",
"source": "https://github.com/examplecorp/examplecorp-mymodule",
"project_page": "https://forge.puppetlabs.com/examplecorp/mymodule",
"issues_url": "https://github.com/examplecorp/examplecorp-mymodule/issues",
"tags": ["things", "stuff"],
"operatingsystem_support": [
{
"operatingsystem":"RedHat",
"operatingsystemrelease":[ "5.0", "6.0" ]
},
{
"operatingsystem": "Ubuntu",
"operatingsystemrelease": [ "12.04", "10.04" ]
}
],
"dependencies": [
{ "name": "puppetlabs/stdlib", "version_requirement": ">=3.2.0 <5.0.0" },
{ "name": "puppetlabs/firewall", "version_requirement": ">= 0.0.4" }
]
}
It's ugly, awful, horrible, not structure-aware and will give you incorrect results if you have extra contents in your input file that look similar to what you're trying to find -- but...
#!/bin/bash
# ^- NOT /bin/sh; shell-native regexes are a bash extension
contents=$(<in.json)
if [[ $contents =~ '"name":'[[:space:]]*'"'([^\"]*)'"' ]]; then
echo "Found name: ${BASH_REMATCH[1]}"
fi
Now, let's talk about some of the ways this answer is broken (and using jq would be better):
It finds the first name, even if it's not one at an outer nesting layer. That is to say, if "dependencies": [ { "name": "puppetlabs/stdlib", "version_requirement": ">=3.2.0 <5.0.0" } ] comes before "name": "examplecorp-mymodule", guess which result is being found? (The easy workarounds to this would involve making assumptions about whitespace/formatting, and are thus not proof against all possible JSON expressions of the same data).
It won't unescape contents inside your name that require, well, unescaping (think about names containing symbols encoded as &foo;).
It isn't multibyte-character aware, and thus isn't guaranteed to emit output that aligns on codepoint boundaries.
If you have a name with an escaped \" subsequence... well, guess what happens there?
Etc. It's not quite as awful as trying to parse XML with regular expressions (JSON is easier!), but it's still quite a mess.
This should work for you:
jq '.name' metadata.json