How to check for a value before '.' in BASH - json

I'm outputting some values to JSON format, and it appears if a value starts with a '.' it isn't valid JSON (The API doesn't seem to like these int's inside " "). What would be the best way to check if my value has anything in front of a '.', and if not, put a 0 there?
i.e
value = .53
newvalue = 0.53
I'm not very good at doing anything more than simple functions in BASH at the moment, still trying to learn awk/sed and other useful things such as cut.

There might be a number of possible solutions given the nature of the input. However, given those unknowns an easy workaround would be to say:
[[ $value == \.* ]] && newvalue=0${value}
Example:
$ value=.42
$ [[ $value == \.* ]] && newvalue=0${value}
$ echo $newvalue
0.42

Related

Bash regex match on JSON structure

I have a JSON structure like this which I've read into a bash variable as a string:
{
"elem1": "val1",
"THEELEM": "THEVAL",
"elem3": "val3"
}
I want to use regex to match on "THEELEM": "THEVAL". It works if I try individual words, where the JSON is stored in result as a string:
[[ $result =~ THEVAL ]] && echo "yes"
But I want to match on the key-pair like this:
[[ $result =~ "THEELEM": "THEVAL" ]] && echo "yes"
That gives me syntax issues. I've tried escaping, single-quotes and triple quotes to no avail. Any help appreciated.
Quoting works for me.
[[ $result =~ '"THEELEM": "THEVAL"' ]] && echo "yes"
Note that quoting the pattern disables recognition of special regular expression characters, and just searches for the literal substring. This is not a problem here, since you don't have any wildcards or other non-literal pattern characters. But if you did, you'd have to put the pattern in a variable, as in #noah's answer.
You can create a variable $expr to hold the string you want to match to and then use that for the regex.
expr='"THEELEM": "THEVAL"'
[[ $result =~ $expr ]] && echo "yes"
Inspired by this stack overflow post

Newbie: unix bash, nested if statement, results from a loop results from sql

Newbie here, please pardon any confusing wording that I use.
A common task I have is to take a list of names and do a MySQL query to look the names up in a table and see if they are "live" on our site.
Doing this one at a time, my SQL query works fine. I then wanted to do the query using a loop from a file listing multiple names. This works fine, too.
I added this query loop to my bash profile so that I can quickly do the task by typing this:
$ ValidOnSite fileName
This works fine, and I even added an usage statement for my process to remind myself of the syntax. Below is what I have that works fine:
validOnSite() {
if [[ "$1" == "" ]] || [[ "$1" == "-h" ]] || [[ "$1" == "--help" ]]; then
echo "Usage:"
echo " $ validOnSite [filename]"
echo " Where validOnSite uses specified file as variables in sql query:"
echo " SELECT name, active FROM dbDb WHERE name=lines in file"
else
cat $1 | while read line ; do hgsql -h genome-centdb hgcentral -Ne "select name, active from dbDb where name='$line'" ; done
fi
Using a file "list.txt" which contains:
nameA
nameB
I would then type:
validOnSite list.txt
and both entries in list.txt meet my query criteria and are found in sql. My results will be:
nameA 1
nameB 1
Note the "1" after each result. I assume this is some sort of "yes" status.
Now, I add a third name to my list.txt, one that I know is not a match in sql. Now list.txt contains:
nameA
nameB
foo
When I again run this command for my list with 3 rows:
validOnSite list.txt
My results are the same as when I used the 1st version of file.txt, and I cannot see which lines failed, I still only see which lines were a success:
nameA 1
nameB 1
I have been trying all kinds of things to add a nested if statement, something that says, "If $line is a match, echo "pass", else echo "fail."
I do not want to see a "1" in my results. Using file.txt with 2 matches and 1 non-match, I would like my results to be:
nameA pass
nameB pass
foo fail
Or even better, color code a pass with green and a fail with red.
As I said, newbie here... :)
Any pointers in the right direction would help. Here is my latest sad attempt, but I realize I may be going in a wrong direction entirely:
validOnSite() {
if [[ "$1" == "" ]] || [[ "$1" == "-h" ]] || [[ "$1" == "--help" ]]; then
echo "Usage:"
echo " $ validOnSite [filename]"
echo " Where validOnSite uses specified file as variables in sql query:"
echo " SELECT name, active FROM dbDb WHERE name=lines in file"
else
cat $1 | while read line ; do hgsql -h genome-centdb hgcentral -Ne "select name, active from dbDb where name='$line'" > /dev/null ; done
if ( "status") then
echo $line "failed"
echo $line "failed" >> outfile
else
echo $line "ok"
echo $line "ok" >>outfile
clear
cat outfile
fi
fi
If something looks crazy in my last attempt, it's because it is - I am just googling around and trying as many things as I can while trying to learn. Any help appreciated, I feel stuck after working on this for a long time, but I am excited to move forward and find a solution! I think there is something I'm missing about understanding stdout, and also confusion about nested if's.
Note: I do not need an outfile, but it's ok if one is needed to accomplish the goal. stdout result alone would suffice, and is preferred.
Note: hgssql is just the name of our MySQL server. The MySQL part works fine, I am looking for a better way to deal with my bash output, and I think there is something about stderr that I'm missing. I'm looking for a fairly simple answer as I'm a newbie!
I guess, by hgsql you mean some Mercurial extension that allows to perform MySQL queries. I don't know how hgsql works, but I know that MySQL returns only the matching rows. But in terms of shell scripting, the result is a string that may contain extra information even if the number of matched rows is zero. For example, some MySQL client may return the header or a string like "No rows found", although it is unlikely.
I'll show how it is done with the official mysql client. I'm sure you will manage to adapt hgsql with the help of its documentation to the following example.
if [ -t 1 ]; then
red_color=$(tput setaf 1)
green_color=$(tput setaf 2)
reset_color=$(tput sgr0)
else
red_color=
green_color=
reset_color=
fi
colorize_flag() {
local color
if [ "$1" = 'fail' ]; then
color="$red_color"
else
color="$green_color"
fi
printf '%s' "${color}${1}${reset_color}"
}
sql_fmt='SELECT IF(active, "pass", "fail") AS flag FROM dbDb WHERE name = "%s"'
while IFS= read -r line; do
sql=$(printf "$sql_fmt" "$line")
flag=$(mysql --skip-column-names dbname -e "$sql")
[ -z "$flag" ] && flag='fail'
printf '%-20s%s\n' "$line" "$(colorize_flag "$flag")"
done < file
The first block detects if the script is running in interactive mode by checking if the file descriptor 1 (standard output) is opened on a terminal (see help test). If it is opened in a terminal, the script considers that the script is running interactively, i.e. the standard output is connected to the user's terminal directly, but not via pipe, for example. For interactive mode, it assigns variables to the terminal color codes with the help of tput command.
colorize_flag function accepts a string ($1) and outputs the string with the color codes applied according to its value.
The last block reads file line by line. For each line builds an SQL query string (sql) and invokes mysql command with the column names stripped off the output. The output of the mysql command is assigned to flag by means of command substitution. If "$flag" is empty, it is assigned to 'fail'. The $line and the colorized flag are printed to standard output.
You can test the non-interactive mode by chaining the output via pipe, e.g.:
./script | tee -a
I must warn you that it is generally bad idea to pass the shell variables into SQL queries unless the values are properly escaped. And the popular shells do not provide any tools to escape MySQL strings. So consider running the queries in Perl, PHP, or any programming language that is capable of building and running the queries safely.
Also note that in terms of performance it is better to run a single query and then parse the result set in a loop instead of running multiple queries in a loop, with the exception of prepared statements.
I found a way to get to my solution by piecing together the few basic things that I know. Not elegant, but it works well enough for now. I created a file "[filename]Results" with the output:
nameA 1
nameB 1
I then cut out the "1"s and made a new file. I then did a comparison with "[fileName]results" to list.txt in order to see what lines exist in file.txt but do not exist in results.
Note: I have the following in my .zshrc file.
validOnSite() {
if [[ "$1" == "" ]] || [[ "$1" == "-h" ]] || [[ "$1" == "--help" ]]; then
echo "Usage:"
echo " $ validOnSite [filename]"
echo " Where validOnSite uses specified file as variables in sql query:"
echo " SELECT name, active FROM dbDb WHERE name=lines in file"
else
cat $1 | while read line ; do hgsql -h genome-centdb hgcentral -Ne "select name from dbDb where name='$line' and active='1'" >> $1"Pass"; done
autoload -U colors
colors
echo $fg_bold[magenta]Assemblies active on site${reset_color}
echo
cat $1"Pass"
echo
echo $fg_bold[red]Not active or not found on site${reset_color}
comm -23 $1 $1"Pass" 2> /dev/null
echo
echo
mv $1"Pass" ~cath/myFiles/validOnSiteResults
echo "Results file containing only active assemblies resides in ~cath/myFiles/validOnSiteResults"
fi
}
list.txt:
nameA
nameB
foo
My input:
validOnSite list.txt
My output:
Assemblies active on site (<--this font is magenta)
nameA
nameB
Not active or not found on site (<--this font is red)
foo
Results file containing only active assemblies resides in ~me/myFiles/validOnRRresults

Insert newline character at index in .bash

I'm taking an introductory course to bash at my university and am working on a little MotD script that uses a json-object grabbed from an API using curl.
I want to make absolutely certain that you understand that this is NOT an assignment, but something I'm playing around with to learn more about how to script with bash.
I've found myself stuck with what could possibly be a very simply issue; I want to insert a new line ('\n') on a specific index if the 'quote' value of my json-object is too long (in this case on index 80).
I've been following a bunch of SO threads and this is my current solution:
#!/bin/bash
json_object=$(curl -s 'http://quotes.stormconsultancy.co.uk/random.json')
quote=$(echo ${json_object} | jq .quote | sed -e 's/^"//' -e 's/"$//')
author=$(echo ${json_object} | jq .author)
count=${#quote}
echo $quote
echo $author
echo "wc: $count"
if((count > 80));
then
quote=${quote:0:80}\n${quote:80:(count - 80)}
else
echo "lower"
fi
printf "$quote"
The current output I receive from the printf is the first word of the quote, whereas if I have an echo before trying to do the string-manipulation I get the entire quote.
I'm sorry if it's not following best practice or anything, but I'm an absolute beginner using both vi and bash.
I'd be very happy with any sort of advice. :)
EDIT:
Sample output:
$ ./json.bash
You should name a variable using the same care with which you name a first-born child.
"James O. Coplien"
86
higher
You should name a variable using the same care with which you name a first-born nchild.
You can just use a single line bash command to achieve this,
string="You should name a variable using the same care with which you name a first-born child."
(( "${#string}" > 80 )) && printf "%s\n" "${string:0:80}"$'\n'"${string:80}" || printf "%s\n" "$string"
You should name a variable using the same care with which you name a first-born
child.
(and) for an input line less than 80 charaacters
string="You should name a variable using the same care"
(( "${#string}" > 80 )) && printf "%s\n" "${string:0:80}"$'\n'"${string:80}" || printf "%s\n" "$string"
You should name a variable using the same care
An explanation,
(( "${#string}" > 80 )) && printf "%s\n" "${string:0:80}"$'\n'"${string:80}" || printf "%s\n" "$string"
# The syntax is a indirect implementation of ternary operator as bash doesn't
# directly support it.
#
# (( "${#string}" > 80 )) will return a success/fail depending upon the length
# of the string variable and if it is greater than 80, the command after && is
# executed and if it fails the command after || is executed
#
# "${string:0:80}"$'\n'"${string:80}"
# A parameter expansion syntax for sub-string extraction.
#
# ${PARAMETER:OFFSET}
#
# ${PARAMETER:OFFSET:LENGTH}
#
# This one can expand only a part of a parameter's value, given a position
# to start and maybe a length. If LENGTH is omitted, the parameter will be
# expanded up to the end of the string. If LENGTH is negative, it's taken as
# a second offset into the string, counting from the end of the string.
#
# So in our example we basically extract the characters from position 0 to 80
# insert a new-line and append the rest of the string
#
# The $'\n' syntax allows to include all escape sequence characters be
# included, in this case just the new line character.
Not really in the original question, but adding some extra code to #Inian great answer to allow not to break in the middle of a word, but rather at the last white space in ${string:0:80}:
#!/usr/bin/env bash
string="You should really name a variable using the same care with which you name a first-born child."
if (( "${#string}" > 80 )); then
maxstring="${string:0:80}"
lastspace="${maxstring##*\ }"
breakat="$((${#maxstring} - ${#lastspace}))"
printf "%s\n" $"${string:0:${breakat}}"$'\n'"${string:${breakat}}"
else
printf "%s\n" "$string"
fi
maxstring=${string:0:80}:
Let's get the first 80 characters of the quote.
lastspace=${maxstring##*\ }:
Deletes longest match of *\ (white space is escaped) from front of $maxstring, ${lastspace} will be the remaining string from last white space until end of the string.
breakat="$((${#maxstring} - ${#lastspace}))":
Subtract the length of ${lastspace} with the length of ${maxstring} to get the last index of the white space from ${maxstring}. This is the index where \n will be inserted.
Example output with "hard" break at character 80:
You should really name a variable using the same care with which you name a firs
t-born child.
Example output with a "soft" break at the closest white space from character 80:
You should really name a variable using the same care with which you name a
first-born child.

getting specific filename from bash

So I have a perl module that uses a bash command to obtain the file(s) with certain "table" names. In my specific case, it is looking for tables with the name "event", but I need this to work with all names too.
Currently, I have the following code in my perl script to obtain MYI files with the name table, and I am receiving not only event_* but also event_extra_data_* as well. For my example, I only need the 2nd table that exists in my database for event_. As my test info, I have, currently,
event_1459161160_0
event_1459182760_0
event_extra_data_1459182745_0
event_extra_data_1459182760_0
which are partitioned tables from tables "event" and event_extra_data which is the value that the $table variable sees below.
Anyways, my question is, how do i limit this to only receiving event_1459182760_0.MYI and not event_extra_data_1459182760_0.MYI which it is currently getting?
elsif ($sql =~ /\{LAST\}/i )
{
$cmd = 'ls -1 /var/lib/mysql/sfsnort/'.$table.'_*MYI | grep -v template | tail -n1 | cut -d"/" -f6 | cut -d"." -f1';
$value = `$cmd`;
print "Search Value: $value\n";
if ($value eq "")
{
$sql = ""; # same as with FIRST
}
else
{
$sql =~ s/\{LAST\}/$value/g;
}
}
Don't parse ls - there's no point, and it's prone to causing problems.
I would point out this - the glob function within perl allows you to do to a limited number of "regex-like" patterns. (Note - they aren't regex, so don't get them mixed up).
foreach my $filename ( glob "event_[0-9]*" ) {
#do something with $filename
}
If you're just after the last - when sorted numerically:
my ( $last ) = reverse sort glob "event_[0-9]*";
Given you have a single path, then you should be able to:
my ( $last ) = reverse sort glob "/var/lib/mysql/sfsnort/event_[0-9]*.MYI";
Note - that this works, assuming you're working with time() numeric values - it's doing an alphanumeric sort (and on directory names too).
If that isn't a valid assumption, you'll need a custom sort - which is quite easy, you can feed sort a subroutine to sort by.
Either:
sort { my ($a1) = $a =~ /(\d+)/; my ($b1) = $b =~ /(\d+)/; $b1 <=> $a1 }
To extract the first 'string of digits' from the path. (note - also includes directories).
Or use the -M file test:
sort { -M $a <=> -M $b }
Which will read modification time from the file (technically -M is age in days).
You can remove the reverse if you custom sort, just by swapping $a and $b.
Though I think this would be better done all in perl, to answer your specific question about how to get event_* but not event_extra*, you could of course add that to your grep to filter out, or you could use a different glob, like $table_[0-9]* if there's always an _ then a digit after the table name.
In perl you could do it something like the following though:
opendir( DIR, '/var/lib/mysql/sfsnort/' );
my #files = sort grep { /$table_\d/ } readdir( DIR );
closedir( DIR );
$files[$#files] =~ /(^[^.]+)/;
my $value = $1;

Read JSON data in a shell script [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 6 years ago.
In shell I have a requirement wherein I have to read the JSON response which is in the following format:
{ "Messages": [ { "Body": "172.16.1.42|/home/480/1234/5-12-2013/1234.toSort", "ReceiptHandle": "uUk89DYFzt1VAHtMW2iz0VSiDcGHY+H6WtTgcTSgBiFbpFUg5lythf+wQdWluzCoBziie8BiS2GFQVoRjQQfOx3R5jUASxDz7SmoCI5bNPJkWqU8ola+OYBIYNuCP1fYweKl1BOFUF+o2g7xLSIEkrdvLDAhYvHzfPb4QNgOSuN1JGG1GcZehvW3Q/9jq3vjYVIFz3Ho7blCUuWYhGFrpsBn5HWoRYE5VF5Bxc/zO6dPT0n4wRAd3hUEqF3WWeTMlWyTJp1KoMyX7Z8IXH4hKURGjdBQ0PwlSDF2cBYkBUA=", "MD5OfBody": "53e90dc3fa8afa3452c671080569642e", "MessageId": "e93e9238-f9f8-4bf4-bf5b-9a0cae8a0ebc" } ] }
Here I am only concerned with the "Body" property value. I made some unsuccessful attempts like:
jsawk -a 'return this.Body'
or
awk -v k="Body" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}
But that did not suffice. Can anyone help me with this?
There is jq for parsing json on the command line:
jq '.Body'
Visit this for jq: https://stedolan.github.io/jq/
tl;dr
$ cat /tmp/so.json | underscore select '.Messages .Body'
["172.16.1.42|/home/480/1234/5-12-2013/1234.toSort"]
Javascript CLI tools
You can use Javascript CLI tools like
underscore-cli:
json:select(): CSS-like selectors for JSON.
Example
Select all name children of a addons:
underscore select ".addons > .name"
The underscore-cli provide others real world examples as well as the json:select() doc.
Similarly using Bash regexp. Shall be able to snatch any key/value pair.
key="Body"
re="\"($key)\": \"([^\"]*)\""
while read -r l; do
if [[ $l =~ $re ]]; then
name="${BASH_REMATCH[1]}"
value="${BASH_REMATCH[2]}"
echo "$name=$value"
else
echo "No match"
fi
done
Regular expression can be tuned to match multiple spaces/tabs or newline(s). Wouldn't work if value has embedded ". This is an illustration. Better to use some "industrial" parser :)
Here is a crude way to do it: Transform JSON into bash variables to eval them.
This only works for:
JSON which does not contain nested arrays, and
JSON from trustworthy sources (else it may confuse your shell script, perhaps it may even be able to harm your system, You have been warned)
Well, yes, it uses PERL to do this job, thanks to CPAN, but is small enough for inclusion directly into a script and hence is quick and easy to debug:
json2bash() {
perl -MJSON -0777 -n -E 'sub J {
my ($p,$v) = #_; my $r = ref $v;
if ($r eq "HASH") { J("${p}_$_", $v->{$_}) for keys %$v; }
elsif ($r eq "ARRAY") { $n = 0; J("$p"."[".$n++."]", $_) foreach #$v; }
else { $v =~ '"s/'/'\\\\''/g"'; $p =~ s/^([^[]*)\[([0-9]*)\](.+)$/$1$3\[$2\]/;
$p =~ tr/-/_/; $p =~ tr/A-Za-z0-9_[]//cd; say "$p='\''$v'\'';"; }
}; J("json", decode_json($_));'
}
use it like eval "$(json2bash <<<'{"a":["b","c"]}')"
Not heavily tested, though. Updates, warnings and more examples see my GIST.
Update
(Unfortunately, following is a link-only-solution, as the C code is far
too long to duplicate here.)
For all those, who do not like the above solution,
there now is a C program json2sh
which (hopefully safely) converts JSON into shell variables.
In contrast to the perl snippet, it is able to process any JSON,
as long as it is well formed.
Caveats:
json2sh was not tested much.
json2sh may create variables, which start with the shellshock pattern () {
I wrote json2sh to be able to post-process .bson with Shell:
bson2json()
{
printf '[';
{ bsondump "$1"; echo "\"END$?\""; } | sed '/^{/s/$/,/';
echo ']';
};
bsons2json()
{
printf '{';
c='';
for a;
do
printf '%s"%q":' "$c" "$a";
c=',';
bson2json "$a";
done;
echo '}';
};
bsons2json */*.bson | json2sh | ..
Explained:
bson2json dumps a .bson file such, that the records become a JSON array
If everything works OK, an END0-Marker is applied, else you will see something like END1.
The END-Marker is needed, else empty .bson files would not show up.
bsons2json dumps a bunch of .bson files as an object, where the output of bson2json is indexed by the filename.
This then is postprocessed by json2sh, such that you can use grep/source/eval/etc. what you need, to bring the values into the shell.
This way you can quickly process the contents of a MongoDB dump on shell level, without need to import it into MongoDB first.