BATCH: grep equivalent - mysql

I need some help what ith the equivalent code for grep -v Wildcard and grep -o in batch file.
This is my code in shell.
result=`mysqlshow --user=$dbUser --password=$dbPass sample | grep -v Wildcard | grep -o sample`

The batch equivalent of grep (not including third party tools like GnuWin32 grep), will be findstr.
grep -v finds lines that don't match the pattern. The findstr version of this is findstr /V
grep -o shows only the part of the line that matches the pattern. Unfortunately, there's no equivalent of this, but you can run the command and then have a check along the lines of
if %errorlevel% equ 0 echo sample

Related

Grepping a word buried in a <p> on a website

I am having trouble grepping a word on a website. This is the command I'm using
wget -q http://bcbioinformaticsgrad.ca/our-faculty/james-piret/ | grep 'medical'
which is returning nothing, when it should be returning
[name of the website]:Many recent developments in biological and medical
.
.
.
.
.
.
The overall goal of what I'm trying to do is find a certain word within all the links of the website
My script is written like this
#!/bin/bash
#$1 is the parent website
#This pipeline obtains all the links located on a website
wget -qO- $1 | grep -Eoi '<a [^>]+>' | grep -Eo 'href="[^\"]+"' | cut -c 7- | rev | cut -c 2- | rev > .linksLocated
#$2 is the word being looked for
#This loop goes though every link and tries to locate a word
while IFS='' read -r line || [[ -n "$line" ]]; do
wget -q $line | grep "$2"
done < .linksLocated
#rm .linksLocated
Wget doesn't put the downloaded file to standard output, so it's trying to grep the word from nothing (since you added the -q flag).
Add -O - to print the page to stdout:
wget -q http://bcbioinformaticsgrad.ca/our-faculty/james-piret/ -O - | grep 'medical'
I see you used it with the first wget in your script, so just add it to the second one, too.
It's also possible to use curl, which does that by default, without any parameters:
curl http://bcbioinformaticsgrad.ca/our-faculty/james-piret/ | grep 'medical'
Edit: this tool is super useful when you actually need to select certain HTML elements in the downloaded page, might suit some use cases better than grep: https://github.com/ericchiang/pup

Extract href of a specific anchor text in bash

I am trying to get the href of the most recent production release from Exiftool page.
curl -s 'http://www.sno.phy.queensu.ca/~phil/exiftool/history.html' | grep -o -E "href=[\"'](.*)[\"'].*Version"
Actual output
href="Image-ExifTool-10.36.tar.gz">Version
I want this an as output
Image-ExifTool-10.36.tar.gz
Using grep -P you can use a lookahead and \K for match reset:
curl -s 'http://www.sno.phy.queensu.ca/~phil/exiftool/history.html' |
grep -o -P "href=[\"']\K[^'\"]+(?=[\"']>Version)"
Image-ExifTool-10.36.tar.gz

Outputting data from 5gb file with awk

I have a csv file with approximately 300 columns.
I'm using awk to create a subset of this file where the 24th column is "CA".
Example of data:
Here's what I am trying:
awk -F "," '{if($24~/CA/)print}' myfile.csv > subset.csv
After approximately 10 minutes the subset file grew to 400 mb, and then I killed it because this is too slow.
How can I speed this up? Perhaps a combination of sed / awk?
\
tl;dr:
awk implementations can significantly differ in performance.
In this particular case, see if using gawk (GNU awk) helps.
Ubuntu comes with mawk as the default awk, which is usually considered faster than gawk. However, in the case at hand it seems that gawk is significantly faster (related to line length?), at least based on the following simplified tests, which I ran
in a VM on Ubuntu 14.04 on a 1-GB file with 300 columns of length 2.
The tests also include an equivalent sed and grep command.
Hopefully they provide at least a sense of comparative performance.
Test script:
#!/bin/bash
# Pass in test file
f=$1
# Suppress stdout
exec 1>/dev/null
awkProg='$24=="CA"'
echo $'\n\n\t'" $(mawk -W version 2>&1 | head -1)" >&2
time mawk -F, "$awkProg" "$f"
echo $'\n\n\t'" $(gawk --version 2>&1 | head -1)" >&2
time gawk -F, "$awkProg" "$f"
sedProg='/^([^,]+,){23}CA,/p'
echo $'\n\n\t'" $(sed --version 2>&1 | head -1)" >&2
time sed -En "$sedProg" "$f"
grepProg='^([^,]+,){23}CA,'
echo $'\n\n\t'" $(grep --version 2>&1 | head -1)" >&2
time grep -E "$grepProg" "$f"
Results:
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan
real 0m11.341s
user 0m4.780s
sys 0m6.464s
GNU Awk 4.0.1
real 0m3.560s
user 0m0.788s
sys 0m2.716s
sed (GNU sed) 4.2.2
real 0m9.579s
user 0m4.016s
sys 0m5.504s
grep (GNU grep) 2.16
real 0m50.009s
user 0m42.040s
sys 0m7.896s

Printing column separated by comma using Awk command line

I have a problem here. I have to print a column in a text file using awk. However, the columns are not separated by spaces at all, only using a single comma. Looks something like this:
column1,column2,column3,column4,column5,column6
How would I print out 3rd column using awk?
Try:
awk -F',' '{print $3}' myfile.txt
Here in -F you are saying to awk that use , as the field separator.
If your only requirement is to print the third field of every line, with each field delimited by a comma, you can use cut:
cut -d, -f3 file
-d, sets the delimiter to a comma
-f3 specifies that only the third field is to be printed
Try this awk
awk -F, '{$0=$3}1' file
column3
, Divide fields by ,
$0=$3 Set the line to only field 3
1 Print all out. (explained here)
This could also be used:
awk -F, '{print $3}' file
A simple, although awk-less solution in bash:
while IFS=, read -r a a a b; do echo "$a"; done <inputfile
It works faster for small files (<100 lines) then awk as it uses less resources (avoids calling the expensive fork and execve system calls).
EDIT from Ed Morton (sorry for hi-jacking the answer, I don't know if there's a better way to address this):
To put to rest the myth that shell will run faster than awk for small files:
$ wc -l file
99 file
$ time while IFS=, read -r a a a b; do echo "$a"; done <file >/dev/null
real 0m0.016s
user 0m0.000s
sys 0m0.015s
$ time awk -F, '{print $3}' file >/dev/null
real 0m0.016s
user 0m0.000s
sys 0m0.015s
I expect if you get a REALY small enough file then you will see the shell script run in a fraction of a blink of an eye faster than the awk script but who cares?
And if you don't believe that it's harder to write robust shell scripts than awk scripts, look at this bug in the shell script you posted:
$ cat file
a,b,-e,d
$ cut -d, -f3 file
-e
$ awk -F, '{print $3}' file
-e
$ while IFS=, read -r a a a b; do echo "$a"; done <file
$

output of shell script in json form

I want to get output of following small shell script in json form.
#!/bin/bash
top -b -d1 -n1 | grep Cpu
Output:
Cpu(s): 6.2%us, 1.6%sy, 0.2%ni, 90.9%id, 1.1%wa, 0.0%hi, 0.0%si, 0.0%st
Required Output:
{"Cpu": "6.3" }
How can I convert output of such every shell scripts in json form ?
You could try this
echo "{\"Cpu\":\"`top -b -d1 -n1 | grep Cpu | cut -f3 -d " " | cut -f1 -d %`\"}"
A brief description: First, take a look at man cut, especially -f and -d arguments. The \"s are simply double quotations, which should be preceded with a backslash to avoid misunderstanding by shell interpreter. And at last, anything enclosed in back quotation marks `` would be executed, as described here.
try this line:
your commands ...|awk 'BEGIN{FS="\\(s\\):\\s*";OFS="";q="\x22" }{$1=q$1q;sub(/%.*$/,"%",$2);$2=q$2q; print $1,$2}'
test with your data:
kent$ echo "Cpu(s): 6.2%us, 1.6%sy, 0.2%ni, 90.9%id, 1.1%wa, 0.0%hi, 0.0%si, 0.0%st"|awk 'BEGIN{FS="\\(s\\):\\s*";OFS="";q="\x22" }{$1=q$1q;sub(/%.*$/,"%",$2);$2=q$2q; print $1,$2}'
"Cpu""6.2%"