I am trying to insert a column in front of the first column in a comma separated value file (CSV). At first blush, awk seems to be the way to go but, I'm struggling with how to move down the new column.
CSV File
A,B,C,D,E,F
1,2,3,4,5,6
2,3,4,5,6,7
3,4,5,6,7,8
4,5,6,7,8,9
Attempted Code
awk 'BEGIN{FS=OFS=","}{$1=$1 OFS (FNR<1 ? $1 "0\nA\n2\nC" : "col")}1'
Result
A,col,B,C,D,E,F
1,col,2,3,4,5,6
2,col,3,4,5,6,7
3,col,4,5,6,7,8
4,col,5,6,7,8,9
Expected Result
col,A,B,C,D,E,F
0,1,2,3,4,5,6
A,2,3,4,5,6,7
2,3,4,5,6,7,8
C,4,5,6,7,8,9
This can be easily done using paste + printf:
paste -d, <(printf "col\n0\nA\n2\nC\n") file
col,A,B,C,D,E,F
0,1,2,3,4,5,6
A,2,3,4,5,6,7
2,3,4,5,6,7,8
C,4,5,6,7,8,9
<(...) is process substitution available in bash. For other shells use a pipeline like this:
printf "col\n0\nA\n2\nC\n" | paste -d, - file
With awk only you could try following solution, written and tested with shown samples.
awk -v value="$(echo -e "col\n0\nA\n2\nC")" '
BEGIN{
FS=OFS=","
num=split(value,arr,ORS)
for(i=1;i<=num;i++){
newVal[i]=arr[i]
}
}
{
$1=arr[FNR] OFS $1
}
1
' Input_file
Explanation:
First of all creating awk variable named value whose value is echo(shell command)'s output. NOTE: using -e option with echo will make sure that \n aren't getting treated as literal characters.
Then in BEGIN section of awk program, setting FS and OFS as , here for all line of Input_file.
Using split function on value variable into array named arr with delimiter of ORS(new line).
Then traversing through for loop till value of num(total values posted by echo command).
Then creating array named newVal with index of i(1,2,3 and so on) and its value is array arr value.
In main awk program, setting first field's value to array arr value and $1 and printing the line then.
I want to select() an object based on a string containing a jq variable ($ARCH) using -arg jq argument. Here's the use-case while looking for "/bin/linux/$ARCH/kubeadm" from Google...
# You may need to install `xml2json` IE
# sudo gem install --no-rdoc --no-ri xml2json and run the script I wrote to do the xml2json:
#!/usr/bin/ruby
# Written by Jim Conner
require 'xml2json'
xml = ARGV[0]
begin
if xml == '-'
xdata = ARGF.read.chomp
puts XML2JSON.parse(xdata)
else
puts XML2JSON.parse(File.read(file2parse).chomp)
end
rescue => e
$stderr.puts 'Unable to comply: %s' % [e.message]
end
Then run the following:
curl -sSL https://storage.googleapis.com/kubernetes-release/ > /var/tmp/k8s.xml | \
xml2json - | \
jq --arg ARCH amd64 '[.ListBucketResult.Contents[] | select(.Key | contains("/bin/linux/$arch/kubeadm"))]'
...which returns an empty set because jq doesn't transliterate inside quotes. I know I can get around this by using multiple select/contains() but I'd prefer not to if possible.
jq simply may not do it, but if someone knows a way to do it, I'd much appreciate it.
jq does support string interpolation, and in your case the string would be:
"/bin/linux/\($ARCH)/kubeadm"
Notice that this is not a JSON string: the occurrence of "\(" signals that the string is subject to interpolation. Very nifty.
(Alternatively, you could of course use string concatenation:
"/bin/linux/" + $ARCH + "/kubeadm")
Btw, you might wish to avoid contains here. Its semantics is (are?) quite complex and perhaps counter-intuitive. Consider using startswith, index, or (for regex matches) test.
I've a lot json file the structure of which looks like below:
{
key1: 'val1'
key2: {
'key21': 'someval1',
'key22': 'someval2',
'key23': 'someval3',
'date': '2018-07-31T01:30:30Z',
'key25': 'someval4'
}
key3: []
... some other objects
}
My goal is to get only these files where date field is from some period.
For example from 2018-05-20 to 2018-07-20.
I can't base on date of creation this files, because all of this was generated in one day.
Maybe it is possible using sed or similar program?
Fortunately, the date in this format can be compared as a string. You only need something to parse the JSONs, e.g. Perl:
perl -l -0777 -MJSON::PP -ne '
$date = decode_json($_)->{key2}{date};
print $ARGV if $date gt "2018-07-01T00:00:00Z";
' *.json
-0777 makes perl slurp the whole files instead of reading them line by line
-l adds a newline to print
$ARGV contains the name of the currently processed file
See JSON::PP for details. If you have JSON::XS or Cpanel::JSON::XS, you can switch to them for faster processing.
I had to fix the input (replace ' by ", add commas, etc.) in order to make the parser happy.
If your files actually contain valid JSON, the task can be accomplished in a one-liner with jq, e.g.:
jq 'if .key2.date[0:10] | (. >= "2018-05-20" and . <= "2018-07-31") then input_filename else empty end' *.json
This is just an illustration. jq has date-handling functions for dealing with more complex requirements.
Handling quasi-JSON
If your files contain quasi-JSON, then you could use jq in conjunction with a JSON rectifier. If your sample is representative, then hjson
could be used, e.g.
for f in *.qjson
do
hjson -j $f | jq --arg f "$f" '
if .key2.date[0:7] == "2018-07" then $f else empty end'
done
Try like this:
Find a online converter. (for example: https://codebeautify.org/json-to-excel-converter#) and convert Json to CSV
Open CSV file with Excel
Filter your data
I am writing a bash script to use with badips.com
This command:
wget https://www.badips.com/get/key -qO -
Will return something like this:
{"err":"","suc":"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
Or like this:
{"err":"","suc":"Your Key was already present! To overwrite, see http:\/\/www.badips.com\/apidoc.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
I need to parse the key value out (5f72253b673eb49fc64dd34439531b5cca05327f) into a variable in the script. I would prefer to use grep to do it but can't get it right.
Instead of parsing with some grep, you have the perfect tool for this: jq.
See:
jq '.key' file
or
.... your_commands .... | jq '.key'
will return
"5f72253b673eb49fc64dd34439531b5cca05327f"
See another example, for example to get the suc attribute:
$ cat a
{"err":"","suc":"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
{"err":"","suc":"Your Key was already present! To overwrite, see http:\/\/www.badips.com\/apidoc.","key":"5f72253b673eb49fc64dd34439531b5cca05327f"}
$ jq '.suc' a
"new key 5f72253b673eb49fc64dd34439531b5cca05327f has been set."
"Your Key was already present! To overwrite, see http://www.badips.com/apidoc."
You could try the below grep command,
grep -oP '"key":"\K[^"]*(?=")' file
Using perl :
wget https://www.badips.com/get/key -qO - |
perl -MJSON -MFile::Slurp=slurp -le '
my $s = slurp "/dev/stdin";
my $d = JSON->new->decode($s);
print $d->{key}
'
Not as strong as precedent one, but that don't require to install new modules, a stock perl can do it :
wget https://www.badips.com/get/key -qO - |
perl -lne 'print $& if /"key":"\K[[:xdigit:]]+/'
awk keeps it simple
wget ... - | awk -F: '{split($NF,k,"\"");print k[2]}'
the field separator is :;
the key is always in the last field, in awk this field is accessed using $NF (Number of Fields);
the split function splits $NF and puts the pieces in array k, according to separator "\"" that is just a single double quote character;
the second field of the k array is what you want.
i am trying to retrieve only the last updated value of variable date2,but all the times it
gives 0 in the output,i think it is because i am retrieving it out side the block but
i only want the last value store in variable date2,how can i fix this problem.Thanks for
your help
here is my code
count=0
date1=0
date2=0
mysql -uroot -proot -Dproject_ivr_db -rN --execute "SELECT FeeSubmissionDate FROM
meritlist_date wHERE Discipline='phd' AND AnnounceDate<=now() " | while read value
do
if [[ "$count" == 0 ]]
then
let "date2=$value"
let "count++"
else
let "date1=$value"
let "result=$date1-$date2"
if [[ "$result" -gt 0 ]]
then
let "date2=$date1"
fi
fi
done
echo"V,date2=$date2"
AWK can easily solve this issue. Here's an example of a simple script which lists all files in the current directory and returns the number of those files that have substring "test" as part of the filename.
#!/bin/bash
ls -1 |
awk 'BEGIN { count=0 } { if (index($0,"test") != 0) { count++ }} END { printf("Count=%s\n", count); }'
Now if you replace ls -1 with your mysql command and get rid of the while loop by simply using AWK in an analogue way as shown above, you should have no problems counting the lines of interest.