Using AWK to add DATE to last column of CSV

Using AWK to add DATE to last column of CSV - csv

As a total bash newbie, I'm struggling to construct an AWK statement that prints the output as a DATE. Here is what I've been trying. Any ideas on how to make the $6 = $date?
cat file.json |
jq -r '.pagedEntities._embedded.teamActivityList |
[.[].teamName, .[].rank, .[].average, .[].total] |
#csv' |
awk -F"," 'BEGIN { OFS = "," } ; {$6=$(date) OFS $6; print}'

I know you asked for an Awk command, but since you're already using jq to generate the CSV file, you might as well do it there:
cat file.json |
jq --arg date "$(date)" -r '
.pagedEntities._embedded.teamActivityList |
[.[].teamName, .[].rank, .[].average, .[].total, $date] |
#csv'
This also saves you from the pitfalls of using tools that don't understand the language they're dealing with, as is the case with Awk and CSV; as an example, your script will break if any of the CSV entries is quoted and has a comma in it.

If in this command...
awk -F"," 'BEGIN { OFS = "," } ; {$6=$(date) OFS $6; print}'
...you are trying to assign to $6 the output of the date command, that won't work. $(command) is Bourne shell syntax and won't work in awk. The easiest way to do what you want is probably:
awk -v date="$(date)" -F"," 'BEGIN { OFS = "," } ; {$6=date; print}'
This assigns the output of date to the awk variable date, which can then be used in your awk script.
If that's not what you're trying to do, please update your question to show both some sample input as well as an example of what you would like your output to look like.

Here is a solution which builds on Santiago's approach but uses jq's now and strftime functions instead of using the unix date command.
jq -r '
(now|strftime("%c")) as $date
| .pagedEntities._embedded.teamActivityList
| [.[].teamName, .[].rank, .[].average, .[].total, $date]
| #csv
' file.json

Related

How to print only those numbers of a column which are greater than certain number in bash [duplicate]

I found some ways to pass external shell variables to an awk script, but I'm confused about ' and ".
First, I tried with a shell script:
$ v=123test
$ echo $v
123test
$ echo "$v"
123test
Then tried awk:
$ awk 'BEGIN{print "'$v'"}'
$ 123test
$ awk 'BEGIN{print '"$v"'}'
$ 123
Why is the difference?
Lastly I tried this:
$ awk 'BEGIN{print " '$v' "}'
$ 123test
$ awk 'BEGIN{print ' "$v" '}'
awk: cmd. line:1: BEGIN{print
awk: cmd. line:1: ^ unexpected newline or end of string
I'm confused about this.

#Getting shell variables into awk
may be done in several ways. Some are better than others. This should cover most of them. If you have a comment, please leave below.                                                                                    v1.5
Using -v (The best way, most portable)
Use the -v option: (P.S. use a space after -v or it will be less portable. E.g., awk -v var= not awk -vvar=)
variable="line one\nline two"
awk -v var="$variable" 'BEGIN {print var}'
line one
line two
This should be compatible with most awk, and the variable is available in the BEGIN block as well:
If you have multiple variables:
awk -v a="$var1" -v b="$var2" 'BEGIN {print a,b}'
Warning. As Ed Morton writes, escape sequences will be interpreted so \t becomes a real tab and not \t if that is what you search for. Can be solved by using ENVIRON[] or access it via ARGV[]
PS If you have vertical bar or other regexp meta characters as separator like |?( etc, they must be double escaped. Example 3 vertical bars ||| becomes -F'\\|\\|\\|'. You can also use -F"[|][|][|]".
Example on getting data from a program/function inn to awk (here date is used)
awk -v time="$(date +"%F %H:%M" -d '-1 minute')" 'BEGIN {print time}'
Example of testing the contents of a shell variable as a regexp:
awk -v var="$variable" '$0 ~ var{print "found it"}'
Variable after code block
Here we get the variable after the awk code. This will work fine as long as you do not need the variable in the BEGIN block:
variable="line one\nline two"
echo "input data" | awk '{print var}' var="${variable}"
or
awk '{print var}' var="${variable}" file
Adding multiple variables:
awk '{print a,b,$0}' a="$var1" b="$var2" file
In this way we can also set different Field Separator FS for each file.
awk 'some code' FS=',' file1.txt FS=';' file2.ext
Variable after the code block will not work for the BEGIN block:
echo "input data" | awk 'BEGIN {print var}' var="${variable}"
Here-string
Variable can also be added to awk using a here-string from shells that support them (including Bash):
awk '{print $0}' <<< "$variable"
test
This is the same as:
printf '%s' "$variable" | awk '{print $0}'
P.S. this treats the variable as a file input.
ENVIRON input
As TrueY writes, you can use the ENVIRON to print Environment Variables.
Setting a variable before running AWK, you can print it out like this:
X=MyVar
awk 'BEGIN{print ENVIRON["X"],ENVIRON["SHELL"]}'
MyVar /bin/bash
ARGV input
As Steven Penny writes, you can use ARGV to get the data into awk:
v="my data"
awk 'BEGIN {print ARGV[1]}' "$v"
my data
To get the data into the code itself, not just the BEGIN:
v="my data"
echo "test" | awk 'BEGIN{var=ARGV[1];ARGV[1]=""} {print var, $0}' "$v"
my data test
Variable within the code: USE WITH CAUTION
You can use a variable within the awk code, but it's messy and hard to read, and as Charles Duffy points out, this version may also be a victim of code injection. If someone adds bad stuff to the variable, it will be executed as part of the awk code.
This works by extracting the variable within the code, so it becomes a part of it.
If you want to make an awk that changes dynamically with use of variables, you can do it this way, but DO NOT use it for normal variables.
variable="line one\nline two"
awk 'BEGIN {print "'"$variable"'"}'
line one
line two
Here is an example of code injection:
variable='line one\nline two" ; for (i=1;i<=1000;++i) print i"'
awk 'BEGIN {print "'"$variable"'"}'
line one
line two
1
2
3
.
.
1000
You can add lots of commands to awk this way. Even make it crash with non valid commands.
One valid use of this approach, though, is when you want to pass a symbol to awk to be applied to some input, e.g. a simple calculator:
$ calc() { awk -v x="$1" -v z="$3" 'BEGIN{ print x '"$2"' z }'; }
$ calc 2.7 '+' 3.4
6.1
$ calc 2.7 '*' 3.4
9.18
There is no way to do that using an awk variable populated with the value of a shell variable, you NEED the shell variable to expand to become part of the text of the awk script before awk interprets it. (see comment below by Ed M.)
Extra info:
Use of double quote
It's always good to double quote variable "$variable"
If not, multiple lines will be added as a long single line.
Example:
var="Line one
This is line two"
echo $var
Line one This is line two
echo "$var"
Line one
This is line two
Other errors you can get without double quote:
variable="line one\nline two"
awk -v var=$variable 'BEGIN {print var}'
awk: cmd. line:1: one\nline
awk: cmd. line:1: ^ backslash not last character on line
awk: cmd. line:1: one\nline
awk: cmd. line:1: ^ syntax error
And with single quote, it does not expand the value of the variable:
awk -v var='$variable' 'BEGIN {print var}'
$variable
More info about AWK and variables
Read this faq.

It seems that the good-old ENVIRON awk built-in hash is not mentioned at all. An example of its usage:
$ X=Solaris awk 'BEGIN{print ENVIRON["X"], ENVIRON["TERM"]}'
Solaris rxvt

You could pass in the command-line option -v with a variable name (v) and a value (=) of the environment variable ("${v}"):
% awk -vv="${v}" 'BEGIN { print v }'
123test
Or to make it clearer (with far fewer vs):
% environment_variable=123test
% awk -vawk_variable="${environment_variable}" 'BEGIN { print awk_variable }'
123test

You can utilize ARGV:
v=123test
awk 'BEGIN {print ARGV[1]}' "$v"
Note that if you are going to continue into the body, you will need to adjust
ARGC:
awk 'BEGIN {ARGC--} {print ARGV[2], $0}' file "$v"

I just changed #Jotne's answer for "for loop".
for i in `seq 11 20`; do host myserver-$i | awk -v i="$i" '{print "myserver-"i" " $4}'; done

I had to insert date at the beginning of the lines of a log file and it's done like below:
DATE=$(date +"%Y-%m-%d")
awk '{ print "'"$DATE"'", $0; }' /path_to_log_file/log_file.log
It can be redirect to another file to save

Pro Tip
It could come handy to create a function that handles this so you dont have to type everything every time. Using the selected solution we get...
awk_switch_columns() {
cat < /dev/stdin | awk -v a="$1" -v b="$2" " { t = \$a; \$a = \$b; \$b = t; print; } "
}
And use it as...
echo 'a b c d' | awk_switch_columns 2 4
Output:
a d c b

Print Rows if End of Field Matches a String in AWK

I have a csv file and I am trying to print rows using awk if a certail field ends with a specific string. So for example, I have the below CSV file:
col1,col2,col3
1,abcd,.abcd_efg
2,efgh,.abcd
3,ijkl,.abcd_mno
4,mnop,.abcd
5,qrst,.abcd_uvw
This is the result I am seeking after:
2,efgh,.abcd
4,mnop,.abcd
But I am getting a different result. This is the awk command I am using:
cat file.csv | awk -F"," '{if ($3 ~ ".abcd" ) print $0}'
and This is the result I am getting:
1,abcd,.abcd_efg
2,efgh,.abcd
3,ijkl,.abcd_mno
4,mnop,.abcd
5,qrst,.abcd_uvw
I event tried the below, but no matched is returned so it didn't work:
cat file.csv | awk -F"," '{if ($3 ~ ".abcd$" ) print $0}'
Any clue what the issue might be? Am I using the wrong expression to get this result?
EDIT: This is another command I tried where I tried Kent's solution, but it didn't work:
cat file.csv | awk -F"," '$3 ~ "[.]abcd"'

First of all the cat in cat file|awk ... is useless, just awk ... file
Your input text has no single comma, how come you set FS=","?
If you want to do exact String compare, use $3 == "whatever" instead of $3 ~ /regex/
So your codes could be changed into:
awk '$3 == ".abcd"' file
If you really love regex, and want to do it in regex match way:
awk '$3 ~ "[.]abcd$"' file
or
awk '$3 ~ /^[.]abcd$/' file
depends on what you required.

You may modify your awk command as followed,
$ cat file.csv | awk '$3 ~ /\.abcd$/ {print $0}'
2 efgh .abcd
4 mnop .abcd
Brief explanation,
$3 ~ /.abcd$/: if $3 matches the regex .abcd$, print $0
According to your modified question, you may change the awk command to:
cat file.csv | awk -F, '$3 ~ /\.abcd$/ {print $0}'

How do I print line number with the output for JQ in json file

I know there is a -n option and tried many combinations but couldn't get it to work. I'd like to print the line number and the length of the each line for the json file
cat -n traffictest.json | jq '. |length'
jq -C . | cat -n traffictest.json | jq '. |length'

jq has a built-in filter, input_line_number, which emits the line number of the input being read. For example, given this input:
[1,2]
"abcd"
{"a":1,"b":1,"c":1,"d":1, "e":1}
the invocation:
jq -r "\(input_line_number): \(length)"
yields:
1: 2
3: 4
5: 5

If you are just interested in line-number and length, I would use awk instead, e.g.:
awk '{ print NR, length, $0 }' traffictest.json
Or if you want to keep the syntax highlighting:
paste <(jq . traffictest.json | awk '{ print NR, length }' OFS='\t') \
<(jq -C . traffictest.json)

Convert ps aux to json

I try to convert the output of ps aux into Json format without using Perl or Python! For these I have read about jq. But I have success to convert the commandline output into json.
How to convert a simpe ps aux to Json?

ps aux | awk '
BEGIN { ORS = ""; print " [ "}
{ printf "%s{\"user\": \"%s\", \"pid\": \"%s\", \"cpu\": \"%s\"}",
separator, $1, $2, $3
separator = ", "
}
END { print " ] " }';
Just adjust columns which you need from ps aux output.

jq can read non-JSON input. You'll want to pre-process the input with awk first:
ps aux |
awk -v OFS=, '{print $1, $2}' |
jq -R 'split(",") | {user: .[0], pid: .[1]}'
If you want an array instead of a sequence of objects, pipe the output through jq --slurp 'add'. (I swear there's a way to do that without an extra call to jq, but it escapes me at the moment.)

Here's an only-jq solution based on tokenization.
Tokenization can be done using:
def tokens:
def trim: sub("^ +";"") | sub(" +$";"");
trim | splits(" +");
For illustration and brevity, let's consider only the first 10 tokens:
[tokens] | .[0:9]
Invocation:
$ ps aux | jq -c -R -f tokens.jq
Or as a one-liner, you could get away with:
$ ps aux | jq -cR '[splits(" +")] | .[0:9]'
First few lines of output:
["USER","PID","%CPU","%MEM","VSZ","RSS","TT","STAT","STARTED"]
["p","1595","55.9","0.4","2593756","32832","??","R","24Jan17"]
["p","12472","26.6","12.6","4951848","1058864","??","R","Sat01AM"]
["p","13239","10.9","1.5","4073756","128324","??","R","Sun12AM"]
["p","12482","7.8","1.2","3876628","101736","??","R","Sat01AM"]
["p","32039","7.7","1.4","4786968","118424","??","R","12Feb17"]
["_windowserver","425","7.6","0.8","3445536","65052","??","Ss","24Jan17"]
Using the headers as object keys
See e.g.
https://github.com/stedolan/jq/wiki/Cookbook#convert-a-csv-file-with-headers-to-json

I have a gist that to convert ps output to json. It uses jq under the covers so you need to install that. But you do not need to know jq

Dumps the fields specified by the -o flag as an array of PID objects:
ps ax -o "stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,wchan,sz,pcpu,command" \
| jq -sRr ' sub("\n$";"") | split("\n") | ([.[0]|splits(" +")]) as $header | .[1:] | [.[] | [. as $x | range($header|length) | {"key": $header[.], "value": (if .==($header|length-1) then ([$x|splits(" +")][.:]|join(" ")|tojson|.[1:length-1]) else ([$x|splits(" +")][.]) end) } ] | from_entries]'
This builds an array of the header fields, maps an array of {key, value} objects per output object, and then uses the built-in from_entries filter to object-aggregate these into the outputs.

extract 2 values from JSON object and use as variables in loop using jq and bash

I am new to jq. I am trying to write a simple script that loops through a JSON file, gets two values within each object and assigns them to two separate variables I can use with a curl REST call. I see both values as output when I echo $i but how can I get value and addr as separate variables?
for i in `cat /Users/egraham/Downloads/test2 | jq .[] | jq ."value,.addr"`; do

You can do this:
jq -rc '.populator.value + " " + .populator.addr' file.json |
while read -r value addr; do
echo do something with "$value" and "$addr"
done

If spaces or tabs or other special characters make using 'read -r' problematic, and if your shell has "readarray", then it could be used:
$ readarray -t v < <(jq -rc '.populator | (.value,.addr)' file.json)
The values would then be available as ${v[0]} and ${v[1]}
This approach is especially useful if there are more than two values of interest, or if the number of values is variable or not known beforehand.
If your shell does not have readarray, then you can still use the array-oriented approach, e.g. along the lines of:
i=-1; while read -r a ; do i=$((i+1)); v[$i]="$a" ; done

First:
for i in cat /Users/egraham/Downloads/test2 | jq .[] | jq .value; do echo $i done
Second:
for i in cat /Users/egraham/Downloads/test2 | jq .[] | jq .addr; do echo $i done
I don't know any way to get it without running the commands separately. I don't know AWK, but maybe it's something worth considering.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Using AWK to add DATE to last column of CSV - csv

Here is a solution which builds on Santiago's approach but uses jq's now and strftime functions instead of using the unix date command. jq -r ' (now|strftime("%c")) as $date | .pagedEntities._embedded.teamActivityList | [.[].teamName, .[].rank, .[].average, .[].total, $date] | #csv ' file.json

Related

How to print only those numbers of a column which are greater than certain number in bash [duplicate]

Print Rows if End of Field Matches a String in AWK

How do I print line number with the output for JQ in json file

Convert ps aux to json

extract 2 values from JSON object and use as variables in loop using jq and bash

Categories

Resources