system information to workspace

system information to workspace - octave

How to return to a variable the information of the system command, like:
a=system("cat /proc/cpuinfo | grep 'model name' | uniq")
return a=0
or
b=system("xrandr | grep 'current'")
Screen 0: minimum 320 x 200, current 1366 x 768, maximum 16384 x 16384
ans = 0

As documented in Octave, you can get the result in a variable if this one is the second output.
Example: with
[s, r] = system("ls")
you get in r the expected list of file names.

Related

AWK: statistics operations of multi-column CSV data

With the aim to perform some statistical analysis of multi-column data I am analyzing big number of CSV filles using the following bash + AWK routine:
#!/bin/bash
home="$PWD"
# folder with the outputs
rescore="${home}"/rescore
# folder with the folders to analyse
storage="${home}"/results
#cd "${home}"/results
cd ${storage}
csv_pattern='*_filt.csv'
while read -r d; do
awk -v rescore="$rescore" '
FNR==1 {
if (n)
mean[suffix] = s/n
prefix=suffix=FILENAME
sub(/_.*/, "", prefix)
sub(/\/[^\/]+$/, "", suffix)
sub(/^.*_/, "", suffix)
s=n=0
}
FNR > 1 {
s += $3
++n
}
END {
out = rescore "/" prefix ".csv"
mean[suffix] = s/n
print prefix ":", "dG(mean)" > out
for (i in mean)
printf "%s: %.2f\n", i, mean[i] >> out
close(out)
}' "${d}_"*/${csv_pattern} #> "${rescore}/"${d%%_*}".csv"
done < <(find . -maxdepth 1 -type d -name '*_*_*' | awk -F '[_/]' '!seen[$2]++ {print $2}')
Basically the script takes ensemble of CSV files belonged to the same prefix (defined as the naming pattern occured at the begining of the directory contained CSV, for example 10V1 from 10V1_cne_lig1) and calculate for it the mean value for the numbers in the third column:
# input *_filt.csv located in the folder 10V1_cne_lig1001
ID, POP, dG
1, 142, -5.6500
2, 10, -5.5000
3, 2, -4.9500
add 1 string to 10V1.csv, which is organized in 2 column format i) the name of the suffix of the folder with initial CSV; ii) the mean value calculated for all numbers in the third column (dG) of input.csv:
# this is two column format of output.csv: 10V1.csv
10V1: dG(mean)
lig1001: -5.37
in this way for 100 CSV filles such output.csv should contain 100 lines with the mean values, etc
I need to introduce a small modification to my AWK part of my routine that would add the 3rd column to the output CSV with RMSD value (as the measure of the differences between initial dG values) of the initial data (dG), which had been used to calculate the MEAN value. Using AWK syntax, with a particular MEAN value the RMS could be expressed as
mean=$(awk -F , '{sum+=$3}END{printf "%.2f", sum/NR}' $csv)
rmsd=$(awk -v mean=$mean '{++n;sum+=($NF-mean)^2} END{if(n) printf "%.2f", sqrt(sum/n)}' $csv)
Here is expected output for 5 means and 5 rmsds values calculated for 5 CSV logs (the first one is corresponded to my above example!):
10V1: dG(mean): RMSD (error)
lig1001 -5.37 0.30
lig1002 -8.53 0.34
lig1003 -6.57 0.25
lig1004 -9.53 0.00 # rmsd=0 since initial csv has only 1 line: no data variance
lig1005 -8.11 0.39
How this addition could be incorporated into my main bash-AWK code with the aim to add the third RMSD column (for each of the processed CSV, thus taking each of the calculated MEAN) to the output.csv?

You can calculate both of mean and rmsd within the awk code.
Would you please try the following awk code:
awk -v rescore="$rescore" '
FNR==1 {
if (n) { # calculate the results of previous file
m = s / n # mean
var = s2 / n - m * m # variance
if (var < 0) var = 0 # avoid an exception due to round-off error
mean[suffix] = m # store the mean in an array
rmsd[suffix] = sqrt(var)
}
prefix=suffix=FILENAME
sub(/_.*/, "", prefix)
sub(/\/[^\/]+$/, "", suffix)
sub(/^.*_/, "", suffix)
s = 0 # sum of $3
s2 = 0 # sum of $3 ** 2
n = 0 # count of samples
}
FNR > 1 {
s += $3
s2 += $3 * $3
++n
}
END {
out = rescore "/" prefix ".csv"
m = s / n
var = s2 / n - m * m
if (var < 0) var = 0
mean[suffix] = m
rmsd[suffix] = sqrt(var)
print prefix ":", "dG(mean)", "dG(rmsd)" > out
for (i in mean)
printf "%s: %.2f %.2f\n", i, mean[i], rmsd[i] >> out
close(out)
}'
Here is the version to print the lowest value of dG.
awk -v rescore="$rescore" '
FNR==1 {
if (n) { # calculate the results of previous file
m = s / n # mean
var = s2 / n - m * m # variance
if (var < 0) var = 0 # avoid an exception due to round-off error
mean[suffix] = m # store the mean in an array
rmsd[suffix] = sqrt(var)
lowest[suffix] = min
}
prefix=suffix=FILENAME
sub(/_.*/, "", prefix)
sub(/\/[^\/]+$/, "", suffix)
sub(/^.*_/, "", suffix)
s = 0 # sum of $3
s2 = 0 # sum of $3 ** 2
n = 0 # count of samples
min = 0 # lowest value of $3
}
FNR > 1 {
s += $3
s2 += $3 * $3
++n
if ($3 < min) min = $3 # update the lowest value
}
END {
if (n) { # just to avoid division by zero
m = s / n
var = s2 / n - m * m
if (var < 0) var = 0
mean[suffix] = m
rmsd[suffix] = sqrt(var)
lowest[suffix] = min
}
out = rescore "/" prefix ".csv"
print prefix ":", "dG(mean)", "dG(rmsd)", "dG(lowest)" > out
for (i in mean)
printf "%s: %.2f %.2f %.2f\n", i, mean[i], rmsd[i], lowest[i] > out
}' file_*.csv
I've assumed all dG values are negative. If there is any chance the
value is greater than zero, modify the line min = 0 which initializes
the variable to considerably big value (10,000 or whatever).
Please apply your modifications regarding the filenames, if needed.
The suggestions by Ed Morton are also included although the results will be the same.

jq: error: round/0 is not defined at <top-level>

round function in jq doesn't work.
$ jq '10.01 | round'
jq: error: round/0 is not defined at <top-level>, line 1:
10.01 | round
jq: 1 compile error
$ jq --help
jq - commandline JSON processor [version 1.5-1-a5b5cbe]
What I need to do?

Seems like round is unavailable in your build. Either upgrade jq or implement round using floor:
def round: . + 0.5 | floor;
Usage example:
$ jq -n 'def round: . + 0.5 | floor; 10.01 | round'
10

We can use the pow function along with . + 0.5 | floor to create our own 'round' function that takes a value to round as input and the number of decimal places as argument.
def round_whole:
# Basic round function, returns the closest whole number
# Usage:
# 2.6 | round_whole // 3
. + 0.5 | floor
;
def round(num_dec):
# Round function, takes num_dec as argument
# Usage: 2.2362 | round(2) // 2.24
num_dec as $num_dec |
# First multiply the number by the number of decimal places we want to round to
# i.e 2.2362 becomes 223.62
. * pow(10; $num_dec) |
# Then use the round_whole function
# 223.62 becomes 224
round_whole |
# Then divide by the number of decimal places we want to round by
# 224 becomes 2.24 as expected
. / pow(10; $num_dec)
;
jq --null-input --raw-output \
'
def round_whole:
# Basic round function, returns the closest whole number
# Usage:
# 2.6 | round_whole // 3
. + 0.5 | floor
;
def round(num_dec):
# Round function, takes num_dec as argument
# Usage: 2.2362 | round(2) // 2.24
num_dec as $num_dec |
# First multiply the number by the number of decimal places we want to round to
# i.e 2.2362 becomes 223.62
. * pow(10; $num_dec) |
# Then use the round_whole function
# 223.62 becomes 224
round_whole |
# Then divide by the number of decimal places we want to round by
# 224 becomes 2.24 as expected
. / pow(10; $num_dec)
;
[
2.2362,
2.4642,
10.23423
] |
map(
round(2)
)
'
Yields
[
2.24,
2.46,
10.23
]

Display symbolic expression in octave. Matrix multiplication as an expression and not as a result

I have hard time finding out how to display matrix multiplication as an expression, not as a result of an expression. The expression must be displayed in command line, not as a plot.
Lets say I have
syms m00 m01 m10 m11;
M = [m00 m01; m10 m11];
syms x0 x1;
X = [x0; x1];
I want to see the expression M * X as a symbolic expression. Something that will be displayed like:
| m00 m01 | * | x0 |
| m10 m11 | | x1 |
And will not be displayed as a result of M * X evaluation:
| m00*x0 + m01*x1 |
| m10*x0 + m11*x1 |
I have read documentation on octave symbolic package. Can not seem to find the mechanics there. My thoughts are wrapped around converting expressions to latex,
m = latex(M)
x = latex(X)
Concatenating the result as a latex string, and somehow print the latex string in octave command line. No luck as of now.

Read in search strings from text file, search for string in second text file and output to CSV

I have a text file named file1.txt that is formatted like this:
001 , ID , 20000
002 , Name , Brandon
003 , Phone_Number , 616-234-1999
004 , SSNumber , 234-23-234
005 , Model , Toyota
007 , Engine ,V8
008 , GPS , OFF
and I have file2.txt formatted like this:
#==============================================
# 005 : Model
#------------------------------------------------------------------------------
[Model] = Honda
option = 0
length = 232
time = 1000
hp = 75.0
k1 = 0.3
k2 = 0.0
k1 = 0.3
k2 = 0.0
#------------------------------------------------------------------------------
[Model] = Toyota
option = 1
length = 223
time = 5000
speed = 50
CCNA = 1
#--------------------------------------------------------------------------
[Model] = Miata
option = 2
CCNA = 1
#==============================================
# 007 : Engine
#------------------------------------------------------------------------------
[Engine_Type] = V8 #1200HP
option = 0
p = 12.0
pp = 12.0
map = 0.4914
k1mat = 100
k2mat = 600
value =12.00
mep = 79.0
cylinders = 8
#------------------------------------------------------------------------------
[Engine_Type] = v6 #800HP
option = 1
active = 1
cylinders = 6
lim = 500
lim = 340
rpm = 330
start = 350
ul = 190.0
ll = 180.0
ul = 185.0
#==============================================
# 008 : GPS
#------------------------------------------------------------------------------
[GPS] = ON
monitor = 0
#------------------------------------------------------------------------------
[GPS] = OFF
monitor = 1
Enable = 1
#------------------------------------------------------------------------------
[GPS] = Only
monitor = 2
Enable = 1
#==============================================
# 014 :Option
#------------------------------------------------------------------------------
[Option] = Disable
monitor = 0
#------------------------------------------------------------------------------
[Option] = Enable
monitor = 1
#==============================================
# 015 : Weight
#------------------------------------------------------------------------------
[lbs] = &1
weight = &1
#==============================================
The expected output is supposed to look like this:
Since there is only option 005-008 in file1.txt the output would be:
Code:
#==============================================
# 005 : Model
#------------------------------------------------------------------------------
[Model] = Toyota
option = 1
length = 223
time = 5000
speed = 50
CCNA = 1
#==============================================
# 007 : Engine
#------------------------------------------------------------------------------
[Engine_Type] = V8 #1200HP
option = 0
p = 12.0
pp = 12.0
map = 0.4914
k1mat = 100
k2mat = 600
value =12.00
mep = 79.0
cylinders = 8
#==============================================
# 008 : GPS
#------------------------------------------------------------------------------
[GPS] = OFF
monitor = 1
Enable = 1
#-----------------------------------------------------------------
Now, using Awk and the values from the 2nd and 3rd columns in file1, I want to search for those strings in file2 and output everything in that section to a CSV file ie from where the string is found to where there is the #-------------
demarcation.
Could someone please help me with this and explain also? I am new to Awk
Thank you!

I wouldn't really use awk for this job as specified, but here's a little snippet to get started:
awk -F'[ ,]+' 'FNR == NR { section["[" $2 "]"] = $3; next }
/^\[/ && section[$1] == $3, /^#/' file1.txt file2.txt
1) The -F'[ ,]+' sets the field separator to one or more of spaces and/or commas (since file1.txt looks like it's not a proper CSV file).
2) FNR == NR (record number in file equals total record number) is only true when reading file1.txt. So for each line in file1.txt, we record [second_field] as the pattern to look for with the third field as value.
3) Then we look for lines that begin with a [ and where the value stored in section for the first field of that line matches the third field of that line (/^\[/ && section[$1] == $3), and print from that line until the next line that begins with a #.
The output for your example input is:
[Model] = Toyota
option = 1
length = 223
time = 5000
speed = 50
CCNA = 1
#--------------------------------------------------------------------------
[GPS] = OFF
monitor = 1
Enable = 1
#------------------------------------------------------------------------------
The matched lines in step 3 were [Model] = Toyota and [GPS] = OFF. The Engine line is missing because file2.txt had Engine_Type instead. Also, I didn't bother with the section headers; it would be easy to add another condition to print them all but it requires lookahead to print only the ones that are going to have matching content in them (because at the time you read the header you don't know if a match is found inside). For that, I would switch to another language (e.g., Ruby).

how do I convert fractional decimal numbers to fractional binary numbers using dc

So dc is a great tool for converting between bases - handy for those bit twiddling coding jobs. e.g to convert 1078 into binary I can do this:
bash> echo "2o1078p" | dc
10000110110
However I can't get it to print fractions between 0 and 1 correctly.
Trying to convert 0.3 into binary:
bash> echo "2o10k 0.3p" | dc
.0100
But 0.0100(bin) = 0.25 not 0.3.
However if I construct the value manually I get the right answer
bash> echo "2o10k 3 10 / p" | dc
.0100110011001100110011001100110011
Well it looks like its giving me more than the 10 significant figures I ask for but thats OK
Am I doing something wrong? Or am I trying to make dc do something that its not able to do?
bash> dc --version
dc (GNU bc 1.06) 1.3
...

Strange. My first thought was that maybe precision only applies to calculations, not conversions. But then it only works for division, not addition, subtraction, or multiplication:
echo "2o10k 0.3 1 / p" | dc
.0100110011001100110011001100110011
echo "2o10k 0.3 0 + p" | dc
.0100
echo "2o10k 0.3 0 - p" | dc
.0100
echo "2o10k 0.3 1 * p" | dc
.0100
As for precision, the man page says "The precision is always measured in decimal digits, regardless of the current input or output radix." That explains why the output (when you get it) is 33 significant bits.

It seems that dc is getting the number of significant figures from the input.
Now 1/log10(2)=3.32 so each decimal significant digit is 3.3 binary digits.
Looking at the output of dc for varying input SF lengths shows:
`dc -e "2o10k 0.3 p"` => .0100
`dc -e "2o10k 0.30 p"` => .0100110
`dc -e "2o10k 0.300 p"` => .0100110011
`dc -e "2o10k 0.3000 p"` => .01001100110011
A table of these values and expected value, ceil(log10(2)*SFinput) is as follows:
input : output : expected output
1 : 4 : 4
2 : 7 : 7
3 : 10 : 10
4 : 14 : 14
And dc is behaving exactly as expected.
So the solution is to either use the right number of significant figures in the input, or the division form dc -e "2o10k 3 10 / p"

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

system information to workspace - octave

How to return to a variable the information of the system command, like: a=system("cat /proc/cpuinfo | grep 'model name' | uniq") return a=0 or b=system("xrandr | grep 'current'") Screen 0: minimum 320 x 200, current 1366 x 768, maximum 16384 x 16384 ans = 0

As documented in Octave, you can get the result in a variable if this one is the second output. Example: with [s, r] = system("ls") you get in r the expected list of file names.

Related

AWK: statistics operations of multi-column CSV data

jq: error: round/0 is not defined at <top-level>

Display symbolic expression in octave. Matrix multiplication as an expression and not as a result

Read in search strings from text file, search for string in second text file and output to CSV

how do I convert fractional decimal numbers to fractional binary numbers using dc

Categories

Resources