gnuplot output filename from csv file - csv

I have a CSV file and I am plotting the columns one by one in a do for[] loop. I would like to save the plot as a PNG file, with the filename coming from the columnheader. What is the best way to go about this where text.png is replaced by the ith column header?
#!/bin/bash
set datafile separator ","
set key autotitle columnhead
set xlabel "time/date"
nc = "`awk -F, 'NR == 1{ print NF; exit}' input.csv`"
set term png
do for [i = 2:5] {
set output "test.png"
plot 'HiveLongrun.csv' every::0 using i:xticlabels(1) with lines
}

As long as you're using awk, you could use it once more to get the header name from inside a gnuplot macro:
#!/usr/bin/env gnuplot
set datafile separator ","
set key autotitle columnhead
set xlabel "time/date"
nc = "`awk -F, 'NR == 1{ print NF; exit}' input.csv`"
# Define a macro which, when evaluated, assigns the ith column header
# to the variable 'head'
awkhead(i) = "head = \"\`awk -F, 'NR == 1 {print $".i."}' input.csv\`\""
set term png
do for [i = 2:5] {
eval awkhead(i) # evaluate the macro
set output head.".png" # use the 'head' variable assigned by the macro
plot 'HiveLongrun.csv' every::0 using i:xticlabels(1) with lines
}
There is almost certainly a cleaner way to do this with another awk-like utility, or even within gnuplot. Gnuplot offers a few ways to run arbitrary internal/external commands, as you can see from my mix of backtics and macro evaluation.
By the way, it is a little strange to me that you have the bash shebang (#!/bin/bash) at the start of the script if it will presumably be interpreted by gnuplot. I assume you call it as gnuplot myscript.plt. In this case the shebang is just a comment (as far as gnuplot is concerned) and doesn't do anything because gnuplot is the interpreter. In my example I use #!/usr/bin/env gnuplot and I run the script as an executable in bash, like ./myscript.plt. The shebang in this case tells bash to make gnuplot the interpreter (or whatever command you would get by typing gnuplot at the command prompt). Of course you could also set the shebang to be #!/usr/bin/gnuplot if you're not worried about the path changing.

Related

Tcl How to modify csv and it can control another file's data without using tcllib

I have many files of process and it contain versions, I have regexp certain line of them and import them into a txt file , the txt format is like
#process #AA_version #BB_version
a11 Aa/10.10-d87_1 Bb/10.57-d21_1
a15 Aa/10.15-d37_1 Bb/10.57-d28_1
a23 Aa/10.20-d51_1 Bb/10.57-d29_3
and then I change the txt to csv format like
,#process,#AA_version,#BB_version
,a11,Aa/10.10-d87_1,Bb/10.57-d21_1
,a15,Aa/10.15-d37_1,Bb/10.57-d28_1
,a23,Aa/10.20-d51_1,Bb/10.57-d29_3
And now I want to write a tcl(get_version.tcl) it can generate the corresponding version (in the file where I regexp the version) after parsing the csv file
ex. If I tclsh get_version.tcl to parse csv file, input the process I want to modify(a11) and it will puts the current version AA_version: Aa/10.10-d87_1 BB_version: Bb/10.57-d21_1 and I can modify the version
and the get_version.tcl can update the csv file and the file (where I regexp) at the same time.
Is that available? and I cannot install tcllib , can I do these thing without tcllib?
It contains too much commands, I don't know how to start, Thanks for help~
If I want to change the AA_version of process a11, and use
get_version.tcl a11
AA_version = **Aa/10.10-d87_1**
BB_version = **Bb/10.57-d21_1**
It will read the csv file and add the new version(ex.Aa/10.13-d97_1 ), and this action will change the file (1.add a new line AA_version =Aa/10.13-d97_1 2. modify AA_version = Aa/10.10-d87_1 ->#AA_version = Aa/10.10-d87_1) in original file
I'd process each line something like this:
split [regsub -all {\s+} $line "\u0000"] "\u0000"
(The tcllib textutil::splitx command does something similar.)
If you don't have commas or quotes in your input data — and it looks like you might be able to guarantee that easily — then you can just join $record "," to get a line you can write out with puts. If you do have commas or quotes or stuff like that, use the Tcllib csv package because that handles the tricky edge cases correctly.
As a simple stdin→stdout filter, the script would be:
while {[gets stdin line] >= 0} {
set data [split [regsub -all {\s+} $line "\u0000"] "\u0000"]
# Any additional column mangling here
puts [join $data ","]
}

how do remove carriage returns in a txt file

I recently received some data items 99 pipe delimited txt files, however in some of them and ill use dataaddress.txt as an example, where there is a return in the address eg
14 MakeUp Road
Hull
HU99 9HU
It goming out on 3 rows rather than one, bear in made there is data before and after this address separated by pipes. It just seems to be this addresss issue which is causing me issues in oading the txt file correcting using SSIS.
Rather than go back to source I wondered if there was a way we can manipulate the txt file to remove these carriage returns while not affected the row end returns if that makes sense.
I would use sed or awk. I will show you how to do this with awk, because it more platform independent. If you do not have awk, you can download a mawk binary from http://invisible-island.net/mawk/mawk.html.
The idea is as follows - tell awk that your line separator is something different, not carriage return or line feed. I will use comma.
Than use a regular expression to replace the string that you do not like.
Here is a test file I created. Save it as test.txt:
1,Line before ...
2,Broken line ... 14 MakeUp Road
Hull
HU99 9HU
3,Line after
And call awk as follows:
awk 'BEGIN { RS = ","; ORS=""; s=""; } $0 != "" { gsub(/MakeUp Road[\n\r]+Hull[\n\r]+HU99 9HU/, "MakeUp Road Hull HU99 9HU"); print s $0; s="," }' test.txt
I suggest that you save the awk code into a file named cleanup.awk. Here is the better formatted code with explanations.
BEGIN {
# This block is executed at the beginning of the file
RS = ","; # Tell awk our records are separated by comma
ORS=""; # Tell awk not to use record separator in the output
s=""; # We will print this as record separator in the output
}
{
# This block is executed for each line.
# Remember, our "lines" are separated by commas.
# For each line, use a regular expression to replace the bad text.
gsub(/MakeUp Road[\n\r]+Hull[\n\r]+HU99 9HU/, "MakeUp Road Hull HU99 9HU");
# Print the replaced text - $0 variable represents the line text.
print s $0; s=","
}
Using the awk file, you can execute the replacement as follows:
awk -f cleanup.awk test.txt
To process multiple files, you can create a bash script:
for f in `ls *.txt`; do
# Execute the cleanup.awk program for each file.
# Save the cleaned output to a file in a directory ../clean
awk -f cleanup.awk $f > ../clean/$f
done
You can use sed to remove the line feed and carriage return characters:
sed ':a;N;$!ba;s/MakeUp Road[\n\r]\+/MakeUp Road /g' test.txt | sed ':a;N;$!ba;s/Hull[\n\r]\+/Hull /g'
Explanation:
:a create a label 'a'
N append the next line to the pattern space
$! if not the last line, ba branch (go to) label 'a'
s substitute command, \n represents new line, \r represents carriage return, [\n\r]+ - match new line or carriage return in a sequence as many times as they occur (at least one), /g global match (as many times as it can)
sed will loop through step 1 to 3 until it reach the last line, getting all lines fit in the pattern space where sed will substitute all \n characters

TCL - feeding STDIN to exec

I'm trying to implement a retry mechanism when executing an external program using TCL. I'm having some issues when trying to feed STDIN to the external program. I'm now working with a simplified example trying to solve the issue. Take the following python script (simple.py):
x = raw_input()
y = raw_input()
print x + y
Inputs 2 strings from the output will be the concatenation result of the strings.
Now the following command works from the TCL interpreter:
% exec python stuff.py << 1\n2
12
However when I try to split it in separate commands, or add them to a string before doing this, it fails.
Fail 1:
% set cmd "python simple.py << 1\n2"
% exec $cmd
couldn't execute "python simple.py << 1
2": no such file or directory
Fail 2:
% set cmd1 "python simple.py"
% set cmd2 "1\n2"
% exec $cmd1 << $cmd2
couldn't execute "python simple.py": no such file or directory
Fail 3:
% set fullCommandString "exec python simple.py << 1\n2"
% eval $fullCommandString
Traceback (most recent call last):
File "simple.py", line 2, in <module>
y = raw_input()
EOFError: EOF when reading a line
The 3rd case seems that starts the script, but it interprets both lines of STDIN as one.
Any help is appreciated.
Tcl's commands do not reinterpret whitespace in their arguments by default. exec is one of these, and it follows the same rules. That means that you need to tell Tcl to interpret that list of words as a list of words as otherwise it is just a string. Fortunately, there's {*} for this; the expansion operator syntax interprets the rest of the word as a Tcl list, and uses the words out of that list at the point where you write it. It's very useful I find.
The simplest to fix is actually your second case:
% set cmd1 "python simple.py"
% set cmd2 "1\n2"
% exec {*}$cmd1 << $cmd2
You can fix the first and third by adding Tcl list quoting to ensure the 1\n2 is still interpreted as a single word (as otherwise newline is a perfectly reasonable list item separator).
% set cmd "python simple.py << {1\n2}"
% exec $cmd
% set fullCommandString "exec python simple.py << {1\n2}"
% eval $fullCommandString
The third can be written more economically though:
% set fullCommandString "exec python simple.py << {1\n2}"
% {*}$fullCommandString
As a rule of thumb, if you see eval in modern Tcl (note: not namespace eval or interp eval or uplevel) then it's usually an indication that some code could be made more efficient and to have fewer bugs by switching to using expansion carefully.
tl;dr: Put {*} before $cmd1 in your second example to get the idiomatic fix.

How to zip multiple files through tcl script in Linux box?

I have set of code in tcl where I'm trying to achieve to zip the files but I'm getting below error
zip warning: name not matched: a_1.txt a_2.txt a_3.txt a_4.txt
On other hand I'm doing same thing from command prompt I'm able to execute successfully.
#!/usr/local/bin/tclsh
set outdir /usr/test/
set out_files abc.10X
array set g_config { ZIP /usr/bin/zip }
set files "a_1.txt a_2.txt a_3.txt a_4.txt"
foreach inp_file $files {
append zipfiles "$inp_file "
}
exec $g_config(ZIP) $outdir$out_files zipfiles
Tcl really cares about the boundaries between words, and doesn't split things up unless asked to. This is good as it means that things like filenames with spaces in don't confuse it, but in this case it causes you some problems.
To ask it to split the list up, precede the read of the word from the variable with {*}:
exec $g_config(ZIP) $outdir$out_files {*}$files
This is instead of this:
exec $g_config(ZIP) $outdir$out_files $files
# Won't work; uses "strange" filename
or this:
exec $g_config(ZIP) $outdir$out_files zipfiles
# Won't work; uses filename that is the literal "zipfiles"
# You have to use $ when you want to read from a variable and pass the value to a command.
Got a very old version of Tcl where {*} doesn't work? Upgrade to 8.5 or 8.6! Or at least use this:
eval {exec $g_config(ZIP) $outdir$out_files} $files
(You need the braces there in case you put a space in outdir…)

Large text substitutions in Tcl

Within my Tcl script I'm building source code in another language. Let it be gnuplot source for example. I have Tcl code like this:
# `script' variable contains gnuplot source code
set script {
set terminal pdf
set output "chart.pdf"
set title "[makeTitle]"
plot "$dataFile" using 1:2 title ""
}
# Then I write `script' to file for later execution
Notice that script variable contains command call (makeTitle) and variable substitution (dataFile). The source code itself contains new lines, double quotes.
Question: how can I simply "evaluate" this variable to substitute command calls by their results and variables by their values? Expected result should look like this:
set terminal pdf
set output "chart.pdf"
set title "R(S) Dependence"
plot "r_s.txt" using 1:2 title ""
You're looking for the subst command:
set result [subst $script]
One approach I commonly use in this type of situation is using [string map] with special symbols. For example:
set script {
set terminal pdf
set output "chart.pdf"
set title "%MAKETITLE%"
plot "%DATAFILE%" using 1:2 title ""
}
set script [string map [list %MAKETITLE% [makeTitle] %DATAFILE% $datafile] $script]
While glenn's answer of using [subst] is a good one and will work for the sample code you tested, it can run into issues as the original string gets more complex. Specifically, if it winds up containing characters that Tcl would interpret as commands to run or variables to substitute, you wind up needing to escape them, etc. By using string map and very specific character sequences to replace, you can limit the things that are changed to exactly what you need.