Using different colors in Gnuplot based on a CSV file column value - csv

I have a CSV file with the following structure:
X,Y,Z
where X and Y are coordinates on a square plot and Z can be 0/1. I want to plot points with different color, depending on the value in the Z column.
Is that possible?
So far I have a file which just displays all the data on the square chart and colors them with only 1 color:
filename='test.csv'
set datafile separator ","
set title filename
set size square
plot filename using 0:1 linecolor rgb "yellow"

It's all in the documentation, check help rgbcolor variable :
rgb(r,g,b) = 65536 * int(r) + 256 * int(g) + int(b)
color1=rgb(255,0,0); color2=rgb(0,255,0)
plot fname using 1:2:($3==0?color1:color2) w p lc rgb variable

Related

Multiple regression model

I have multiple regression model which looks like this:
wine.lm <- lm(Alc.vol. ~ pc2+pc1* factor(pc_parametr$color),
data = pc_parametr)
I have to extract the coefficients and translate them into one regression equation for each color (white,red)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.6803846 0.2511799 50.4832872 1.104478e-26
pc2 -3.7524814 3.5850681 -1.0466974 3.052541e-01
pc1 -9.3332435 7.5111530 -1.2425847 2.255503e-01
factor(pc_parameter$color)white 0.4778615 0.2926116 1.6330914 1.149828e-01
pc1:factor(pc_parameter$color)white 6.7281697 8.0999853 0.8306397 4.140399e-01
I was trying to do it manually but I am confused.
Y = 12.68 + -3.75 * pc2 + -9.33 * pc1 + 0.48 * factor(pc_parametr$color)white + 6.73 * pc1:factor(pc_parametr$color)white + e"
Is there a code for the calculation for different colors or manually how is the correct way
Here is correct equation notation:
Red color is the reference level and is always included in intercept. In the plot the colors move the regression curve along the y-axis.

Gnuplot one-liner to generate a titled line for each row in a CSV file

I've been trying to figure out gnuplot but haven't been getting anywhere for seemingly 2 reasons. My lack of understanding gnuplot set commands, and the layout of my data file. I've decided the best option is to ask for help.
Getting this gnuplot command into a one-liner is the hope.
Example rows from my CSV data file (MyData.csv):
> _TitleRow1_,15.21,15.21,...could be more, could be less
> _TitleRow2_,16.27,16.27,101,55.12,...could be more, could be less
> _TitleRow3_,16.19,16.19,20.8,...could be more, could be less
...(over 100 rows)
Contents of MyData.csv rows will always be a string as the first column for title, followed by an undetermined amount of decimal values. (Each row gets appended to periodically, so specifying an open ended amount of columns to include is needed)
What I'd like to happen is to generate a line graph showing a line for each row in the csv, using the first column as a row title, and the following numbers generating the actual line.
This is the I'm trying:
gnuplot -e 'set datafile separator ","; set key autotitle columnhead; plot "MyData.csv"'
Which results in:
set datafile separator ","; set key autotitle columnhead; plot "MyData.csv"
^
line 0: Bad data on line 2 of file MyData.csv
This looks like an amazing tool and I'm looking forward to learning more about it. Thanks in advance for any hints/assistance!
Your datafile format is very unfortunate for gnuplot which prefers data in columns.
Although, you can also plot rows (which is not straightforward in gnuplot, but see an example here). This requires a strict matrix, but the problem with your data is that you have a variable column count.
Actually, your CSV is not a "correct" CSV, because a CSV should have the same number of columns for all rows, i.e. if one row has less data than the row with maximum data the line should be filled with ,,, as many as needed. That's basically what the script below is doing.
With this you can plot rows with the option matrix (check help matrix). However, you will get some warnings warning: matrix contains missing or undefined values which you can ignore.
Alternatively, you could transpose your data (with variable column count maybe not straightforward). Maybe there are external tools which can do it easily. With gnuplot-only it will be a bit cumbersome (and first you would have to fill your shorter rows as in the example below).
Maybe there is a simpler and better gnuplot-only solution which I am currently not aware of.
Data: SO73099645.dat
_TitleRow1_, 1.2, 1.3
_TitleRow2_, 2.2, 2.3, 2.4, 2.5
_TitleRow3_, 3.2, 3.3, 3.4
Script:
### plotting rows with variable columns
reset session
FILE = "SO73099645.dat"
getColumns(s) = (sum [i=1:strlen(s)] (s[i:i] eq ',') ? 1 : 0) + 1
set datafile separator "\t"
colCount = 0
myNaNs = myHeaders = ''
stats FILE u (rowCount=$0+1, c=getColumns(strcol(1)), c>colCount ? colCount=c : 0) nooutput
do for [i=1:colCount] {myNaNs=myNaNs.',NaN' }
set table $Data
plot FILE u (s=strcol(1),c=getColumns(s),s.myNaNs[1:(colCount-c)*4]) w table
unset table
set datafile separator ","
stats FILE u (myHeaders=sprintf('%s "%s"',myHeaders,strcol(1))) nooutput
myHeader(n) = word(myHeaders,n)
set key noenhanced
plot for [row=0:rowCount-1] $Data matrix u 1:3 every ::1:row::row w lp pt 7 ti myHeader(row+1)
### end of script
As "one-liner":
FILE = "SO/SO73099645.dat"; getColumns(s) = (sum [i=1:strlen(s)] (s[i:i] eq ',') ? 1 : 0) + 1; set datafile separator "\t"; colCount = 0; myNaNs = myHeaders = ''; stats FILE u (rowCount=$0+1, c=getColumns(strcol(1)), c>colCount ? colCount=c : 0) nooutput; do for [i=1:colCount] {myNaNs=myNaNs.',NaN' }; set table $Data; plot FILE u (s=strcol(1),c=getColumns(s),s.myNaNs[1:(colCount-c)*4]) w table; unset table; set datafile separator ","; stats FILE u (myHeaders=sprintf('%s "%s"',myHeaders,strcol(1))) nooutput; myHeader(n) = word(myHeaders,n); set key noenhanced; plot for [row=0:rowCount-1] $Data matrix u 1:3 every ::1:row::row w lp pt 7 ti myHeader(row+1)
Result:

Octave: Plotyy log files from geothermal heat pump (import/plot datetime)

I'm trying to plot values from my geothermal heat pump log files to analyse it's performance. I tried with excel but it was to slow and not possible to get the plot type I wanted so I'm trying Octave instead. I have absolutely no experience with octave so please forgive my incompetence!
I've processed the .log files with open office calc to get into a decent delimited format. The first column is datetime with the format MM/DD/YY HH:MM:SS, in total there is 21 columns (but I only need 5) and one header line with a label, coma delimiter is '.' and delimiter is ','. The file can be downloaded here and the first 7 columns look like this:
02/19/2018 23:07:00,-0.7,47.5,42,47.3,52.1,1.5
I'm currently trying to plot this with demonstration 3 plotyy from here. Column 2, 3, 5 and 8 imports correctly so I'm figuring it's a problem with the datetime column 1. How can I get Octave to import column 1 correctly and use it as x axis in this plot?:
data=csvread('heatpump.csv');
clf;
hold on
t=data(:,1);
x=data(:,3);
y=data(:,5);
z=data(:,2);
o=data(:,8);
[hax, h1, h2] = plotyy (t, x, t, y);
[~, h3, h4] = plotyy (t, z, t, o);
set ([h3, h4], "linestyle", "--");
xlabel (hax(1), "Time");
title (hax(2), 'Heat pump analysis');
ylabel (hax(1), "Radiator and hot water temp");
ylabel (hax(2), "Outdoor temp and brine out");
There are many, many ways. Here I show you how to read the csv using csv2cell from the io package. I've tried to modify your existing code as less as sane. The first columns is used verbatim (well, I inserted a linebreak) to the plot. There is also a commented version which actually does the conversion and you could then use datetick. Btw, If you add google drive links it would be cool if you add direct links so someone can easily grab the csv or insert the url in the code as I've done, see below.
set (0, "defaultlinelinewidth", 2);
url = "https://drive.google.com/uc?export=download&id=1K_czefz-Wz4HPdvc7YqIqIupPwMi8a7r";
fn = "heatpump.csv";
if (! exist (fn, "file"))
urlwrite (url, fn);
endif
pkg load io
d = csv2cell (fn);
# convert to serial date
# (but you don't have if you want to keep the old format)
#t = datenum (d(2:end,1), "mm/dd/yyyy HH:MM:SS");
data = cell2mat (d(2:end,2:end));
clf;
hold on
t = 1:rows (data);
# Attention: the date/time column time was removed above, so the indizes are shifted
x = data(:,2);
y = data(:,4);
z = data(:,1);
o = data(:,7);
[hax, h1, h2] = plotyy (t, x, t, y);
[hax2, h3, h4] = plotyy (t, z, t, o);
grid on
#set ([h3, h4], "linestyle", "--");
xlabel (hax(1), "Time");
title (hax(2), 'Heat pump analysis');
ylabel (hax(1), "Radiator and hot water temp");
ylabel (hax(2), "Outdoor temp and brine out");
# use date as xtick
# extract them
date_time = d (get(hax2(1), "xtick"), 1);
# break them after the date part
date_time = strrep (date_time, " ", "\n");
# feed them back
set (hax, "xticklabel", date_time)
set (hax2, "xticklabel", date_time)
print ("-S1200,1000", "-F:10", "out.png")

Use column from CSV as a category label for plotting column chart using gnuplot

I have a CSV file looking like:
frameNo dataSeg paritySeg frameType
0 17 3 k
1 2 1 d
2 3 1 d
3 3 1 d
4 3 1 d
5 2 1 d
6 3 1 d
7 3 1 d
8 4 1 d
I'm able to plot stacked column diagram showing number of data and parity segments per frame. Looks like this:
What I'd like to add to it, however, is paint differently those columns (both data and parity) which have "k" marker in the last column. Basically, distinguish between two categories - "d" and "k".
Is that possible using gnuplot?
Here's the script I'm using:
set style histogram rowstacked;
set style data histograms;
set style fill solid;
set datafile separator "\t";
set terminal png size 2500,1500 enhanced font ",30";
set title "";
set tics font ",25";
set xlabel "Frame #" font ",25";
set ylabel "# of segments" font ",25";
set key outside;
set xrange [0:];
plot "segments.csv" using 2 t "Data", "" using 3 t "Parity";'
You could impose a custom condition on the columns being plotted and supply an invalid value (signaling to skip the particular data point) if this condition is not met:
set terminal pngcairo size 1200,600 enhanced font ",30";
set output 'test.png'
set style histogram rowstacked;
set style data histograms;
set style fill solid;
#set datafile separator "\t";
set title "";
set tics font ",25";
set xlabel "Frame #" font ",25";
set ylabel "# of segments" font ",25";
set key outside;
set xrange [0:];
fName = 'segments.csv'
plot \
fName using (strcol(4) eq 'd'?$2:1/0) t "Data d" lc rgb '#666666', \
fName using (strcol(4) eq 'd'?$3:1/0) t "Parity d" lc rgb '#ff0000', \
fName using (strcol(4) eq 'k'?$2:1/0) t "Data k" lc rgb '#000000', \
fName using (strcol(4) eq 'k'?$3:1/0) t "Parity k" lc rgb '#990000'
this would give (using the sample data in your question):

GNUPlot if statement on plot

I've a csv data file like this:
Sensor1;value;iteration
Sensor2;value;iteration
Sensor2;value;iteration
Sensor1;value;iteration
Sensor2;value;iteration
Can I plot two different lines in base of my 1st col value? one for Sensor1 and another for Sensor2 in same plot.
Now I plot all data as follow:
set terminal jpeg
set output 'testimage.jpeg'
set autoscale # scale axes automatically
unset log # remove any log-scaling
unset label # remove any previous labels
set xtic auto # set xtics automatically
set ytic auto # set ytics automatically
set datafile separator ";"
set xrange [1:10000]
set yrange [3000:5000]
plot "result_test_day_1.csv" using 5:3:(stringcolumn(1) eq "Sensor1"? $2:1/0) title "a" lc rgb "blue" with lines
plot "result_test_day_1.csv" using 5:3:(stringcolumn(1) eq "Sensor2"? $2:1/0) title "b" lc rgb "red" with lines