Plotting using a CSV file

Plotting using a CSV file - csv

I have a csv file which has 5 entries on every row. Every entry is whether a network packet is triggered or not. The last entry in every row is the size of packet. Every row = time elapsed in ms.
e.g. row
1 , 0 , 1 , 2 , 117
How do I plot a graph for e.g. where x axis is the row number and y is the value for e.g. 1st entry in every row?

This should get you started:
set datafile separator ","
plot 'infile' using 0:1

You can also plot to a png file using gnuplot (which is free):
terminal commands
gnuplot> set title '<title>'
gnuplot> set ylabel '<yLabel>'
gnuplot> set xlabel '<xLabel>'
gnuplot> set grid
gnuplot> set term png
gnuplot> set output '<Output file name>.png'
gnuplot> plot '<fromfile.csv>'
note: you always need to give the right extension (.png here) at set output
Then it is also possible that the ouput is not lines, because your data is not continues. To fix this simply change the 'plot' line to:
plot '<Fromfile.csv>' with line lt -1 lw 2
More line editing options (dashes and line color ect.) at:
http://gnuplot.sourceforge.net/demo_canvas/dashcolor.html
gnuplot is available in most linux distros via the package manager (e.g. on an apt based distro, run apt-get install gnuplot)
gnuplot is available in windows via Cygwin
gnuplot is available on macOS via homebrew (run brew install gnuplot)

Related

Issue to train tesseract-OCR 4 - Empy shape table

I am trying to train Tesseract 4 with particular pictures (to read multimeters with 7 segments),
please note that I am aware of the allready trained data from Arthur Augusto at https://github.com/arturaugusto/display_ocr but I need to train Tesseract over my own data.
In order to train tess, I followed differents tutorials (as https://robipritrznik.medium.com/recognizing-vehicle-license-plates-on-images-using-tesseract-4-ocr-with-custom-trained-models-4ba9861595e7 or https://pretius.com/how-to-prepare-training-files-for-tesseract-ocr-and-improve-characters-recognition/)
but i allways get problem when running the shapeclustering command with my own data
(With example data as https://github.com/tesseract-ocr/tesseract/issues/1174#issuecomment-338448972, every things is working fine)
Indeed when I try to do the shapeclusturing command it have this output screenshot
Then my shape_table is empty and the trainig could'nt be efficient...
With example data it's working fine and the shape_table is well filled
I am guessing that I have issue with box file generation, here is my process to create box file :
I use the
tesseract imageFileName.tif imageFileName batch.nochop makebox
command to generate box file and then i edit it with JtessboxEditor.
So I can't see where I'am wrong with my .box/.tif data couple.
Have a good day & thanks for helping me
\n
Adrien
Here is my full batch script for training after having generated and edited box files.
set name=sev7.exp0
set shortName=sev7
echo Run Tesseract for Training..
tesseract.exe %name%.tif %name% nobatch box.train
echo Compute the Character Set..
unicharset_extractor.exe %name%.box
shapeclustering -F font_properties -U unicharset -O %shortName%.unicharset %name%.tr
mftraining -F font_properties -U unicharset -O %shortName%.unicharset %name%.tr
echo Clustering..
cntraining.exe %name%.tr
echo Rename Files..
rename normproto %shortName%.normproto
rename inttemp %shortName%.inttemp
rename pffmtable %shortName%.pffmtable
rename shapetable %shortName%.shapetable
echo Create Tessdata..
combine_tessdata.exe %shortName%.
echo. & pause

Ok so finally I achieved to train tesseract.
The solution is to add a --psm parameter when using the command
tesseract.exe %name%.tif %name% nobatch box.train
as
tesseract.exe %name%.%typeFile% %name% --psm %psm% nobatch box.train
note that all the psm value are :
REM pagesegmode values are:
REM 0 = Orientation and script detection (OSD) only.
REM 1 = Automatic page segmentation with OSD.
REM 2 = Automatic page segmentation, but no OSD, or OCR
REM 3 = Fully automatic page segmentation, but no OSD. (Default)
REM 4 = Assume a single column of text of variable sizes.
REM 5 = Assume a single uniform block of vertically aligned text.
REM 6 = Assume a single uniform block of text.
REM 7 = Treat the image as a single text line.
REM 8 = Treat the image as a single word.
REM 9 = Treat the image as a single word in a circle.
REM 10 = Treat the image as a single character.
REM 11 = Sparse text. Find as much text as possible in no particular order.
REM 12 Sparse text with OSD.
REM 13 Raw line. Treat the image as a single text line bypassing hacks that are Tesseract-specific.
founded on https://github.com/tesseract-ocr/tesseract/issues/434

tesseract 5.0 bazaar + user-words config doesn't work

I tried to force tesseract to use only my words list when perform OCR.
First, i copy bazaar file to /usr/share/tesseract-ocr/5/tessdata/configs/. This is my bazaar file:
load_system_dawg F
load_freq_dawg F
user_words_suffix user-words
Then, i created eng.user-words in /usr/share/tesseract-ocr/5/tessdata. This is my user-words file:
Items
VAT
included
CASH
then i perform ocr for this image by command: tesseract -l eng --oem 2 test_small.jpg stdout bazaar.
this is my result:
2 Item(s) (VAT includsd) 36,000
casH 40,000
CHANGE 4. 000
As you can see, includsd is not in my user-words file, and it should be 'included'. Besides, i got same result even without using bazaaz config in command. It looks like that my bazaar and eng.user-words config doesn't have any effect in OCR output. So how can use bazaar and user-words config, in order to get desired result ?

All you need to do was up-sampling the image.
If you up-sample two - times
Now read:
2 Item(s) (VAT included) 36,000
CASH 40,000
CHANGE 4,000
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("4nGXo.jpg")
# Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Up-sample
gry = cv2.resize(gry, (0, 0), fx=2, fy=2)
# OCR
print(pytesseract.image_to_string(gry))
# Display
cv2.imshow("", gry)
cv2.waitKey(0)

user_words_suffix does not seem to work for --oem 2.
A workaround is to use user_words_file which contains the path to your user-words file.

GNUplot script that works with tab seperated .txt but not with .csv even when the datafile separater is changed

I have a GNUplot script that works perfectly with tab separated data, but my data comes in .csv files and it would be really handy to read them direct, any ideas?
Here is a sample of data in comma seprated.csv format, the other is exactly the same but in a .txt and obviously tab by a comma rather than a tab
HEADER
HEADER
Timestamp,Date,Time,Value,Units
1413867843,21/10/2014,05:04:03,0.053,µA
1413867243,21/10/2014,04:54:03,0.091,µA
1413866643,21/10/2014,04:44:03,0.084,µA
1413866043,21/10/2014,04:34:03,20.000,µA
1413846241,20/10/2014,23:04:01,0.041,µA
1413845641,20/10/2014,22:54:01,0.056,µA
1413845041,20/10/2014,22:44:01,0.123,µA
1413844441,20/10/2014,22:34:01,20.000,µA
1413824638,20/10/2014,17:03:58,0.075,µA
1413824038,20/10/2014,16:53:58,0.073,µA
1413823438,20/10/2014,16:43:58,0.103,µA
1413822838,20/10/2014,16:33:58,20.000,µA
Here is the problematic GNUPlot script I use
CSV MIN
#!/gnuplot
set terminal pdf enhanced font "sans,6"
#Filename
set output "BIOSENSE GRAPH TEMPLATE.pdf"
set size ratio 0.71
set pointsize 0.1
set datafile separator ","
#DATA FILES
plot 'ACT.csv' using 1:4 every::6 title 'Active' with points pt 5 lc rgb 'red' axes x1y1
And finally this is the error message I get from GNUplot
line 15: x range is invalid
Any help appreciated!
Thanks

How to open gnuplots in full screen and a particular size?

I am plotting graphs in gnuplot and would like to open them in full screen and a particular size.
Previously, I have been outputting graphs in multiplot mode and updating them using reread; so, when I maximise it manually, the plots fill the screen after a few iterations. Now, I also want to save the output as a file. When I open that file, it is in the same small size as the original multiplot output. However, when I maximise it, the plots don't increase in size to fill the screen. I have 2 questions:
How can I open the multiplot file in full screen?
How can I make the output file a particular size?
Here is my current gnuplot code (in a file called gnuplotCode):
set terminal pngcairo dashed enhanced
set output 'foo.png'
set multiplot layout 3, 3
plot for [iter=1:9] path/to/file using 1:(column(iter)) notitle
unset multiplot
unset output
pause 10
reread
I have tried to type the following:
gnuplot -geometry -3360-1050 gnuplotCode # where my screen size is 3360x1050
and:
resolution=$(xrandr | grep '*') && resolution=${resolution% *}
gnuplot -geometry $resolution gnuplotCode
but neither approach works. Please can you tell me how to open gnuplots in full screen and a particular size? Thank you.

You must distinguish between pixel-based terminals (pngcairo, png, canvas (...) and all interactive terminals wxt, x11, qt, windows, aqua, where the size is given in pixel. For vector-based terminals (postscript, svg, postscript etc) the size is given in inch or centimeters.
Using the -geometry flag works only for the x11 terminal:
gnuplot -geometry 800x800 -persist -e 'set terminal x11; plot x'
For all other pixel-based terminal you can use the size option to set the canvas size in pixel:
set terminal pngcairo size 800,800
Of course you can also extract the monitor resolution and use that as size. Here you have two variants:
Extract the monitor size on the shell:
monitorSize=$(xrandr | awk '/\*/{sub(/x/,",");print $1; exit}')
gnuplot -e "monitorSize='$monitorSize'; load 'gnuplotCode'"
The file gnuplotCode must then use the gnuplot variable monitorSize as follows:
set macros
set terminal pngcairo size #monitorSize
set output 'foo.png'
plot x
Note, that the content of the string variable monitorSize must be used as macro, i.e. the value is inserted before the whole line is evaluated.
If you don't want to have that additional line on the shell, you could also call the xrand stuff from within the gnuplot script via the system function. In that case the file gnuplotCode would look as follows:
monitorSize=system("xrandr | awk '/\*/{sub(/x/,\",\");print $1; exit}'")
set macros
set terminal pngcairo size #monitorSize
set output 'foobar.png'
plot x**2
which you must call only with gnuplot gnuplotCode.
Note, that the shell command as is always extracts the information of the first monitor only.

Plotting CSV with semi-colons and time formatted data with Gnuplot

I have 2 CSV files with over 80k strings in each.
The first file have this structure:
12.11.12 - 00:59:58;428,8;
12.11.12 - 00:59:59;428,9;
...
12.11.12 - 21:53:32;592,7;
12.11.12 - 21:53:35;596,4;
...
14.11.12 - 12:31:41;510,0;
14.11.12 - 12:31:41;510,0;
And the second have another scructure:
1;428.9;
1;428.9;
5;428.9;
...
117109;673.6;
117110;672.8;
117111;672.8;
...
214241;497.2;
214241;497.2;
214258;507.3;
How I can plot both of this CSV files in Gnuplot?
P.S. The first column must be x and the second must be y.

First, apparently you can set the delimiter thus:
set datafile separator ";"
Then set the time format for your first file, and set x to be a time axis:
set timefmt "%d.%m.%y - %H:%M:%S"
set xdata time
Plot the first file
plot "data1.csv" using 1:2
The second file x values don't seem to have a date format, but instead perhaps seconds elapsed? For that, just do
set datafile separator ";"
plot "data2.csv" using 1:2
and don't set xdata time. Then you should have an x axis in seconds. If you need to plot both at the same time, it would be simplest to pre-process one to look like the other.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008