I have a matrix listed in a .csv file of approximately 0.6 mio datapoints I would like to visualize in a 3d plot. Since my computer showed troubles with the amount of data I evolved the command line from:
splot "file.csv" matrix w pm3d
to
splot "file.csv" matrix every 5::50::3000 w pm3d
My intention was to only plot from row 50 to 3000 using only every 5th row. A row contains 100 columns btw. The command however cut the first 50 rows and columns, using every 5th row and column and ended in with line 3500.
How do I use the every command on my rows only?
I also tried to combine the using command with the every command in order to define my row with the every command but I couldn't get it to work properly.
Short answer: Use
splot "file.csv" matrix every :5::50::3000 w pm3d
Long answer: The description of the every option is:
plot ’file’ every {<point_incr>}
{:{<block_incr>}
{:{<start_point>}
{:{<start_block>}
{:{<end_point>}
{:<end_block>}}}}}
The description of point and block refers to the usual data file structure, where two data blocks are separated by an empty line.
When using the matrix data format, replace point by column and block by row. That means, that every 1:1 selects all points, every 2:1 selects every second column and every row, every 1:2 (or every :2) selects every column and every second row.
Just use a simple data file
0 0 0 0 0 0
file0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
and test:
splot 'file' matrix with lines, '' every :2
Related
I'm working with a csv file from a customer, which holds a large amount of data. The data is extracted from an SQL database and the commas therefore signify the different columns. In one of these columns there are 10 digit numbers. For some reason all 10 digit numbers starting with 0 have been converted to 9 digit numbers with the 0 removed. I need to find all these instances and insert a 0 at the beginning of the 9 digit number.
A complication in the data is that another column also contains 9 digit numbers, and these do not need to be modified. I can assume, however that all those numbers start with 0 and all the numbers i need to find do not start with 0.
I'm currently using notepad++ trying to fix the problem and found the regular expression \d{9} which finds all numbers with 9 digits, but that is not what I'm looking for
Below i have an example of how the data could look. The column that needs all 9 digit numbers converted is on the left, and the other column with 9 digit numbers is on the right.
An example of the data that is causing the trouble could be:
Column 1
Column 2
2323232323
002132413
231985313
004542435
In this example I need to find the second line of column 1 and insert a 0 in front of the number.
Ctrl+H
Find what: \b(?!0)\d{9}\b
Replace with: 0$0
TICK Wrap around
SELECT Regular expression
Replace all
Explanation:
\b # word boundary, make sure ae haven't digit before
(?!0) # negative lookahead, make sure the next character is not 0
\d{9} # 9 digits
\b # word boundary, make sure ae haven't digit after
Replacement:
0 # 0 to be inserted
$0 # the whole match (i.e. 9 digts)
Screenshot (before):
Screenshot (after):
Using Notepad++ do CTRL + H (search and replace utility).
Tick Regular Expression
Find what ? ([^0-9])(\d{9})([^0-9])
Replace with ? \10\2\3
Explanation :
([^0-9])(\d{9})([^0-9]) matches a 9 digit number surrounded by a non-digit on each side (including line return / comma, etc) :
Each (....) "captures" a group for later use (in "replace").
[^0-9] is a non-number character
\d{9} is a 9 digits number
\10\2\3 is a 0 right after the first captured group \1 (it was just one character here) followed by the 9 digit number (2nd captured group : \2) and the character that was after that number (3rd captured group : \3).
Limit :
It won't match a number at the very beginning of the file (before any other character) or at the very end (after every character). Adding a newline at the end of the file is one workaround, or fixing the last number manually if there is no newline before EOF.
I want to use mixtools to separate 1, 2 and 3+ year old cohorts in shellfish length frequency data. I am totally new to R coding. The example is old faithful geyser data but it is merely a list of 272 data points. I have various tables of lengths (size class midpoints) and frequencies. Generally about 15 length classes and counts in each between 0 and 50. I can create a data frame from my MSexcel table but not sure how to call it with normalmixEM() Thanks.
I would like to prepare a script file to draw a 3D plot of some kinetic spectroscopy results. In the experiment the absorption spectrum of a solution is measured sequentially at increasing times from t0 to tf with a constant increase in time Δt.
The plot will show the variation of absorbamce (Z) with wavelength and time.
The data are recorded using a UV-VIS spectrometer and saved as a CSV text file.
The file contains a table in which the first column are the wavelengths of the spectra. Afterwards, a column is added for each the measured spectra, and a number of columns depends on the total time and the time interval between measuerments. The time for each spectra appears in the headers line.
I wonder if the data can be plotted directly witha minimum of preformatting and without the need to rewrite the data in a more estandar XYZ format.
The structure of the data file is something like this
Title; espectroscopia UV-Vis
Comment;
Date; 23/10/2018 16:41:12
Operator; laboratorios
System Name; Undefined
Wavelength (nm); 0 Min; 0,1 Min; 0,2 Min; 0,3 Min; ... 28,5 Min
400,5551; 1,491613E-03; 1,810312E-03; 2,01891E-03; ... 4,755786E-03
... ... ... ... ... ...
799,2119; -5,509266E-04; 3,26314E-04; -4,319865E-04; ... -5,087912E-04
(EOF)
A copy of a sample data is included in this file kinetic_spectroscopy.csv.
Thanks.
Your data is in an acceptable form for gnuplot, but persuading the program to plot this as one line per wavelength rather than a gridded surface is more difficult. First let's establish that the file can be read and plotted. The following commands should read in the x/y coordinates (x = first row, y = first column) and the z values to construct a surface.
DATA = 'espectros cinetica.csv'
set datafile separator ';' # csv file with semicolon
# Your data uses , as a decimal point.
set decimal locale # The program can handle this if your locale is correct.
show decimal # confirm this by inspecting the output from "show".
set title DATA
set ylabel "Wavelength"
set xlabel "Time (min)"
set xyplane 0
set style data lines
splot DATA matrix nonuniform using 1:2:3 lc palette
This actually looks OK with your data. For a smaller number of scans it is probably not what you would want. In order to plot separate lines, one per scan, we could break this up into a sequence of line plots rather than a single surface plot:
DATA = 'espectros cinetica.csv'
set datafile separator ";"
set decimal locale
unset key
set title DATA
set style data lines
set ylabel "Wavelength"
set xlabel "Time (min)"
set xtics offset 0,-1 # move labels away from axis
splot for [row=0:*] DATA matrix nonuniform every :::row::row using 1:2:3
This is what I get for the first 100 rows of your data file. The row data is colored sequentially by gnuplot linetypes. Other coloring schemes are possible.
I'm using GDAL.rasterize to rasterize a simple shapefile of points. The shapefile points simply consists of Xco-ord, Yco-ord and an integer data value. Everything about the output file is fine except for a single row containing only No_data inserted mysteriously by the process about 3/4 way down the raster and hence all subsequent rows below that then appear misaligned by 100m southward. but data exists in shapefile for that anomalous row ?
I have tried creating other formats instead of TIFF, such as EHDr, but they all turn out the same
So, thinking it was memory related I tried reducing he extent
if I reduce the extent to only rasterise below the line of the inserted null row , the resultant output is still offset
if I reduce the extent to only rasterise above the row of the inserted null data , the resultant output is fine for that portion as it is when I do the whole extent
I took a thin sliver of the extent reducing the number of cols but keeping the rows, same thing occurred
so I don't think its memory related any longer
the output raster is a simple 6256 cols by 12361 rows tiff in 100mx100m grid
Extent is 45080,670080,4355,1240255 CRS is EPSG: 27700
this is the Gdal .rasterise switches I used
gdal_rasterize -l !fileOUT! -a OP_DATAFIELD -tr 100.0 100.0 -a_nodata -9999 -te 44780.0 4155.0 670380.0 1240255.0 -ot Int16 C:\WorkingMDT\!SHPfileIN!.shp C:\WorkingMDT\!fileOUT!.tiff
What I need to happen is a raster without the anomalous row 3/4 way down it that is offsetting all subsequent rows,
what's causing that anomalous row to be inserted ?
now I can manually correct this by converting to .asc and editing out the anomaly row but id rather find the programmatical cause
all help and consideration much appreciated. Here is a picture of the issue Green is the raster created, blue crosses the original data points, the row of no_data that has been inserted and the subsequent downward shift can be clearly seenenter image description here
I have a CSV file which is generated by a process that outputs the data in pre-defined bins (say from -100 to +100 in steps of 10). So, each line looks somewhat like this:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
i.e. 20 comma separated values, the first representing the frequency in the range -100 to -90, while the last represents the frequency between 90 to 100.
The problem is, Gnuplot seems to require the raw data for it to be able to generate a histogram, whereas I have only the frequency distribution. How do I proceed in this case? I'm looking for the simplest possible histogram, that perhaps displays the data using vertical bars.
You already have histogram data, so you mustn't use "set histogram".
Generate the x-values from the linenumbers, and do a simple boxplot
plot dataf using (($0-10)*10):$1 with boxes