I have a MATLAB script that I would like to run in Octave. But it turns out that the timeseries and synchronize functions from MATLAB are not yet implemented in Octave. So my question is if there is a way to express or replace these functions in Octave.
For understanding, I have two text files with different row lengths, which I want to synchronize into one text file with the same row length over time. The content of the text files is:
Text file 1:
1st column contains the distance
2nd column contains the time
Text file 2:
1st column contains the angle
2nd column contains the time
Here is the part of my code that I use in MATLAB to synchronize the files.
ts1 = timeseries(distance,timed);
ts2 = timeseries(angle,timea);
[ts1 ts2] = synchronize(ts1,ts2,'union');
distance = ts1.Data;
angle = ts2.Data;
Thanks in advance for your help.
edit:
Here are some example files.
input distance
input roation angle
output
The synchronize function seems to create a common timeseries from two separate ones (here, specifically via their union), and then use interpolation (here 'linear') to find interpolated values for both distance and angle at the common timepoints.
An example of how to achieve this to get the same output in octave as your provided output file is as follows.
Note: I had to preprocess your input files first to replace 'decimal commas' with dots, and then 'tabs' with commas, to make them valid csv files.
Distance_t = csvread('input_distance.txt', 1, 0); % skip header row
Rotation_t = csvread('input_rotation_angle.txt', 1, 0); % skip header row
Common_t = union( Distance_t(:,2), Rotation_t(:,2) );
InterpolatedDistance = interp1( Distance_t(:,2), Distance_t(:,1), Common_t );
InterpolatedRotation = interp1( Rotation_t(:,2), Rotation_t(:,1), Common_t );
Output = [ InterpolatedRotation, InterpolatedDistance ];
Output = sortrows( Output, -1 ); % sort according to column 1, in descending order
Output = Output(~isna(Output(:,2)), :); % remove NA entries
(Note, The step involving removal of NA entries was necessary because we did not specify we wanted extrapolation during the interpolation step, and some of the resulting distance values would be outside the original timerange, which octave labels as NA).
Related
I got a problem reading some data from a txt file. I appreciate any suggestion and thank you in advance!
I have a txt file with text/number on top, followed by two tab-separated columns (additionally, they have commas instead of dots).
I want to extract the two columns without text, and replace the commas with dots in order to plot them.
I tried with importdata to be able to replace the commas, but it separates every single character, so I get 36k elements instead of 2048.
Tried dlmread but it ignores the second column...
I have no idea how to proceed without modifying every single file manually.
here is an example of the file:
Data from FLMS012901__118__10-30-26-589.txt Node
Date: Tue Jul 05 10:30:26 CEST 2022
User: Myself
Number of Pixels in Spectrum: 2048
>>>>>Begin Spectral Data<<<<<
338,147 -2183,94
338,527 -2183,94
338,906 -2183,94
339,286 -2251,25
Any suggestions?
EDIT:
Apparently, there was already a solution, even though a bit slow:
% Read file in as a series of strings
fid = fopen('data.txt', 'rb');
strings = textscan(fid, '%s', 'Delimiter', '');
fclose(fid);
% Replace all commas with decimal points
decimal_strings = regexprep(strings{1}, ',', '.');
% Convert to doubles and join all rows together
data = cellfun(#str2num, decimal_strings, 'uni', 0);
data = cat(1, data{:});
On the sample that you provide, the following works:
>> [a,b,c,d] = textread("SO_73502149.txt","%f,%f %f,%f", "headerlines", 6);
>> format free
>> [a+b/1000, c+sign(c).*d/100]
ans =
338.147 -2183.94
338.527 -2183.94
338.906 -2183.94
339.286 -2251.25
However there are some possible traps, according to the way decimal figures are handled in your file, you should adapt the post-processing: If for 338.10 338,1 is printed in the file instead of 338,10 , the decoding would be a bit harder. Whenever c becomes zero, sign(c) would kill the decimal part. A less trivial post-processing would be required.
So I'm currently trying to use Python to transform large sums of data into a neat and tidy .csv file from a .txt file. The first stage is trying to get the 8-digit company numbers into one column called 'Company numbers'. I've created the header and just need to put each company number from each line into the column. What I want to know is, how do I tell my script to read the first eight characters of each line in the .txt file (which correspond to the company number) and then write them to the .csv file? This is probably very simple but I'm only new to Python!
So far, I have something which looks like this:
with open(r'C:/Users/test1.txt') as rf:
with open(r'C:/Users/test2.csv','w',newline='') as wf:
outputDictWriter = csv.DictWriter(wf,['Company number'])
outputDictWriter.writeheader()
rf = rf.read(8)
for line in rf:
wf.write(line)
My recommendation would be 1) read the file in, 2) make the relevant transformation, and then 3) write the results to file. I don't have sample data, so I can't verify whether my solution exactly addresses your case
with open('input.txt','r') as file_handle:
file_content = file_handle.read()
list_of_IDs = []
for line in file_content.split('\n')
print("line = ",line)
print("first 8 =", line[0:8])
list_of_IDs.append(line[0:8])
with open("output.csv", "w") as file_handle:
file_handle.write("Company\n")
for line in list_of_IDs:
file_handle.write(line+"\n")
The value of separating these steps is to enable debugging.
I have a code to calculate the mean of the first five values of each column of a file, for then use these values as a reference point for all set. The problem is that now I need to do the same but for many files. So I will need to obtain the mean of each file to then use these values again with the originals files. I have tried in this way but I obtain an error. Thanks.
%%% - Loading the file of each experiment
myfiles = dir('*.lvm'); % To load every file of .lvm
for i = 1:length(myfiles) % Loop with the number of files
files=myfiles(i).name;
mydata(i).files = files;
mydata(i).T = fileread(files);
arraymean(i) = mean(mydata(i));
end
The files that I need to compute are more or less like this:
Delta_X 3.000000 3.000000 3.000000
***End_of_Header***
X_Value C_P1N1 C_P1N2 C_P1N3
0.000000 -0.044945 -0.045145 -0.045705
0.000000 -0.044939 -0.045135 -0.045711
3.000000 -0.044939 -0.045132 -0.045706
6.000000 -0.044938 -0.045135 -0.045702
Your first line results in 'myfiles' being a structure array with components that you will find defined when you type 'help dir'. In particular, the names of all the files are contained in the structure element myfiles(i).name. To display all the file names, type myfiles.name. So far so good. In the for loop you use 'fileread', but fileread (see help fileread) returns the character string rather than the actual values. I have named your prototype .lvm file DinaF.lvm and I have written a very, very simple function to read the data in that file, by skipping the first three lines, then storing the following matrix, assumed to have 4 columns, in an array called T inside the function and arrayT in the main program
Here is a modified script, where a function read_lvm has been included to read your 'model' lvm file.
The '1' in the first line tells Octave that there is more to the script than just the following function: the main program has to be interpreted as well.
1;
function T=read_lvm(filename)
fid = fopen (filename, "r");
%% Skip by first three lines
for lhead=1:3
junk=fgetl(fid);
endfor
%% Read nrow lines of data, quit when file is empty
nrow=0;
while (! feof (fid) )
nrow=nrow + 1;
thisline=fscanf(fid,'%f',4);
T(nrow,1:4)=transpose(thisline);
endwhile
fclose (fid);
endfunction
## main program
myfiles = dir('*.lvm'); % To load every file of .lvm
for i = 1:length(myfiles) % Loop with the number of files
files=myfiles(i).name;
arrayT(i,:,:) = read_lvm(files);
columnmean(i,1:4)=mean(arrayT(i,:,:))
end
Now the tabular values associated with each .lvm file are in the array arrayT and the mean for that data set is in columnmean(i,1:4). If i>1 then columnmean would be an array, with each row containing the files for each lvm file. T
This discussion is getting to be too distant from the initial question. I am happy to continue to help. If you want more help, close this discussion by accepting my answer (click the swish), then ask a new question with a heading like 'How to read .lvm files in Octave'. That way you will get the insights from many more people.
I habitually use csvRead in scilab to read my data files however I am now faced with one which contains blocks of 200 rows, preceeded by 3 lines of headers, all of which I would like to take into account.
I've tried specifying a range of data following the example on the scilab help website for csvRead (example is right at the bottom of the page) (https://help.scilab.org/doc/6.0.0/en_US/csvRead.html) but I always come out with the same error messages :
The line and/or colmun indices are outside of the limits
or
Error in the column structure.
My first three lines are headers which I know can cause a problem but even if I omit them from my block-range, I still have the same problem.
Otherwise, my data is ordered such that I have my three lines of headers (two lines containing a header over just one or two columns, one line containing a header over all columns), 200 lines of data, and a blank line - this represents data from one image and I have about 500 images in the file, I would like to be able to read and process all of them and keep track of the headers because they state the image number which I need to reference later. Example:
DTN-dist_Devissage-1_0006_0,,,,,,
L0,,,,,,
X [mm],Y [mm],W [mm],exx [1] - Lagrange,eyy [1] - Lagrange,exy [1] - Lagrange,Von Mises Strain [1] - Lagrange
-1.13307,-15.0362,-0.00137507,7.74679e-05,8.30045e-05,5.68249e-05,0.00012711
-1.10417,-14.9504,-0.00193334,7.66086e-05,8.02914e-05,5.43132e-05,0.000122655
-1.07528,-14.8647,-0.00249155,7.57493e-05,7.75786e-05,5.18017e-05,0.0001182
Does anyone have a solution to this?
My current code, following an adapted version of the Scilab-help example looks like this (I have tried varying the blocksize and iblock values to include/omit headers:
blocksize=200;
C1=1;
C2=14;
iblock=1
while (%t)
R1=(iblock-1)*blocksize+4;
R2=blocksize+R1-1;
irange=[R1 C1 R2 C2];
V=csvRead(filepath+filename,",",".","",[],"",irange);
iblock=iblock+1
end
Errors
The CSV
A lot's of your problem comes from the inconsistency of the number of coma in your csv file. Opening it in LibreOffice Calc and saving it puts the right number of comma, even on empty lines.
R1
Your current code doesn't position R1 at the beginning of the values. The right formula is
R1=(iblock-1)*(blocksize+blanksize+headersize)+1+headersize;
End of file
Currently your code raise an error and the end of the file because R1 becomes greater than the number of lines. To solve this, you can specify the maximum number of block or test the value of R1 against the number of lines.
Improved solution for much bigger file.
When solving your probem with a big file, two problems were raised :
We need to know the number of blocks or the number of lines
Each call of csvRead is really slow because it process the whole file at each call (1s / block !)
My idea was to read the whole file and store it in a string matrix ( since mgetl as been improved since 6.0.0 ), then use csvTextScan on a submatrix. Doing so also removes the manual writing of the number of block/lines.
The code follows :
clear all
clc
s = filesep()
filepath='.'+s;
filename='DTN_full.csv';
// header is important as it as the image name
headersize=3;
blocksize=200;
C1=1;
C2=14;
iblock=1
// let save everything. Good for the example.
bigstruct = struct();
// Read all the value in one pass
// then using csvTextScan is much more efficient
text = mgetl(filepath+filename);
nlines = size(text,'r');
while ( %t )
mprintf("Block #%d",iblock);
// Lets read the header
R1=(iblock-1)*(headersize+blocksize+1)+1;
R2=R1 + headersize-1;
// if R1 or R1 is bigger than the number of lines, stop
if sum([R1,R2] > nlines )
mprintf('; End of file\n')
break
end
// We use csvTextScan ony on the lines that matters
// speed the program, since csvRead read thge whole file
// every time it is used.
H=csvTextScan(text(R1:R2),",",".","string");
mprintf("; %s",H(1,1))
R1 = R1 + headersize;
R2 = R1 + blocksize-1;
if sum([R1,R2]> nlines )
mprintf('; End of file\n')
break
end
mprintf("; rows %d to %d\n",R1,R2)
// Lets read the values
V=csvTextScan(text(R1:R2),",",".","double");
iblock=iblock+1
// Let save theses data
bigstruct(H(1,1)) = V;
end
and returns
Block #1; DTN-dist_0005_0; rows 4 to 203
....
Block #178; DTN-dist_0710_0; rows 36112 to 36311
Block #179; End of file
Time elapsed 1.827092s
---- Update with what I got so far and what's left to resolve can be found in point 3 below ----
Using Octave I want to create 30 horizontal box and whisker plots without spread (x-axis) from 30 different GeoTIFF's. This is a sketch of how I would like the plot to look like:
Ideally the best solution for me would be an Octave code (workflow) that would allow me to place multiple GeoTIFFs in one directory and then with one click create a box and whisker plot for all GeotIFFs at once - just like the sketch above.
A GeoTIFF-sample with 3 GeoTIFF's can be downloaded here. The file looks like this in QGIS:
It holds elevation values on band 1 (the ones that each box and whisker plot should be based on, and no data values (-999), the no-data values should be excluded from the plot.
Right now this is what I got:
Using img = imread ("filname.tif") gets the file into Octave. Using hist (img(:), 200); shows that all cells are concentrated around 65300. imagesc (img, [65100 65600]) follwed by colorbar displays the image extent but's it's clear that this way simply doesn't import the real cell values. I can't find a working solution to import GeoTIFF's with cell values, therefor my current work-around is exporting the GeoTIFF from QGIS with gdal_translate -of aaigrid which creates a .asc-file that I manually edit to remove header rows, rename to .csv and load into Octave. That .csv can be found here.
To load it and create a box plot I'm currently using this code (thanks to #Andy and #Cris Luengo):
pkg load statistics
s = urlread ("https://drive.google.com/uc?export=download&id=1RzJ-EO0OXgfMmMRG8wiCBz-51RcwSM5h");
o = str2double (strsplit (s, ";"));
o(isnan (o)) = [];
boxplot (o)
set(gca,"xtick",[])
view([-90 90])
print out.png
The results is pretty close but I'm still failing to: A) load GeoTIFF's directly from a folder. If this is not possible I'm gonna have to modify the code to load all *.csv in a directory to the same box plot and label each plot by filename (which I'm unsure how to accomplish. B) to get the x-axis reversed (going from 200-450, not the other way around). This is caused by the view([-90 90]) that I use to make the box plot horizontal instead of vertical which is needed for layout reasons.
Anyone with any ideas on how to resolve the last adjustments?
---- Background info ----
I have 30 GeoTIFFs containing results from a viewshed analysis, for every 2x2 meter square there is a value the tells me how high a building can be (in meters) before it's visible from the viewshed point. The results cover the whole city of Stockholm but the above mentioned 30 GeoTIFFs are smaller clips of an area where new development is planned. The results help planners to understand how new development might effect each of the 30 places (that are important for cultural heritage management).
As part of a bigger PDF-report (where these results are visualized with different maps in different scales) I'm trying to produce a box and whisker plot (as a compliment to the maps) the gives the reader an overview over how much space is there is left at the planned development area, based on each of the 30 viewshed (GeoTIFF) results (one box and whisker for each of the 30 locations). Below is an example of how a map in the report can look like:
Does not directly read GeoTIFF but calls gdal_translate under the hood. Just place all your .tif in the same directory. Make sure gdal_translate is in your PATH:
pkg load statistics
clear all;
fns = glob ("*.tif");
for k=1:numel (fns)
ofn = tmpnam;
cmd = sprintf ('gdal_translate -of aaigrid "%s" "%s"', fns{k}, ofn);
[s, out] = system (cmd);
if (s != 0)
error ('calling gdal_translate failed with "%s"', out);
endif
fid = fopen (ofn, "r");
# read 6 headerlines
hdr = [];
for i=1:6
s = strsplit (fgetl (fid), " ");
hdr.(s{1}) = str2double (s{2});
endfor
d = dlmread (fid);
# check size against header
assert (size (d), [hdr.nrows hdr.ncols])
# set nodata to NA
d (d == hdr.NODATA_value) = NA;
raw{k} = d;
# create copy with existing values
raw_v{k} = d(! isna (d));
fclose (fid);
endfor
## generate plot
boxplot (raw_v)
set (gca, "xtick", 1:numel(fns),
"xticklabel", strrep (fns, ".tif", ""));
view ([-90 90])
zoom (0.95)
print ("out.png")
gives