Load csv file with integers in Octave 3.2.4 under Windows - csv

I am trying to import in Octave a file (i.e. data.txt) containing 2 columns of integers, such as:
101448,1077
96906,924
105704,1017
I use the following command:
data = load('data.txt')
However, the "data" matrix that results has a 1 x 1 dimension, with all the content of the data.txt file saved in just one cell. If I adjust the numbers to look like floats:
101448.0,1077.0
96906.0,924.0
105704.0,1017.0
the loading works as expected, and I obtain a matrix with 3 rows and 2 columns.
I looked at the various options that can be set for the load command but none of them seem to help. The data file has no headers, just plain integers, comma separated.
Any suggestions on how to load this type of data? How can I force Octave to cast the data as numeric?

The load function is not to read csv files. It is meant to load files saved from Octave itself which define variables.
To read a csv file use csvread ("data.txt"). Also, 3.2.4 is a very old version no longer supported, you should upgrade.

Related

line feed within a column in csv

I have a csv like below. some of columns have line break like column B below. when I doing wc -l file.csv unix is returning 4 but actually these are 3 records. I don't want to replace line break with space, I am going to load data in database using sql loader and want to load data as it is. what should I do so that unix consider line break as one record?
A,B,C,D
1,"hello
world",sds,sds
2,sdsd,sdds,sdds
Unless you're dealing with trivial cases (No quoted fields, no embedded commas, no embedded newlines, etc.), CSV data is best processed with tools that understand the format. Languages like perl and python have CSV parsing libraries available, there are packages like csvkit that provide useful utilities, and more.
Using csvstat from csvkit on your example:
$ csvstat -H --count foo.csv
Row count: 3

xlsread in octave return zero values

I am trying to read a csv file in octave. The file contains a table with both numeric and text data. It also contains information of date and hour. In addition, the first line is in a different format then the rest of the lines since it contains titles.
The csvread can only read numeric data (according to Octave help), so I tried using xlsread as follows:
[NUMARR, TXTARR, RAWARR, LIMITS] = xlsread ('Line.csv')
I get only a matrix of NUMARR with numeric values. However, all other returned variables are empty- their dimension is 0x0.
How do I get all the text and all other information?
TX!
To solve this issue, open your CSV file in Windows notepad and save it as ANSI format instead of UNICODE.

Convert large CSV file into NetCDF

I'd like to convert the file of float vector data written in CSV format which consists of 3 million rows and 150 columns like the following into NetCDF format.
0.3,0.9,1.3,0.5,...,0.9
-5.1,0.1,1.0,8.4,...,6.7
...
First, I tried something like cache-all-the-data-and-then-convert-it algorithm, but it didn't work because it could not allocate the memory for the cache.
So I need the code written in convert-one-by-one algorithm.
Does any one know such solutions?
The memory capacity of my machine is 8 MiB, and it's OK for any programming language such as C, Java, and Python.
With python you can read the file line by line.
with open("myfile.csv") as infile:
for line in infile:
appendtoNetcdf(line)
So you dont have to load all the file contents into memory.
Check the netCDF4-python library, you can create a netcdf4 or many netcdf4 files easily.

all the columns of a csv file cannot be imported in sas dataset

my data set contains 1300000 observations with 56 columns. it is a .csv file and i'm trying to import it by using proc import. after importing i find that only 44 out of 56 columns are imported.
i tried increasing the guessing rows but it is not helping.
P.S: i'm using sas 9.3
If (and only in that case as far as I am aware) you specify the file to load in a filename statement, you have to set the option lrecl to a value that is large enough.
If you don't, the default is only 256. Ergo, if your csv has lines longer than 256, he will not read the full line.
See this link for more information (just search for lrecl): https://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000308090.htm
If you have SAS Enterprise Guide (I think it's now included with all desktop licenses) try out the import wizard. It's excellent. And it will generate code you can reuse with a little editing.
It will take a while to run because it will read your entire file before writing the import logic.

Natural ordering files in directory into a cell array using Octave

I have files being generated by another program/user that have names such as "jh-1.txt, jh-2.txt, ..., jh-100.txt, ..., jh-1024.txt". I'm extracting a column from these files, manipulating the data, and outputting to a new matrix. The only problem is that Octave is using ASCII ordering and not natural ordering when reading in the files. Thus, the output matrix is not ordered in a natural way. My question is, can Octave sort file names in a natural order? I'm getting file names in the standard method:
fileDirectory = '/path/to/directory';
filePattern = fullfile(fileDirectory, '*.txt'); % Selects only the txt files.
dataFiles = dir(filePattern); % Gets the info from the txt files in the directory.
baseFileName = {dataFiles.name}'; % Gets all the txt file names.
I can't rename the files because this is a script for another user. They are on a Windows machine and already have Octave installed with Cygwin and I don't want to make them use the command line more than they have to because they are unfamiliar with it. Alternatively, it would be nice to have the output with the file names in a column but, I haven't figured that one out either (bit of a noob with Octave myself). That way the user could use Excel (which they are familiar with) to sort the columns.
I don't think there's a built in natural sort in Octave. However, there is a natural sort submission on Mathwork's File Exchange. I've not used it, but the comments imply it works in Octave too.