I'm trying to load a csv file that includes strings into Octave. I found that the load function doesn't work on files with strings so I tried csvread. Though when I try to assign the result of a function it results in an error.For example for a file named churn.csv
csvread churn.csv will work without an error but
M = csvread.churn.csv will output an error "churn is not defined near line 1 column 1.
What could be the problem
csvread churn.csv
M = csvread churn.csv
Related
I tried converting my .csv file to .dat format and tried to load the file into Octave. It throws an error:
unable to find file filename
I also tried to load the file in .csv format using the syntax
x = csvread(filename)
and it throws the error:
'filename' undefined near line 1 column 13.
I also tried loading the file by opening it on the editor and I tried loading it and now it shows me
warning: load: 'filepath' found by searching load path
error: load: unable to determine file format of 'Salary_Data.dat'.
How can I load my data?
>> load Salary_Data.dat
error: load: unable to find file Salary_Data.dat
>> Salary_Data
error: 'Salary_Data' undefined near line 1 column 1
>> Salary_Data
error: 'Salary_Data' undefined near line 1 column 1
>> Salary_Data
error: 'Salary_Data' undefined near line 1 column 1
>> x = csvread(Salary_Data)
error: 'Salary_Data' undefined near line 1 column 13
>> x = csvread(Salary_Data.csv)
error: 'Salary_Data' undefined near line 1 column 13
>> load Salary_Data.dat
warning: load: 'C:/Users/vaith/Desktop\Salary_Data.dat' found by searching load path
error: load: unable to determine file format of 'Salary_Data.dat'
>> load Salary_Data.csv
warning: load: 'C:/Users/vaith/Desktop\Salary_Data.csv' found by searching load path
error: load: unable to determine file format of 'Salary_Data.csv'
Salary_Data.csv
YearsExperience,Salary
1.1,39343.00
1.3,46205.00
1.5,37731.00
2.0,43525.00
2.2,39891.00
2.9,56642.00
3.0,60150.00
3.2,54445.00
3.2,64445.00
3.7,57189.00
3.9,63218.00
4.0,55794.00
4.0,56957.00
4.1,57081.00
4.5,61111.00
4.9,67938.00
5.1,66029.00
5.3,83088.00
5.9,81363.00
6.0,93940.00
6.8,91738.00
7.1,98273.00
7.9,101302.00
8.2,113812.00
8.7,109431.00
9.0,105582.00
9.5,116969.00
9.6,112635.00
10.3,122391.00
10.5,121872.00
Ok, you've stumbled through a whole pile of issues here.
It would help if you didn't give us error messages without the commands that produced them.
The first message means you were telling Octave to open something called filename and it couldn't find anything called filename. Did you define the variable filename? Your second command and the error message suggests you didn't.
Do you know what Octave's working directory is? Is it the same as where the file is located? From the response to your load commands, I'd guess not. The file is located at C:/Users/vaith/Desktop. Octave's working directory is probably somewhere else.
(Try the pwd command and see what it tells you. Use the file browser or the cd command to navigate to the same location as the file. help pwd and help cd commands would also provide useful information.)
The load command, used as a command (load file.txt) can take an input that is or isn't defined as a string. A function format (load('file.txt') or csvread('file.txt')) must be a string input, hence the quotes around file.txt. So all of your csvread input commands thought you were giving it variable names, not filenames.
Last, the fact that load couldn't read your data isn't overly surprising. Octave is trying to guess what kind of file it is and how to load it. I assume you tried help load to see what the different command options are? You can give it different options to help Octave figure it out. If it actually is a csv file though, and is all numbers not text, then csvread might still be your best option if you use it correctly. help csvread would be good information for you.
It looks from your data like you have a header line that is probably confusing the load command. For data that simply formatted, the csvread command can bring in the data. It will replace your header text with zeros.
So, first, navigate to the location of the file:
>> cd C:/Users/vaith/Desktop
then open the file:
>> mydata = csvread('Salary_Data.csv')
mydata =
0.00000 0.00000
1.10000 39343.00000
1.30000 46205.00000
1.50000 37731.00000
2.00000 43525.00000
...
If you plan to reuse the filename, you can assign it to a variable, then open the file:
>> myfile = 'Salary_Data.csv'
myfile = Salary_Data.csv
>> mydata = csvread(myfile)
mydata =
0.00000 0.00000
1.10000 39343.00000
1.30000 46205.00000
1.50000 37731.00000
2.00000 43525.00000
...
Notice how the filename is stored and used as a string with quotation marks, but the variable name is not. Also, csvread converted non-numeric header data to 'zeros'. The help for csvread and dlmread show you how to change it to something other than zero, or to skip a certain number of rows. If you want to preserve the text, you'll have to use some other input function.
I am having issues reading a .dat file into a dataframe. I think the issue is with the delimiter. I have included a screen shot of what the data in the file looks like below. My best guess is that it is tab delimited between columns and then new-line delimited between rows. I have tried reading in the data with the following commands:
df = CSV.File("FORCECHAIN00046.dat"; header=false) |> DataFrame!
df = CSV.File("FORCECHAIN00046.dat"; header=false, delim = ' ') |> DataFrame!
My result either way is just a DataFrame with only one column including all the data frome each column concatenated into one string. I tried to even specify the types with the following code:
df = CSV.File("FORCECHAIN00046.dat"; types=[Float64,Float64,Float64,Float64,
Float64,Float64,Float64,Float64,Float64,Float64,Float64,Float64]) |> DataFrame!
And I received an the following error:
┌ Warning: 2; something went wrong trying to determine row positions for multithreading; it'd be very helpful if you could open an issue at https://github.com/JuliaData/CS
V.jl/issues so package authors can investigate
I can work around this by uploading it into google sheets and then downloading a csv, but I would like to find a way to make the original .dat file work.
Part of the issue here is that .dat is not a proper file format—it's just something that seems to be written out in a somewhat human-readable format with columns of numbers separated by variable numbers of spaces so that the numbers line up when you look at them in an editor. Google Sheets has a lot of clever tricks built in to "do what you want" for all kinds of ill-defined data files, so I'm not too surprised that it manages to parse this. The CSV package on the other hand supports using a single character as a delimiter or even a multi-character string, but not a variable number of spaces like this.
Possible solutions:
if the files aren't too big, you could easily roll your own parser that splits each line and then builds a matrix
you can also pre-process the file turning multiple spaces into single spaces
That's probably the easiest way to do this and here's some Julia code (untested since you didn't provide test data) that will open your file and convert it to a more reasonable format:
function dat2csv(dat_path::AbstractString, csv_path::AbstractString)
open(csv_path, write=true) do io
for line in eachline(dat_path)
join(io, split(line), ',')
println(io)
end
end
return csv_path
end
function dat2csv(dat_path::AbstractString)
base, ext = splitext(dat_path)
ext == ".dat" ||
throw(ArgumentError("file name doesn't end with `.dat`"))
return dat2csv(dat_path, "$base.csv")
end
You would call this function as dat2csv("FORCECHAIN00046.dat") and it would create the file FORCECHAIN00046.csv, which would be a proper CSV file using commas as delimiters. That won't work well if the files contain any values with commas in them, but it looks like they are just numbers, in which case it should be fine. So you can use this function to convert the files to proper CSV and then load that file with the CSV package.
A little explanation of the code:
the two-argument dat2csv method opens csv_path for writing and then calls eachline on dat_path to read one line form it at a time
eachline strips any trailing newline from each line, so each line will be bunch of numbers separated by whitespace with some leading and/or trailing whitespace
split(line) does the default splitting of line which splits it on whitespace, dropping any empty values—this leaves just the non-whitespace entries as strings in an array
join(io, split(line), ',') joins the strings in the array together, separated by the , character and writes that to the io write handle for csv_path
println(io) writes a newline after that—otherwise everything would just end up on a single very long line
the one-argument dat2csv method calls splitext to split the file name into a base name and an extension, checking that the extension is the expected .dat and calling the two-argument version with the .dat replaced by .csv
Try using the readdlm function in DelimitedFiles library, and convert to DataFrame afterwards:
using DelimitedFiles, DataFrames
df = DataFrame(readdlm("FORCECHAIN00046.dat"), :auto)
I am trying to read a csv file in octave. The file contains a table with both numeric and text data. It also contains information of date and hour. In addition, the first line is in a different format then the rest of the lines since it contains titles.
The csvread can only read numeric data (according to Octave help), so I tried using xlsread as follows:
[NUMARR, TXTARR, RAWARR, LIMITS] = xlsread ('Line.csv')
I get only a matrix of NUMARR with numeric values. However, all other returned variables are empty- their dimension is 0x0.
How do I get all the text and all other information?
TX!
To solve this issue, open your CSV file in Windows notepad and save it as ANSI format instead of UNICODE.
I'd like to loop two csv files in Jmeter. I found this which is close, but I'd like the outer file to give me the CSV filename for the inner CSV.
So the outer file might have
filename
A
B
C
And this would lead to the inner loop looping
A.csv
B.csv
C.csv
When I try the technique referenced above, I get an error that the filename does not exist and I can see in the error that the problem is that jmeter is not substituting the variable in the filename for CSV data set under the Loop Controller. I suspect jmeter evaluates all the variables at a time when the variable introduced by the outer CSV file are not yet defined.
JMeter is not substituting a variable, but it will substitute a property.
Convert your variable into property and the approach will start working.
See Knit One Pearl Two: How to Use Variables in Different Thread Groups. guide to learn how you can do it and this SO answer for working JMeter script realizing alike scenario.
I am trying to import in Octave a file (i.e. data.txt) containing 2 columns of integers, such as:
101448,1077
96906,924
105704,1017
I use the following command:
data = load('data.txt')
However, the "data" matrix that results has a 1 x 1 dimension, with all the content of the data.txt file saved in just one cell. If I adjust the numbers to look like floats:
101448.0,1077.0
96906.0,924.0
105704.0,1017.0
the loading works as expected, and I obtain a matrix with 3 rows and 2 columns.
I looked at the various options that can be set for the load command but none of them seem to help. The data file has no headers, just plain integers, comma separated.
Any suggestions on how to load this type of data? How can I force Octave to cast the data as numeric?
The load function is not to read csv files. It is meant to load files saved from Octave itself which define variables.
To read a csv file use csvread ("data.txt"). Also, 3.2.4 is a very old version no longer supported, you should upgrade.