I need to turn a csv into an arff file but when i try to do it through the ARFFViewer form weka I get the following error:
"java.io.IOException: wrong number of values. Read 5, expected 6, read Token[EOL], line 2 encountered line: 2"
I've investigated this and what I have found is that I have a comma at the end of each line in my csv, the problem here is that is not one comma, there are a bunch of commas and not the same quantity in each line of the file and i have 10.000 lines so what could I do here?
Example of csv line:
chicken,tropical fruit,domestic eggs,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
pot plants,domestic eggs,diapers,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
specialty bar,white bread,diapers,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Examples of other ending commas:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Try this in a VI editor :%s/,,//g Search for ,, and replace with
I have a query whose output I format and dump onto a CSV file.
This is the code I'm using,
(query.....)
INTO OUTFILE
"/tmp/dump.csv"
FIELDS TERMINATED BY
','
ENCLOSED BY
'"'
LINES TERMINATED BY
'\n'
;
However when I open the CSV in Google Sheets or Excel, the columns are broken up into hundreds of smaller ones.
When I open the CSV in a plain text editor, I see that the column values itself contain quotes (single and double), commas, line-breaks.
Only the double-quotes are escaped.
Even though the double-quotes are escaped, they are omitted when interpreted by Google Sheets and Excel.
I tried manually editing the CSV entries; escaping the commas and such. But no luck. The commas still break the columns. However, in a couple of instances they didn't break the column. I am not able to figure why though.
So my question is how do I correctly format the output to accommodate for these characters and dump it onto a CSV or even an XLXS ( in case a CSV is not capable for situations like these )?
For context, I'm operating in a WordPress environment. If there is a solution in PHP, that can work too.
EDIT ::
Here is a sample line from the CSV,
"1369","Blaze Pannier Mounts for KTM Duke 200 & 390","HTA.04.740.80200/B","<strong>Product Description</strong><span data-sheets-value=\"[null,2,"SW Motech brings you the Blaze Pannier Brackets for the Duke 200 & 390. "]\" data-sheets-userformat=\"[null,null,15293,[null,0],11]\">SW Motech brings you the Blaze Pannier Brackets for the Duke 200 & 390.</span>"," <strong>What's in the box? </strong><span data-sheets-value=\"[null,2,"2 Quick Lock SupportsnMounting materialnMounting Instructions"]\" data-sheets-userformat=\"[null,null,15293,[null,0],null,[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],null,0,1,0,null,[null,2,0],"calibri,arial,sans,sans-serif",11]\">2 Quick Lock SupportsMounting materialMounting Instructions</span> ","Installation Instructions"
From RFC 4180
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
Any double quotes inside fields enclosed with double quotes need to be escaped with another double quote. So given abc,ab"c," the expected formatting would be abc,"ab""c","""".
I want to open a csv file (saved from openoffice calc) in weka.
I keep getting an error: "wrong number of values. 140 read, 139 expected on line 3."
The csv was already fixed with quotes around the labels. And I count 140 values on the first lines.
What is wrong here?
Link to the file.
Turns out there was an value somewhere for beyond sight in the excel file I was exporting.
I noticed it because all the rows ended with a comma instead of nothing.
Carefully selected only the right reach, copied in a document and works.
Hope this helps somebody else as well.
I had the same error.!!!! I found the solution.
Just remove all the double-quote, single-quote from the .csv, .xls file.
i,e for eg. under the Name column if the value is "john" it throws an error. Make it to john by removing the quotes.
To remove all the quotes, go to the excel file FInd and replace box.
Find what - "
Replace with - (empty space)
I also went through the same problem when I was using Weka and importing a csv file.
The problem is with the wrong formatting of the file
In my file there was a word in one of the columns GOV'T what I just did was removed the "'" and wrote a whole word GOVERNMENT and it worked.
Hope this helps !!
I had the same error. Problem was a sigle quote character in a string value. Solution for me was to eclose the whole string value in double quotes.
So I have to convert
this: ...,Uncharted 3: Drake's Deception,...
to this: ...,"Uncharted 3: Drake's Deception",...
using weka v. 3.8.0
This is because of addition of extra column. So to get rid of that error, select whole of that column and delete that column.
That should work fine. :)
I also encountered with that error. My csv file contains floating numbers. I have solved that problem by replacing "," with "." .
For me all of the above worked. I replaced " ' , with space.
I had the same error before. I changed my .xls files without any blank ranks. Sometimes the Weka loaded too many "," . But if I clear the blank ranks than the Weka could be work.
If you have copied data from another file using Conrol+A, Control+C and control+V, you copied extra columns. if you open csv file in Nodepad you will see comma in the end of each row. you got this error because of the comma in the end of each row.
To avoid this error, press Control and select columns one by one then Control+C now copy it to new File which you will use in weka.
or you can use another method to avoid comma in the end of each row.
I encountered the same problem.
Replacing/ erasing all " and ' with space worked for me!