Weka csv to arff special characters caue error - csv

I'm new to Weka and having problems converting a CSV file containing Tweets into an Arff file.
The CSV looks like this
Tweet,Class
Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6,Yes
When I try to convert to .arff using Explorer, I receive the following error
"...not recognized as an CSV data files Reason: wrong number of values. Read 1 expected 2, read token[EOF], line 2"
Removing the "%" character allows the file to convert to arff without error. I could remove "%" and other characters but I really don't want to alter my Tweet data. Enclosing in single or double quotes does not help either. Any idea what I am doing wrong?
Appreciate any help

Weka may interprete "%" as a begining of comment, and may ignore "%" and rest of that line.
Please enclose entire field ,which contains character "%", with quotation marks (both of single quote "'" and doubel quote '"' work well).
For Example:
A csv file which contents following two lines, may be able to convert to Arff file by Weka.
Tweet,Class
"Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6",Yes
P.S.
I'm sorry that my previous answer is incorrect.
PRIVIOUS ANSWER (Incorrect answer) was:
Try to replace "%" character to "\%".
"\" works as escape character, so "\" makes the comment-delimiter character "%" to a normal character "%".

Related

How to read data from a .csv file and store it in a table in openedge

Could someone help me find a way to read data from a .csv file and then store it in a table in openedge.
INPUT FROM ‘c:\sample.csv’.
REPEAT:
CREATE customer.
IMPORT DELIMITER "," cust-num name sales-rep.
END.
OUTPUT CLOSE.
This is the code that I tried but its not getting executed!
The "\" is an "escape" character. Escape the escape by doubling it or (preferably) by using the alternate escape of "~".
Input from 'c:~\sample.csv'.
INPUT FROM c:\sample.csv.
REPEAT:
CREATE customer.
IMPORT DELIMITER "," customer.cust-num customer.name customer.sales-rep.
END.
OUTPUT CLOSE.
Remove the quotes from around your file name. Escape characters aren't needed for the blackslash since you are running on Windows and not Unix.
If you need to use a variable for the file name then you would use INPUT FROM VALUE(myvariable).

i can't escape commas using weka explorer

i have a .csv file which contains some sentences like this :
hi my name is Lorenzo, i want to solve this problem.
You can notice that there is a comma inside the sentence,i've tried putting it with
"," \, "\," ',' "'","'"
none of this worked...
the error that weka launch when i keep the sentence with comma is
wrong number of values. Read 2, expected 1

MySQL importing CSV file with phpmyadmin without cell quotes

I have a huge CSV file with the data like this:
ID~Name~Location~Price~Rating
02~Foxtrot~Scotland~~9
08~Alpha~Iceland~9.90~4
32~ForestLane~Germany~14.35~
The issue is that when importing using PHPMyAdmin, it asks for Columns enclosed with: and Columns escaped with:. The trouble is, that this CSV doesn't have quotes for the cells.
If I leave this blank, it gives the error: Invalid parameter for CSV import: Columns escaped with
Is there a way to import without having quotes on the CSV?
I can reproduce this behavior. I'll bring it up on the phpMyAdmin development discussion list, but in the meantime, you can can work around it by using some nonsense character for "Columns escaped with" and leaving "Columns enclosed with" blank. Make sure your data doesn't contain, say a " or £ and use that for "Columns escaped with". For instance, I have a data set where I know £ doesn't exist, so I can use that for the "Columns escaped with" character -- if you don't have any escaped characters, you can enter any character there.
I'll update if I can provide any more useful information, but certainly that workaround should allow you to import your data.

Wrong number of values when importing csv in Weka

I want to open a csv file (saved from openoffice calc) in weka.
I keep getting an error: "wrong number of values. 140 read, 139 expected on line 3."
The csv was already fixed with quotes around the labels. And I count 140 values on the first lines.
What is wrong here?
Link to the file.
Turns out there was an value somewhere for beyond sight in the excel file I was exporting.
I noticed it because all the rows ended with a comma instead of nothing.
Carefully selected only the right reach, copied in a document and works.
Hope this helps somebody else as well.
I had the same error.!!!! I found the solution.
Just remove all the double-quote, single-quote from the .csv, .xls file.
i,e for eg. under the Name column if the value is "john" it throws an error. Make it to john by removing the quotes.
To remove all the quotes, go to the excel file FInd and replace box.
Find what - "
Replace with - (empty space)
I also went through the same problem when I was using Weka and importing a csv file.
The problem is with the wrong formatting of the file
In my file there was a word in one of the columns GOV'T what I just did was removed the "'" and wrote a whole word GOVERNMENT and it worked.
Hope this helps !!
I had the same error. Problem was a sigle quote character in a string value. Solution for me was to eclose the whole string value in double quotes.
So I have to convert
this: ...,Uncharted 3: Drake's Deception,...
to this: ...,"Uncharted 3: Drake's Deception",...
using weka v. 3.8.0
This is because of addition of extra column. So to get rid of that error, select whole of that column and delete that column.
That should work fine. :)
I also encountered with that error. My csv file contains floating numbers. I have solved that problem by replacing "," with "." .
For me all of the above worked. I replaced " ' , with space.
I had the same error before. I changed my .xls files without any blank ranks. Sometimes the Weka loaded too many "," . But if I clear the blank ranks than the Weka could be work.
If you have copied data from another file using Conrol+A, Control+C and control+V, you copied extra columns. if you open csv file in Nodepad you will see comma in the end of each row. you got this error because of the comma in the end of each row.
To avoid this error, press Control and select columns one by one then Control+C now copy it to new File which you will use in weka.
or you can use another method to avoid comma in the end of each row.
I encountered the same problem.
Replacing/ erasing all " and ' with space worked for me!

Openoffice - CSV-export: is there a default escape-charcter?

As far as I can see OpenOffice, when it comes to save a file as a csv-file, encloses all strings in quote-characters.
So is there any need for an escape character?
and related to this question:
Does OpenOffice have a default escape character?
I'm also wondering if there is a way to choose the escape character when saving OpenOffice as csv. phpmyadmin was not accepting a 9,000 line 50+ column spreadsheed in .ods format and there doesn't seem to be a way to choose the escape character when saving as CSV.
So I had to save as csv, open in word, and use some find/replace tricks to change the escape character to \ (back slash). Default is to use double quotes to escape double quotes, and phpmyadmin won't accept that format.
To properly convert the file to use \ (back-slash) to escape double-quotes, you have to do this:
Pick a placeholder character string, e.g. 'abcdefg', that does
not occur anywhere in the csv.
Find/replace """ (three double-quotes in a row) with the placeholder. This is to prevent possibly incorrect results in the next step.
Find/replace "" (two quotes in a row, representing one quote that should be escaped), with \" (back-slash double-quote). If you did this without find/replacing """ it's conceivable you could get a result like "\" instead of \"". Better safe than sorry.
Find/replace the placeholder string with \"" (back-slash double-quote double-quote).
That will work, unless you happen to have more than one double-quote in a row in your original text fields, which would result in as many as five double-quotes in a row in the resulting .ods or .xlsx csv file (two double-quotes for each escaped double quote, plus another double quote if its at the end of the field).
Escaping in quotes makes life easier for tools parsing the CSV file.
In a recent version of LibreOffice (3.4.4), the CSV export was not handled correctly by phpMyAdmin. Since LibreOffice doesn't provide an escape character, the phpMyAdmin's default "CSV" import feature "Columns escaped with:" didn't work well. The data was always inconsistent.
However, using the option CSV using LOAD DATA did work, only if the value in Columns escaped by option was removed. I presume phpMyAdmin uses the default MySQL LOAD DATA command, and thus the control is passed to MySQL for data processing. In my scenario it resulted in accurate data import.