i can't escape commas using weka explorer - csv

i have a .csv file which contains some sentences like this :
hi my name is Lorenzo, i want to solve this problem.
You can notice that there is a comma inside the sentence,i've tried putting it with
"," \, "\," ',' "'","'"
none of this worked...
the error that weka launch when i keep the sentence with comma is
wrong number of values. Read 2, expected 1

Related

Exporting data from SSRS to a .csv file adds lots of quotation marks how do I get just one set?

I have a report which is just a simple SELECT statement which generates a list of columns full of data. I want this data to be exported as a CSV file with each datum being enclosed in " quotation marks. I have created a table and used this as my expression
=""""+Fields!Activity_Code.Value+""""
When I run the report inside ReportBuilder 3.0 I get exactly what I'm looking for
No headers and each datum has quotation marks, perfect.
But when I hit export to csv, and then open with notepad I see this.
The headers are in there where they shouldn't be and each datum has 3 quotation marks on each side. What am I doing wrong?
This is perfectly normal.
When csv fields contain a separator or double quotes, the fields are enclosed in double quotes and the quotes inside the fields are escaped with another quote.
Example - the fields:
123
"27" monitor"
456
become:
123,"""27"" monitor""",456
or:
"123","""27"" monitor""","456"
A csv reader/parser should handle this correctly when reading the data (or you could provide a parameter telling the parser that the fields are quoted).
On the other hand, if you just want your fields to be quoted inside the csv (and not visible after opening the file), you can tell the csv generator to quote the fields (or in this case do nothing since the generator seems to be adding quotes already).

Removing bunch of commas from the end of all lines in CSV

I need to turn a csv into an arff file but when i try to do it through the ARFFViewer form weka I get the following error:
"java.io.IOException: wrong number of values. Read 5, expected 6, read Token[EOL], line 2 encountered line: 2"
I've investigated this and what I have found is that I have a comma at the end of each line in my csv, the problem here is that is not one comma, there are a bunch of commas and not the same quantity in each line of the file and i have 10.000 lines so what could I do here?
Example of csv line:
chicken,tropical fruit,domestic eggs,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
pot plants,domestic eggs,diapers,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
specialty bar,white bread,diapers,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Examples of other ending commas:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Try this in a VI editor :%s/,,//g Search for ,, and replace with

formatting MySQL output to valid CSV or XLSX

I have a query whose output I format and dump onto a CSV file.
This is the code I'm using,
(query.....)
INTO OUTFILE
"/tmp/dump.csv"
FIELDS TERMINATED BY
','
ENCLOSED BY
'"'
LINES TERMINATED BY
'\n'
;
However when I open the CSV in Google Sheets or Excel, the columns are broken up into hundreds of smaller ones.
When I open the CSV in a plain text editor, I see that the column values itself contain quotes (single and double), commas, line-breaks.
Only the double-quotes are escaped.
Even though the double-quotes are escaped, they are omitted when interpreted by Google Sheets and Excel.
I tried manually editing the CSV entries; escaping the commas and such. But no luck. The commas still break the columns. However, in a couple of instances they didn't break the column. I am not able to figure why though.
So my question is how do I correctly format the output to accommodate for these characters and dump it onto a CSV or even an XLXS ( in case a CSV is not capable for situations like these )?
For context, I'm operating in a WordPress environment. If there is a solution in PHP, that can work too.
EDIT ::
Here is a sample line from the CSV,
"1369","Blaze Pannier Mounts for KTM Duke 200 & 390","HTA.04.740.80200/B","<strong>Product Description</strong><span data-sheets-value=\"[null,2,"SW Motech brings you the Blaze Pannier Brackets for the Duke 200 & 390. "]\" data-sheets-userformat=\"[null,null,15293,[null,0],11]\">SW Motech brings you the Blaze Pannier Brackets for the Duke 200 & 390.</span>"," <strong>What's in the box? </strong><span data-sheets-value=\"[null,2,"2 Quick Lock SupportsnMounting materialnMounting Instructions"]\" data-sheets-userformat=\"[null,null,15293,[null,0],null,[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],[null,[[null,2,0,null,null,[null,2,13421772]],[null,0,0,3],[null,1,0,null,1]]],null,0,1,0,null,[null,2,0],"calibri,arial,sans,sans-serif",11]\">2 Quick Lock SupportsMounting materialMounting Instructions</span> ","Installation Instructions"
From RFC 4180
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
Any double quotes inside fields enclosed with double quotes need to be escaped with another double quote. So given abc,ab"c," the expected formatting would be abc,"ab""c","""".

Weka csv to arff special characters caue error

I'm new to Weka and having problems converting a CSV file containing Tweets into an Arff file.
The CSV looks like this
Tweet,Class
Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6,Yes
When I try to convert to .arff using Explorer, I receive the following error
"...not recognized as an CSV data files Reason: wrong number of values. Read 1 expected 2, read token[EOF], line 2"
Removing the "%" character allows the file to convert to arff without error. I could remove "%" and other characters but I really don't want to alter my Tweet data. Enclosing in single or double quotes does not help either. Any idea what I am doing wrong?
Appreciate any help
Weka may interprete "%" as a begining of comment, and may ignore "%" and rest of that line.
Please enclose entire field ,which contains character "%", with quotation marks (both of single quote "'" and doubel quote '"' work well).
For Example:
A csv file which contents following two lines, may be able to convert to Arff file by Weka.
Tweet,Class
"Conference Update: 50% Off Registration to End .. http://t.co/nZtkSzZnJ6",Yes
P.S.
I'm sorry that my previous answer is incorrect.
PRIVIOUS ANSWER (Incorrect answer) was:
Try to replace "%" character to "\%".
"\" works as escape character, so "\" makes the comment-delimiter character "%" to a normal character "%".

Wrong number of values when importing csv in Weka

I want to open a csv file (saved from openoffice calc) in weka.
I keep getting an error: "wrong number of values. 140 read, 139 expected on line 3."
The csv was already fixed with quotes around the labels. And I count 140 values on the first lines.
What is wrong here?
Link to the file.
Turns out there was an value somewhere for beyond sight in the excel file I was exporting.
I noticed it because all the rows ended with a comma instead of nothing.
Carefully selected only the right reach, copied in a document and works.
Hope this helps somebody else as well.
I had the same error.!!!! I found the solution.
Just remove all the double-quote, single-quote from the .csv, .xls file.
i,e for eg. under the Name column if the value is "john" it throws an error. Make it to john by removing the quotes.
To remove all the quotes, go to the excel file FInd and replace box.
Find what - "
Replace with - (empty space)
I also went through the same problem when I was using Weka and importing a csv file.
The problem is with the wrong formatting of the file
In my file there was a word in one of the columns GOV'T what I just did was removed the "'" and wrote a whole word GOVERNMENT and it worked.
Hope this helps !!
I had the same error. Problem was a sigle quote character in a string value. Solution for me was to eclose the whole string value in double quotes.
So I have to convert
this: ...,Uncharted 3: Drake's Deception,...
to this: ...,"Uncharted 3: Drake's Deception",...
using weka v. 3.8.0
This is because of addition of extra column. So to get rid of that error, select whole of that column and delete that column.
That should work fine. :)
I also encountered with that error. My csv file contains floating numbers. I have solved that problem by replacing "," with "." .
For me all of the above worked. I replaced " ' , with space.
I had the same error before. I changed my .xls files without any blank ranks. Sometimes the Weka loaded too many "," . But if I clear the blank ranks than the Weka could be work.
If you have copied data from another file using Conrol+A, Control+C and control+V, you copied extra columns. if you open csv file in Nodepad you will see comma in the end of each row. you got this error because of the comma in the end of each row.
To avoid this error, press Control and select columns one by one then Control+C now copy it to new File which you will use in weka.
or you can use another method to avoid comma in the end of each row.
I encountered the same problem.
Replacing/ erasing all " and ' with space worked for me!