why excel export csv doesn't quote single quote? - mysql

When I export the data from excel as a CSV format it encapsulates some data in double quotes.
E.g.
8" becomes "8""". And I believe this operation is trying to get the database to understand the inside quote later on.
but for single quote 8', it keeps the same and this causes problem(see the pic below) while I am importing the csv.
Why not quoting the 8' into "8'" too?
8' becomes ' while importing, while "8'" will result in 8' And not quoting single quote leads some data loss.
question related:
what does quotechar mean in mysql while importing data?
Excel adds extra quotes on CSV export

After doing so many experiments, I finally found a pretty close answer.
Conclusion first:
It is Mysql Workbench's problem. Its import wizard works badly. I test under Navicat for every test data, and Navicat get all things right.
Single quote can caused unexpected behavior.
Test:
By default, Mysql workbench import wizard takes the first row value as column name, while in Navicat, I can configure that.
(All tests files are excel-exported csv utf8 encoding.)
test1:
e.g.: 8'(only 1 record), without column name.
Mysql: Pop out some unknown error and whatever configuration you change, can't get the original data.
Navicat:works fine.
test2:
e.g.: 8' , only 1 record with column name or have extra records without column name
Mysql: Can handle single quote properly.
Navicat: No problem.
test3:
If single quote exists, for most situation import wizard can't handle double quotes well.
e.g.: Single quote data comes before double quotes data.
Mysql: Fails totally.
Navicat: No problem.

Related

what does quotechar mean in mysql while importing data?

Someone sent me a xlsx excel file and I opened it with excel and saved as csv with utf8 encoding.
I use mysql workbench import wizard to import an excel-made utf8 csv file to a database table. But the imported result missed some data (less than it should have).
And I think it has something to do with the quotechar.
By default the quotechar is double quote but I have some data like this (mixing single quote and double quote):
8'10" foo bar
4" x 6" foo foo bar
I've try to omit the value but it can't (see the error from the pic).
So here I want to figure out:
What does quotechar mean here? How does it work? Why does it matter? Can't it just import everything from the csv file?
How can I import the data correctly while my data mixes single quote and double quote (later I need to retrieve them and use as search keywords, so it'd be better to keep the original form)?
my data looks like this in excel:
You are going to export your data from Excel as a CSV, I assume, so how this looks in Excel is irrelevant.
When you export the data from excel as a CSV format it's going to encapsulate your data in double quotes. Any double quotes in the data itself is going to be escaped by a second double quote automatically by excel.
As an example, if your data is:
8"
When you export it will be:
"8"""
You have to tell Mysql that you are enclosing strings in character ". That is the quotechar it's talking about. It's the second field on that form you are filling out.
I'm not sure how picky MySQl is going to be here since I haven't imported CSV to Mysql in forever and ever and ever. The trick with the Excel CSV output is that if you have data like:
8"
8'
It will output it as CSV:
"8"""
8'
The second record/field doesn't gain the double quote encapsulation since it doesn't contain a character that requires encapsulation (A double quote, a carriage return, or a line feed).
Mysql might choke on that second record (Hopefully it's import process is robust enough to handle encapsulated and non-encaps'd fields though)

How to have column with character value equal to the enclosing character value in mysql load data in file

I'm using mysqlimport,which uses LOAD DATA INFILE command. My question is the following: Assume I have --fields-enclosed-by='"', and that I have column with values which have double quoted string, such as "5" object" (which stands for 5 inches). The problem is that when mysql encounter the double quote string after the 5, it treats it as the enclosing character, and things are messed up. How to use mysqlimport with such values? I don't want to just use another character to enclosing, because this other character as well may occur in the data. So what is a general solution for this?
I guess it is will be different this way to import csv.
To solve above issue in another way,
Export or get or convert old data into sql format rather than csv format.
Import the same sql data using mysql command line tool.
mysql -hservername -uusername -p'password' dbname < 'path to you sql imported file.sql'

Cannot import csv into mysql database using phpmyadmin wizard

I am trying to import a csv file into my mysql database using phpmyadmin but keep getting errors.
Here is how the csv looks:
Then I import like this:
And get the error: "Invalid parameter for CSV import: Fields enclosed by". I have tried to put the columns in quotes " or put a semicolon after each column, but keep getting errors.
Yeah, you have an extra field in there. For instance, with your example line of:
itemId,date,description,amount
,1,2/13/2013,Fabrics,44
the date maps to "description" because of the leading comma, which basically gives an empty (or null, depending on how the import is handled) value to itemId, which doesn't seem to be what you want. Where'd that extra comma come from -- was this an export from some program?
Also, in this case you don't have anything enclosing the fields so you should just be able to leave that value empty, which seems to have worked for you once you got the column count corrected.
I had to remove the first line of the csv (containing the column names) and that solved the issue. Everything got imported properly.
Note, the date field needed reformatting to match SQL's date format yyyy-mm-dd.

Lose data in random fields when importing from file into table using phpmyadmin

I have an access DB. I exported tables to xlsx. Then I saved as .ods using openOffice
because I found out that phpmyadmin-mysql no longer supports excel files. I have my mySQL database formated exactly as it should to accept the data. I import and everything seems fine except one little detail.
In some fields, the value is NULL instead of the value it should have according to the .ods file. Some rows show the same value for that field correctly, some show NULL.
Also, the "faulty" rows have some fields that show the value 0 for fields that where empty in the imported file (instead of NULL). Default value for those fields in mySQL is NULL. Each row has many fields like that and all of the same data type (tinyint). Some appear correctly NULL and some have the value 0....
I can't see a pattern on all these.
Any help is appreciated.
Check to see that imported strings have ("") quotes and NULL do not and that all are separated appropriately, usually a "," comma with the record/row delimited by ";" semicolon. Best way to check what the MySQL is looking for is to export some existing data to the same format and check it against what you are trying to import. One little missed quote and the deal is off. Be consistent in the use of either double " quotes or single ' quotes. also the ` character is not used as I think. If you are "squishing" your data through an application that applies "smart quotes" like MS word does or "Open Office??' this too can cause issues. Add the word NULL either inside or without quotes in your csv import where values appropriate.

Using Excel to create a CSV file with special characters and then Importing it into a db using SSIS

Take this XLS file
I then save this XLS file as CSV and then open it up with a text editor. This is what I see:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
I see that the double quote character in column C was stored as AB""C, the column value was enclosed with quotations and the double quote character in the data was replaced with 2 double quote characters to indicate that the quote is occurring within the data and not terminating the column value. I also see that the value for column G, 3,2, is enclosed in quotes so that it is clear that the comma occurs within the data rather than indicating a new column. So far, so good.
I am a little surprised that all of the column values are not enclosed by quotes but even this seems reasonable OK when I assume that EXCEL only specifies column delimieters when special characters like a commad or a dbl quote character exists in the data.
Now I try to use SQL Server to import the csv file. Note that I specify a double quote character as the Text Qualifier character.
And a command char as the Column delimiter character. However, note that SSIS imports column 3 incorrectly,eg, not translating the two consecutive double quote characters as a single occurence of a double quote character.
What do I have to do to get Excel and SSIS to get along?
Generally people avoid the issue by using column delimiter chactacters that are LESS LIKELY to occur in the data but this is not a real solution.
I find that if I modify the file from this
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
...to this:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB"C","D,E",F,03,"3,2"
i.e, removing the two consecutive quotes in column C's value, that the data is loaded properly, however, this is a little confusing to me. First of all, how does SSIS determine that the double quote between the B and the C is not terminating that column value? Is it because the following characters are not a comma column delimiter or a row delimiter (CRLF)? And why does Excel export it this way?
According to Wikipedia, here are a couple of traits of a CSV file:
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
However, it looks like SSIS doesn't like it that way when importing. What can be done to get Excel to create a CSV file that could contain ANY special characters used as column delimiters, text delimiters or row delimiters in the data? There's no reason that it can't work using the approach specified in Wikipedia,. which is what I thought the old MS DTS packages used to do...
Update:
If I use Notepad change the input file to
Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8
"1","ABC","AB""C","D,E","F","03","3,2","AB""C"
Excel reads it just fine
but SSIS returns
The preview sample contains embedded text qualifiers ("). The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Conclusion:
Just like the error message says in your update...
The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Confirmed bug in Microsoft Connect. I encourage everyone reading this to click on this aforementioned link and place your vote to have them fix this stinker. This is in the top 10 of the most egregious bugs I have encountered.
Do you need to use a comma delimiter.
I used a pipe delimiter with no Text qualifier and it worked fine. Here is my output form the text file.
1|ABC|AB"C|D,E|F|03|3,2
You have 3 options in my opinion.
Read the data into a stage table.
Run any update queries you need on the columns
Now select your data from the stage table and output it to a flat file.
OR
Use pipes are you delimiters.
OR
Do all of this in a C# application and build it in code.
You could send the row to a script in SSIS and parse and build the file you want there as well.
Using text qualifiers and "character" delimited fields is problematic for sure.
Have Fun!