I have a problem similar to this question. That is - I need to export some UTF8 data within a MySQL database to MS Excel.
The gotchas kindly Excel provides:
Excel opens UTF8 formatted CSV files as ANSCI, thus breaking
Excel will open tab-seperated UTF8 files correctly, but there is no support for linebreaks (my data has linebreaks, though in a worst-case scenario I might be able to loose these)
Excel will, apparently, open UTF-16LE (little endian) encoded CSVs OK. However, so far as I know, MySQL INTO OUTFILE does not accept content encoding argument, and just defaults to the database encoding (UTF8).
My web-app is PHP driven, but unfortunately I cannot use a PHP Excel-file-making library since the database is pretty large. All my exports must be done through MySQL.
If anybody knows how to make MySQL jump through Excel's hoops on this one, that would be great.
Many thanks,
Jack
Edit: This answer describes a solution that works for Excel 2007. Adding a 'BOM' to the file, which I may be able to do by providing the outputted file to the client via a PHP script that appends the BOM. Ideally I would like to find a solution that works in 2003 also.
For handling likebreaks I suggest adding:
FIELDS ENCLOSED BY '"'
A more complete example from mysql docs:
SELECT a,b,a+b INTO OUTFILE '/tmp/result.txt'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM test_table;
I recall running into this issue with Excel. The BOM fix does work for Excel 2007 and 2010. We also wanted to support 2003, however our solution was to just write XLS files instead of CSV files (using Java). That doesn't sound like an option for you since you're exporting from MySQL. One idea that comes to mind is to convert your UTF8 output to UTF-16LE after your export. This page explains how to do it with Perl.
Related
I am running an SQL statement to read in a CSV file (in this case only 1 column with a heading) and importing it into my database, where further manipulation to the data will occur through subsequent SQL statements.
However, I seem to be unable to load the CSV file into my DB both directly in MySQL Workbench and my PHP website locally as well as on another Mac in my network through the PHP website.
The interesting thing is the query appears to run successfully as I get no errors on any of the platforms or computers but no rows are affected.
I have done a lot of digging in trying to solve the problem. Here is my current SQL code and I will then talk through what I have tried.
LOAD DATA INFILE '/Users/Josh/Desktop/testcsv.csv'
INTO TABLE joinTest
FIELDS
TERMINATED BY ','
ENCLOSED BY '"'
LINES
TERMINATED BY '\r\n'
IGNORE 1 LINES
(interestName);
So this is me trying it in MySQL Workbench. In PHP I have an uploader and variable which stores the location of the tmp file. This works as have echo'd it out and all looks fine.
I've tried running it as
LOAD DATA INFILE
But it still doesn't affect any rows (runs successfully). I've also changed the TERMINATED BY in LINES to just \n but still will not affect any rows.
I can't understand why it is not affecting any rows as my CSV file is readable by all and should be in the correct format (created in Excel and saved as cvs format).
Does anyone know what the potential problem could be?
If any more info is required I will respond with it ASAP. Thanks.
Right so I discovered Mac uses different Line Endings to Unix & Windows. I opened the CSV in Sublime Text 3 and discovered there was an option to change the Line Endings in the View Options.
I set this to Unix, saved the file and the terminator of \n worked. Unfortunately Sublime text doesn't show the line endings as visible characters so this was purely by chance.
I hope this helps anyone else who runs into this issue, make sure the line endings of the CSV match the line endings you are specifying in your LOAD DATA query.
Currently, I have a CSV file with data in it. I want to turn it into a SQL table, so I can run SQL queries on it. I want the table to be within a web-based database that others in my organization can also access. What's the easiest way to go from CSV file to this end result? Would appreciate insight on setting the up database and table, giving others access, and getting data inside. Preferably PostgreSQL, but MySQL is fine too.
To create the table it depends on the number of columns you have. If you have only a few then do it manually:
CREATE TABLE <table name> (<variable name> <variable type (e.g. int or varchar(100)>, <etc.>)
If you have many columns you can open the csv file in excel and get 'SQL Converter for Excel' which will build a create statement for you using your column headings (and autodetect variable types too).
Loading data from a csv is also pretty straightforward:
LOAD DATA INFILE <filepath (e.g. 'C:/Users/<username>/Desktop/test.csv'>
INTO TABLE <table name>
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS; (Only use this line if you have column names included in the csv).
As for a web-based solution: https://cloud.google.com/products/cloud-sql/
That's a relatively open-ended question. A couple of noteworthy pointers off the top of my head:
MySQL allows you to store your data in different formats, one of them being CSV. That's a very straightforward solution if you're happy with it and don't mind a few limitations (see http://dev.mysql.com/doc/refman/5.0/en/csv-storage-engine.html).
Otherwise you can import your data into a table with a full-featured engine (see other answer(s) for details).
If you're happy with PostgreSQL and look for fully web based solution, have a look at Heroku.
There are a great many ways to make your data available through web services without accessing the back-end data store directly. Have a look at REST and SOAP for instance.
HTH
I have all my database/tables and columns set to UTF-8_general_ci collation set.
Conditions that I Faced :-
When I insert hindi data manually by phpmyadmin, I can see the the hindi characters in phpmyadmin, while question marks when seen on webpage generated by PHP
In the same table when I insert data by HTML/PHP Forms I see some unrecognizable words in english something like cc2faa;(something like this) and Correct Hindi on Webpage.
For the large data we have a script that reads from txt files and insert the data in the table in this , I see characters like जाना in phpmyadmin but Hindi On webpage.
Now the main problem is :-
Data has gone under changes online by forms and now I need this data to export to excel and give to the client but I am getting जाठin excel instead of Hindi Characters.
Note :-
All English characters are working fine and as it is everywhere.
My CHARACTER SET is utf8 For all tables.
I tried to change the collation to UTF-8_bin but that too doesn't helped me in anyway.
Encoding on the browser is UTF-8, and I have already sent the headers for UTF-8 encoding.
I have seen many posts about utf8 problem but no one seem to have this weird different behavior problem.
Please Do I have any rescue from this?? Or finally have to give the PHP reports of the data??
Please help!!
When I insert hindi data manually by phpmyadmin, I can see the the hindi characters in phpmyadmin, while question marks when seen on webpage generated by PHP
PHP probably generates the question marks because the encoding of the database connection is not utf-8. How to fix this depends on the database library you use; if you use MySQLi use mysqli_set_charset('utf8'), if PDO you add charset=utf8 to the DSN...
In the same table when I insert data by HTML/PHP Forms I see some unrecognizable words in english something like cc2faa;(something like this) and Correct Hindi on Webpage.
For the large data we have a script that reads from txt files and insert the data in the table in this , I see characters like जाना in phpmyadmin but Hindi On webpage.
These are likely caused by the same problem as above: the PHP forms and the script connect to the database using the default encoding, probably latin1. Then they insert utf-8 encoded text, but since MySQL thinks you are using latin1, it encodes the text into utf-8 again, and inserts this doubly encoded text into the table.
So: PHP sends "जाना" to MySQL telling it's latin1, and MySQL goes and converts it to utf-8, resulting in "जाना". Later PHP asks MySQL return the value, and since the connection is again using latin1, MySQL takes "जाना" and decodes it to latin1. Then PHP pretends that this latin1 string is actually utf-8 and displays "जाना".
Again, the solution is setting the encoding of the connection to utf-8. And this depends on what you use to access the database.
If you need to export your data as Excel file, use the PHP class php-export-data by Eli Dickinson, http://github.com/elidickinson/php-export-data. It is pretty nifty and so far I had have no problems exporting weird character sets with it.
I was importing an SQL dump file using the source command (source sql_file_name) and afterwards there were a bunch of weird characters in the columns' values. For example: â¢. How do I fix it? I'm using mysql 5.0.45. Thanks.
Probably someone used accented characters in the column names, which is always a bad idea.
When importing a file, MySQL needs to know in what encoding the file is. So you need to issue "SET NAMES utf8" before importing the file (or something else than utf8, just choose the right encoding).
If you don't use the correct encoding, your data will be screwed too.
We're often faced with the need to send a data file to one of our clients with data from the database he/she needs to translate. Most of the time this export is CSV or XLS.
Most of the time we create a csv dump with phpmyadmin and get an xls file in return with the translated data. The problem is that most of the time the data is UTF8 and when the file is returned as xls each and every time we load the data into mysql again we end up with utf8 problems, characters not being displayed properly, etc ...
We've already doublechecked everything in mysql from my.conf to column charactersets and everything is set correctly to UTF8.
My question is not how to fix the encoding issue since that's been solved but how we would best proceed in the future handling this situation? What export format should we hand over? How should we import (just mysql load data infile or our own processing scripts). What is the general consensus on how to handle this situation?
We would like to continue using excel if possible since that's the format almost everybody expects including our clients' translation agencies. Our clients' ease of use is the most important factor here, without overloading us with major issues each time. The best of both worlds :)
The application I am currently working on includes the functionality of data import as well. The data is mostly encoded in utf-8.
My approach is to preprocess the imported CSV (or tab delimited)(in any encoding) file to a correct utf-8 encoded temporary CSV file in client script (Python) and load the contents of the file using LOAD DATA INFILE statement.
The encoding of the file is controled by character_set_database system variable (the variable should be set on the server level) and starting from MySQL 5.1.17 can be overridden by the CHARACTER SET clause of the LOAD DATA INFILE.
The only thing one should know is that MySQL stores up to 3 bytes for each character instead of 4 (that might be a problem for orient languages).
To export lots of data efficienly you can use SELECT ... INTO OUTFILE statement.