change charset from UTF-8 to ISO-8859-1 in mysql - mysql

I started working on a legacy mysql database whose collation: latin1-default but tables are utf-8-default. Even though tables are mentioned with utf-8 (universal standard encoding) it doesn't render Swedish characters. It seems application related to this database encoding is ISO-8859-1. So, I would like to convert this database and data in it to ISO-8859-1 encoding. I tried with this command
iconv -f UTF-8 -t ISO-8859-1 webtest_backu_01.sql > converted-file.sql
it gives error: illegal input sequence at position
any help is appreciated. thanks.

Please take a look at this link: http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html
You can use the alter table command to make this conversion per-table if it is possible. I used this before successfully.
Example from the link:
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
Also an important detail... Conversion may be lossy if the column contains characters that are not in both character sets... but I don't think ISO-8859-1 to UTF-8.
Give this a try for one of the tables and see if it works.

Related

MySQL Exporting Arabic/Persian Characters

I'm new to MySQL and i'm working on it through phpMyAdmin.
My problem is that i have imported some tables with (.sql) extension into a database with: UTF8_general_ci format and it contains some Arabic or Persian characters. However, when i export these data into an Excel file, they appear as the following:
The original value: أحمد الكمالي
The exported value: أحمد  الكمالي
I have searched and looked for this issue and tried to solve it by making the output and the server connection with the same format UTF8_general_ci. But, for some reason which i don't know, the phpMyAdmin doesn't allow me to change to the same format, it forces me to chose this: UTF8mb4_general_ci
Anyway, when i export the data, i'm making sure that the format is in UTF8 but it still appears like that.
How can i solve it or fix it?
Note: Here are some screenshots if you want to check organized by numbers.
http://www.megafileupload.com/rbt5/Screenshots.rar
I found easier way that you can rebuild excel file with correct characters.
Export your data from MySQL normally in CSV format.
Open new Excel and go to Data tab.
Select "From Text".if you not find this it is under "Get External Data".
Select your file.
Change file origin to Unicode(UTF-8) and select next.("Delimited" checked by default)
Select Comma delimiter and press finish.
you will see your language characters correctly.See more
Mojibake. Probably...
The bytes you have in the client are correctly encoded in utf8mb4 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8mb4.)
The column in the tables may or may not have been CHARACTER SET utf8mb4, but it should have been that.
(utf8 and utf8mb4 work equally well for Arabic/Persian.)
Please provide more details if this explanation does not suffice.

Using iconv to convert mysqldump-ed databases

Trying to quickly convert a latin1 mysql DB to utf8, I tried the following:
Dump the DB
run iconv -f latin1 -t utf8 on the resulting file
import into a fresh DB with UTF8 default encoding
This mostly works except... some letters get converted wrong (an example: uppercase accented 'U' becomes some garbled sequence starting with a question mark). Some conversion is taking place (od an a query result shows a two byte sequence where the latin1 byte was) and te latin1 version is alright. While I have so far been unsystematic in isolating the problem (late night; under deadline; etc.) the weirdness of the issue kills me: why would it fail on some letters and not all? Client connection? Column charset? Why I am not getting any diagnostics? I'm stymied.
Sure, I can work on isolating the issue and its details, but thought that maybe somebody ran into this already and can recognize it by this (admittedly rather poor) description.
Cheers
The data may have been stored as latin1 but it's possible that what ever client you used to dump the data has already exported it as UTF-8.
Open the dump file in a decent text editor (Notepad++, TextWrangler, Atom) and check which encoding allows all characters to be displayed properly.
Then when it comes to import the data back in, ensure your client is set to use UTF-8 on the import.
Don't use iconv, it only muddies the works.
Assuming that a table is declared to be latin1 and correctly contains latin1 bytes, but you would like to change it to utf8, do this to the table:
ALTER TABLE tbl CONVERT TO CHARACTER SET utf8;
It is also possible to do it with a dump and reload; it involves some changes to the arguments. Sorry I don't have the details.

Not able to display Chinese characters after loading it to Postgres DB

I have a source file which contains Chinese characters. After loading that file into a table in Postgres DB, all the characters are garbled and I'm not able to see the Chinese characters. The encoding on Postgres DB is UTF-8. I'm using the psql utility on my local mac osx to check the output. The source file was generated from mySql db using mysqldump and contains only insert statements.
INSERT INTO "trg_tbl" ("col1", "col2", "col3", "col4", "col5", "col6", "col7", "col7",
"col8", "col9", "col10", "col11", "col12", "col13", "col14",
"col15", "col16", "col17", "col18", "col19", "col20", "col21",
"col22", "col23", "col24", "col25", "col26", "col27", "col28",
"col29", "col30", "col31", "col32", "col33")
VALUES ( 1, 1, '与é<U+009D>žç½‘_首页&频é<U+0081>“页顶部广告ä½<U+008D>(946×90)',
'通æ <U+008F>广告(Leaderboard Banner)',
0,3,'',946,90,'','','','',0,'f',0,'',NULL,NULL,NULL,NULL,NULL,
'2011-08-19 07:29:56',0,0,0,'',NULL,0,NULL,'CPM',NULL,NULL,0);
What can I do to resolve this issue?
The text was mangled before producing that SQL statement. You probably wanted the text to start with 与 instead of the "Mojibake" version: 与. I suggest you fix the dump either to produce utf8 characters or hex. Then the load may work, or there may be more places to specify utf8, such as SET NAMES or the equivalent.
Also, for Chinese, CHARACTER SET utf8mb4 is preferred in MySQL.
é<U+009D>ž is so mangled I don't want to figure out the second character.

Change default charset to utf-8: mysql

I am developing an app in which I am using MySQL database. The database contains certain characters which can not be encoded to the client side & I found those values null.
Like, a string containing a special character is represented as null at the client side.
I found that the default charset for the db was latin1, I changed it to utf-8, including all tables and individual columns of those tables. Also in my pdo_construct I have mentioned the charset to be utf-8,
$db = new PDO('mysql:dbname=$dbname;host=$dbhost;charset=utf8',$dbname,$dbhost);
I also configured the response headers to use utf-8 charset. But the characters are still not encoded, I am still getting null string in case where the special character is present.
I tried changing the my.ini file configuration by setting the default charset, it gives me error in my connection file at PDO construct.
Its urgent for me to fix this! Can someone help?

UTF-8: showing correctly in database, however not in HTML despite utf-8 charset

I use MySQL 5.1 and loaded from a UTF-8 decoded txt-file about 2.7 mil lines into a table which itself is declared as utf8_unicode_ci and as well all char-fields are declared as utf8_unicode_ci, using LOAD DATA INFILE...
In the database itself the characters all seem to be correct, everything looks nice. However, when I print them using php, the characters show up as ???, although I use utf-8 declaration in the HTML head:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
...
In another table (using utf-8), where I inserted text from a submitted form, the characters appear strangely in the database, but are shown correctly again, when I print them using SELECT....
So, I was wondering: what is wrong? Are UTF-8 chars shown correctly in the database or strangely but when you SELECT them again they are OK? Or where is the problem (when loading the file into the db, in the HTML or somewhere in between)??
Thank you very much for any hint or suggestion! :)
Note: MySQL's utf8 charset is limited, it only supports Unicode characters in the BMP that take up no more than three bytes. You should be using utf8mb4 instead.
Make sure you send the SET NAMES utf8 SET NAMES utf8mb4 command to MySQL after connecting, before running any MySQL queries.
Make sure your page is actually rendered as utf-8 (if there's an HTTP header Content-Type: text/html;charset=iso-8859-1, browsers disagree about which should win).
Read this article: Handling Unicode Front To Back In A Web App (but remember to replace utf8 with utf8mb4 where MySQL is concerned).
If phpMyAdmin displays your entered data as correct Unicode text, then my bet is that you are not doing SET NAMES utf8 after connecting.
Try use such code after connecting to DataBase, but befor you recieve data
$db->query('set character_set_client=utf8');
$db->query('set character_set_connection=utf8');
$db->query('set character_set_results=utf8');
$db->query('set character_set_server=utf8');