mysql unicode support - mysql

I'm trying to store unicode strings in a MySQL database (MySQL version 5.1.41-3ubuntu12.9 on an Ubuntu 10.04). This appears to work fine as long as I'm using the terminal to view the data. But, if I use MySQL Query Browser or Ruby on Rails to query the database all I get are garbage strings.
I've tried adding default-character-set = utf8 and character-set-server = utf8 to my my.cnf file and restarting MySQL, but that doesn't seem to help. My database.yml file has the line encoding: utf8 but I'm guessing this is not the issue considering the fact that I can't view the data properly in MySQL Query Browser either.
Any ideas on what to do?

I was having the exact same problem with no help from the internet. The data was showing the correct data in terminal window only but both MySQL Query Browser and PHP were showing strange characters.
Finally I realized that the data in the database was NOT saved using the correct character set. So I updated the code from where I was inserting and updating with the addition $link->set_charset('utf8'); where $link is the mysqli object.
Now the new data inserted is showing fine. I still don't understand why the MySQL Terminal was showing it correctly.
Anyway it works now!

Related

Saving non-english language in MySQL shows ?????

I am trying to save some Indian language content (read Hindi) from a website into a column in MySQL database. I am using SpringBoot and JPA to scrape the website and write the content.
Everything works fine on my local system (OSX). The same code is deployed on Ubuntu 16.04 LTS. It works fine except that the content of the relevant db column shows '????'. I presumed that it might be a problem with the COLLATION and CHARACTER SET of MySQL of the prod instance.
Indeed MYSQL version on my local machine was 8.x. The COLLATION on local db was utf8mb4_0900_ai_ci. The version on prod was 5.6, the COLLATION being utf8_x_x. Apparently, utf8mb4_0900_ai_ci, is not available for versions prior to 8.x.
I tried changing the COLLATION of columns (the entire schema for that matter) on local machine to utf8mb4_general_ci, as well as utf8mb4_unicode_ci. Both works fine on local DB. But the same collation, namely utf8mb4_general_ci & utf8mb4_unicode_ci still gives '?????' in the prod environment!
Could this be a problem other than 'COLLATION'? I am able to print the Hindi content on terminal, both on local and Ubuntu deployment. The problem is less likely with the client, as the same client (Datagrip) shows local schema content nicely and ubuntu DB content as cryptic question marks.
Have googled for two days. Any help would be appreciated.

I converted my database to utf-8 but my old data is still in windows-1251. How to convert it?

I want to convert my database to utf-8. What I have done up to now is set the server to read utf-8 and the database is converted by using this query:
ALTER DATABASE database_name CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Now all new information is seen and the things that were broken are now fine. The problem is that the old data is seen as �. This, in my opinion, is due to the fact that the old data is written in windows-1251 (I think at least and I am not 100% sure).
I found out that I need to dump the data:
mysqldump -uroot -p database -r utf8.dump
and then import it:
mysql -uroot -p --default-character-set=utf8 database
mysql> SET names 'utf8'
mysql> SOURCE utf8.dump
This is what I saw from here: https://makandracards.com/makandra/595-dumping-and-importing-from-to-mysql-in-an-utf-8-safe-way
The problem is that I have absolutely no idea where and how to do this.
All I have access is to the web hosting control panel and I have not set up anything on my computer. Therefore, I have no idea how to connect the database to the command shell and so on. What next steps should I do to convert the data to utf-8? Please, any detailed explanation would be great due to the fact that this is the first time for me doing something like this.
// I have a Mac and a Windows machine, but not a Linux at the moment.
Thank you!
The charset and collation of the database are the default for any subsequently created tables. The table setting are the defaults for columns.
For each table, so this:
ALTER TABLE table_name CONVERT TO utf8mb4;

Microsoft SQL DB to MySQL DB

I have seen tons of post on this, but I am not very familiar with the process and nothing has worked.
My basic problem is I am trying to get data from a Microsoft SQL database (Using Microsoft Management Studio 2008) to convert and use in a MySQL database (using MySQL Workbench)
I have tried to dump the .sql file but when I try to import into MySQL Workbench I get errors about it not being in the UTF-8 format. I tried several ways to get it to be that encoding (such as notepadd++ converting I saw suggested everywhere), but nothing seemed to work.
If I run it, I get this error:
ERROR: ASCII '\0' appeared in the statement, but this is not
allowed unless option --binary-mode is enabled and mysql is
run in non-interactive mode.
Again I looked up fixed for this, but could not get it.
I have tried a few options, just no success, so looking for some ideas or guidance with this area I do not have much experience in.
UPDATE:
The problem now is that the sql file I export from Management Studio is not accepted as a valid query in Workbench.
For example the brackets are not accepted and such.
I bet your encoding is not supported in MySQL WorkBench.
You can change the encoding when saving a .sql file.
Select File|Save.sql As to invoke the save as dialog.
Notice that the Save button on the lower right hand side has a drop down icon to indicate options.
Select the drop down icon and choose the "Save with Encoding" context memu item.
Select an encoding that works in MySQL Workbench.

After migrating db php returns latin1 charset, but in DB - cp1251

I need to transfer my mysql DB from windows server to Ubuntu server.
So i made export in phpmyadmin on win and imported *.sql file in linux.
In linux PMA all looks okay, tables are healthy, no errors, and charter set is (cp1251), russian data in tables looks how it needed.
But when i try to run select in php script there is only "???????" in result, and echo mysql_client_encoding() showing that charset is latin1.
Please, tell me where is latin1 can be seted?
Thanks for help.
UPD: I am using now mysql_set_charset('cp1251'); after each db connection, but its not an pefect solution. Maybe someone can offer other idea?

Loading UTF-8 encoded dump into MySQL

I've been pulling my hear out over this problem for a few hours yesterday:
I've a database on MySQL 4.1.22 server with encoding set to "UTF-8 Unicode (utf8)" (as reported by phpMyAdmin). Tables in this database have default charset set to latin2. But, the web application (CMS Made Simple written in PHP) using it displays pages in utf8...
However screwed up this may be, it actually works. The web app displays characters correctly (mostly Czech and Polish are used).
I run: "mysqldump -u xxx -p -h yyy dbname > dump.sql". This gives me an SQL script which:
looks perfect in any editor (like Notepad+) when displaying in UTF-8 - all characters display properly
all tables in the script have default charset set to latin2
it has "/*!40101 SET NAMES latin2 */;" line at the beginning (among other settings)
Now, I want to export this database to another server running on MySQL 5.0.67, also with server encoding set to "UTF-8 Unicode (utf8)". I copied the whole CMS Made Simple installation over, copied the dump.sql script and ran "mysql -h ddd -u zzz -p dbname < dump.sql". After that, all the characters are scrambled when displaying CMSMS web pages.
I tried setting:
SET character_set_client = utf8;
SET character_set_connection = latin2;
And all combinations (just to be safe, even if it doesn't make any sense to me): latin2/utf8, latin2/latin2, utf8/utf8, etc. - doesn't help. All characters still scrambled, however sometimes in a different way :).
I also tried replacing all latin2 settings with utf8 in the script (set names and default charsets for tables). Nothing.
Are there any MySQL experts here who could explain in just a few words (I'm sure it's simple after all) how this whole encoding stuff really works? I read 9.1.4. Connection Character Sets and Collations but found nothing helpful there.
Thanks,
Matt
Did you try adding the --default-character-set=name option, like this:
mysql --default-character-set=utf8 -h ddd -u zzz -p dbname < dump.sql
I had that problem before and it worked after using that option.
Hope it helps!
Ugh... ok, seems I found a solution.
MySQL isn't the culprit here. I did a simple dump and load now, with no changes to the dump.sql script - meaning I left "set names latin2" and tables charsets as they were. Then I switched my original CMSMS installation over to the new database and... it worked correctly. So actually encoding in the database is ok, or at least it works fine with CMSMS installation I had at my old hosting provider (CMSMS apparently does funny things with characters encoding).
To make it work on my new hosting provider, I actually had to add this line to lib/adodb/drivers/adodb-mysql.inc.php in CMSMS installation:
mysql_query('set names latin2',$this->_connectionID);
This is a slightly modified solution from this post. You can find the exact line there as well. So it looks like mysql client configuration issue.
SOLUTION for me:
set this option in your php file, after mysql_connect (or after mysql_select_db)..
mysql_query("SET NAMES 'utf8'");