I recently switched my MariaDB database to UTF-8 from Latin1. Read a bunch of checklists and carefully updated my character set, collation, my.cnf and php.ini. I have php forms for most of my data entry on the site, but sometimes for quick little changes, it's easier to go into a program like Adminer or phpmyadmin.
With the UTF-8 in place, I wanted to change director Alfonso Cuaron's name to Cuarón. I went to his entry in Adminer. Edit. Cuar[alt+0243]n. It showed in the edit box as Cuarón. But when I saved the change, Adminer showed it as Cuarón. Okay. Looked at page info in Firefox. Says the character encoding of the page is UTF-8. So all should be well, right?
I went to one of my php data entry forms and created a Bob Cuarón. It showed up fine.
I SSH'd into the server fired up a mysql command line and ran an update sql line with Cuarón. That worked. But trying to change it in Adminer still kept giving me Cuarón. I installed phpmyadmin (which was giving me some issues with my nginx config) but I was able to edit his name and...sigh...it too gave me Cuarón. I installed SQLbuddy and...success...I was able to make the changes, but the program is lacking some of the things I need, like the ability to edit search results.
I'm sure I've nailed everything down:
nginx.conf:
charset UTF-8;
my.cnf:
[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
/etc/php5/fpm/php.ini
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8
SHOW VARIABLES LIKE "%character_set%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
I can't see what I could be missing. Both Adminer and phpmyadmin handle UTF-8 so I don't know why it's not working. It worked right out of the box with SQLBuddy, but as I said it's missing some features.
Any thoughts where I should look?
UPDATE: turns out that an article I had read (forgot to bookmark, sorry) on UTF-8 migrations had me change some of the mbstring setting away from PHP's defaults. Someone at adminer noticed that and all was good. See their response here:
https://sourceforge.net/p/adminer/discussion/960418/thread/33595373/#42df
In MySQL, how to change a variable such as character_set_client?
mysql> show variables like 'character_set%';
-------------------------+-------
character_set_client | latin1
to obtain
character_set_client | utf8
When starting MySQL client you have to specify --default-character-set=charset_name
From manual:
Use charset_name as the default character set for the client and
connection.
A common issue that can occur when the operating system uses utf8
or another multi-byte character set is that output from the mysql
client is formatted incorrectly, due to the fact that the MySQL
client uses the latin1 character set by default. You can usually
fix such issues by using this option to force the client to use the
system character set instead.
For example:
$>mysql -uUser -pPassword --default-character-set=utf8
For an example of how to set it via connection string see here.
I'm currently attempting to switch from my shared inmotionhosting account (have received AWEFUL service lately) to an Amazon EC2 server that I've set up. I'm having trouble with getting the encryption function working in the EC2 account.
In my PHP code, all text gets encrypted by mcrypt before being put into the SQL. I have deduced that those mcrypt characters are responsible for all my queries throwing errors. (I know it's because of encoding issues, but Google searches on the subject aren't very clear on where I need to focus my attention.)
A more simplified way of explaining the problem. On my new hosting account this SQL query doesn't work:
UPDATE mydatabase.clients SET firstname='\'å».”é¶Q' WHERE id_client=65
But this does
UPDATE mydatabase.clients SET firstname='Test' WHERE id_client=65
So that tells me the mcrypt function is using characters that the SQL database doesn't understand and thus the queries aren't working.
Some other info for you...
When I run "SHOW VARIABLES LIKE 'character_set_%'" on the working database I get this:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
When I do that on the nonworking database I get:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
I saw the difference in character_set_database and ran this line of code:
ALTER DATABASE mydatabase DEFAULT CHARACTER SET latin1
It successfully changed the character_set_database to "latin1" to match the other, but didn't solve the problem.
Finally, all my columns in my tables are using the Collation "latin1_swedish_ci"
Any help you could give would be very very appreciated!
Store your encrypted strings as binary (or a similar) type. Also make sure you are escaping the encrypted string. Both are important parts to doing this right!
I've been working with MySQL and Mcrypt and I store my encrypted data and initialization vectors as binary and I escape all of these strings before they get put in a query. Works like a charm.
I have a mysql dump generated from phpmyadmin in a windows environment,
when i try to import in osx (using mysql command line) there are encoding
problems, the databases have the same encoding and collation.
I've also noticed that this problems occurs also when i try to import a
diferent database from a unix virtual machine.
When i try to import the same databases in the Windows with the same commands everything is ok.
Anyone have a ideia about whats going on?
Thanks.
Both databases have the following configuration:
character_set_client | latin1
character_set_connection | latin1
character_set_database | utf8
character_set_filesystem | binary
character_set_results | latin1
character_set_server | utf8
character_set_system | utf8
character_sets_dir | /usr/local/mysql-5.1.43-osx10.6-x86_64/share/charsets/
I have a staging Rails site up that's running on MySQL 5.0.32-Debian.
On this particular site, all of my tables are using utf8 / utf8_general_ci encoding.
Inside that database, I have some data that looks like so:
mysql> select * from currency_types limit 1,10;
+------+-----------------+---------+
| code | name | symbol |
+------+-----------------+---------+
| CAD | Canadian Dollar | $ |
| CNY | Chinese Yuan | å…ƒ |
| EUR | Euro | € |
| GBP | Pound | £ |
| INR | Indian Rupees | ₨ |
| JPY | Yen | ¥ |
| MXN | Mexican Peso | $ |
| USD | US Dollar | $ |
| PHP | Philippine Peso | ₱ |
| DKK | Denmark Kroner | kr |
+------+-----------------+---------+
Here's the issue I'm having
On staging (with the db and Rails site running on the debian box), the characters for symbols are appearing correctly when displayed from Rails. For instance, the Chinese Yuan is appearing as 元 in my browser, not å…ƒ as it shows inside the database.
When I download that data to my local OS X development machine and run the db and Rails locally, I see the representation from inside the DB (å…ƒ) on my browser, not the character 元 as I see in staging.
Debugging I've done
I've ensured all headers for Content-Type are coming back as utf8 from each webserver (local, staging).
My local mysql server and the staging server are both setup to use utf8 as the default charset. I'm using "set names 'utf8'" before I make any calls.
I can even connect to my staging db from my OS X Rails host, and I still see the characters å…ƒ representing the yuan. I'm guessing then, perhaps there's an issue with my mysql local client, but I can't figure out what the issue is.
Perhaps this might lend a clue
To make it even more confusing, if I paste the character 元 into the db on my local machine, I see that in the web browser fine. --- YET if I paste that same character into my staging db, I get a ? mark in it's place on the page from my staging Rails site.
Also, locally on my OS X rails machine if I use "set names 'latin1'" before my queries, the characters all come back properly. I did have these tables set as latin1 before - could this be the issue?
Someone please help me out here, I'm going crazy trying to figure out what's wrong!
AHA! Seems I had some table information encoded in latin1 before, and stupidly changed the databases to utf8 without converting.
Running the following fixed that currency_types table:
mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DBNAME > DBNAME.sql
mysql -u root -p --default-character-set=utf8 DBNAME < DBNAME.sql
Now I just have to ensure that the other content generated after the latin1 > utf8 switch isn't messed up by that :(
Do you have these two lines in your database.yml under the proper section?
encoding: utf8
collation: utf8_general_ci
The problem could have been with you MySQL client in staging it does not support UTF-8.
Your local OSX ruby installation configuration might not have declared the proper configs.
You should have "encoding: utf8" in "config/database.yml" for the MySQL database.
You should have "$KCODE = 'u'" in "config/environment.rb" for the ruby enviroment.
Another simple approach is to set the encode type by using SQL Alter statement. You can do this using the below bash script.
for t in $(mysql --user=root --password=admin --database=DBNAME -e "show tables";);do echo "Altering" $t;mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";done
prettified
for t in $(mysql --user=root --password=admin --database=DBNAME -e "show tables";);
do
echo "Altering" $t;
mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";
done
My DB was already set by default to utf8, but I encountered the same problem.
Also after adding the following usual meta tag, the problem was still there:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Then I created a dedicated connection.php to ensure all communication with MySQL is set to charset utf8. Note that there is no - in utf8 in mysqli_set_charset($bd, 'utf8')!
Here is my Connection.php:
<?php
$mysql_hostname = "localhost";
$mysql_user = "username";
$mysql_password = "password";
$mysql_database = "dbname";
$prefix = "";
$bd = mysqli_connect($mysql_hostname, $mysql_user, $mysql_password) or die("Could not connect database");
mysqli_select_db($bd, $mysql_database) or die("Could not select database");
if(!mysqli_set_charset($bd, 'utf8')) {
exit() ;
}
?>
Another php file:
<?php
//Include database connection details
require_once('connection.php');
//Enter code here...
//Create query
$qry = "SELECT * FROM subject";
$result = mysqli_query($bd, $qry);
?>
//Other stuff
For Rails run the following code snippet into rails console. It will generate an sql for all tables. Then log in to mysql and execute copied sql from rails console. It will alter all tables encoding.
schema = File.open('db/schema.rb', 'r').read
rows = schema.split("\n")
table_name = nil
rows.each do |row|
if row =~ /create_table/
table_name = row.match(/create_table "(.+)"/)[1]
puts "ALTER TABLE `#{table_name}` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"
end
end
You can generate a migration, the Rails way, to change the collation type on your databases:
rails generate migration ChangeDatabaseCollation
Then you can edit the generated file and paste:
def change
# for each table that will store the new collation execute:
execute "ALTER TABLE my_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci"
end
And run the migration:
rake db:migrate
You can also enforce the new collation on your database.yml:
development:
adapter: mysql2
encoding: utf8
collation: utf8_general_ci
For more information on Rails migrations:
http://edgeguides.rubyonrails.org/active_record_migrations.html
For more information on collation types:
http://collation-charts.org/