In MySQL, how to change a variable such as character_set_client?
mysql> show variables like 'character_set%';
-------------------------+-------
character_set_client | latin1
to obtain
character_set_client | utf8
When starting MySQL client you have to specify --default-character-set=charset_name
From manual:
Use charset_name as the default character set for the client and
connection.
A common issue that can occur when the operating system uses utf8
or another multi-byte character set is that output from the mysql
client is formatted incorrectly, due to the fact that the MySQL
client uses the latin1 character set by default. You can usually
fix such issues by using this option to force the client to use the
system character set instead.
For example:
$>mysql -uUser -pPassword --default-character-set=utf8
For an example of how to set it via connection string see here.
Related
I am running a MySQL database on RDS. I want to change all of my encodings to utf8mb4. I created a parameter group on RDS with all character_set_* parameters as utf8mb4, assigned it to my RDS instance, and then rebooted the instance. However, when I run SHOW VARIABLES LIKE '%char%' on my DB, there are still values of latin1, which I do not want:
character_set_client latin1
character_set_connection latin1
character_set_database utf8mb4
character_set_filesystem binary
character_set_results latin1
character_set_server utf8mb4
character_set_system utf8
character_sets_dir /rdsdbbin/mysql-5.6.22.R1/share/charsets/
Likewise, new columns that I create on the DB are latin1 encoded instead of utf8mb4 encoded. I can change the encoding values manually through the mysql command line, but this doesn't help since the values are also reset to latin1 when I push to production.
I think this the issue is the distinction between VARIABLES and GLOBAL VARIABLES.
If you list the GLOBAL VARIABLES this should reflect what you see in your parameter group: (assuming you've rebooted as Naveen suggested in the other answer)
SHOW GLOBAL VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
This is opposed to what you see in your regular VARIABLES:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
These can sometimes be overridden by the options supplied in the connection. eg connecting using the options --default-character-set:
mysql -h YOUR_RDS.us-east-1.rds.amazonaws.com -P 3306 --default-character-set=utf8 -u YOUR_USERNAME -p
After changing the parameter group - do you the warning "Pending Reboot" in the console. If yes, try rebooting the DB Instance and the new character set would start be applied.
More information - http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html
What's the difference between character-set-server and default-character-set in my.cnf? I want to set MySQL's connection to UTF8 and both of these seem to work. Is one better than the other?
Always use
character-set-server
The server character set and collation are used as default values if the database character set and collation are not specified in CREATE DATABASE statements. They have no other purpose (source)
character-set-server has replaced default-character-set setting, since default-character-set is now deprecated and can cause problems.
p.s. I believe the answer by newtover is wrong.
Here is a quote from MySQL docs:
You can force client programs to use specific character set as follows:
[client]
default-character-set=charset_name
This is normally unnecessary. However, when character_set_system differs from character_set_server or character_set_client, and you input characters manually (as database object identifiers, column values, or both), these may be displayed incorrectly in output from the client or the output itself may be formatted incorrectly. In such cases, starting the mysql client with --default-character-set=system_character_set—that is, setting the client character set to match the system character set—should fix the problem.
In other words, character_set_server and character_set_client are settings for mysqld, when default-character-set is a setting for mysql and other client libraries which overrides character_set_client assumed by mysqld by default.
You may not see the difference if you connect with mysql to localhost, but default-character-set is used as well when you connect to some other server, which may have other defaults.
UPD from 2018-08-17
As John Smith noticed, my answer is currently outdated, but the essence of it is still correct: character_set_server is a server variable, but when you connect to the mysqld with a client, you should specify the client and connection settings.
In these days much many computers used as clients and servers have utf-8 as a default for locale encoding and only because of that setting character-set-server instead of default-character-set might seem to work.
To be clear, to set up mysqld (that is server) to use utf-8 and some its collation as default for schema names, table names, column names and column values instead of latin1_swedish_ci you should set up characted-set-server in mysqld configuration.
But when you connect a mysql client to the server, your current charset may be any other, and to correctly convert to the server character set the data you send over a connection and to convert back the data sent from the server as response, you should specify the corresponding client settings:
SET character_set_client = charset_name;
SET character_set_results = charset_name;
SET character_set_connection = charset_name;
or the corresponding settings in mysql.ini for your client application. If all of them are the same, you can start your communication with server with a shorter statement:
SET NAMES 'charset_name' [COLLATE 'collation_name'];
I'm in the process of upgrading an old legacy Rails 2.3 app to something more modern and running into an encoding issue. I've read all the existing answers I can find on this issue but I'm still running into problems.
Rails ver: 2.3.17
Ruby ver: 1.9.3p385
My MySQL tables are default charset: utf8, collation: utf8_general_ci. Prior to 1.9 I was using the original mysql gem without incident. After upgrading to 1.9 when it retrieved anything with utf8 characters in it would get this well-documented problem:
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)
I switched to the mysql2 gem for it's superior handling and I no longer see exceptions but things are definitely not encoding correctly. For example, what appears in the DB as the string Repoussé is being rendered by Rails as Repoussé, “Boat” appears as “Boatâ€, etc.
A few more details:
I see the same results when I use the ruby-mysql gem as the driver.
I've added encoding: utf8 lines to each entry in my database.yml
I've also added the following to my environment.rb:
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
It has occurred to me that I may have some mismatch where latin1 was being written by the old version of the app into the utf8 fields of the database or something, but all of the characters appear correctly when viewed in the mysql command line client.
Thanks in advance for any advice, much appreciated!
UPDATE: I now believe that the issue is that my utf8 data is being coerced through a binary conversion into latin1 on the way out of the db, I'm just not sure where.
mysql> SELECT CONVERT(CONVERT(name USING BINARY) USING latin1) AS latin1, CONVERT(CONVERT(name USING BINARY) USING utf8) AS utf8 FROM items WHERE id=myid;
+-------------+----------+
| latin1 | utf8 |
+-------------+----------+
| Repoussé | Repoussé |
+-------------+----------+
I have my encoding set to utf8 in database.yml, any other ideas where this could be coming from?
I finally figured out what my issue was. While my databases were encoded with utf8, the app with the original mysql gem was injecting latin1 text into the utf8 tables.
What threw me off was that the output from the mysql comand line client looked correct. It is important to verify that your terminal, the database fields and the MySQL client are all running in utf8.
MySQL's client runs in latin1 by default. You can discover what it is running in by issuing this query:
show variables like 'char%';
If setup properly for utf8 you should see:
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
If these don't look correct, make sure the following is set in the [client] section of your my.cnf config file:
default-character-set = utf8
Add add the following to the [mysqld] section:
# use utf8 by default
character-set-server=utf8
collation-server=utf8_general_ci
Make sure to restart the mysql daemon before relaunching the client and then verify.
NOTE: This doesn't change the charset or collation of existing databases, just ensures that any new databases created will default into utf8 and that the client will display in utf8.
After I did this I saw characters in the mysql client that matched what I was getting from the mysql2 gem. I was also able to verify that this content was latin1 by switching to "encoding: latin1" temporarily in my database.conf.
One extremely handy query to find issues is using char length to find the rows with multi-byte characters:
SELECT id, name FROM items WHERE LENGTH(name) != CHAR_LENGTH(name);
There are a lot of scripts out there to convert latin1 contents to utf8, but what worked best for me was dumping all of the databases as latin1 and stuffing the contents back in as utf8:
mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DBNAME > DBNAME.sql
mysql -u root -p --default-character-set=utf8 DBNAME < DBNAME.sql
I backed up my primary db first, then dumped into a test database and verified like crazy before rolling over to the corrected DB.
My understanding is that MySQL's translation can leave some things to be desired with certain more complex characters but since most of my multibyte chars are fairly common things (accent marks, quotes, etc), this worked great for me.
Some resources that proved invaluable in sorting all of this out:
Derek Sivers guide on transforming MySQL data latin1 in utf8 -> utf8
Blue Box article on MySQL character set hell
Simple table conversion instructions on Stack Overlow
You say it all looks OK in the command line client, but perhaps your Terminal's character encoding isn't set to show UTF8? To check in OS X Terminal, click Terminal > Preferences > Settings > Advanced > Character Encoding. Also, check using a graphical tool like MySQL Query Browser at http://dev.mysql.com/downloads/gui-tools/5.0.html.
I am running MySQL 5.5.20 on Windows Vista Business. I am having a bit of a problem with collation_connection. My default charset is utf8 and collation is ut8_unicode_ci. However, when I perform mysql dumps on the database, my functions and procedures have the collation_connection showing utf8_general_ci; for example,
/*!50003 SET collation_connection = utf8_general_ci*/ ;
Is it possible to specify MySQL to default to ut8_unicode_ci for collation_connection? I use MySQL Workbench to perform the mysql dumps.
Nowadays consider using utf8mb4 instead of utf8.
If you want the behavior you're describing, you first would execute:
SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;
Then redefine affected functions, procedures and triggers.
Next time you execute mysqldump, generated file will have lines like this:
/*!50003 SET collation_connection = utf8mb4_unicode_ci*/ ;
More details about SET NAMES in MySQL Reference Manual.
At shell command prompt:
mysqladmin -u"username" -p"password" --default-character-set=utf8 CREATE my_db_schema
--default-character-set=utf8 seems to have no effect and I don't understand why.
Database gets created, but character set is latin1 with collation latin1_swedish_ci.
I found this question, which would seem to be the same issue, but even when I tried a non-root user as the selected answer suggested, I get identical behavior:
MySQL connection character set problems
(I'm using Windows and MariaDB if that makes any difference)
I have tried these mysqladmin.exe clients:
MariaDB 5.3.2 for Win32 (ia32) with default character set latin1 (no .ini)
MySQL 5.0.77 for linux-gnu (i686) with default character set utf8
In both cases, --default-character-set=utf8 or --default-character-set=latin1 do NOT override the MySQL server's .ini/.cnf settings.
As a workaround I'd suggest running:
echo "CREATE DATABASE my_db_schema DEFAULT CHARACTER SET utf8" | mysql -uusername -ppassword
--default-character-set=utf8 seems to have no effect and I don't understand why.
Database gets created, but character set is latin1 with collation latin1_swedish_ci.
This options does not influence the character of a datatabase, table or column when they are created.
The default-character-set is the character set of the connection to the server -- it ensures values you select from the database come through to the client with the correct encoding for display.
On the surface I'd say this appears to be a mysqladmin bug. I would let the MariaDB devs know about it.
http://kb.askmonty.org/en/reporting-bugs has general instructions about reporting bugs (ignore the bit about using the mysqlbug script, since it is not available on Windows).
P.S. And if the bug exists in MariaDB it likely also exists in MySQL.