How to create a utf8 db with mysqladmin - mysql

I feel like this should be simple but i can't work out how to set the character set when making a db with "mysqladmin create". I thought this would work
mysqladmin -u root db_name --character-set=utf8
leveraging this bit of the mysqladmin --help text:
-O, --set-variable=name
Change the value of a variable. Please note that this
option is deprecated; you can set variables directly with
--variable-name=value.
i also tried this
mysqladmin -u root create db_name --default-character-set=utf8
In both cases, the db was created without complaint, but i don't think it's worked:
mysql> SHOW VARIABLES like '%character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
I can see that character_set_system is utf8, but should all of the latin1's above be showing utf8?
Grateful for any advice - max

No, the variables that you have displayed are the options of your connection, not the database. If you make a database dump, you will see, that everything is in place. For more options see SET NAMES 'charset' command in MySQL Manual.

I'm coming back to answer my own question, since i just tried to do this with a more recent install of mysql and it didn't work: i think the options have changed.
In mysql 5.5, which i have, the relevant config options (to make databases default to utf8 character set) are:
[client]
default-character-set = utf8
[mysql]
default-character-set = utf8
[mysqld]
collation-server = utf8_unicode_ci
init-connect = 'SET NAMES utf8'
character-set-server = utf8

Related

Change default character set on MySQL Workbench

I am trying to connect to my MYSQL database using a utf8mb4 charset (Note the global settings for the database charset is already utf8mb4).
I can do this quite easily using the CLI like so:
mysql -h myhostname -u myuser -p --default-character-set=utf8mb4
When I do the following query:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
I get the correct output as expected:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
However, when I connect to my MySQL database using MySQL Workbench, and perform the same query I get the following:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
The issue here is that I am struggling to change the default-character-set in MySQL Workbench GUI. I tried appending the following:
default-character-set=utf8mb4
in Manage Server Connections > Connection > Advanced > Others section,
but does not seem to have any affect.
How can I change the default character set on the MySQL Workbench GUI.
AFAIK you have to execute this command each time you start a new Workbench session:
SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci';
UPDATE
The following is useful if you need to use Workbench to do exports: (I haven't found a similar way to cause all it's connections to default to utf8mb4)
The default charset that is used is to export data is utf8. To support full Unicode though we need utf8mb4. To achieve this it's possible to modify Workbench to use utf8mb4 manually.
Go to C:\Program Files\MySQL\MySQL Workbench 6.3 CE\modules
open this file wb_admin_export.py.
Create a backup copy
Replace all occurrences "default-character-set":"utf8" with "default-character-set":"utf8mb4".
Save the file.
Restart Workbench.
The next time you run the export you will see in the log results like this:
Running: mysqldump.exe --defaults-file="c:\users\jonathan\appdata\local\temp\tmpidlh7a.cnf" --host=localhost --protocol=tcp --user=root --allow-keywords=TRUE --port=3306 --default-character-set=utf8mb4 --routines --skip-triggers "databasename"
In MySQL Workbench (8.0), you can click the Administration tab, select Options File under Instance, scroll to the International section and you'll find character-set-server and collation-server, which you can set to your desired charset and collation. Click the Apply button to save the changes.
This will set the values in /etc/mysql/my.cnf, or wherever your config file is.

Force MariaDB clients to use utf8mb4

I'm running into an issue where I'm getting differently ordered results when querying with PHP Versus the command line. From my research, it appears that in some cases that bad encoding can cause problems with the order of the results.
That said, all my DB tables are encoded as utf8mb4, with the collation utf8mb4_general_ci. However, it doesnt seem that the mysql variables are set correctly.
I'm on Mysql 5.5.5-10.1.26-MariaDb.
Here are my CNF settings, but to be honest I don't know what I'm doing here:
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mariadb]
[mysqld]
character-set-server=utf8mb4
character_set_client=utf8mb4
collation-server=utf8mb4_general_ci
The variables output from mysql:
character_set_client utf8
character_set_connection utf8
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8
character_set_server utf8mb4
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8mb4_unicode_ci
collation_server utf8mb4_general_ci
Update: A person has asked for how I'm connecting to the database:
$this->connection = new PDO('mysql:host='.DB_SERVER.';dbname='.DB_NAME.';port='.DB_PORT, DB_USER, DB_PASS, $options);
Update: I've switched to utf8mb4_unicode_ci (as per suggestions in answers below).
You want to have character-set-client-handshake = FALSE as well.
With /etc/my.cnf.d/character-set.cnf
# https://scottlinux.com/2017/03/04/mysql-mariadb-set-character-set-and-collation-to-utf8/
# https://mariadb.com/kb/en/library/setting-character-sets-and-collations/
# https://medium.com/#adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434
# https://stackoverflow.com/questions/47566730/force-mariadb-clients-to-use-utf8mb4
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
collation-server = utf8mb4_unicode_ci
init-connect = 'SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'
character-set-server = utf8mb4
I get everything to be utf8mb41
MariaDB [(none)]> show variables like 'char%'; show variables like 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)
MariaDB [(none)]>
however without the character-set-client-handshake line some are still utf8
MariaDB [(none)]> show variables like 'char%'; show variables like 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.01 sec)
MariaDB [(none)]>
1 character_set_system is always utf8.
You should probably use utf8mb4_unicode_ci instead of utf8mb4_general_ci as it's more accurate. Unless you're running MariaDB on a system with an old/limited CPU and performance is a huge concern.
That being said, the solution is to set init_connect in your MariaDB configuration (or --init-connect on the command line):
init_connect = "SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci"
Either way is fine. I am not recommending one way over the other. Both are equally valid approaches.
Your MariaDB configuration may be in my.cnf or a file included by my.cnf, typically found under /etc/mysql. Check your system documentation for details. Because you are configuring a server variable, as indicated by the MariaDB documentation linked to above, you should set the variable in the server part of the configuration file. The server part of the configuration files is indicated by the INI section names ending in "d". An INI section is denoted by a keyword surrounded by square brackets, e.g. "[section]". The "d" stands for "daemon", which is standard UNIX nomenclature for a server process. You can set the variable in either the [mysqld] section or the [mariadb] section. Because the init_connect server variable is common to both MySQL and MariaDB, I would recommend you put it under [mysqld].
I see that you are setting character_set_client=utf8mb4 in your pasted configuration. You don't need to do this. You can delete or comment out the line. Comments are lines starting with pound symbol (#), also known as a hash mark, octothorp, or number sign.
Any and all clients that connect to the server will execute these command(s) before any other commands are processed.
init_connect is not performed by anyone connecting as root, so it is not as universal as you would like.
SET NAMES utf8mb4 sets 3 things; experiment to see that. You need all 3.
If you weren't as far back as 5.5, I would recommend utf8mb4_unicode_520_ci as being a better collation: "Unicode collation names now may include a version number to indicate the Unicode Collation Algorithm (UCA) version on which the collation is based. Initial collations thus created use version UCA 5.2.0. For example, utf8_unicode_520_ci is based on UCA 5.2.0. UCA-based Unicode collation names that do not include a version number are based on version 4.0.0."
Version 8.0 has Unicode 9.0 standard.
Back to the question: There is no perfect solution; the user can override whatever you do -- either through ignorance or through malice.
You could police the tables created, but that won't keep them from connecting incorrectly. Or correctly, but with a different charset. It is valid to do SET NAMES latin1, then provide latin1-encode bytes. MySQL will convert as it stores/fetches.
But if they have utf8-encoded bytes, but say SET NAMES latin1, you get "double encoding". This "bug" destroys any chance of collating correctly, but is otherwise (usually) transparent. That is, stuff is messed up as it is stored, then un-messed up as it is fetched.
To fix this warning you should edit
/etc/my.cnf (my.ini on Windows)
Simply add/set in the file
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mysqld]
collation-server=utf8mb4_unicode_ci
init-connect='SET NAMES utf8mb4'
character-set-server=utf8mb4

Set CENTOS 6 mysql character sets to UTF8

I have a website, and I realised when you copy some characters(*,' " - _) from specific applications like Microsoft word, into a search box on my website, it returns this error:
ERROR org.hibernate.util.JDBCExceptionReporter - Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like'
So I went to check out my database and I wanted to see if the database used UTF-8.
mysql> SHOW VARIABLES LIKE '%character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> SHOW VARIABLES LIKE '%collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
As you can see, the database is using latin1 and I wanted to set it to use utf8. So firstly, I'm on a Centos 6.2 server and the file my.cnf file resides in /etc/my.cnf and the file is as follows under the [mysqld]:
[mysqld]
local-infile=0
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
init_connect ='SET NAMES utf8'
character_set_server = utf8
collation_server = utf8_general_ci
P.S I am not worried about the [client] section since it shows that it uses utf8 under the value of character_set_client.
The issue:
Although I've tried to set the server in my my.cnf file (and closed the file, shutdown my tomcat and restarted my tomcat). Nothing is changing. And when I run the first query I displayed, it still shows that character_set_server is still using latin1
Although restarting my TOMCAT didn't have any effect, I actually had to restart my 'mySql' in order for the changes to take place. So you'd have to stop the main server first (in my case Stop TOMCAT) and then restart mySql and then Start TOMCAT again. After, in your mySql if you type: show VARIABLES like %character%; you should see your database datatype with utf-8 and if you type: show VARIABLES like %collation% you should also see your database collation data type which should be utf-8_general_ci.

mysql default charset different when invoked by php

As many others, I'm having some problems with mysql charset. As many others, I want everything to be UTF-8, but mysql was installed with latin-1, and no matter how I try/google/experiment with mysql config there is still latin-1 lurking in client settings.
Ok, here is the setup. I have a (non-root) mysql user 'usr' with a password 'pwd'. Whenever I access mysql via terminal (mysql -uusr -p) and then ask him nicely about his charsets, he tell that he is in love with utf8 (as he ought to be):
mysql> SHOW VARIABLES LIKE 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> SHOW VARIABLES LIKE 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_unicode_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
However, if I use PHP to access mysql (via the very same user):
$mysql_link=mysql_connect('localhost','usr','pwd');
$result1=mysql_query("show variables like 'character%'");
$result2=mysql_query("show variables like 'collation%'");
mysql_close($mysql_link)
And print_r $result1, $result 2, it magically falls back to latin-1:
character_set_client => latin1
character_set_connection => utf8
character_set_database => utf8
character_set_filesystem => binary
character_set_results => latin1
character_set_server => utf8
character_set_system => utf8
character_sets_dir => /usr/share/mysql/charsets/
collation_connection => utf8_unicode_ci
collation_database => utf8_general_ci
collation_server => utf8_unicode_ci
This happens regardless whether I invoke php via browser (as php-cgi) or via terminal (as php-cli).
Kinda fix for that is to set charset manually at each connection:
mysql_set_charset('utf8',$mysql_link);
That works. But I feel like there should be a way to do that via mysql config.
For reference, Mysql config (my.cfg) includes:
[client]
default_character_set = utf8
[mysqld]
init_connect='SET collation_connection = utf8_unicode_ci'
character-set-server = utf8
collation-server = utf8_unicode_ci
And PHP config (php.ini) includes
default_charset = "utf-8"
Thank forward! =)
P.S. I know that mysql_ functions are deprecated and should be replaced with mysqli_ ones. But hopefully that doesn't have anything to do with this exact problem =)
If you're like most people, you use the root account to get to MySQL. This little snippet from the docs might be your smoking gun.
It is still necessary for applications to configure their connection using SET NAMES or equivalent after they connect, as described previously. You might be tempted to start the server with the --init_connect="SET NAMES 'utf8'" option to cause SET NAMES to be executed automatically for each client that connects. However, this will yield inconsistent results because the init_connect value is not executed for users who have the SUPER privilege.

Rails show question marks(????) for my input utf8 data

I have set every encoding set variable I can figure out to utf8.
In database.yml:
development: &development
adapter: mysql2
encoding: utf8
In my.cnf:
[client]
default-character-set = utf8
[mysqld]
default-character-set = utf8
skip-character-set-client-handshake
character-set-server = utf8
collation-server = utf8_general_ci
init-connect = SET NAMES utf8
And if I run mysql client in terminal:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
But it's to beat the air. When I insert utf8 data from Rails app, it finally becomes ????????????.
What do I miss?
Check not global settings but when you are connected to specific database for application. When you changed settings for mysql you have also change settings for your app database.
Simple way to check it is to log to mysql into app db:
mysql app_db_production -u db_user -p
or rails command:
rails dbconsole production
For my app it looks like this:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8_general_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
Command for changing database collation and charset:
mysql> alter database app_db_production CHARACTER SET utf8 COLLATE utf8_general_ci ;
Query OK, 1 row affected (0.00 sec)
And remeber to change charset and collation for all your tables:
ALTER TABLE tablename CHARACTER SET utf8 COLLATE utf8_general_ci; # changes for new records
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; # migrates old records
Now it should work.
I had the same problem. I added characterEncoding to the end of mysql connection string:
use this: jdbc:mysql://localhost/dbname?characterEncoding=utf8
instead of this: jdbc:mysql://localhost/dbname
Okay for anybody else for whom the #Ravbaker answer does not cut it .. some more tips
MySQL has encoding specified in multiple levels : server, database, connection, table and even field/column. My problem was that the field/column was forced to latin (which over rides all the other encodings). I set the field back to the table encoding (which was utf-8) and the world was good again.
Most of these settings can be set at the usual places: my.cnf, alter queries and rails database.yml file.
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
was the query which did the trick for me.
For server / connection encodings use my.cnf and database.yml
For database / table / column encodings use queries
(You can also achieve these by other means)
Do you have this in the HTML?
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
or on HTML5 pages with <!doctype html>
<meta charset="utf-8">
You may need this to let the browser send strings in utf8.
I have some problem today! It's solved by drop my table and creating new, then db:migrate and all is pretty works!
WARNING: IT WILL DELETE ALL YOUR DATA IN THIS TABLE
So:
$ mysql -u USER -p
mysql > drop database YOURDB_NAME_development;
mysql > create database YOURDB_NAME_development CHARACTER SET utf8 COLLATE utf8_general_ci;
mysql > \q
$ rake db:migrate
Well done!