How to set utf8mb4 on MySQL 5.7 (windows) - mysql

I have an installation of MySQL 5.7 on a Windows 7 machine.
I need to change the character set of the database in order to persist emoji.
The configuration into my.ini:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
default-character-set = utf8mb4
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Into the windows services I saw that the configuration file path that is loaded is correct.
Looking into database properties with the query:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
I got the following results:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8mb4_general_ci
collation_server utf8_general_ci
So, the values of collation_server, character_set_system, character_set_server, character_set_results, character_set_connection, character_set_client are wrong.
How can I fix them?
Thanks.

After connecting to MySQL, perform SET NAMES utf8mb4. That will establish that your client is using the full 4-byte encoding for reading/writing.
You can do this in my.cnf/my.ini:
init_connect = 'SET NAMES utf8mb4'
but keep in mind that when connecting as root (or any SUPER user), init_connect is ignored.
Also, the tables/columns must be CHARACTER SET utf8mb4.

Related

MySQL 5.7 Not setting as utf8mb4 correctly

I'm trying to import data into my table. The source is CSV in UTF8 encoding into my MySQL UTF8MB4 table.
Originally i thought the encoding was wrong for the language which is Russian (for this failed row) but it turns out its a slash "/" in the string (or any other similar chars) and when i remove it then insert works to that point.
This data is multi language and has emoji too, what is the best way to handle the slash in the double quote enclosed string?
For example the slash in this is the problem "й\с"
When i run
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'
I get
Variable_name
Value
character_set_client
utf8mb4
character_set_connection
utf8mb4
character_set_database
utf8mb4
character_set_filesystem
binary
character_set_results
utf8mb4
character_set_server
latin1
character_set_system
utf8
collation_connection
utf8mb4_general_ci
collation_database
utf8mb4_unicode_ci
collation_server
latin1_swedish_ci

FIREDAC TFDparams - can't send emoji

I'm trying to save text with emoji (like "hello 💋 world") to MYSQL
Everything goes fine when I just use it without FDparams:
FDQuery.SQL.text:='update USER set status="hello 💋 world"'
But if I try to use TFDparams, the troubles begin:
(A) FDParams.CreateParam(ftString,'status',ptInput).AsString:='hello 💋 world';
(B) FDParams.CreateParam(ftWideString,'status',ptInput).AsWideString:='hello 💋 world';
FDQuery.SQL.text:='update USER set status=:status'
FDQuery.Params.Assign(FDParams);
(A) just doesn't save the emoji properly ('hello ?? world' is sent to DB instead) - and I believe the emoji becomes '??' even before sending to DB (the call '.AsString' seems to spoil the unicode)
(B) gives a native MySQL error: Incorrect string value: '\xF0\x9F... for column 'status'
Mysql settings (config files):
[mysql]
default-character-set=utf8mb4
[mysqld]
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
init_connect='SET collation_connection = utf8mb4_unicode_ci'
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-character-set-client-handshake
Table USER show create:
CREATE TABLE `USER` (
`status` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
mysql -V
mysql Ver 14.14 Distrib 5.7.31, for Linux (x86_64) using EditLine wrapper
SHOW VARIABLES (sent from my client program)
character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = utf8mb4
character_set_system = utf8
character_sets_dir = /usr/share/mysql/charsets/
collation_connection = utf8mb4_unicode_ci
collation_database = utf8mb4_unicode_ci
collation_server = utf8mb4_unicode_ci
In short:
Query without FDparams - everything is fine, emoji is saved properly and no errors
FDparams[...].AsString - Firedac spoils the unicode emoji and sends just 'hello ?? world' to MySql
FDparams[...].AsWideString - I get MYSQL error "Incorrect string value: '\xF0\x9F..."

How to change system variables of sql on server

I need to find file where I can change these variables into utf-8. I can't find them in my.cnf files.
collation_connection latin1_swedish_ci
collation_database latin1_swedish_ci
collation_server latin1_swedish_ci
character_set_client latin1
character_set_connection latin1
character_set_database latin1
character_set_results latin1
character_set_server latin1
From mysql doc You can change character_set configuration
[client]
character-sets-dir=/usr/local/mysql/share/mysql/charsets
[client]
character-sets-dir="C:/Program Files/MySQL/MySQL Server 5.6/share/charsets"
http://dev.mysql.com/doc/refman/5.6/en/charset-configuration.html

MySQL only storing some Emojis in text field when using UTF8MB4

We have a project where we're storing Facebook and Twitter posts in a Mysql database, as first almost all Emojis were being stored as ?. We've since gone ahead and made some configuration changes to the database server, and since then we're starting to see more Emojis saving and appearing correctly, however some Emojis are still showing as ?, sadly I'm not sure which ones they are. I know one of them was a basket ball.
When I execute the following commend on MySQL;
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%'
OR Variable_name LIKE 'collation%';
I see the following settings;
character_set_client = utf8
character_set_connection = utf8
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8
character_set_server = utf8mb4
character_set_system = utf8
collation_connection = utf8_general_ci
collation_database = utf8mb4_unicode_ci
collation_server = utf8mb4_unicode_ci
Our database server is hosted with Rackspace, we've asked them to set up the following configuration;
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect='SET NAMES utf8mb4'
I've tested output from the database using a number of clients, PHP, Java and MySQL Workbench.
I'm at a loss now as to why some Emojis are not saving, and I've followed as much advice as I can find on the web.
character_set_client/connection/results = utf8 -- These three are changed by SET NAMES. What you list seems to be before SET NAMES is executed.
If you are connecting as root, init-connect is not executed; perhaps this is why you don't see it.
Establish a non-SUPER user for all application work; that way the init-connect will be executed.

Encoding error with polish charset during transfer of database / server seting up

I am trying to transfer one of my databases from one host (home.pl) to another (my newly set server). The script that I am trying to transfer is wordpress. Unluckily irrespective of the method used I am struggling with encoding problems.
New host configuration
In my new server I am using the following directives in my.cnf:
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_general_ci
character-set-server = utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
[client]
default-character-set=utf8
My mySQL vars:
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8_general_ci
collation_server utf8_general_ci
Php.ini on new server:
; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"
Old host configuration
I have runned SHOW VARIABLES in my old host from which I am trying to transfer database and I got the following:
character_set_client utf8
character_set_connection utf8mb4
character_set_database utf8
character_set_results utf8
character_set_server latin2
character_set_system utf8
/usr/local/pssql55/share/charsets/
collation_connection utf8mb4_general_ci
collation_database utf8_polish_ci
collation_server latin2_general_ci
Transfer methods tried out
1) Transfer via phpmyadmin
I have tried using PHPMYADMIN export/import. In particular I have pointed out UTF-8 as file character set both during export and import via phpmyadmin.
What is strange both in phpmyadmin on source server and new host I don't see polish chars (the output is the same without polish chars).
2) Export / Import via mysql dump
I have tried also to use:
mysqldump -h OLD_HOST -u OLD_USER -p DB | mysql -h localhost -u root NEW DATABASE
but the encoding also fails.
Tried to use also encoding variables but it also failed:
mysqldump --default-character-set=latin1 | mysql --default-character-set=utf8
Dump file
In my dump file using Programers Notepad with UTF-8 encoding set, charcters look like this:
"Ä" instead of "ę"
Opening them in microsoft word I see
Ä™ instead of "ę"
The encoding converter (gżegżółka) recognises that the file is in:
C:\Users\mkondej001\Desktop\14271425_mk.sql
Kodowanie: Unicode UTF-8
EOL: LF (Unix)
Any clues how to transfer DB / set server variables correctly ?
At the end I have founded out that the problem was related to the fact that the data was written to SQL incorrectly in my original server.
I ended up with transferring DB using:
mysqldump --default-character-set=utf8 [ORYGINAL_DB] | mysql [TARGET_DB] --default-character-set=utf8
and the executing:
UPDATE [table name] SET [field] = CONVERT(BINARY CONVERT([field] USING latin2) USING utf8)
as it was advices here:
strange character encoding of stored data , old script is showing them fine new one doesn't
Hope that the above solution will be helpful for others too.
SET NAMES utf8;
(The default is latin11, which leads to Ä™.)
Note: init_connect is not executed for root (or any SUPER) user. So this failed you:
init_connect='SET NAMES utf8'