We have a project where we're storing Facebook and Twitter posts in a Mysql database, as first almost all Emojis were being stored as ?. We've since gone ahead and made some configuration changes to the database server, and since then we're starting to see more Emojis saving and appearing correctly, however some Emojis are still showing as ?, sadly I'm not sure which ones they are. I know one of them was a basket ball.
When I execute the following commend on MySQL;
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%'
OR Variable_name LIKE 'collation%';
I see the following settings;
character_set_client = utf8
character_set_connection = utf8
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8
character_set_server = utf8mb4
character_set_system = utf8
collation_connection = utf8_general_ci
collation_database = utf8mb4_unicode_ci
collation_server = utf8mb4_unicode_ci
Our database server is hosted with Rackspace, we've asked them to set up the following configuration;
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect='SET NAMES utf8mb4'
I've tested output from the database using a number of clients, PHP, Java and MySQL Workbench.
I'm at a loss now as to why some Emojis are not saving, and I've followed as much advice as I can find on the web.
character_set_client/connection/results = utf8 -- These three are changed by SET NAMES. What you list seems to be before SET NAMES is executed.
If you are connecting as root, init-connect is not executed; perhaps this is why you don't see it.
Establish a non-SUPER user for all application work; that way the init-connect will be executed.
Related
I'm trying to save text with emoji (like "hello 💋 world") to MYSQL
Everything goes fine when I just use it without FDparams:
FDQuery.SQL.text:='update USER set status="hello 💋 world"'
But if I try to use TFDparams, the troubles begin:
(A) FDParams.CreateParam(ftString,'status',ptInput).AsString:='hello 💋 world';
(B) FDParams.CreateParam(ftWideString,'status',ptInput).AsWideString:='hello 💋 world';
FDQuery.SQL.text:='update USER set status=:status'
FDQuery.Params.Assign(FDParams);
(A) just doesn't save the emoji properly ('hello ?? world' is sent to DB instead) - and I believe the emoji becomes '??' even before sending to DB (the call '.AsString' seems to spoil the unicode)
(B) gives a native MySQL error: Incorrect string value: '\xF0\x9F... for column 'status'
Mysql settings (config files):
[mysql]
default-character-set=utf8mb4
[mysqld]
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
init_connect='SET collation_connection = utf8mb4_unicode_ci'
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-character-set-client-handshake
Table USER show create:
CREATE TABLE `USER` (
`status` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
mysql -V
mysql Ver 14.14 Distrib 5.7.31, for Linux (x86_64) using EditLine wrapper
SHOW VARIABLES (sent from my client program)
character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = utf8mb4
character_set_system = utf8
character_sets_dir = /usr/share/mysql/charsets/
collation_connection = utf8mb4_unicode_ci
collation_database = utf8mb4_unicode_ci
collation_server = utf8mb4_unicode_ci
In short:
Query without FDparams - everything is fine, emoji is saved properly and no errors
FDparams[...].AsString - Firedac spoils the unicode emoji and sends just 'hello ?? world' to MySql
FDparams[...].AsWideString - I get MYSQL error "Incorrect string value: '\xF0\x9F..."
I am trying to set character sets to utf8mb4 and collation sets to utf8mb4_unicode_ci. On my website the emoji's looks fine when I get the data out of the table with php/mysql select. Only in phpMyAdmin (4.8.4) I see most emoji's as one questionamrk.
I try this:
Add to /etc/my.cnf:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Restart systemctl restart mariadb
I use on my website mysqli_set_charset($con, "utf8mb4");, <form accept-charset="UTF-8"> and <meta charset="utf-8">.
Result of SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%':
What am I doing wrong?
I use mysql version 5.5.60 (Mariadb).
Edit:
The problem might be phpmyadmin (4.8.4). It seems that phpMyAdmin is still using utf8. The global variabelen are set to utf8mb4 or utf8mb4_unicode_ci but I can't change the session variabelen:
I set in config.inc.php:
In the 'normal' table in phpMyAdmin:
And when I try SET NAMES everything looks fine:
Under General settings in phpMyAdmin the server connection collation is set on utf8_unicode_ci. I can select utf8mb4_unicode_ci, but then it switches back to utf8_unicode_ci :(
And on my website with select everything looks oké:
Connecting with different SQL users with the same client (Sequel Pro) on the same MySQL server on the same database results in different collation server variables.
In my.cnf, I included the following:
[mysqld]
init_connect='SET collation_connection = utf8_unicode_ci, NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake
Connecting with user A results in:
collation_connection = utf8_unicode_ci
collation_database = utf8_unicode_ci
collation_server = utf8_unicode_ci
init_connect = SET collation_connection = utf8_unicode_ci, NAMES utf8
Connecting with user B results in:
collation_database = utf8_unicode_ci
collation_server = utf8_unicode_ci
init_connect = SET collation_connection = utf8_unicode_ci, NAMES utf8
So the collation_connection variable is missing. Is it possible the init_connect is ignored / user specific for some reason?
Because the collation_connection is not set, a Illegal mix of collations error occurs.
Possible cause: The init_connect string seems to be wrong. Try:
init_connect='SET collation_connection = utf8_unicode_ci; SET NAMES utf8;'
and then restart the mysql service.
This does not explain the weird behaviour with the SUPER permission however..
Please be aware that init_connect is not run when a user with SUPER privileges logs in. This is a security feature, as documented:
The content of init_connect is not executed for users that have the SUPER privilege. This is done so that an erroneous value for init_connect does not prevent all clients from connecting.
This is to prevent from locking yourself out of the database completely:
For example, the value might contain a statement that has a syntax error, thus causing client connections to fail. Not executing init_connect for users that have the SUPER privilege enables them to open a connection and fix the init_connect value.
So it is expected behavior for these users not to have this variable set upon connecting.
I have an installation of MySQL 5.7 on a Windows 7 machine.
I need to change the character set of the database in order to persist emoji.
The configuration into my.ini:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
default-character-set = utf8mb4
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Into the windows services I saw that the configuration file path that is loaded is correct.
Looking into database properties with the query:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
I got the following results:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8mb4_general_ci
collation_server utf8_general_ci
So, the values of collation_server, character_set_system, character_set_server, character_set_results, character_set_connection, character_set_client are wrong.
How can I fix them?
Thanks.
After connecting to MySQL, perform SET NAMES utf8mb4. That will establish that your client is using the full 4-byte encoding for reading/writing.
You can do this in my.cnf/my.ini:
init_connect = 'SET NAMES utf8mb4'
but keep in mind that when connecting as root (or any SUPER user), init_connect is ignored.
Also, the tables/columns must be CHARACTER SET utf8mb4.
I am trying to transfer one of my databases from one host (home.pl) to another (my newly set server). The script that I am trying to transfer is wordpress. Unluckily irrespective of the method used I am struggling with encoding problems.
New host configuration
In my new server I am using the following directives in my.cnf:
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_general_ci
character-set-server = utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
[client]
default-character-set=utf8
My mySQL vars:
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8_general_ci
collation_server utf8_general_ci
Php.ini on new server:
; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"
Old host configuration
I have runned SHOW VARIABLES in my old host from which I am trying to transfer database and I got the following:
character_set_client utf8
character_set_connection utf8mb4
character_set_database utf8
character_set_results utf8
character_set_server latin2
character_set_system utf8
/usr/local/pssql55/share/charsets/
collation_connection utf8mb4_general_ci
collation_database utf8_polish_ci
collation_server latin2_general_ci
Transfer methods tried out
1) Transfer via phpmyadmin
I have tried using PHPMYADMIN export/import. In particular I have pointed out UTF-8 as file character set both during export and import via phpmyadmin.
What is strange both in phpmyadmin on source server and new host I don't see polish chars (the output is the same without polish chars).
2) Export / Import via mysql dump
I have tried also to use:
mysqldump -h OLD_HOST -u OLD_USER -p DB | mysql -h localhost -u root NEW DATABASE
but the encoding also fails.
Tried to use also encoding variables but it also failed:
mysqldump --default-character-set=latin1 | mysql --default-character-set=utf8
Dump file
In my dump file using Programers Notepad with UTF-8 encoding set, charcters look like this:
"Ä" instead of "ę"
Opening them in microsoft word I see
Ä™ instead of "ę"
The encoding converter (gżegżółka) recognises that the file is in:
C:\Users\mkondej001\Desktop\14271425_mk.sql
Kodowanie: Unicode UTF-8
EOL: LF (Unix)
Any clues how to transfer DB / set server variables correctly ?
At the end I have founded out that the problem was related to the fact that the data was written to SQL incorrectly in my original server.
I ended up with transferring DB using:
mysqldump --default-character-set=utf8 [ORYGINAL_DB] | mysql [TARGET_DB] --default-character-set=utf8
and the executing:
UPDATE [table name] SET [field] = CONVERT(BINARY CONVERT([field] USING latin2) USING utf8)
as it was advices here:
strange character encoding of stored data , old script is showing them fine new one doesn't
Hope that the above solution will be helpful for others too.
SET NAMES utf8;
(The default is latin11, which leads to Ä™.)
Note: init_connect is not executed for root (or any SUPER) user. So this failed you:
init_connect='SET NAMES utf8'