Character encoding - MySQL 5.1.67 on CentOS 6.3 - mysql

I've been looking for some hours for an answer for this one and it seems to be not yet covered. I will try to be as succinct as possible.
I have a client running a web app with some character encoding problems. Specifics:
CentOS 6.3
MySQL Server 5.1.67
ALL tables set to UTF-8
SHOW FULL COLUMNS verifies also that all text columns in each table are UTF-8
Data going into database is UTF-8
Data is served to web clients as UTF-8
No bad characters in output data; forcing browser to Latin1 causes all manner of problems
So, basically, everything is UTF-8, and everything is working perfectly.
EXCEPT... all connections to MySQL must be Latin1, or else the whole system falls apart. I have verified this with the console MySQL client, PHP/Mysql, PHP/Mysqli and the Ruby mysql gem v2.8.1. Issuing a command like "SET NAMES utf8" or using the various API methods to change the connection character set to UTF-8 will cause all multibyte characters to become garbled and unrecognizable.
At the moment there is no major problem here, except of course that using a Latin1 connection to the server does not work at all in my Ubuntu test environment and so my programs keep breaking when I move them into production. But I have a nagging feeling that something cannot possibly be right and it's going to come back and bite my client later on.
MySQL is reporting this upon initial connection to the server:
character_set_client: latin1
character_set_connection: latin1
character_set_database: latin1
character_set_filesystem: binary
character_set_results: latin1
character_set_server: latin1
character_set_system: utf8
And these are the settings that work for UTF-8 data. If I change anything else to UTF-8, multi-byte characters die a miserable death.
If anybody has any idea what is going wrong here and where I can find any better documentation on it, that would be enormously appreciated.

Related

Set encoding character set to utf8mb4 in Node.js or MySQL client?

When we want to store a string with emojis to our MySQL database, we get this error:
Conversion from collation utf8_general_ci into utf8mb4_unicode_520_ci impossible for parameter
Can someone explain to me why Node is sending it as utf8 to MySQL and how I can declare that we have an utf8mb4 collation? I tried several different encodings on the database side, but I think it is a Node.js or driver configuration that is missing.
The backend is based on Express.js, running on a Debian system within an lts-alpine docker container and the request is sent by a React Native app.
Thx, Florian

Convert phpMyAdmin exported database to older version. utf8mb4 issues

I want to move my MySQL database to an older version server (5.7 to 5.1).
I get errors because it is created using utf8mb4 .
If i manually change utf8mb4 to utf8 the data become unreadable because of multilinguality.
I have access only to phpMyAdmin in both servers so I can't use mysqldump.
Any ideas?
It seems I've figured out a solution.
Use at export mysql40 compatibility mode, replace utf8mb4 with utf8 and change the character set of tables from phpmyadmin to utf8_unicode_ci.
Hope this will save some time from a fellow in future.

ejabberd error with with utf8mb4

I have ejabberd 16.01 that works well with MySQL, the problem is that it store only the regular emoji not the 4-bytes emoji. If I use a terminal, connect to the database, set the charset to utf8mb4 and run insert query, I can insert all type of icons, so the db is configured in the right way!
So i ejabberd put a message in offline storage all icons of 4-Bytes become "?????????" is there a way to set the charset to utf8mb4 for ejabberd mod offline?
How can I fix it? Do you have any ideas?
Thanks!
You need two things:
Ensure you have ejabberd 16.02 or newer that forces UTF8MB4 as default for emoji retrieval (Insertion is already in UTF8MB4 since years, if the table is correctly defined)
Ensure your MySQL schema has been properly created with UTFMB4 support.

How to change or conversion utf8_general_ci to binary in MySQL?

When I tested the new version (1.20wmf4) of MediaWiki I saw (see: screen capture) the following message about database character set:
In binary mode, MediaWiki stores UTF-8 text to the database in binary fields. This is more efficient than MySQL's UTF-8 mode, and allows you to use the full range of Unicode characters.
In UTF-8 mode, MySQL will know what character set your data is in, and can present and convert it appropriately, but it will not let you store characters above the Basic Multilingual Plane.
I've my own Wiki on the MediaWiki engine, but my tables are with utf8_general_ci collation. My question is: how to easily change the collation from utf8_general_ci to binary in an existing database?
My MediaWiki version: 1.19.0
My MySQL info:
Server: Localhost via UNIX socket
Server version: 5.1.52
Protocol version: 10
MySQL charset: UTF-8 Unicode (utf8)
I had to do something similar not too long ago and followed the instructions here: http://www.mediawiki.org/wiki/Manual:Backing_up_a_wiki#Latin-1_to_UTF-8_conversion. You basically have to export the database, replace utf8_general_ci with binary in the exported SQL, and then import the database again. The sed line in those instructions wasn't quite right but you can also manually edit your dumped SQL file and fix any instances of utf8_general_ci.

Telling MySQL connection to use UTF-8 with Django

I've uploaded some data to a MySQL (5.5.15 for osx10.6) database using UTF8 encoding, though for some reason I had to specify its encoding as latin1 when I LOADed it.
I reckon this part is good because when I write to an OUTFILE, my unicode 'nu' characters come out OK in a terminal and in Vim.
However, when I look at them within a MySQL session, and when I try to edit the fields from Django admin, I get mangled characters (latin1?).
So, my question is: how to I tell a MySQL client and (especially) Django to read my database as UTF-8, the way it oughta?
At the command line, I tried
--default_character_set=utf8
and also
'SET NAMES UTF8;'
at the MySQL prompt, but they do not work.
When I look at VARIABLES LIKE 'char%', they're all set to utf8 apart from character_set_server which is latin1. If I set it to utf8.... that doesn't work either.
I'd be grateful if someone could give me some pointers here, especially about how to configure Django to talk to my database properly.
Thanks!
Add in your .cnf:
[mysqld]
character-set-server = utf8
collation-server = utf8_general_ci
skip-character-set-client-handshake