Ignore character encoding set by MySQL client - mysql

I'm connecting to a MySQL server (v5.7.21) that is configured to use utf8mb4 encoding and to ignore encoding set by the client. Here are the relevant sections of my.ini config file (I'm running Windows) :
[client]
default-character-set=utf8mb4
[mysql]
default-character-set=utf8mb4
[mysqld]
character-set-client-handshake=FALSE
skip-character-set-client-handshake #I've added this but it has no effect
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci
My client is an ETL (Talend) which uses the JDBC driver : mysql-connector-java-5.1.30-bin.jar.
If I set the driver property: characterEncoding=utf8, this overrides the character encoding set by the server (utf8mb4), as shown by this query :
SHOW VARIABLES
WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'
|=-----------------------+-----------------=|
|Variable_Name |Value |
|=-----------------------+-----------------=|
|character_set_client |utf8 |
|character_set_connection|utf8 |
|character_set_database |utf8mb4 |
|character_set_filesystem|binary |
|character_set_results | |
|character_set_server |utf8mb4 |
|character_set_system |utf8 |
|collation_connection |utf8_general_ci |
|collation_database |utf8mb4_unicode_ci|
|collation_server |utf8mb4_unicode_ci|
'------------------------+------------------'
If I don't set any driver properties on the connection, the utf8mb4 encoding is used (which is what is expected).
It seems setting character-set-client-handshake=FALSE and skip-character-set-client-handshake has no effect.
How to prevent the client from changing the encoding and force it to always use utf8mb4 ?

Related

MySQL server has gone away error. --max_allowed_packet=2G doesn't work either

I am constantly getting the following error while trying to import a MySQL table.
ERROR 2006 (HY000) at line 15692: MySQL server has gone away
The error occurs when inserting entries of a table with a longblob field. I have tried everything suggested on the internet, like using --max_allowed_packet, export and import explicity in utf8, exporting in --hex-blob, increasing wait_timeout and interactive_timeout etc, but nothing works!
I dug a bit deeper and noticed that the value of --max_allowed_packet isn't being set properly. I am using LAMPP, and in the file /opt/lampp/etc/my.cnf, I have the following under the [mysqld] section.
max_allowed_packet = 2G
However, MariaDB still shows that its value is set to only 1 MB. Why is it like that? I stopped and restarted LAMPP server, but still to no avail. Even setting this parameter from the command line, like as follows, doesnt' work!
/opt/lampp/bin/mysql -h localhost --max_allowed_packet=2G -u root -p
In both cases, when I query its value, I get the following.
MariaDB [(none)]> SHOW VARIABLES LIKE 'max_allowed_packet';
--------------
SHOW VARIABLES LIKE 'max_allowed_packet'
--------------
+--------------------+---------+
| Variable_name | Value |
+--------------------+---------+
| max_allowed_packet | 1048576 |
+--------------------+---------+
How can I solve this problem? Note that I am logged in as root.
OK, I have solved the problem. Shown below, is how I did it.
Inside an MySQL shell (open one by typing/opt/lampp/bin/mysql -h localhost -u root -p), set the value of max_allowed_packet, like this.
SET GLOBAL max_allowed_packet=1073741824;
After exiting that MySQL shell, this value should have been set. If you now go back to a new MySQL shell, and type the following,
SHOW VARIABLES LIKE 'max_allowed_packet';
It displays the correct value, as shown below.
+--------------------+------------+
| Variable_name | Value |
+--------------------+------------+
| max_allowed_packet | 1073741824 |
+--------------------+------------+

font to display utf-8 in cmd console

Hiho,
I'm trying to get some Asian UTF-8 characters, which are read from a mysql database, to display properly in the command line prompt. I'll go through the steps I've gone through in case someone else is after the same thing...
1: First I made sure the database was encoded, and set the globals utf8
mysql> show variables like 'char%';
+--------------------------+---------------------------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
After that I could compare entries in my tables, but they still weren't displayed properly.
2: In the properties of cmd, I noticed it was using code page 850, so I decided to see how change that and changed that before running mysql
cmd> chcp 65001
cmd> "C:\Program Files\MySQL\MySQL Server 5.7\bin\mysql.exe" "--defaults-file=C:\ProgramData\MySQL\MySQL Server 5.7\my.ini" "-uroot" "-p" "--default-character-set=utf8"
3: After checking the font used by the command line, neither font available could display what I needed. Unifont is a fixed size font that seems to be able to though
And that's where I am now. After installing it; I just can't seem to be able to add the Unifont to the list of fonts in command prompt. I found this tutorial on registry editing to add it but no luck. The other available fonts were there though. Even replacing the current one just caused it to use the default font instead. It seems strange Any ideas?
The font in cmd needs to be Lucida Console.
More discussion.

AWS RDS Parameter Group not changing MySQL encoding

I am running a MySQL database on RDS. I want to change all of my encodings to utf8mb4. I created a parameter group on RDS with all character_set_* parameters as utf8mb4, assigned it to my RDS instance, and then rebooted the instance. However, when I run SHOW VARIABLES LIKE '%char%' on my DB, there are still values of latin1, which I do not want:
character_set_client latin1
character_set_connection latin1
character_set_database utf8mb4
character_set_filesystem binary
character_set_results latin1
character_set_server utf8mb4
character_set_system utf8
character_sets_dir /rdsdbbin/mysql-5.6.22.R1/share/charsets/
Likewise, new columns that I create on the DB are latin1 encoded instead of utf8mb4 encoded. I can change the encoding values manually through the mysql command line, but this doesn't help since the values are also reset to latin1 when I push to production.
I think this the issue is the distinction between VARIABLES and GLOBAL VARIABLES.
If you list the GLOBAL VARIABLES this should reflect what you see in your parameter group: (assuming you've rebooted as Naveen suggested in the other answer)
SHOW GLOBAL VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
This is opposed to what you see in your regular VARIABLES:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
These can sometimes be overridden by the options supplied in the connection. eg connecting using the options --default-character-set:
mysql -h YOUR_RDS.us-east-1.rds.amazonaws.com -P 3306 --default-character-set=utf8 -u YOUR_USERNAME -p
After changing the parameter group - do you the warning "Pending Reboot" in the console. If yes, try rebooting the DB Instance and the new character set would start be applied.
More information - http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html

UTF-8 input problems with Adminer and phpmyadmin

I recently switched my MariaDB database to UTF-8 from Latin1. Read a bunch of checklists and carefully updated my character set, collation, my.cnf and php.ini. I have php forms for most of my data entry on the site, but sometimes for quick little changes, it's easier to go into a program like Adminer or phpmyadmin.
With the UTF-8 in place, I wanted to change director Alfonso Cuaron's name to Cuarón. I went to his entry in Adminer. Edit. Cuar[alt+0243]n. It showed in the edit box as Cuarón. But when I saved the change, Adminer showed it as Cuarón. Okay. Looked at page info in Firefox. Says the character encoding of the page is UTF-8. So all should be well, right?
I went to one of my php data entry forms and created a Bob Cuarón. It showed up fine.
I SSH'd into the server fired up a mysql command line and ran an update sql line with Cuarón. That worked. But trying to change it in Adminer still kept giving me Cuarón. I installed phpmyadmin (which was giving me some issues with my nginx config) but I was able to edit his name and...sigh...it too gave me Cuarón. I installed SQLbuddy and...success...I was able to make the changes, but the program is lacking some of the things I need, like the ability to edit search results.
I'm sure I've nailed everything down:
nginx.conf:
charset UTF-8;
my.cnf:
[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
/etc/php5/fpm/php.ini
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8
SHOW VARIABLES LIKE "%character_set%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
I can't see what I could be missing. Both Adminer and phpmyadmin handle UTF-8 so I don't know why it's not working. It worked right out of the box with SQLBuddy, but as I said it's missing some features.
Any thoughts where I should look?
UPDATE: turns out that an article I had read (forgot to bookmark, sorry) on UTF-8 migrations had me change some of the mbstring setting away from PHP's defaults. Someone at adminer noticed that and all was good. See their response here:
https://sourceforge.net/p/adminer/discussion/960418/thread/33595373/#42df

Encoding error with Rails 2.3 on Ruby 1.9.3

I'm in the process of upgrading an old legacy Rails 2.3 app to something more modern and running into an encoding issue. I've read all the existing answers I can find on this issue but I'm still running into problems.
Rails ver: 2.3.17
Ruby ver: 1.9.3p385
My MySQL tables are default charset: utf8, collation: utf8_general_ci. Prior to 1.9 I was using the original mysql gem without incident. After upgrading to 1.9 when it retrieved anything with utf8 characters in it would get this well-documented problem:
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)
I switched to the mysql2 gem for it's superior handling and I no longer see exceptions but things are definitely not encoding correctly. For example, what appears in the DB as the string Repoussé is being rendered by Rails as Repoussé, “Boat” appears as “Boatâ€, etc.
A few more details:
I see the same results when I use the ruby-mysql gem as the driver.
I've added encoding: utf8 lines to each entry in my database.yml
I've also added the following to my environment.rb:
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
It has occurred to me that I may have some mismatch where latin1 was being written by the old version of the app into the utf8 fields of the database or something, but all of the characters appear correctly when viewed in the mysql command line client.
Thanks in advance for any advice, much appreciated!
UPDATE: I now believe that the issue is that my utf8 data is being coerced through a binary conversion into latin1 on the way out of the db, I'm just not sure where.
mysql> SELECT CONVERT(CONVERT(name USING BINARY) USING latin1) AS latin1, CONVERT(CONVERT(name USING BINARY) USING utf8) AS utf8 FROM items WHERE id=myid;
+-------------+----------+
| latin1 | utf8 |
+-------------+----------+
| Repoussé | Repoussé |
+-------------+----------+
I have my encoding set to utf8 in database.yml, any other ideas where this could be coming from?
I finally figured out what my issue was. While my databases were encoded with utf8, the app with the original mysql gem was injecting latin1 text into the utf8 tables.
What threw me off was that the output from the mysql comand line client looked correct. It is important to verify that your terminal, the database fields and the MySQL client are all running in utf8.
MySQL's client runs in latin1 by default. You can discover what it is running in by issuing this query:
show variables like 'char%';
If setup properly for utf8 you should see:
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
If these don't look correct, make sure the following is set in the [client] section of your my.cnf config file:
default-character-set = utf8
Add add the following to the [mysqld] section:
# use utf8 by default
character-set-server=utf8
collation-server=utf8_general_ci
Make sure to restart the mysql daemon before relaunching the client and then verify.
NOTE: This doesn't change the charset or collation of existing databases, just ensures that any new databases created will default into utf8 and that the client will display in utf8.
After I did this I saw characters in the mysql client that matched what I was getting from the mysql2 gem. I was also able to verify that this content was latin1 by switching to "encoding: latin1" temporarily in my database.conf.
One extremely handy query to find issues is using char length to find the rows with multi-byte characters:
SELECT id, name FROM items WHERE LENGTH(name) != CHAR_LENGTH(name);
There are a lot of scripts out there to convert latin1 contents to utf8, but what worked best for me was dumping all of the databases as latin1 and stuffing the contents back in as utf8:
mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DBNAME > DBNAME.sql
mysql -u root -p --default-character-set=utf8 DBNAME < DBNAME.sql
I backed up my primary db first, then dumped into a test database and verified like crazy before rolling over to the corrected DB.
My understanding is that MySQL's translation can leave some things to be desired with certain more complex characters but since most of my multibyte chars are fairly common things (accent marks, quotes, etc), this worked great for me.
Some resources that proved invaluable in sorting all of this out:
Derek Sivers guide on transforming MySQL data latin1 in utf8 -> utf8
Blue Box article on MySQL character set hell
Simple table conversion instructions on Stack Overlow
You say it all looks OK in the command line client, but perhaps your Terminal's character encoding isn't set to show UTF8? To check in OS X Terminal, click Terminal > Preferences > Settings > Advanced > Character Encoding. Also, check using a graphical tool like MySQL Query Browser at http://dev.mysql.com/downloads/gui-tools/5.0.html.