AWS RDS Parameter Group not changing MySQL encoding - mysql

I am running a MySQL database on RDS. I want to change all of my encodings to utf8mb4. I created a parameter group on RDS with all character_set_* parameters as utf8mb4, assigned it to my RDS instance, and then rebooted the instance. However, when I run SHOW VARIABLES LIKE '%char%' on my DB, there are still values of latin1, which I do not want:
character_set_client latin1
character_set_connection latin1
character_set_database utf8mb4
character_set_filesystem binary
character_set_results latin1
character_set_server utf8mb4
character_set_system utf8
character_sets_dir /rdsdbbin/mysql-5.6.22.R1/share/charsets/
Likewise, new columns that I create on the DB are latin1 encoded instead of utf8mb4 encoded. I can change the encoding values manually through the mysql command line, but this doesn't help since the values are also reset to latin1 when I push to production.

I think this the issue is the distinction between VARIABLES and GLOBAL VARIABLES.
If you list the GLOBAL VARIABLES this should reflect what you see in your parameter group: (assuming you've rebooted as Naveen suggested in the other answer)
SHOW GLOBAL VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
This is opposed to what you see in your regular VARIABLES:
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
These can sometimes be overridden by the options supplied in the connection. eg connecting using the options --default-character-set:
mysql -h YOUR_RDS.us-east-1.rds.amazonaws.com -P 3306 --default-character-set=utf8 -u YOUR_USERNAME -p

After changing the parameter group - do you the warning "Pending Reboot" in the console. If yes, try rebooting the DB Instance and the new character set would start be applied.
More information - http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html

Related

MySQL only storing some Emojis in text field when using UTF8MB4

We have a project where we're storing Facebook and Twitter posts in a Mysql database, as first almost all Emojis were being stored as ?. We've since gone ahead and made some configuration changes to the database server, and since then we're starting to see more Emojis saving and appearing correctly, however some Emojis are still showing as ?, sadly I'm not sure which ones they are. I know one of them was a basket ball.
When I execute the following commend on MySQL;
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%'
OR Variable_name LIKE 'collation%';
I see the following settings;
character_set_client = utf8
character_set_connection = utf8
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8
character_set_server = utf8mb4
character_set_system = utf8
collation_connection = utf8_general_ci
collation_database = utf8mb4_unicode_ci
collation_server = utf8mb4_unicode_ci
Our database server is hosted with Rackspace, we've asked them to set up the following configuration;
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect='SET NAMES utf8mb4'
I've tested output from the database using a number of clients, PHP, Java and MySQL Workbench.
I'm at a loss now as to why some Emojis are not saving, and I've followed as much advice as I can find on the web.
character_set_client/connection/results = utf8 -- These three are changed by SET NAMES. What you list seems to be before SET NAMES is executed.
If you are connecting as root, init-connect is not executed; perhaps this is why you don't see it.
Establish a non-SUPER user for all application work; that way the init-connect will be executed.

Encoding error with polish charset during transfer of database / server seting up

I am trying to transfer one of my databases from one host (home.pl) to another (my newly set server). The script that I am trying to transfer is wordpress. Unluckily irrespective of the method used I am struggling with encoding problems.
New host configuration
In my new server I am using the following directives in my.cnf:
[mysql]
default-character-set=utf8
[mysqld]
collation-server = utf8_general_ci
character-set-server = utf8
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
[client]
default-character-set=utf8
My mySQL vars:
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
collation_connection utf8_general_ci
collation_database utf8_general_ci
collation_server utf8_general_ci
Php.ini on new server:
; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"
Old host configuration
I have runned SHOW VARIABLES in my old host from which I am trying to transfer database and I got the following:
character_set_client utf8
character_set_connection utf8mb4
character_set_database utf8
character_set_results utf8
character_set_server latin2
character_set_system utf8
/usr/local/pssql55/share/charsets/
collation_connection utf8mb4_general_ci
collation_database utf8_polish_ci
collation_server latin2_general_ci
Transfer methods tried out
1) Transfer via phpmyadmin
I have tried using PHPMYADMIN export/import. In particular I have pointed out UTF-8 as file character set both during export and import via phpmyadmin.
What is strange both in phpmyadmin on source server and new host I don't see polish chars (the output is the same without polish chars).
2) Export / Import via mysql dump
I have tried also to use:
mysqldump -h OLD_HOST -u OLD_USER -p DB | mysql -h localhost -u root NEW DATABASE
but the encoding also fails.
Tried to use also encoding variables but it also failed:
mysqldump --default-character-set=latin1 | mysql --default-character-set=utf8
Dump file
In my dump file using Programers Notepad with UTF-8 encoding set, charcters look like this:
"Ä" instead of "ę"
Opening them in microsoft word I see
Ä™ instead of "ę"
The encoding converter (gżegżółka) recognises that the file is in:
C:\Users\mkondej001\Desktop\14271425_mk.sql
Kodowanie: Unicode UTF-8
EOL: LF (Unix)
Any clues how to transfer DB / set server variables correctly ?
At the end I have founded out that the problem was related to the fact that the data was written to SQL incorrectly in my original server.
I ended up with transferring DB using:
mysqldump --default-character-set=utf8 [ORYGINAL_DB] | mysql [TARGET_DB] --default-character-set=utf8
and the executing:
UPDATE [table name] SET [field] = CONVERT(BINARY CONVERT([field] USING latin2) USING utf8)
as it was advices here:
strange character encoding of stored data , old script is showing them fine new one doesn't
Hope that the above solution will be helpful for others too.
SET NAMES utf8;
(The default is latin11, which leads to Ä™.)
Note: init_connect is not executed for root (or any SUPER) user. So this failed you:
init_connect='SET NAMES utf8'

character-set-server VS default-character-set in MySQL

What's the difference between character-set-server and default-character-set in my.cnf? I want to set MySQL's connection to UTF8 and both of these seem to work. Is one better than the other?
Always use
character-set-server
The server character set and collation are used as default values if the database character set and collation are not specified in CREATE DATABASE statements. They have no other purpose (source)
character-set-server has replaced default-character-set setting, since default-character-set is now deprecated and can cause problems.
p.s. I believe the answer by newtover is wrong.
Here is a quote from MySQL docs:
You can force client programs to use specific character set as follows:
[client]
default-character-set=charset_name
This is normally unnecessary. However, when character_set_system differs from character_set_server or character_set_client, and you input characters manually (as database object identifiers, column values, or both), these may be displayed incorrectly in output from the client or the output itself may be formatted incorrectly. In such cases, starting the mysql client with --default-character-set=system_character_set—that is, setting the client character set to match the system character set—should fix the problem.
In other words, character_set_server and character_set_client are settings for mysqld, when default-character-set is a setting for mysql and other client libraries which overrides character_set_client assumed by mysqld by default.
You may not see the difference if you connect with mysql to localhost, but default-character-set is used as well when you connect to some other server, which may have other defaults.
UPD from 2018-08-17
As John Smith noticed, my answer is currently outdated, but the essence of it is still correct: character_set_server is a server variable, but when you connect to the mysqld with a client, you should specify the client and connection settings.
In these days much many computers used as clients and servers have utf-8 as a default for locale encoding and only because of that setting character-set-server instead of default-character-set might seem to work.
To be clear, to set up mysqld (that is server) to use utf-8 and some its collation as default for schema names, table names, column names and column values instead of latin1_swedish_ci you should set up characted-set-server in mysqld configuration.
But when you connect a mysql client to the server, your current charset may be any other, and to correctly convert to the server character set the data you send over a connection and to convert back the data sent from the server as response, you should specify the corresponding client settings:
SET character_set_client = charset_name;
SET character_set_results = charset_name;
SET character_set_connection = charset_name;
or the corresponding settings in mysql.ini for your client application. If all of them are the same, you can start your communication with server with a shorter statement:
SET NAMES 'charset_name' [COLLATE 'collation_name'];

Change variable in MySQL

In MySQL, how to change a variable such as character_set_client?
mysql> show variables like 'character_set%';
-------------------------+-------
character_set_client | latin1
to obtain
character_set_client | utf8
When starting MySQL client you have to specify --default-character-set=charset_name
From manual:
Use charset_name as the default character set for the client and
connection.
A common issue that can occur when the operating system uses utf8
or another multi-byte character set is that output from the mysql
client is formatted incorrectly, due to the fact that the MySQL
client uses the latin1 character set by default. You can usually
fix such issues by using this option to force the client to use the
system character set instead.
For example:
$>mysql -uUser -pPassword --default-character-set=utf8
For an example of how to set it via connection string see here.

mysql encoding encrypted text

I'm currently attempting to switch from my shared inmotionhosting account (have received AWEFUL service lately) to an Amazon EC2 server that I've set up. I'm having trouble with getting the encryption function working in the EC2 account.
In my PHP code, all text gets encrypted by mcrypt before being put into the SQL. I have deduced that those mcrypt characters are responsible for all my queries throwing errors. (I know it's because of encoding issues, but Google searches on the subject aren't very clear on where I need to focus my attention.)
A more simplified way of explaining the problem. On my new hosting account this SQL query doesn't work:
UPDATE mydatabase.clients SET firstname='\'å».”é¶Q' WHERE id_client=65
But this does
UPDATE mydatabase.clients SET firstname='Test' WHERE id_client=65
So that tells me the mcrypt function is using characters that the SQL database doesn't understand and thus the queries aren't working.
Some other info for you...
When I run "SHOW VARIABLES LIKE 'character_set_%'" on the working database I get this:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
When I do that on the nonworking database I get:
Variable_name Value
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
I saw the difference in character_set_database and ran this line of code:
ALTER DATABASE mydatabase DEFAULT CHARACTER SET latin1
It successfully changed the character_set_database to "latin1" to match the other, but didn't solve the problem.
Finally, all my columns in my tables are using the Collation "latin1_swedish_ci"
Any help you could give would be very very appreciated!
Store your encrypted strings as binary (or a similar) type. Also make sure you are escaping the encrypted string. Both are important parts to doing this right!
I've been working with MySQL and Mcrypt and I store my encrypted data and initialization vectors as binary and I escape all of these strings before they get put in a query. Works like a charm.