character_set_database not updating on mysql - mysql

so I followed the "tutorial" here:
http://outofcontrol.ca/blog/comments/change-mysql-5.5-default-character-set-to-utf8
And after doing so, the mysql 5.7 servers on linux updated properly and everything is showing as I would expect.
BUT, I did the same on my development machine, Macbook Pro, El Capitan, and I see the following:
character_set_client utf8
character_set_connection utf8
**character_set_database latin1**
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
character_sets_dir /usr/local/mysql-5.6.10-osx10.7-x86_64/share/charsets/
Why is this still latin1? The other entries updated to utf8 and this worked on linux.

character_set_database is of virtually no use. Don't worry about it.
When creating a table, explicitly state
CREATE TABLE (...) ... CHARACTER SET utf8;
That helps you avoid various changes to defaults that may occur now or in the future.
Actually, the future is likely to be utf8mb4 with collation utf8mb4_unicode_520_ci.
As long as you are going to 5.7, you may as well go to that combo. utf8mb4 lets you get to Emoji and the rest of the Chinese character set; the 520 collation is arguably "more correct".
The link you gave is a bit weak -- it does not explain how to convert any data you have.
Existing tables are not changed by changing the settings you mentioned.
So, what is the whole picture? Do you have existing data? In utf8 or latin1? Are you updating in place? Or dumping and reloading? Etc.

Related

MySQL database - conversion of characterset and collation to utf8mb4 and utf8mb4_unicode_ci?

I have converted charset and collation of my mysql database from latin1 to utf8mb4 using the collowing commands as advised here
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
To check the conversion was done properly, I have run the following command.
SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%'
OR Variable_name LIKE 'collation%'
The output is
While character_set_client, character_set_connection, character_set_database, character_set_results are now in utf8mb4, character_set_filesystem is in binary and character_set_server is still in latin. What exactly these and why it is still not in utf8mb4?
Similarly, collation_connection and collation_database are in utf8mb4_unicode_ci, but collation_server is still in latin1_swedish_ci
https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_character_set_filesystem
This variable is used to interpret string literals that refer to file names, such as in the LOAD DATA INFILE and SELECT ... INTO OUTFILE statements and the LOAD_FILE() function. Such file names are converted from character_set_client to character_set_filesystem before the file opening attempt occurs. The default value is binary, which means that no conversion occurs.
You probably don't need to change this value.
https://dev.mysql.com/doc/refman/5.7/en/charset-server.html
The server character set and collation are used as default values if the database character set and collation are not specified in CREATE DATABASE statements. They have no other purpose.
You can change this value in your /etc/my.cnf but it's redundant since if you already specify the character set for each database.

Mysql server's encoding differs from the client's (latin1 vs utf8mb4). How bad is it?

I'm designing a php web-app and have some difficulties understanding a meaning of Mysql variables related to encoding and how they interact between each other. The encoding of the server is set to latin1 but the client's is utf8mb4.
Running the mysql query inside a database
SHOW VARIABLES
WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'
gives the following:
character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = latin1
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = latin1
character_set_system = utf8
collation_connection = utf8mb4_unicode_ci
collation_database = latin1_swedish_ci
collation_server = latin1_swedish_ci
I'm afraid running into issues with the older databases which are in latin1 if I change the character set of the mysql server to utf8mb4, but I certainly want to use utf8mb4 for the new databases I create. To correctly serve and retrieve data from the database should server's and client's encoding and collation always be the same? Any insight would be appreciated?
Some of those VARIABLES must agree with what encoding is used in the client.
CREATE TABLE ... specifies how they are to be stored in the tables.
If those two differ, then MySQL will convert "on the wire" between the client encoding an the table encoding.
If that means converting, say, Korean characters (encoding in utf8 or utf8mb4) to latin1 encoding, it will not be possible. On the other hand, all accented letters in Western Europe have encodings in both latin1 and utf8, so there is no problem.
Read this for common screwups:
Trouble with UTF-8 characters; what I see is not what I stored
See ALTER TABLE .. CONVERT TO .. for converting all character columns in one table to a different encoding (assuming it was correctly stored to begin with).

MySQL change database + tables charset & collation from UTF8 to UTF8mb4

I currently have a MySQL database with the following settings:
character_set_client: utf8
character_set_connection: utf8
character_set_database: utf8
character_set_filesystem: binary
character_set_results: utf8
character_set_server: latin1
character_set_system: utf8
collation_connection: utf8_general_ci
collation_database: utf8_general_ci
collation_server: latin1_swedish_ci
I want to support emoji's and other languages (like Chinese) in the database. Currently this is not working, those characters are automatically converted to a ?.
I created a test database with charset & collation utf8mb4(_general_ci) and a table with the same settings. Emojis work here. However, when I change the database settings to utf8(_general_ci) and leave the table as utf8mb4(_general_ci), emojis are still working, while this is not the case with my main database.
If I change my main database settings to charset + collation utf8mb4(_general_ci), and the tables as well, would that work?
And for database-access, will anything else have to be changed, such as character_set_connection or collation_connection?
I know on my JavaScript server, the connection is configured as utf8, I assume this has to be utf8mb4.
All current utf8(_general_ci) data, will that be kept intact when changing to utf8mb4(_general_ci)?
Correctly stored utf8 characters will convert correctly to utf8mb4.
You should also specify that the connections are utf8mb4.
See this for discussion of 'question mark'.
To convert all the char/text columns to utf8mb4:
ALTER TABLE tbl CONVERT TO CHARACTER SET utf8mb4;
To convert one column:
ALTER TABLE tbl MODIFY COLUMN col ... CHARACTER SET utf8mb4;

Do mysql setting Spanish equal English?

I run this command
select 'Conceição do Almeida'='Conceicao do Almeida';
The left side of the equal sign is Spanish,and the other side is English.
But,It return 1 as result!
Obviously, this is completely different two strings!
Do mysql setting Spanish equal English?
By the way
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database utf8
character_set_filesystem binary
character_set_results utf8mb4
character_set_server utf8
character_set_system utf8
So I don't think this about coding errors.
The compare depends on the collation you use.
See this SQLFiddle example.
So either change your table collation or use a specific collation in your queries like utf8_bin.

MySQL UTF8 Data Not Being Displayed Properly

I want all the data in MySQL to be UTF8 encoded. I've set all the character sets and collations to be UTF8 for the database, tables and columns. Before anything is written to the database, I use mb_detect_encoding in PHP to check if it is UTF8. Thus, I believe all the data is UTF8 encoded.
However, here is the problem: take this word Ríkarðsdóttir, it shows up correctly when queried from the database and displayed through PHP on a UTF8 encoded webpage. If I query this same record through phpMyAdmin, I get Ríkarðsdóttir. The same is true if I use the MySQL command line.
Running SHOW VARIABLES returns to me:
character_set_client utf8,
character_set_connection utf8,
character_set_database utf8,
character_set_filesystem binary,
character_set_results utf8,
character_set_server latin1,
character_set_system utf8
Only the server is latin1, and I am on a shared hosting site and don't believe I can change that. Could that be the problem?
Here is what I do not understand: why does my UTF8 webpage correctly display Ríkarðsdóttir, but a UTF8 encoded phpMyAdmin webpage display it as Ríkarðsdóttir? Is the data not truly UTF8 encoded or does the database not believe it is? What needs to be done to correct this?
Try running this query right after you connect:
SET NAMES UTF8
Your database needs to store the data as UTF8, and your web page header should also have a UTF8 declaration, but your connection to the database also needs to use UTF8. You can run that on the command line and/or through PHPMyAdmin. All communication after that "query" will then be UTF8 encoded.