Export database with emoji - mysql

I would like to export my database with emoji, but I have a problem with the export. When I exported my table, the emoji are replaced by "?".
For example :
When I export, and import, I have this :
I checked my table (utf-8) :
I use Sequel Pro to export and import.
But if I tried with DataGrip, and I have directly the "?", I never see the emoji :

before you run the queries, run
set names utf8mb4;
Why?
In short.
First, Emoji usually takes four bytes, however, mysql's utf8, an alias for for utf8mb3, using one to three bytes(i.e., max byte three), which could not understand an Emoji char. As such, you see a '?' in your result. utf8mb4 can do the job since it requires a maximum of four bytes per multibyte character.
Second, set names utf8mb4 will set three session variable, e.g.,
SET character_set_client = utf8mb4;
SET character_set_results = utf8mb4;
SET character_set_connection = utf8mb4;
which will coordinate the barrier between server, client and results char set, so we can view the Emoji correctly.
For more information, you can find in the doc
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html
https://dev.mysql.com/doc/refman/8.0/en/set-names.html

Related

How can i change Mysql character_set_system UTF8 to utf8bin

When i dump mysql data out , some data has changed because of the character_set_system which is UTF8.Server , client and connection character sete are utf8mb4.
I guess the problem is system character set and server character set differences.
I am trying to change system caharacter set from UTF8 to utf8mb4 with this
Change MySQL default character set to UTF-8 in my.cnf?
But i can not
The title is incorrectly phrased.
"utf8" is a "character set"
"utf8_bin" is a "collation" for the character set utf8.
You cannot change character_set... to collation. You may be able to set some of the collation_% entries to utf8_bin.
But none of that a valid solution for the problem you proceed to discuss.
Probably you can find more tips here: Trouble with UTF-8 characters; what I see is not what I stored
To help you further, we need to see the symptoms that got you started down this wrong path.

MySql Database change existing table to UTF8

I have a weird problem with MySql supporting cyrilic alphabet. The database has been created in utf8_unicode_ci from the start, however the tables were not. Right now the table data, if supplied in cyrrilic looks like this ????????, if I create a table from start in utf there is no problem, however if I try to change the existing table encoding by using
ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Which is supposed to change existing data or
ALTER TABLE Strategies
CHARACTER SET utf8,
COLLATE utf8_unicode_ci;
which is supposed to change future data, it doesn't work.
I have also change my.cnf file and added in
[mysqld]
#
#default-character-set=utf8 this one breaks mysql restart
character-set-server=utf8
skip-character-set-client-handshake
collation-server=utf8_unicode_ci
init-connect='SET NAMES utf8'
init_connect='SET collation_connection = utf8_general_ci'
If I run SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%'; I get:
I also change to utf directly in PHP my admin and it actually shows that the table is in utf but nothing happens to the existing ????????? or to the future cyrillic inputs.
Hopefully someone else had experinced this kind of issue, would be really greatfull for any help or suggestions. Thank you.
If a table starts out as latin1 and has latin1-encoded characters in it, use ALTER TABLE ... CONVERT TO CHARACTER SET utf8 (as you did)
Before converting, test the old encoding do two things:
SHOW CREATE TABLE ... -- to see that the columns say latin1
SELECT HEX(col) ... -- to see what the encoding looks like: é should show E9
I say, "before" because it is possible to cram utf8 into latin1 incorrectly. é should show C3A9 -- this is "double-encoding".
Do likewise after the conversion:
SHOW CREATE TABLE ... -- to see that the columns say utf8
SELECT HEX(col) ... -- to see what the encoding looks like; é should show C3A9
C383C2A9 would indicate double-encoding. A mess.
Do not depend on init-connect='SET NAMES utf8' if you connect as root, init-connect is ignored for root and any other SUPER user.
But... You say you put Cyrillic text into a latin1 column? That is "impossible" since latin1 cannot represent anything other than latin-based Western European characters. So... You probably have "double-encoding".
For more debugging, see Trouble with utf8 . Note especially "question mark".
To repair double-encoding, see Fixes , and pick the appropriate case. This link also says what you should have done (the 2-step Alter) instead of what you did.

Store the city name Łódź in MySQL table

I currently do have an address table in MYSQL, with its Character Set set to 'utf8' and Collation to 'utf8_unicode_ci'. There exists a column name Address and I am trying to store the city name Łódź into the Address column. I tried to key in directly into the table at SQLyog Community 64, as well as using the tool MYSQL for Excel but it keeps showing the error 'Incorrect string value'.
I have tried to set the Character Set set to 'utf8mb4' and Collation to 'utf8mb4_unicode_ci'and it still gives me the same error.
Any help on how should I set the character set and collation in order to store Łódź? This city name is just one of many examples, and moving forward I may experience other similar characters as well. What can I use for a universal character set?
(utf8 and utf8mb4 work equally for Polish characters.)
You have not provided enough details about the flow of the characters, but the following should provide debugging for MySQL:
Trouble with utf8 characters; what I see is not what I stored
When stored correctly, the utf8 (or utf8mb4) encoding for Łódź is hex C581 C3B3 64 C5BA.

mysql unicode text incorrect string warning on insert, despite character set variables set utf8mb4

First, I know, yes, this is yet another mysql unicode question.
Problem: I am unable to insert unicode text into my mysql database
I want to execute the following query:
INSERT INTO usert SET username='田中'
When I do, I get this warning:
Incorrect string value: '\x93c\x92\x86' for column 'username' at row 1
A blank space is inserted into the table instead of the data
I have tried as many answers and forums as I could, and I believe that all appropriate variables, table, and column settings are set to 'utf8mb4' character set, with collation 'utf8mb4_general_ci' or 'utfmb4_unicode_ci'
I will tell you why I believe that by giving you the details, and sql commands used to show them.
First, mysql version:
mysql:> SHOW VARIABLES LIKE 'version'
Confirms that the version is 5.6.23
To show the character set variables in mysql:
mysql:> SHOW VARIABLES LIKE '%char%'
That command shows (in slightly different format):
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
...
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
Collation:
mysql:> SHOW VARIABLES LIKE '%collat%'
RESULTS:
collation_connection: utf8mb4_unicode_ci
collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci
So far so good?
Now, for the table character set and collation:
Look at table details command:
mysql:> SHOW TABLE STATUS
shows that the collation is utf8mb4_general_ci
Command for looking at column details:
mysql:> SHOW FULL COLUMNS IN usert
Confirms that the collation for column 'username' is utf8mb4_general_ci
In summary, from what I have studied, all relevant variables, database, table, and column settings seem to be set to the relevant utf8mb4 setting. Despite that, I am unable to insert the unicode Japanese text.
(By the way, I dont think the 4-byte unicode settings utf8mb4 is necessary here, but it is what I am using because it seemed to fix many other unicode mysql problems)
What other settings in mysql or the system are likely causing this problem?
What other settings can I/ should I change to allow inserting japanese text appropriately?
EDIT UPDATE: I am on a Japanese computer
The problem was the default system settings, which also affected the input settings at the command line.
Its a Japanese computer, which apparently uses shift-jis encoding using, NOT unicode, by default. The text I was inputting was encoded in this way, and in similar input files I was trying to use.
Therefore, I set the character set to be 'jsis' in the server,
i.e. setting character-set-server=sjis in the my.ini initializer file, and set the mysql character set to be the same by entering skip-character-set-client-handshake into the same initilization file.
The character set for the column of course must also be changed via
ALTER TABLE usert MODIFY username varchar(30) CHARACTER SET sjis COLLATE sjis_japanese_ci
Now, you can insert the japanese text from command line, and other japanese files which use shift-jis encoding.
Another option for inputting japanese text seems to be cp932, which is the windows version of shift-jis.
Incidentally, if you DO wish to use unicode via command line, apparently powershell has better support for it, rather than the normal cmd I was using, but I haven't tried it personally.
Try check character set of Database.
Check character set of your Database with command bellow:
SELECT ##character_set_database, ##collation_database;
If result of 1 different UTF-8 then try command bellow:
ALTER DATABASE yourDatabase CHARACTER SET utf8 COLLATE
utf8_unicode_ci;
Hope it work for you.

utf8 and utf8_general_ci

I have problem inserting rows to my DB.
When a row contains characters like: 'è', 'ò', 'ò', '€', '²', '³' .... etc ... it returns an error like this (charset set to utf8):
Incorrect string value: '\xE8 pass...' for column 'descrizione' at row 1 - INSERT INTO materiali.listino (codice,costruttore,descrizione,famiglia) VALUES ('E 251-230','Abb','Relè passo passo','Relè');
But, if I set the charset to latin1 or *utf8_general_ci* it works fine, and no errors are found.
Can somebody explain me why does this happens? I always thought that utf8 was "larger" than latin1
EDIT: I also tried to use mysql_real_escape_string, but the error was always the same!!!!
mysql_real_escape_string() is not relevant, as it merely escapes string termination quotes that would otherwise enable an attacker to inject SQL.
utf8 is indeed "larger" than latin1 insofar as it is capable of representing a superset of the latter's characters. However, not every byte-sequence represents valid utf8 characters; whereas every possibly byte sequence does represent valid latin1 characters.
Therefore, if MySQL receives a byte sequence it expects to be utf8 (but which isn't), some characters could well trigger this "incorrect string value" error; whereas if it expects the bytes to be latin1 (even if they're not), they will be accepted - but incorrect data may be stored in the table.
Your problem is almost certainly that your connection character set does not match the encoding in which your application is sending its strings. Use the SET NAMES statement to change the current connection's character set, e.g. SET NAMES 'utf8' if your application is sending strings encoded as UTF-8.
Read about connection character sets for more information.
As an aside, utf8_general_ci is not a character set: it's a collation for the utf8 character set. The manual explains:
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set.
According to the doc for UTF-8, the default collation is utf8_general_ci.
If you want a specific order in your alphabet that is not the general_ci one, you should pick one of the utf8_* collation that are provided for the utf8 charset, whichever match your requirements in term of ordering.
Both your table and your connection to the DB should be encoded in utf8, preferably the same collation, read more about setting connection collation.
To be completely safe you should check your table collation and make sure it's utf8_* and that your connection is too, using the complete syntax of SET NAMES
SET NAMES 'utf8' COLLATE 'utf8_general_ci'
You can find information about the different collation here
mysql_query("SET NAMES 'utf8' COLLATE 'utf8_general_ci'");
Eurika, the above did it :-)