phpMyAdmin export database - mysql

I use Mamp 3.4. I have a small database with 3 tables. When I upload the database file to the server I have that error: #1115 - Unknown character set: 'utf8mb4'
I have gone back to MAMP and check: Operations > Collation > utf8_unicode_ci
I have that in each table and in the general database
To export I select the database > Export > Custom > Save Output to a file. In the rest of things I leave the default.
Where is the problem? what is that mb4? is the utf8_unicode_ci the right one? How to export from MAMP and import in my server?

Let's get one thing straight: character set is not the same as collation. The two concepts are closely realted only.
Character sets tell the programs processing text how to interpret the byte stream that makes up the text and what character to display on the screen.
Collations tell the programs processing text how to order characters for comparison and sorting purposes. So, if you do an order by on a text field in an RDBMS, then the RDBMS can figure out using the collation the order of the records.
utf8mb4 is a character set MySql uses. MySql's implementation of utf8 can represent a character on up to 3 bytes, while utf8mb4 can represent characters on up to 4 bytes. The utf8 standard uses the up to 4 bytes definition (utf8, wikipedia), so strictly speaking, utf8mb4 is the true utf8 implementation in mysql.
However, utf8mb4 has only been added relatively recently (v5.5.3), so its existence is still not that widely known in the mysql community (MySql utf8mb4).
If you try to import data using this character set to a database that does not support it, then you get the error message in your question.
Collation should match the encoding, so if you have utf8mb4 character set, then use an utf8mb4 collation as well. You need to convert your data to a character set that is supported by your target system and you need to align the collation with your encoding.

Related

Issue when deploying mysql db (utf8mb4_unicode_520_ci -> utf8mb4_unicode_ci)

I started working on a wordpress on my dev machine. mysql version is 5.6, and worpdress is 4.7 so its already using the utf8mb4_unicode_520_ci encoding if it detects its possible.
My problem is that on my hosting (mysql 5.5) utf8mb4_unicode_520_ci is not recognized as a valid encoding. So I'm trying to target utf8mb4_unicode_ci encoding as my hosting knows about this one, and if I understand correctly, this would - in opposition to going to utf8 - allow me to keep the 4 bytes.
I tried several different combinaison of encoding and collation set up for the db, but nothing successful (from here How to convert an entire MySQL database characterset and collation to UTF-8?).
I tried several combination of encoding and collation in the wp-config, but nothing.
Everything that is coming from the database (like post titles and post contents displays badly encoded char for all diatrics, anything else is displayed appropriately )
menu label from the database display incorrectly, where the hardcoded/translated label display correctly
I think I need to convert the actual content of the database, changing charset and collation does not seems to be enough.
I found this but it does not address my problem directly, or if it does I missed it.
Any help would be appreciated
————————————————————————————————
UPDATE :
here is the precise procedure I went through:
Initial situation:
I installed a wordpress (4.6.1) locally (on my dev machine, mysql 5.6.28).
I worked on the theme and plugin locally
(at this moment I have, locally, a database that is utf8_general_ci and tables that are utf8mb4_unicode_520_ci
Problem:
I want to deploy my wordpress on my hosting (mysql: 5.5 - db collation seems to be utf8mb4_unicode_ci).
I mysqldump the db locally, then try to import it on my hostings' phpmyadmin.
This gives error :
Unknown collation: 'utf8mb4_unicode_520_ci'
solution 1 change the tables charset to utf8mb4_unicode_ci:
On my hosting sql server, utf8mb4_unicode_520_ci is not available and I can't get a more recent version of mysql.
utf8mb4_unicode_ci seems like the closest and is available on my hosting sql server.
from various so question, I adapt a bash script to change charset and collation of my tables
for tbl in wp_sij2017_commentmeta wp_sij2017_comments wp_sij2017_cwa wp_sij2017_links wp_sij2017_options wp_sij2017_postmeta wp_sij2017_posts wp_sij2017_term_relationships wp_sij2017_term_taxonomy wp_sij2017_termmeta wp_sij2017_terms wp_sij2017_usermeta wp_sij2017_users wp_sij2017_woocommerce_api_keys wp_sij2017_woocommerce_attribute_taxonomies wp_sij2017_woocommerce_downloadable_product_permissions wp_sij2017_woocommerce_order_itemmeta wp_sij2017_woocommerce_order_items wp_sij2017_woocommerce_payment_tokenmeta wp_sij2017_woocommerce_payment_tokens wp_sij2017_woocommerce_sessions wp_sij2017_woocommerce_shipping_zone_locations wp_sij2017_woocommerce_shipping_zone_methods wp_sij2017_woocommerce_shipping_zones wp_sij2017_woocommerce_tax_rate_locations wp_sij2017_woocommerce_tax_rates; do
mysql --execute="ALTER TABLE wp_sij_2017_original_copy.${tbl} CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"
done
I run this script on the local db
I now have all my tables set to collation utf8mb4_unicode_ci
My db collation is still utf8
I mysqldump the db, then import it to my hosting and...
Import is successful.
I search and replace siteurl in the db.
I then visit the online website, I got SOME diatrics that renders a "question mark char"
Any text coming from the db has decoding issue AT SOME POINT
The source/html markup also has those "question mark char"
I have no idea where to look or what to do next
Clarification: CHARACTER SETs utf8 and utf8mb4 specify how characters are encoded into bytes. COLLATIONs *_unicode_*, etc, specify how those character compare.
The encoding for utf8mb4_unicode_ci and utf8mb4_unicode_520_ci are the same because they are encoded in the character set utf8mb4.
"database that is utf8_general_ci and tables that are utf8mb4_unicode_520_ci" -- that probably means that new tables in that database, unless specifically stated, will be CHARACTER SET utf8 COLLATION utf8_general_ci. That is the database setting is just a default for CREATE TABLE. Since your tables are already CHARACTER SET utf8mb4 COLLATION utf8mb4_unicode_520_ci, the database default is not relevant to them.
As long as the CHARACTER SET stays utf8mb4, no Emoji, Chinese, etc will be lost or otherwise mangled.
Do not use mysql40; it did not know about any CHARACTER SETs. Do not use CONVERT or CAST. Etc.
I assume the 520 is coming from the output of mysqldump? Do you have an editor that can handle a file that big? If so, simply edit it to change utf8mb4_unicode_520_ci to utf8mb4_unicode_ci throughout. Then load the dump. Problem solved?
Your fix
You did ALTER ... CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci on your local machine. That is probably an even better way -- since it will put your dev and prod machine in line with each other. That should have worked. Don't worry about what the "database" claims.
I'm find 'utf8mb4_unicode_520_ci' and replace with 'utf8mb4_unicode_ci' in .sql file.
Its simplest why to solve this.

#1115 - Unknown character set: 'utf8mb4'

I have a local webserver running on my pc to which I use for local development. I'm now at the stage of exporting the database and importing onto my hosted VPS.
When exporting then importing I get the following error!
1115 - Unknown character set: 'utf8mb4'
Can somebody point me in the right direction?
The error clearly states that you don't have utf8mb4 supported on your stage db server.
Cause: probably locally you have MySQL version 5.5.3 or greater, and on stage/hosted VPS you have MySQL server version less then 5.5.3
The utf8mb4 character sets was added in MySQL 5.5.3.
utf8mb4 was added because of a bug in MySQL's utf8 character set.
MySQL's handling of the utf8 character set only allows a maximum of 3
bytes for a single codepoint, which isn't enough to represent the
entirety of Unicode (Maximum codepoint = 0x10FFFF). Because they
didn't want to potentially break any stuff that relied on this buggy
behaviour, utf8mb4 was added. Documentation here.
Solution 1:
Simply upgrade your MySQL server to 5.5.3 (at-least) - for next time be conscious about the version you use locally, for stage, and for prod, all must have to be same.
A suggestion - in present the default character set should be utf8mb4.
Solution 2 (not recommended): Convert the current character set to utf8, and then export the data - it'll load ok.
Sometimes I get similar problems while using HeidiSQL which by default exports in utf8mb4 character encoding. Not all MySQL installations support this encoding and importing such data leads to similar error messages. My workaround then is to export data using phpMyAdmin, which exports in utf8. There are problably other tools and possible ways like manually editing dump file, converting it from utf8mb4 to utf8 (if needed) and changing SET NAMES utf8mb4 to SET NAMES utf8. Utf8mb4 is a superset of utf8, so if you're absolutely sure, that your data is just utf8, then you can simply change SET NAMES in dump file to utf8.
Open sql file by text editor find and replace all
utf8mb4 to utf8
Import again.
This helped me
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

incorrect output 4 bytes symbols in mysql table with utf8mb4 encode

I want to insert via phpmyadmin 4 bytes character in the tabel. (phpmyadmin version is 5.5.33).
I assigned Server connection collation to utf8mb4_general_ci collation;
Database has utf8mb4 encode;
Table and column has utf8mb4 encode;
I tryed to insert 𩸽 symbol and it was success and without any errors! But this symbol in the table is displayed as ????.
Can someone help, please?
So I would reccomend you to check what is the application web encoding because your problem is not the data itself is the program that is printing it. If your php administration tool or the web container (apache most probably) that is hosting this application doesn't have your character encoding you wont see your character. Most of theese application use just UTF8 as encoding therefore I suggest you to change your database to this encoding just UTF8 and the collation to utf8_general_ci.
Your question is most probably related with this one How to display UTF-8 characters in phpMyAdmin?

default database collation not respected while importing

In my database, the collation was originally utf8_general_ci. However, I noticed that utf8_unicode_ci is necessary because of better sorting accuracy.
So I exported all database using phpmyadmin and checked that the word "COLLATION" does not appear in the exported sql file (except for only 2 times in one table where it is set to binary) so generally this script is collation agnostic and should not imply any specific collation when importing but use database default.
After dropping all tables, the database collation was changed to utf8_unicode_ci and then the import script was run from phpmyadmin. But as a result, all tables and all columns are shown again with utf8_general_ci collation (and sorting is incorrect). Why?? And what to do to change it?
P.S. The export/import script contains commented line at the beginning:
/*!40101 SET #OLD_COLLATION_CONNECTION=##COLLATION_CONNECTION */;
I don't know if it has any impact while importing, but after opening mysql console, the command show variables like 'collation_connection'shows COLLATION_CONNECTION as cp852_general_ci.
However, in phpmyadmin->variables the variable 'collation_connection' is set to utf8_general_ci. But there is no way to change it.
That happens because the database export is setting the character set on every table, and such a clause comes with a default collation that depends on the character set, not on the collation of your connection. utf8_general_ci is the default collation for utf8.
You'll have to convert your tables with something like ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; or edit your database export if this is affordable.
As for the MySQL console: the command-line client is pretty much broken on Windows. It'll never support, display or read Unicode, and you're getting a per-connection collation for that client that matches your Windows so-called OEM character set for your locale. This is a Windows misfeature that's difficult to workaround in portable software. PHPMyAdmin uses a web server and doesn't suffer from this problem. I advise you to use a UNIX-like operating system like GNU/Linux for any serious work in any case, not just for this reason. As an added benefit, MySQL, Apache and your whole application stack perform better on Linux.

Should I migrate a MySQL database with a latin1_swedish_ci collation to utf-8 and, if so, how?

The MySQL database used by my Rails application currently has the default collation of latin1_swedish_ci. Since the default charset of Rails applications (including mine) is UTF-8, it seems sensible to me to use the utf8_general_ci collation in the database.
Is my thinking correct?
Assuming it is, what would be the best approach to migrate the collation and all the data in the database to the new encoding?
UTF-8, as well as any other Unicode encoding scheme, can store characters in any language, so it is an excellent choice of codepage for your database.
The collation setting, on the other hand, is a completely separate issue from the encoding scheme. It involves sort orders, upper/lowercase conversions, string equality comparisons, and things like that which are language-specific. The collation setting should match the language that is used in the database.
The UTF-8 general collation is (I am assuming here—I'm not familiar with MySQL in particular) used for situations where the language is unknown and some simple default ordering is needed. It probably corresponds to the Unicode code point ordering, which is almost certainly not what you want if you're storing Swedish.
Convert to UTF-8 as the charset.
Collation settings are only used for sorting and stuff like that. Choose the collation that most of your users would expect.
Providing your existing data in the database is CORRECTLY encoded in latin1, converting the tables to utf8 (using ALTER TABLE, as described in the docs) should just work.
Then all your application needs to do is continue doing whatever it did before. If your application wants to use unicode characters, it should set its connection encoding to utf8 and use utf8, but that's its own problem.
The problem is that a large number of crap web apps have historically sent utf8 data to mysql and told it to treat it as latin1. MySQL will honour this perfectly and save junk into the tables, as instructed.
Converting the tables from latin1 to utf8 will NOT repair this mistake, as you genuinely do have total rubbish in there. Repairing them is nontrivial, particularly if during the lifetime of the app it's been talking different types of rubbish to the database.
Use below mysql query to convert your column :
ALTER TABLE users MODIFY description VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci;
To see full details about your table :
SHOW FULL COLUMNS FROM users;