BLOB not storing utf-8 (Chinese) in MySQL - mysql

I have a table like this
CREATE TABLE account_data (
id BIGINT NOT NULL,
data BLOB NOT NULL,
PRIMARY KEY (account_id)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
However MySQL is not storing it as Chinese, but some garbage values
like «å«¢ å æææ æ­¶ç·è
I checked everywhere and it says CHARSET must be utf8, which is my case
The MySql version I am using is 5.6.14.
I Tried
ALTER TABLE account_data MODIFY data BLOB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
but for some reason, MySQL is giving syntax error for BLOB.
If I do
insert into account_data (id
,data)
VALUES
(5952638508182497,
"123456偟 滭滹漇 嶕憱撏 齞齝囃 熤熡"
);
and check TEXT in Viewer in MySQL workbench, I can see 123456 but for Chinese I am seeing garbage.
Thanks

You can try to set char set of DB and also column.
ALTER DATABASE <database_name> CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE <table_name> MODIFY <column_name> VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Hope this can help you!

For Chinese, you want CHARACTER SET utf8mb4, and optionally some COLLATION starting with utf8mb4_, the default is probably fine.
BLOB should work as you described it, but probably the insertion connection and the reading connection were configured differently. You mentioned Workbench; what charset is set up in it? Was that used for both reading and writing?
The sample gibberish you provided (which does not map correctly into any Chinese) looks like Mojibake, which is usually caused by having latin1 established for the connection. (This was the old default.)
You mentioned 5.6.14; what that used for both inserting and selecting?

Related

Incorrect string value for Cangjie character using MySQL utf8 collation (4 byte support)

I have an application that is saving user input to a table in my database. The database was originally set to utf8 (MySQL v5.7) but from reading it was suggested that MySQL supports only 3 bytes with their utf8 collation and that an upgrade to utf8mb4 was needed for 4 byte support.
I'm currently running some testing by saving text in Cangjie and for the most part it seems fine, however when trying to save the following character '𤍢', I receive the following error:
'Incorrect string value: \'\\xF0\\xA4\\x8D\\xA2\\xE5\\x8F...\' for column \'content\' at row 1'
I upgraded to utf8mb4 on the database, table and column level but still saw the error.
I tried manually inserting the content at the DB level as well (rather than through the application) and got the same error, so I know it's not an implementation problem.
Can anyone suggest a reason why this might be the case? I thought utf8mb4 would have covered this
SOLVED:
I used the following set of commands (modified to meet my needs) to update the default collation: Source: https://mathiasbynens.be/notes/mysql-utf8mb4
# For each database:
ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE =
utf8mb4_unicode_ci;
# For each table:
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE
utf8mb4_unicode_ci;
# For each column:
ALTER TABLE table_name CHANGE column_name column_name VARCHAR(191)
CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
# (Don’t blindly copy-paste this! The exact statement depends on the
column type, maximum length, and other properties. The above line is
just an example for a `VARCHAR` column.)
This alone did not work, I also had to do this in order to update a number of global variables such as character_set_client, character_set_connection etc:
set names utf8mb4 collate utf8mb4_unicode_ci;
This was just for the connection I had open to the database. The client-side code actually works after the initial charset update.

How to store weird chars in the correct way in a MySQL DB

I need to store some data in a MySQL table, but I got some problems with theese kind of characters: "ā" "æ" "ō" "ĕ" (and so on)
Till now I had theese data stored in a SQLlite database and it was great because all was good, but now I'm trying to export it in a MySQL DB, but the strange chars are stored not in the good way, even if I tried different char encode. (UTF-8, UTF-16, latin blah blah)
Does anyone know the correct way to do so?
thanks a lot!!
utf8 needs to be established in about 4 places.
The column(s) in the database -- Use SHOW CREATE TABLE to verify that they are explicitly set to utf8, or defaulted from the table definition. (It is not enough to change the database default.)
The connection between the client and the server. See SET NAMES utf8.
The bytes you have.
If you are displaying the text in a web page, check the <meta> tag.
Either you can switch to BLOB datatype, or if you insist on using TEXT/VARCHAR/CHAR then you need to change charset of your table and database as shown below.
CREATE DATABASE mydbname
CHARACTER SET utf8
COLLATE utf8_general_ci;
USE mydbname;
CREATE TABLE `mytable` (
`data` varchar(200) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
If you already have a database with utf8 charset and collation set to utf8_general_ci then you can simply alter your table as mentioned below:
ALTER TABLE `mytable` CHANGE `data` `data` VARCHAR(100)
CHARSET utf8 COLLATE utf8_general_ci DEFAULT '' NOT NULL;

Changing character set of MySQL tables

I have a MySQL table CHINESE with DEFAULT CHARSET=latin1 and it has a column NAME with CHARACTER SET latin1. I have huge amount to data stored in this table. Around a million rows. And, I want to execute the following commands on my database:
ALTER DATABASE <DATABASE_NAME> DEFAULT CHARACTER SET utf8
ALTER TABLE CHINESE DEFAULT CHARACTER SET utf8
ALTER TABLE CHINESE MODIFY NAME VARCHAR(30) CHARACTER SET utf8
Considering the fact that I have huge amount of data stored in this database. Should I run these commands on my database? Will these commands lock the database in any way?
I am using Java to query and insert values in database. Will appending ?useUnicode=yes&characterEncoding=UTF-8 in URI string help me?
It will take a long time,
I think is it is the best that you export a .sql out and don't the trans-code there replace latin1 to utf8 , then import back in a temp table. Finally swap the name of they.

phpmyadmin MySQL data stored as ascii values in certain table

my host has just updated phpmyadmin to version 4.0.3. I don't know if it is related to the following problem.
I have a table 'users' which stores user data for the site and all data is now being stored as numbers. Where I had a username of 'rich' it is now '72696368' which is it's ascii code.
Any ideas why this might have happened? I have a lot of tables and have checked them all, it is only the users table that has been modified. It is not critical as I can still log in and accept new users etc but I would like to know why this is happening.
Thanks a lot
EDIT The collation is utf8_general_c
try adding this snippet to the script you are running in the line in which you create the schema
DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
for example :
CREATE SCHEMA IF NOT EXISTS `my_schema` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

How to change the default collation of a table?

create table check2(f1 varchar(20),f2 varchar(20));
creates a table with the default collation latin1_general_ci;
alter table check2 collate latin1_general_cs;
show full columns from check2;
shows the individual collation of the columns as 'latin1_general_ci'.
Then what is the effect of the alter table command?
To change the default character set and collation of a table including those of existing columns (note the convert to clause):
alter table <some_table> convert to character set utf8mb4 collate utf8mb4_unicode_ci;
Edited the answer, thanks to the prompting of some comments:
Should avoid recommending utf8. It's almost never what you want, and often leads to unexpected messes. The utf8 character set is not fully compatible with UTF-8. The utf8mb4 character set is what you want if you want UTF-8. – Rich Remer Mar 28 '18 at 23:41
and
That seems quite important, glad I read the comments and thanks #RichRemer . Nikki , I think you should edit that in your answer considering how many views this gets. See here https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html and here What is the difference between utf8mb4 and utf8 charsets in MySQL? – Paulpro Mar 12 at 17:46
MySQL has 4 levels of collation: server, database, table, column.
If you change the collation of the server, database or table, you don't change the setting for each column, but you change the default collations.
E.g if you change the default collation of a database, each new table you create in that database will use that collation, and if you change the default collation of a table, each column you create in that table will get that collation.
It sets the default collation for the table; if you create a new column, that should be collated with latin_general_ci -- I think. Try specifying the collation for the individual column and see if that works. MySQL has some really bizarre behavior in regards to the way it handles this.
may need to change the SCHEMA not only table
ALTER SCHEMA `<database name>` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci ;
as Rich said - utf8mb4
(mariaDB 10)