I use MySQL 5.7, but I do not know how to config it to display Vietnamese correctly.
I have set
CREATE DATABASE brt
DEFAULT CHARACTER SET utf8 COLLATE utf8_vietnamese_ci;
After that I used "LOAD DATA LOCAL INFILE" to load data written by Vietnamese into the database.
But I often get a result with error in Vietnamese character display.
For the detailed codes and files, please check via my GitHub as the following link
https://github.com/fivermori/mysql
Please show me how to solve this. Thanks.
As #ysth suggests, using utf8mb4 will save you a world of trouble going forward. If you change your create statements to look like this, you should be good:
CREATE DATABASE `brt` DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
USE `brt`;
DROP TABLE IF EXISTS `fixedAssets`;
CREATE TABLE IF NOT EXISTS `fixedAssets` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`code` varchar(250) UNIQUE NOT NULL DEFAULT '',
`name` varchar(250) NOT NULL DEFAULT '',
`type` varchar(250) NOT NULL DEFAULT '',
`createdDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE INDEX `idx_fa_main` ON `fixedAssets` (`code`);
I've tested this using the data that you provided and get the expected query results:
name
----------------------------------------------------------------
Mould Terminal box cover BN90/112 612536030 39 tháng
Mould W2206-045-9911-VN #3 ( 43 tháng)
Mould Flange BN90/B5 614260271 ( 43 tháng)
Mould 151*1237PH04pC11 ( 10 năm)
Transfer 24221 - 2112 ( sửa chữa nhà xưởng Space T 07-2016 ) BR2
Using the utf8mb4 character set and utf8mb4_unicode_ci collation is usually one of the simpler ways to ensure that your database can correctly display everything from plain ASCII to modern emoji and everything in between.
Related
Proclaimer: YES, I've done my search on Stackoverflow and NO it couldn't find an answer for this case.
I'm migrating data from an forum which has some legacy in it's MySQL database. One of the issues is the storage of Emoji's.
Donor database:
-- Server: 5.5.41-MariaDB
CREATE TABLE `forumtopicresponse` (
`id` int(10) UNSIGNED NOT NULL,
`topicid` int(10) UNSIGNED NOT NULL DEFAULT '0',
`userid` int(10) UNSIGNED NOT NULL DEFAULT '0',
`message` text NOT NULL,
`created` int(10) UNSIGNED NOT NULL DEFAULT '0',
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
In the message column I've got a message like this: Success!ðŸ‘ðŸ‘, which displays as "Success!👍👍"
Laravel target database:
-- Server: MySQL 5.7.x
CREATE TABLE `answers` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`topic_id` int(10) unsigned NOT NULL,
`user_id` int(10) unsigned NOT NULL,
`body` text CHARACTER SET utf8mb4,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
...keys & indexes
) ENGINE=InnoDB AUTO_INCREMENT=1254419 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
In HTML the document has a <meta charset="utf-8"> and to display the field, I'm using
{!! nl2br(e($answer->body)) !!}
And with this it just displays as Success!ðŸ‘👠and not the Emoji's.
Question
How can I migrate this data CLEAN and UTF-8 valid into my fresh database? I think I need some utf encoding, but can't figure out which.
UPDATE! THE SOLUTION
Got it fixed. The only solution was to alter the table in the Donor database.
ALTER TABLE forumtopicresponse CHANGE message message LONGTEXT CHARACTER SET latin1;
ALTER TABLE forumtopicresponse CHANGE message message LONGBLOB;
Do NOT change the LONGBLOB to LONGTEXT anymore: I lost data this way.
When I migrate the LONGBLOB data to the Laravel target database everything get's migrated correctly: all special chars and emoji's are fixed and in UTF-8.
The Emoji 👍 is hex F09F918D. That is, it is a 4-byte string.
MySQL's CHARACTER SET = utf8 does not handle 4-byte UTF-8 strings, only 3-byte ones, thereby excluding many of the Emoji and some of Chinese.
When interpreted as latin1, those hex digits are 👠(plus a 4th, but unprintable, character). Showing gibberish like that is called "Mojibake".
So, you have 2 problems:
Need to change the storage to utf8mb4 so you can store the Emoji.
Need to announce to MySQL that your client is speaking UTF-8, not latin1.
See "Best Practice" in Trouble with UTF-8 characters; what I see is not what I stored
And also see UTF-8 all the way through
Here's my list of fixes, but you must first correctly identify which case you have. Applying the wrong fix makes things worse.
There may be a 3rd mistake -- in moving the data from 5.5 to 5.7. Please provide those details.
I am running into a bit of an issue.
You see I have made a WordPress website locally using WAMP and everything seemed to be working fine, until I tried to get the MySQL database imported onto the new live site where it gave an error:
"#1709 - Index column size is to large, The maximum column size is 767 bytes"
See image of the complete error here:
Now I have found some answers to what may be causing this here:
MySQL error: The maximum column size is 767 bytes
And here:
mysql change innodb_large_prefix
And although I understand what needs to be imlemented code wise, I am none the wiser as to where the code actually needs to be placed.
As aside from importing and exporting and editing the database credentials I never had to do anything else with MySQL, it is all a bit foreign to me.
And though I am more than happy to look more deeply into it at a later point in time, at this point I rather just want my live site to be working.
Well I figured it out, apparently I had to edit the SQL file itself and had to add ROW_FORMAT=DYNAMIC at the end of every CREATE TABLE Query which uses the INNODB engine.
So I changed this:
CREATE TABLE `xxx` (
`visit_id` bigint(20) NOT NULL AUTO_INCREMENT,
`visitor_cookie` mediumtext NOT NULL,
`user_id` bigint(20) NOT NULL,
`subscriber_id` bigint(20) NOT NULL,
`url` mediumtext NOT NULL,
`ip` tinytext NOT NULL,
`date` datetime NOT NULL,
PRIMARY KEY (`visit_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = #saved_cs_client */;
Into
CREATE TABLE `xxx` (
`visit_id` bigint(20) NOT NULL AUTO_INCREMENT,
`visitor_cookie` mediumtext NOT NULL,
`user_id` bigint(20) NOT NULL,
`subscriber_id` bigint(20) NOT NULL,
`url` mediumtext NOT NULL,
`ip` tinytext NOT NULL,
`date` datetime NOT NULL,
PRIMARY KEY (`visit_id`)
) ROW_FORMAT=DYNAMIC ENGINE=InnoDB DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = #saved_cs_client */;
Then I re-imported the file onto the local server and then did a new export to the live server... and it is live now...finally.
I still find it a bit strange that mySQL doesn't automatically set rows to dynamic, once you exceed a certain amount of characters ( 747) and that it still works inside the existing database eventhough it shouldn't work...but maybe WAMP just has different environment settings vs the live server.
Anyway thanks all!
I would like to store emojies in mysql (version 5.7.18).
My table structure looks like this:
CREATE TABLE `message_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`message` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`created_at` datetime(6) NOT NULL,
`is_read` tinyint(1) NOT NULL,
`chat_id` int(11) NOT NULL,
PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
I am trying to save emojies in message field only and I can see that it gets saved with question marks (?☺️???).
Is there a way for me to read these values directly from the table (actually I would like to see emojies in table viewer). I am using SequelPro for viewing table (if that matters).
Exact mysql query that I am running
INSERT INTO message_message(message, created_at, msg_sender_id, chat_id, is_read) VALUES ('💁👍', UTC_TIME(), 110, 164, False)
If I run select query on this table, it looks like this:
+---------------------------------------------------------------------+
| message |
+---------------------------------------------------------------------+
| 😁 |
| 😁💁👍 |
| 💁👍 |
| 💁👍 |
| 💁👍 |
| 💁👍
Does this looks like data is stored correctly?
Apparently, your data is stored correctly.
You provided this string F09F9281F09F918D as a result for SELECT hex(message) for the data inserted with
INSERT INTO message_message(message, created_at, msg_sender_id, chat_id, is_read) VALUES ('💁👍', UTC_TIME(), 110, 164, False)
And if one checks the UTF8 for both emojis:
F0 9F 92 81 for 💁
F0 9F 91 8D for 👍
then you would find that those exactly match with what you already have.
It means your code is correct and if you have any problems with your GUI application - it's a GUI application configuration or unicode support issues and is a bit out of topic for the stackoverflow.
References:
https://unicode-table.com/en/1F481/
https://unicode-table.com/en/1F44D/
I think your table collation must be properly configured too:
CREATE TABLE `message_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`message` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`created_at` datetime(6) NOT NULL,
`is_read` tinyint(1) NOT NULL,
`chat_id` int(11) NOT NULL,
PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Make sure your table collation is CHARACTER SET utf8mb4 COLLATE utf8mb4_bin, to update this (in your case), the query would be:
ALTER TABLE message_message CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
Make sure your database's default collation is utf8mb4, to update this, the query would be:
SELECT default_character_set_name FROM information_schema.SCHEMATA S WHERE schema_name = "DBNAME";
I have a MySQL Database (myDB; ~2GB in size) with 4 Tables (tab1, tab2, tab3, tab4). Currently, the data that is stored in the tables was added using the charset ISO-8859-1 (i.e. Latin-1).
I'd like to convert the data in all tables to UTF-8 and use UTF-8 as default charset of the tables/database/columns.
On https://blogs.harvard.edu/djcp/2010/01/convert-mysql-database-from-latin1-to-utf8-the-right-way/ I found an interesting approach:
mysqldump myDB | sed -i 's/CHARSET=latin1/CHARSET=utf8/g' | iconv -f latin1 -t utf8 | mysql myDB2
I haven't tried it yet, but are there any caveats?
Is there a way to do it directly in the MySQL shell?
[EDIT:]
Result of SHOW CREATE TABLE messages; after running ALTER TABLE messages CONVERT TO CHARACTER SET utf8mb4;
CREATE TABLE `messages` (
`number` int(11) NOT NULL AUTO_INCREMENT,
`status` enum('0','1','2') NOT NULL DEFAULT '1',
`user` varchar(30) NOT NULL DEFAULT '',
`comment` varchar(250) NOT NULL DEFAULT '',
`text` mediumtext NOT NULL,
`date` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`number`),
KEY `index_user_status_date` (`user`,`status`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=3285217 DEFAULT CHARSET=utf8mb4
It is possible to convert the tables. But then you need to convert the application, too.
ALTER TABLE tab1 CONVERT TO utf8mb4;
etc.
To check, do SHOW CREATE TABLE tab1; it should show you CHARACTER SET utf8mb4.
Note: There are 3 things going on:
Convert the encoding of the data in any VARCHAR and TEXT columns.
Change the CHARACTER SET for such columns.
Change the DEFAULT CHARACTER SET for the table -- this comes into play if you add any new columns without specifying a charset.
The application...
When you connect from a client to MySQL, you need to tell it, in a app-specific way or via SET NAMES, the encoding of the bytes in the client. This does not have to be the same as the column declarations; conversion will occur during INSERT and SELECT, if necessary.
I recommend you take a backup and/or test a copy of one of the tables. Be sure to go all the way through -- insert, select, display, etc.
I am trying to save an URL in mysql db and get it back in my application. It gets saved properly.
http://i.>/00/s/NTAwWDUwMA==/$(KGrHqZHJC4E8fW,EPnUBPN1zoBtIQ~~60_1.JPG?set_id=8800005007
but while retrieving, all the '.' operators in the URL gets replaced by
http://i�domain�com/00/s/NTAwWDUwMA==/$�KGrHqZHJC4E8fW�EPnUBPN1zoBtIQ~~60_1�JPG?set_id=8800005007
Is there a way to remove those special characters. Attaching the create script for the table..
Im getting the url from the result set.
rs.getString(image)
delimiter $$
CREATE TABLE `livedeals` (
`ItemID` bigint(20) NOT NULL,
`category` varchar(200) CHARACTER SET latin1 NOT NULL,
`deal_like` int(4) NOT NULL,
`deal_dislike` int(4) NOT NULL,
`image` varchar(200) CHARACTER SET armscii8 COLLATE armscii8_bin NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$
any help would be helpful.
thanks.
If for some reason you can't change the character set of the table, then you could get that field the following way:
SELECT CAST(image AS CHAR CHARACTER SET utf8) AS image2 FROM livedeals