I am using Navicat 11.0.8 and I am trying to insert values into a table using a query but when i try to insert the values with the query it works but the character encoding is messed up!
As you can see on my code, the table is 'table' and i am inserting an ID, VNUM and a NAME. The VNUM is '体字' and the NAME is 'Versão'.
INSERT INTO table VALUES ('1', '体字', 'Versão');
Instead of showing '体字' on the VNUM and 'Versão' on the NAME, it shows '体å—' and 'Versão'.
This is very bad for me because I am trying to insert more than 5000 lines with alot of information.
I have tried to set the Character Encoding of the table using this these commands:
ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
&
ALTER TABLE table CONVERT TO CHARACTER SET big5 COLLATE big5_chinese_ci;
&
ALTER TABLE table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
I also have tried to delete my table and create a new one with the character encoding already to utf-8 and send the values..
SET FOREIGN_KEY_CHECKS=0;
DROP TABLE IF EXISTS `table`;
CREATE TABLE `table` (
`vnum` int(11) unsigned NOT NULL default '0',
`name` varbinary(200) NOT NULL default 'Noname',
`locale_name` varbinary(24) NOT NULL default 'Noname',
) ENGINE=MyISAM DEFAULT CHARSET=big5_chinese_ci;
INSERT INTO `table` VALUES ('1', '体字', 'Versão');
Still show's '体å—' and 'Versão'.
If i edit the table manually, it show's correct! But i am not going to edit 5000+ lines...
it shows '体å—' and 'Versão'. -- Sounds like you had SET NAMES latin1. Do
SET NAMES utf8;
after connecting and before INSERTing. That tells mysqld what encoding the client is using.
Verify the data stored by doing SELECT col, HEX(col) ... The utf8 (or utf8mb4) hex for 体字 is E4BD93E5AD97. Interpreting those same bytes as latin1 gives you 体å—;
utf8 can handle those, plus ã; I don't know if big5 can.
Actually, I suggest you use utf8mb4 instead of utf8. This is in case you encounter some of the 4-byte Chinese characters.
If you still have latin1 columns that need changing to utf8mb4, see my blog, which discusses the "2-step ALTER", but using BINARY or BLOB (not big5) as the intermediate.
Related
i tried to insert arabic word in MySql and it got inserted and when i displayed it it was there
but there was a warning 3720
for context here is the table:
create table SIGHTS ( S_no int not null PRIMARY KEY, S_name Nvarchar(30) not null);
and here is the inserted value:
insert into SIGHTS values (1,(N'كهف الهكبة'));
so? is it safe to ignore the warning or what should i do with it ?
it seems like you have used the default CHARACTER SET "utf8mb4" and the COLLATE utf8mb4_0900_ai_ci when you create the database. for more information, check this link out
when you create the database using the default "create" statement, which is
CREATE DATABASE _dbase;
it will execute the following statement adding the default CHARACTER SET
CREATE DATABASE _dbase CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
change the CHARACTER SET to "utf8" and the COLLATE to "utf8_general_ci" or "utf8_unicode_ci", you need to use the following statements :
CREATE DATABASE _dbase CHARACTER SET utf8 COLLATE utf8_general_ci;
or
CREATE DATABASE _dbase CHARACTER SET utf8 COLLATE utf8_unicode_ci;
After reading articles about how to properly store your data inside a mysql database, I sort of have a good understanding of how character sets work. Yet, there are still some instances that I find hard to understand. Let me give a quick example: Say I have a table with a varchar column using the latin1 charset (for example purposes)
CREATE TABLE `foobar` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`foo` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
and I force insert a latin1 character using its hex value (á for instance, which would be E1)
SET NAMES 'latin1';
INSERT INTO foobar SET foo = UNHEX('E1')
why does a SELECT query against this table under a latin1 connection fail to give me the value I am expecting? I get the replacement character instead. I would have expected it to just give back the value as it is since the column was in latin1, the connection itself was in latin1 and the value stored is a valid latin1 character. Is mysql trying to interpret E1 as a UTF8 value? Or am I missing something here?
Using a UTF8 connection (SET NAMES 'utf8') before selecting gives me the expected value of á, which I guess is correct since from what I understand, if the connection charset is different from the column charset being selected, then mysql would read data as whatever the column charset was (so E1 gets parsed as a latin1 character, or á)
I am trying to convert a specific column in a table on my DB from latin1 character set with collation latin1_swedish_ci to utf8 with collation utf8_unicode_ci.
COLUMN: description, type: longtext, default not null
I tried the following commands on the column:
ALTER TABLE sample MODIFY description LONGBLOB NOT NULL ;
ALTER TABLE sample MODIFY description LONGTEXT CHARACTER SET utf8 NOT NULL COLLATE utf8_unicode_ci;
I also tried to alter the encoding WITHOUT changing to binary first. But the characters ended up being re-encoded incorrectly by the server.
And keep getting an error regarding some characters:
Error Code: 1366. Incorrect string value: '\x92t hav...' for column 'longDesc' at row 803
It seems like some of the character in my table aren't converting correctly.
How can I fix this issue?
\x92 implies that you have latin1 in the table now. The second ALTER is claiming that the bytes are in utf8 encoding. Hence, the error message.
Case 1: You need to change the LONGTEXT to utf8 because you plan to add rows with text that cannot be encoded in latin1.
For this case, ALTER TABLE sample CONVERT TO CHARACTER SET utf8; -- converts all CHAR/TEXT columns in the table.
ALTER TABLE sample MODIFY description ... CHARACTER SET utf8; -- converts the one column.
Case 2: The rest of the system is thinking utf8 and is confused by this column.
Well, I don't think it is confused. Conversions happen as needed.
I'm trying to convert some mysql tables from latin1 to utf8. I'm using the following command, which seems to mostly work.
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
However, on one table I get an error about a duplicate key entry. This is caused by a unique index on a "name" field. It seems when converting to utf8, any "special" characters are indexed as their straight english equivalent. For example, there is already a record with a name field value of "Dru". When converting to utf8, a record with "Drü" is considered a duplicate. The same with "Patrick" and "Påtrìçk".
Here is how to reproduce the issue:
CREATE TABLE `example` ( `name` char(20) CHARACTER SET latin1 NOT NULL,
PRIMARY KEY (`name`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO example (name) VALUES ('Drü'),('Dru'),('Patrick'),('Påtrìçk');
ALTER TABLE example convert to character set utf8 collate utf8_general_ci;
ERROR 1062 (23000): Duplicate entry 'Dru' for key 1
The reason why the strings 'Drü' and 'Dru' evaluate as the same is that in the utf8_general_ci collation, they count as "the same". The purpose of a collation for a character set is to provide a set of rules as to when strings are the same, when one sorts before the other, and so on.
If you want a different set of comparison rules, you need to choose a different collation. You can see the available collations for the utf8 character set by issuing SHOW COLLATION LIKE 'utf8%'. There are a bunch of collations intended for text that is mostly in a specific language; there is also the utf8_bin collation which compares all strings as binary strings (i.e. compares them as sequences of 0s and 1s).
UTF8_GENERAL_CI is accent insensitive.
Use UTF8_BIN or a language-specific collation.
I have a table like this, where one column is latin1, the other is UTF-8:
Create Table: CREATE TABLE `names` (
`name_english` varchar(255) character NOT NULL,
`name_chinese` varchar(255) character set utf8 default NULL,
) ENGINE=MyISAM DEFAULT CHARSET=latin1
When I do an insert, I have to type _utf8 before values being inserted into UTF-8 columns:
insert into names (name_english = "hooey", name_chinese = _utf8 "鬼佬");
However, since MySQL should know that name_chinese is a UTF-8 column, it should be able to know to use _utf8 automatically.
Is there any way to tell MySQL to use _utf8 automatically, so when I'm programatically making prepared statements, I don't have to worry about including it with the right parameters?
why not to use UTF-8 for the whole table?