MySQL - What's utf8_general_mysql500_ci? - mysql

I just saw that MySQL 5.5 offers utf8_general_mysql500_ci as collation.
What is the difference to other collations like utf8_general_ci?
Should I better use utf8_general_mysql500_ci?

As documented under Changes in MySQL 5.5.21:
New utf8_general_mysql500_ci and ucs2_general_mysql500_ci collations have been added that preserve the behavior of utf8_general_ci and ucs2_general_ci from versions of MySQL previous to 5.1.24. Bug #27877 corrected an error in the original collations but introduced an incompatibility for columns that contain German 'ß' LATIN SMALL LETTER SHARP S. (As a result of the fix, that character compares equal to characters with which it previously compared different.) A symptom of the problem after upgrading to MySQL 5.1.24 or newer from a version older than 5.1.24 is that CHECK TABLE produces this error:
Table upgrade required.
Please do "REPAIR TABLE `t`" or dump/reload to fix it!
Unfortunately, REPAIR TABLE could not fix the problem. The new collations permit older tables created before MySQL 5.1.24 to be upgraded to current versions of MySQL.
To convert an affected table after a binary upgrade that leaves the table files in place, alter the table to use the new collation. Suppose that the table t1 contains one or more problematic utf8 columns. To convert the table at the table level, use a statement like this:
ALTER TABLE t1
CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_mysql500_ci;
To apply the change on a column-specific basis, use a statement like this (be sure to repeat the column definition as originally specified except for the COLLATE clause):
ALTER TABLE t1
MODIFY c1 CHAR(N) CHARACTER SET utf8 COLLATE utf8_general_mysql500_ci;
To upgrade the table using a dump and reload procedure, dump the table using mysqldump, modify the CREATE TABLE statement in the dump file to use the new collation, and reload the table.
After making the appropriate changes, CHECK TABLE should report no error.
For more information, see Checking Whether Tables or Indexes Must Be Rebuilt, and Rebuilding or Repairing Tables or Indexes. (Bug #43593, Bug #11752408)

Related

Collation change utf8mb4_unicode_ci to utf8mb4_general_ci

For my databases, I used utf8mb4_unicode_ci with utf8mb4 character set as a default. This was a mistake and the folks who are using the databases I created are complaining about the collation. I need to convert it to utf8mb4_general_ci. Am I able to get away with just changing the DB using an alter statement such as:
ALTER DATABASE `#{database}` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Or, will I need to change each individual table and deal with columns even though the charset is the same between the two collations? I can't seem to find definitive answers on this... I'm using MySQL 5.7.2x .
utf8mb4_unicode_520_ci is better than either of the collations you mentioned.
Why are they complaining? Perhaps JOINs are failing to use indexes. I would argue with them that the old tables should be changed.
ALTER DATABASE only sets up defaults for future tables. It will not do what you need.
ALTER TABLE ... CONVERT TO ... for each table is needed. See http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes for a similar ALTER. It provides a way to automatically generate all the ALTERs.

(MySQL 5.7.19 AWS RDS) how to change table column character set without locking

i want to change table character set from 'utf8' to 'utf8mb4'
but each column has own character set setting(utf8)
so i need to change column character set to 'Table Default', but locking is the problem
help me to change column character set without table locking
there is over 100,000,000 rows in table
"Character set" is the encoding of the characters in bytes.
"Collation" is how to sort characters.
An INDEX on a VARCHAR is sorted by its collation, so changing the collation of a column requires rebuilding an index -- a non-trivial operation.
The difference between utf8 and utf8mb4 is relatively minor, but I don't think MySQL (hence RDS) has made a special case of that.
ALTER TABLE t CONVERT TO utf8mb4; sounds like the operation that you desire. That requires ALGORITHM=COPY, so it is 'locking'.
Look into pt-online-schema-change and gh-ost as a way of altering a table, even when it needs to "copy". These are essentially non-blocking. However, I do not know if they can be used with RDS. Also, because of JOINs and other cases where one table may need to be consistent with another, those tools may not be practical.
Another approach... Add another column(s); change your code to use both the old and new column(s). Meanwhile, gradually copy the old values to the new column(s); when this is finished, change your code again -- this time to use the new column(s) instead of the old. At some later date, worry about dropping the dead column(s).
Recent versions of MySQL have made significant changes in the speed of ALTER, so be sure to study what version RDS is derived from. In 5.6, ADD COLUMN can use ALGORITHM=INPLACE; in 8.0, ALGORITHM=INSTANT. I think either of those is non-"locking" for your purposes. (DROP COLUMN is not cheap; the issues with JOIN and rebuilding indexes are still up in the air.)
If you try one of these techniques, I strongly recommend you build a table with at least a million rows and try out all the steps (alter add, join, recreate index, alter drop column, etc) to verify what parts are "fast enough" and/or "non-locking".

How to tailor mySQL commands to MariaDB?

mySQL. MariaDB Server version
I want to run the following command in mySQL, but there is an error in my syntax that phpMyAdmin console states is related to MariaDB server version.
Alter table page modify column page_title convert to character set latin1_general_ci
How to tailor a mySQL script to MariaDB?
The syntax is indeed incorrect, it is not specific to MariaDB, you would have an error with MySQL as well.
You are mixing up different operations. Either you want to change the whole table (all character columns), and then it is
ALTER TABLE page CONVERT TO CHARACTER SET <character set>
or you want to change the column, and then it is
ALTER TABLE page MODIFY COLUMN page_title <column type> CHARACTER SET <character set>
Please read the documentation carefully to make sure the command that you choose does what you want, it is not always obvious.
Also, latin1_general_ci is not a character set, so you will have another error when you fix the syntax one.

"Convert to character set" doesn't convert tables having only integer columns to specified character set

I am working on 2 servers each having similar configurations, Including mysql variables specific to character set and collation and both are on running mysql server and client 5.6.x. By default all tables are in latin1 including tables with only integer columns, But when I run
ALTER TABLE `table_name` CONVERT TO CHARACTER SET `utf8` COLLATE `utf8_unicode_ci`
for all tables in each server only one of the servers is converting all tables to utf8.
What I already tried:
Converted the default database character (character_set_database) set to utf8 before running the above listed command
Solution already worked for me (but still unsure why it worked)
ALTER TABLE `table_name` CHARACTER SET = `utf8` COLLATE `utf8_unicode_ci`
Finally there are 2 questions:
CONVERT TO CHARACTER SET is working in one server and not in other
Solution already worked for me which is similar to CONVERT TO CHARACTER SET with only one difference I have come across is, it doesn't implicitly convert the all the columns to specified character set.
Can someone please help me understand what is happening?
Thank you in advance.
IIRC, that was a bug that eventually was fixed. See bugs.mysql.com . (The bug probably existed since version 4.1, when CHARACTER SETs were really added.)
I prefer to be explicit in two places, thereby avoiding the issue you raise:
When doing CREATE TABLE, I explicitly say what CHARACTER SET I need. This avoids depending on the default established when the database was created, perhaps years ago.
When adding a column (ALTER TABLE ADD COLUMN ...), I check (via SHOW CREATE TABLE) to see if the table already has the desired charset. Even so, I might explicitly state CHARACTER SET for the column. Again, I don't trust the history of the table.
Note: I am performing these queries from explicit SQL, not from some UI that might be "helping" me.
Follow on
#HBK found http://bugs.mysql.com/bug.php?id=73153 . From it, I suspect this is what 'should be' done by the user:
ALTER TABLE ...
CONVERT TO ...
DEFAULT CHARACTER SET ...; -- Do this also

Mysql datetime DEFAULT CURRENT_TIMESTAMP error

1.
When I ran this MYSQL syntax on windows it ran properly:
CREATE TABLE New
(
id bigint NOT NULL AUTO_INCREMENT,
timeUp datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id)
)
But when I tried running this code on Linux I got an error:
#1067 - Invalid default value for 'time'
2.
On windows the case is not sensitive eg. New and new both are considered to be same. But on linux the case is sensitive.
Configuration of Linux:
MySQL 5.5.33, phpMyAdmin: 4.0.5, PHP: 5.2.17
Configuration of Windows:
MySql: 5.6.11, phpMyAdmin: 4.0.4.1, PHP: 5.5.0
Is there any way to make them common for both systems? Or any alternative approach?
The DEFAULT CURRENT_TIMESTAMP support for a DATETIME (datatype) was added in MySQL 5.6.
In 5.5 and earlier versions, this applied only to TIMESTAMP (datatype) columns.
It is possible to use a BEFORE INSERT trigger in 5.5 to assign a default value to a column.
DELIMITER $$
CREATE TRIGGER ...
BEFORE INSERT ON mytable
FOR EACH ROW
BEGIN
IF NEW.mycol IS NULL THEN
SET NEW.mycol = NOW();
END IF;
END$$
Case sensitivity (of queries against values stored in columns) is due to the collation used for the column. Collations ending in _ci are case insensitive. For example latin1_swedish_ci is case insensitive, but latin1_general_cs is case sensitive.
The output from SHOW CREATE TABLE foo will show the character set and collation for the character type columns. This is specified at a per-column level. The "default" specified at the table level applies to new columns added to the table when the new column definition doesn't specify a characterset.
UPDATE
Kaii pointed out that my answer regarding "case sensitivity" deals with values stored within columns, and whether queries will return a value from a column containing a value of "New" will be returned with a predicate like "t.col = 'new'".
See Kaii's answer regarding identifiers (e.g. table names) being handled differently (by default) on Windows than on Linux.
As the DEFAULT CURRENT_TIMESTAMP question is already answered, i will only respond to the case-sensitivity mismatch in table names between windows and linux.
On Windows, file systems are by default case-insensitive.
But on Linux and other *NIX like Operating Systems, they are case-sensitive by default.
The reason why you get a mismatch in behaviour here is the file system, as each table is created as a separate file and the filesystem handles case-sensitivity for you.
MySQL has a parameter to override this behaviour:
For example, on Unix, you can have two different tables named my_table
and MY_TABLE, but on Windows these two names are considered identical.
To avoid data transfer problems arising from lettercase of database or
table names, you have two options:
Use lower_case_table_names=1 on all systems. The main disadvantage with this is that when you use SHOW TABLES or SHOW DATABASES, you do
not see the names in their original lettercase.
Use lower_case_table_names=0 on Unix and lower_case_table_names=2 on Windows. This preserves the lettercase of database and table names.
The disadvantage of this is that you must ensure that your statements
always refer to your database and table names with the correct
lettercase on Windows. If you transfer your statements to Unix, where
lettercase is significant, they do not work if the lettercase is
incorrect.Exception: If you are using InnoDB tables and you are trying to avoid these data transfer problems, you should set lower_case_table_names=1 on all platforms to force names to be converted to lowercase.
[...]
To avoid problems caused by such differences,
it is best to adopt a consistent convention, such as always creating
and referring to databases and tables using lowercase names. This
convention is recommended for maximum portability and ease of use.
This is an excerpt from the MySQL manual on the case sensitivity of identifiers
if you wants to default time to must change to timestamp in your datatype,
the datetime is going to display the user input of table...
http://dev.mysql.com/doc/refman/5.0/en/timestamp-initialization.html
http://dev.mysql.com/doc/refman/5.6/en/timestamp-initialization.html