I am running mysql 5.6.
Some of the columns in a schema that I inherited from previous developers have an explicitly specified collate clause.
All explicitly specified collate clauses are the same as the database's default collate.
Is there any way to remove the explicit column collate clauses?
There should be no functional collating differences versus my current collate, but I want the following:
to get column definitions sans collate clauses when I request a
create table statement from mysql (I want to be able to compare
table creation scripts from a code repository with create table
statements obtained from different instances of the schema on
different mysql servers; the explicit column collate clauses are
only in some instances, but not others, which would require me to
use a more complex diff than a plain text diff)
to have the collate of these columns automatically change to
whatever is the new default database collate if I ever change it
1) is much more important than 2), however, since I will probably never change the collate again.
Thanks.
Instead of using SHOW CREATE TABLE, fetch the equivalent data from information_schema tables TABLES and COLUMNS.
Meanwhile, do you have an example of the COLLATION clause being present in some cases, but not in other cases?
Related
I noticed that my MODX database still uses latin1 character set in the database and in its tables. I would like to convert them to utf8mb4 and update collations accordingly.
Not totally sure how I should do this. Is this correct?
I alter every table to use utf8mb4 and utf8_unicode_ci?
I update the default character set and collation of the database.
Are indexes updated automatically? Is there something else I should be aware of?
A bonus question: what would be the most suitable latest utf8_unicode collation? Western languages should work.
Changing the default character sets of a table or a schema does not change the data in the column itself, it only changes the default to apply the next time you add a table or add a column to a table.
To convert current data, alter one table at a time:
ALTER TABLE <name> CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
The collation utf8mb4_0900_ai_ci is faster than earlier collations (at least according to the documentation), and it's the most current and accurate. This collation requires MySQL 8.0.
The most current collation in MySQL 5.7 is utf8_unicode_520_ci.
A table-conversion like this rebuilds the indexes, so there's nothing else you need to do.
For my databases, I used utf8mb4_unicode_ci with utf8mb4 character set as a default. This was a mistake and the folks who are using the databases I created are complaining about the collation. I need to convert it to utf8mb4_general_ci. Am I able to get away with just changing the DB using an alter statement such as:
ALTER DATABASE `#{database}` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Or, will I need to change each individual table and deal with columns even though the charset is the same between the two collations? I can't seem to find definitive answers on this... I'm using MySQL 5.7.2x .
utf8mb4_unicode_520_ci is better than either of the collations you mentioned.
Why are they complaining? Perhaps JOINs are failing to use indexes. I would argue with them that the old tables should be changed.
ALTER DATABASE only sets up defaults for future tables. It will not do what you need.
ALTER TABLE ... CONVERT TO ... for each table is needed. See http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes for a similar ALTER. It provides a way to automatically generate all the ALTERs.
I understand function of command collate (a little). It is truth that I did not test if it is possible to have tables with various collation (or even various charset) inside one DB.
But I found that (at least in phpmyadmin) when I create any DB, I set its charset and collation - and if I miss this command in CREATE TABLE ..., then automatically will be set collation set in creation of DB.
So, my question is: What is sense of presence of command collate in sql of CREATE TABLE ... if it can be missing there - and is recommended to have collate in CREATE TABLE ... or is it irrelevant?
In SQL Server if you don't specify the COLLATE it is defaulted to what ever DB is set to. Thus there is no danger in not specifying.
In MySQL behavior is the same:
The table character set and collation are used as default values for
column definitions if the column character set and collation are not
specified in individual column definitions. MySQL Reference
Collate is only used when you want to specify to non-default value. If all you are using is English character set than you have nothing to worry about it. If you store data from multiple languages than you have specify specific collation to ensure what characters are stored correctly.
I'm writing a set of SQL statements in MySQL to create and modify a few tables. I need to get my output to match a document of sample output exactly (this is for school).
When I show my create table statements, all varchar columns need to look like this:
`name` varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
but they weren't showing the collation. I tried changing the declaration to
name varchar COLLATE utf8_unicode_ci DEFAULT NULL,
but this caused the output to show both the charset and collation, and I need to be showing just the collation. The sample output document was created on Unix, while I am on Windows, so this could be the source of the difference, but I need to know for sure.
Is there a way I can alter my queries to show collation or is this just a Unix Windows inconsistency?
To be honest, I doubt very much that anyone intends for you to obtain output that is identical verbatem—it's more likely that they require it to be identical semantically. However, you might play around with the table's default charset/collation to see whether that makes a difference to the output obtained from SHOW CREATE TABLE:
ALTER TABLE foo CHARACTER SET utf8 COLLATE ut8_bin;
Failing that, it could be a difference between MySQL versions.
I have a table called users with a column firstname and with collation utf8_bin
I want to know what happens under the hood when I execute an query like
SELECT * FROM `users` `a` WHERE `a`.`firstname` = 'sander' COLLATE utf8_general_ci
the column firstname isn't an index, what happens with the performance when the command executed?
And what if the default collation was utf8_general_ci and the query is executed without COLLATE
I want to know the impact it has on a big table (8 million+ records)
In this case, since the forced collation is defined over the same character set as the column's encoding, there won't be any performance impact.
However, if one forced a collation that is defined over a different character set, MySQL may have to transcode the column's values (which would have a performance impact); I think MySQL will do this automatically if the forced collation is over the Unicode character set, and any other situation will raise an "illegal mix of collations" error.
Note that the collation recorded against a column's definition is merely a hint to MySQL over which collation is preferred; it may or may not be used in a given expression, depending on the rules detailed under Collation of Expressions.