Can mysql charset for table and column be different? - mysql

Does it makes sense to have two different charset for table and a single column in the same table ? or will it create problem, especially for the below mentioned example ?
For example,
Table charset - latin1
Column C1 charset - utf8mb4

Tables don't have a charset anyway, the only thing they have is a default charset. The only thing that has an actual "physical" charset are columns, because they're the only thing that actually stores data. The way it works is that if you're not setting an explicit charset for a column, the table's default is used. And if the table doesn't have a default, the database's default is used. And if that doesn't have a default, the server's default is used.

Related

Database conversion from latin1 to utf8mb4, what about indexes?

I noticed that my MODX database still uses latin1 character set in the database and in its tables. I would like to convert them to utf8mb4 and update collations accordingly.
Not totally sure how I should do this. Is this correct?
I alter every table to use utf8mb4 and utf8_unicode_ci?
I update the default character set and collation of the database.
Are indexes updated automatically? Is there something else I should be aware of?
A bonus question: what would be the most suitable latest utf8_unicode collation? Western languages should work.
Changing the default character sets of a table or a schema does not change the data in the column itself, it only changes the default to apply the next time you add a table or add a column to a table.
To convert current data, alter one table at a time:
ALTER TABLE <name> CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
The collation utf8mb4_0900_ai_ci is faster than earlier collations (at least according to the documentation), and it's the most current and accurate. This collation requires MySQL 8.0.
The most current collation in MySQL 5.7 is utf8_unicode_520_ci.
A table-conversion like this rebuilds the indexes, so there's nothing else you need to do.

Create integer only table with a default charset, bad idea?

When I create a table, even if it's using integers only, I set the default charset to utf8 (because I copy paste the code and because in case I introduce a string column in the future).
Example:
CREATE TABLE IF NOT EXISTS `articles` (
`id` smallint(6) unsigned NOT NULL,
`disabled` tinyint(1) unsigned NOT NULL DEFAULT 0,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
However, i'm wondering if it's affecting "performances" to have a default charset in a table that do not make use of it.
Tables have a character set no matter what, so no, there's no performance issue, and UTF-8 is a good default choice (but utf8 is not). But you still shouldn't do that.
It is a bad practice to add a default character set unless you need to specify one. This overrides the default character set of the database which might not be utf8. You're risking making a table with a different character set than every other table causing confusion.
Instead, make sure the server and database character set are set correctly. Then let your tables use the default, unless you have a specific reason to do otherwise.
For example, UTF-8 is a good default choice, but MySQL got UTF-8 wrong. utf8 cannot handle all of UTF-8. You should instead be using utf8mb4 (UTF-8 4-byte). The database might correctly use utf8mb4, but you're overriding that with a less capable character set.
See Specifying Character Sets and Collations and Unicode Support.
The DEFAULT CHARSET clause at the bottom of your table creation is only metadata. It is only used if you add a CHAR/VARCHAR/TEXT column and don't explicitly define the column's character set. Then the table's default character set is used.
Tables don't have any performance characteristic — they are just storage. Queries have performance.
Since your table has no columns with character sets, there can be no query against this table that is affected by the character set. Therefore the default character set has no effect.

What difference in schema VS table VS column CHARSET in MySQL?

What difference in schema CHARSET VS table CHARSET VS column CHARSET in MySQL?
When I change my table's charset to utf8, can I use utf8mb4 charset in my column?
Thanks.
Specifying a character set on database level is in fact defining the default character set for tables.
Doing the same for tables defines the default character set for columns.
Since you can't go further down the road, specifying a character set on a column will definitely use the character set for everything you store in that column.
When you don't specify a character set on column level, the character set of the table is used. And if that is not specified the character set of the database is used.
When creating a table, the backup for charset and collation is the settings for the schema.
Once you have created the table, it now has a default charset and collation. (This is subtly different than what fancyPants said.)
Similarly, when creating a column (either as part of creating the table, or with ALTER .. ADD COLUMN), you can be explicit about charset and collation, or it can inherit from the defaults given for the table. Again, the column's definition is now frozen.
Doing SHOW CREATE TABLE will show an override or continue to leave the implicit inheritance. SELECT .. FROM information_schema.columns .. makes it clearer that every column has a charset and collation.
That is, there is no "dynamic" inheritance at "run time". The inheritance is only when the table or column is created.
Note that each charset has a default collation. And each collation belongs to a specific charset (see the first part of the collation name). So, specifying either the charset or collation implicitly specifies the other.

Sense of command collate in create table sql

I understand function of command collate (a little). It is truth that I did not test if it is possible to have tables with various collation (or even various charset) inside one DB.
But I found that (at least in phpmyadmin) when I create any DB, I set its charset and collation - and if I miss this command in CREATE TABLE ..., then automatically will be set collation set in creation of DB.
So, my question is: What is sense of presence of command collate in sql of CREATE TABLE ... if it can be missing there - and is recommended to have collate in CREATE TABLE ... or is it irrelevant?
In SQL Server if you don't specify the COLLATE it is defaulted to what ever DB is set to. Thus there is no danger in not specifying.
In MySQL behavior is the same:
The table character set and collation are used as default values for
column definitions if the column character set and collation are not
specified in individual column definitions. MySQL Reference
Collate is only used when you want to specify to non-default value. If all you are using is English character set than you have nothing to worry about it. If you store data from multiple languages than you have specify specific collation to ensure what characters are stored correctly.

How to change the default collation of a table?

create table check2(f1 varchar(20),f2 varchar(20));
creates a table with the default collation latin1_general_ci;
alter table check2 collate latin1_general_cs;
show full columns from check2;
shows the individual collation of the columns as 'latin1_general_ci'.
Then what is the effect of the alter table command?
To change the default character set and collation of a table including those of existing columns (note the convert to clause):
alter table <some_table> convert to character set utf8mb4 collate utf8mb4_unicode_ci;
Edited the answer, thanks to the prompting of some comments:
Should avoid recommending utf8. It's almost never what you want, and often leads to unexpected messes. The utf8 character set is not fully compatible with UTF-8. The utf8mb4 character set is what you want if you want UTF-8. – Rich Remer Mar 28 '18 at 23:41
and
That seems quite important, glad I read the comments and thanks #RichRemer . Nikki , I think you should edit that in your answer considering how many views this gets. See here https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html and here What is the difference between utf8mb4 and utf8 charsets in MySQL? – Paulpro Mar 12 at 17:46
MySQL has 4 levels of collation: server, database, table, column.
If you change the collation of the server, database or table, you don't change the setting for each column, but you change the default collations.
E.g if you change the default collation of a database, each new table you create in that database will use that collation, and if you change the default collation of a table, each column you create in that table will get that collation.
It sets the default collation for the table; if you create a new column, that should be collated with latin_general_ci -- I think. Try specifying the collation for the individual column and see if that works. MySQL has some really bizarre behavior in regards to the way it handles this.
may need to change the SCHEMA not only table
ALTER SCHEMA `<database name>` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci ;
as Rich said - utf8mb4
(mariaDB 10)