MySQL default character set - mysql

Does MySQL treat the default character setting in a cascading type of way?
For example, I'm looking at a script that generates the full db and it starts off with a statement like this:
CREATE SCHEMA IF NOT EXISTS `xyz` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci ;
so does that imply that all tables that are created as part of this db will use the UTF8 charset by default? And if I wanted to use a different charset for a given table I would simply have to define it on the CREATE TABLE statement? Is that accurate? Can I override on a specific field of a specific table too? thanks!

Yes. If you set the charset and collation at the database (schema) level, it will apply to any newly created tables inside that database, unless those tables specify their own charset/collation in the CREATE TABLE statement.
And yes, you can also specify charset/collation on a per-field basis. Example from the MySQL manual:
CREATE TABLE t1
(
col1 VARCHAR(5)
CHARACTER SET latin1
COLLATE latin1_german1_ci
);

Related

Mysql - convert Tables in Database to diffrent encoding and collate - foreign key constraints are failing

I have a MySQL Database with the Charset utfmb4 and the collate utf8mb4_unicode_ci.
No, I noticed that this influences my search queries where I use like '%grĂ¼n%'.
This would also match 'Grund'.
I found that this behavior is because of the charset and collate of my Tables/Columns.
Now I want to switch the tables to the collate utf8mb4_de_pb_0900_ai_ci to avoid the wrong selection of german umlaute.
So first I change the default settings for my database which is accepted
ALTER DATABASE CHARACTER SET utf8mb4 COLLATE utf8mb4_de_pb_0900_ai_ci;
Setting the default setting for my first table is also accepted
ALTER TABLE tablename CHARACTER SET utf8mb4 COLLATE utf8mb4_de_pb_0900_ai_ci;
But when I want to convert the existing data to the new settings I get an error
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_de_pb_0900_ai_ci;
Referencing column 'column1' and referenced column 'id' in foreign key constraint 'contraintname_fkey' are incompatible.
I can do this with every table and always get the error that the constraint is not compatible as the foreign table is not converted.
I found clever Queries to generate all alter statements, but I can not execute them because of the error described above.
Is there an easy way to do this?
You can disable checking of foreign keys while you are altering your tables.
SET FOREIGN_KEY_CHECKS=0;
...Your ALTER TABLE queries...
SET FOREIGN_KEY_CHECKS=1;
Remember that AI in the collation means Accent Insensitive, meaning accents are not taken into account when comparing text. For a collation that is sensitive to accents use a collation with _AS_ in its name.

What is the difference between using default charset and using default character set in MySQL while creating database

I tried using these two SQL statements to create a database:
CREATE DATABASE IF NOT EXISTS `dbname` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
and
CREATE DATABASE IF NOT EXISTS `dbname` DEFAULT CHARSET utf8mb4 COLLATE utf8mb4_general_ci;
Both of these SQLs got the same result:
mysql> show create database `dbname`;
+----------------------+--------------------------------------------------------+
| Database | Create Database |
+----------------------+--------------------------------------------------------+
| dbname | CREATE DATABASE `dbname` /*!40100 DEFAULT CHARACTER SET utf8mb4 */ |
+----------------------+--------------------------------------------------------+
But, there isn't an option named CHARSET for creating database in MySQL documentation https://dev.mysql.com/doc/refman/8.0/en/create-database.html, I want to know the differences between these two statements.
I'm trying to find any official documentation about CHARSET too but all of the search result return link to CHARACTER SET. What I do find is this MariaDB documentation which have a CREATE TABLE query using CHARSET. So, I'm guessing CHARSET is a valid synonym to CHARACTER SET.
FYI: I notice that MySQL have a lot of similar functions but slightly - to moderately different name such as CURRENT_DATE and CURDATE(). Most of these functions have almost identical name only differs slightly in spelling. However, there are two functions that I know in particular have totally different names and noticeable difference in length but perform exactly the same operation. These functions are CURRENT_TIMESTAMP and NOW(). Here is a demo.

Is it normal for a MySQL create table statement to include redundant collation declarations for every char, varchar, and text column?

When running SHOW CREATE TABLE `my_table`;, I notice that COLLATE utf8mb4_unicode_ci is shown for every char, varchar, and text column in the table. This seems a bit redundant since the collation is already declared in the table_option portion of the create statement.
mysql> SHOW CREATE TABLE `my_table`;
| Table | Create Table
| my_table | CREATE TABLE `my_table` (
...
`char_col_1` char(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`varchar_col_1` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
`varchar_col_2` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`varchar_col_3` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`text_col_1` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
...
) ENGINE=InnoDB AUTO_INCREMENT=1816178 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
This behavior is noticeable in both MySQL 5.7 and MySQL 8.0 and therefore most likely in other versions as well.
Is this behavior normal and acceptable, or is it a symptom of something that is misconfigured either with the table, database, or MySQL instance?
On the other hand, since collation can be individually set for any specific column, perhaps it is better to explicitly display the collation for every column to avoid any ambiguity or assumptions, even in cases where the collation of the column matches the collation of the table?
You have touched only the tip of the iceberg.
I think the settings on the table are just defaults for columns that are defined without charset or collate.
Ditto for ALTER TABLE ADD COLUMN -- will inherit from the table defaults.
I think that the column settings are put into the information_schema.COLUMNS table and that won't change with an ALTER TABLE .. MODIFY COLUMN ..
Similarly, the table charset and collation inherit from the database definition, and will be frozen as the table is defined.
About defaults:
The old default charset was latin1
The current default is utf8mb4; this is unlikely to ever change in the future.
Every collation applies to exactly one charset, and the charset name is the beginning of the collation name.
Each charset has exactly one "default" collation: latin1_swedish_ci, utf8_unicode_ci, utf8mb4_0900_ai_ci, etc.
That default collation (for a given charset) has rarely, if ever, changed. Perhaps the only change has been for utf8mb4 between 5.7 and 8.0??
(The more I experiment, the less certain I am about all this.)
Best practice: Always explicitly set CHARSET and COLLATE for each string column.
Secondary considerations:
Use utf8mb4, if available, for most string (VARCHAR / TEXT).
Use the latest available collation (Unicode keeps improving it); currently utf8mb4_0900_ai_ci.
Use ascii for things that are clearly only ascii -- country-code, postal_code, hex, etc. Mostly these can use CHAR(..)
Use ascii_general_ci or ascii_bin, depending on whether you need case folding.
Yes, it is redundant to have CHARACTER SET and COLLATION the same in a table definition and a column definition.
Having explicit column definitions means that anyone changes the table definitions of CHARACTER SET or COLLATION the column will remain identical.

ALTER DATABASE to change COLLATE not working

I am using Django on Bluehost. I created a form for user generated input, but unicode inputs from this form fails to be stored or displayed of characters. So I did a SO and google search that I should change the Collate and Character set of my database. I run this sql
ALTER DATABASE learncon_pywithyou CHARACTER SET utf8 COLLATE utf8_unicode_ci;
from python27 manage.py dbshell, which initiated a mysql shell, what shows on screen is
Query OK, 1 row affected (0.00 sec).
So I assume the problem is solved, but it is not actually. This sql has not done anything, as I later find it in phpMyAdmin provided by Bluehost. All the Varchar fields of all the tables are still in lantin1_swedish_ci collate.
So assume that alter table should work instead. I run this on mysql
alter table mytable character set utf8 collate utf8_unicode_ci;
although on screen it shows Query OK. 4 rows affected, it actually did nothing either, the collate of those fields in mytable did not change at all.
So I finally manually change the fields in phpMyAdmin for mytable and this works, now I am able to insert in this table with unicode and also they display correctly, but I have around 20 tables of such, I don't want to change them one by one manually.
Do we at all have a simple and effective way of changing Collate of each field to store and display correct unicodes?
Changing collation at the database level sets the default for new objects - existing collations will not be changed.
Similarly, at a table level, only new columns (See comment at the bottom) are affected with this:
alter table mytable character set utf8 collate utf8_unicode_ci;
However, to convert the collation of existing columns, you need to add convert to:
alter table mytable convert to character set utf8 collate utf8_unicode_ci;
In addition to #StuartLC ,
For Changing All 20 tables charset and collation use below query, Here world is database name
SELECT
CONCAT("ALTER TABLE ",TABLE_SCHEMA , ".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci") AS AlterSQL
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = "world";
The above will prepare all ALTER queries which you need to run.

Mysql -change DB,tables to utf8

In /etc/my.cnf the following has been added
character-set-server=utf8
collation-server=utf8_general_ci
But for the database and tables created before adding the above how to convert the database and tables to utf8 with collation settings
Well, the database character set and table character set are just defaults (they don't affect anything directly). You'd need to modify each column to the proper charset. PHPMyAdmin will do this for you (just edit the column, then change the character set). If you want to do raw SQL, you'll need to know the column definition (SHOW CREATE TABLE foo will show you the definition). Then, you can use ALTER TABLE to change the definition.
To change the default charset for a table:
ALTER TABLE `tablename` DEFAULT CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
To change the charset of a column with the definition `foo VARCHAR(128) CHARACTER SET 'foo' COLLATE 'foo'``:
ALTER TABLE `tablename` MODIFY
`foo` VARCHAR(128) CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
https://serverfault.com/questions/65043/alter-charset-and-collation-in-all-columns-in-all-tables-in-mysql
And:
http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/