Information Schema showing two character encodings for column - mysql

I'm in the process of migrating a MySQL database from the utf8 character set to uft8mb4, following this guide (https://mathiasbynens.be/notes/mysql-utf8mb4). For one of the tables I updated (table1) I get weird output from the information_schema. table1 has four columns, each listed below:
data_store VARCHAR(24)
data_group VARCHAR(24)
source_count INT
load_count INT
I have validated that only 4 columns appear through SELECT * on the table. However, running the following query on information_schema produces odd output.
SELECT column_name, character_set_name FROM information_schema.COLUMNS
WHERE table_name = "table1";
COLUMN_NAME CHARACTER_SET_NAME
--------------------------------------------------------------------------------------
data_store utf8
data_group utf8
source_count <null>
load_count <null>
data_store utf8mb4
data_group utf8mb4
source_count <null>
load_count <null>
I don't see duplicate rows (with differing character sets) for any other table that I have updated and am at a loss in regards to what is wrong and/or how to fix it. Any help would be much appreciated!
Notes: I believe I could just remove the unwanted columns from information_schema, but I'm not sure if this would break anything.

I think you're seeing the columns from different table1 tables in several different database schemas.
Try this to verify that claim.
SELECT table_schema, column_name, character_set_name
FROM information_schema.COLUMNS
WHERE table_name = 'table1'
Try this query to filter by the current database.
SELECT column_name, character_set_name
FROM information_schema.COLUMNS
WHERE table_name = 'table1'
AND table_schema = DATABASE()
Do not try to alter the INFORMATION_SCHEMA database in any way. Don't delete rows, don't add columns, or anything else. It's supposed to be readonly. But sometimes it isn't, and altering it can trash your MySQL instance. Don't ask me how I know that. :-)

Related

Get a Table's Character Set

In MySQL, I can get a table's name, engine and collation like so:
SELECT TABLE_NAME, TABLE_SCHEMA, ENGINE, TABLE_COLLATION
FROM information_schema.tables
WHERE table_name = 'tbl_name';
But how how do I get a table's character set, not just collation? Is it possible to get it from information_schema.tables?
Each collation is used for only one character set, so it's not necessary to record the character set in the INFORMATION_SCHEMA.TABLES. The table collation is enough to indicate both the collation and the character set unambiguously.
You can check INFORMATION_SCHEMA.COLLATIONS or INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY to get the mapping from a given collation to its character set.
Try running this:
SELECT default_character_set_name FROM information_schema.SCHEMATA S WHERE schema_name = "DBNAME";

MySql VIEW created with charset latin1 all configs are set to utf8

I'm trying to create a simple view but I'm getting error because the view is created with latin1 instead of utf8.
The View looks something like this:
create or replace view
my_view
as
select * from my_table
group by some_field
collate utf8_unicode_ci
;
The error I'm getting is:
COLLATION 'utf8_unicode_ci' is not valid for CHARACTER SET 'latin1'
What I did, was check multiple configuration options:
Global
show variables like "%char%";
character_set_client,utf8
character_set_connection,utf8
character_set_database,utf8
character_set_filesystem,binary
character_set_results,utf8
character_set_server,utf8
character_set_system,utf8
The table I'm using to create the view:
SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_COLLATION
FROM INFORMATION_SCHEMA.TABLES
where TABLE_NAME in ('my_table');
;
def,my_database,my_table,utf8_unicode_ci
The Columns of that table:
SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, COLLATION_NAME
FROM INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME in ('my_table');
def,my_database,my_table,id,null
def,my_database,my_table,active,null
def,my_database,my_table,title,utf8_unicode_ci
I had the same issue in an MySQL database, where I had Umlauts/special charaters within a CASE like this (reduced sample):
CASE
WHEN cfs.value = "ungeklärt" THEN "" -- the character 'ä' was an issue for me here
ELSE cfs.value
END AS "Category"
The error popped up for me, when savin this statement as a view.
Can you check if it solves the issue for you, when you set the names accordingly first? In my case this was latin1 instead of utf8 due to an old legacy database.
set names latin1;
If I understood your case correclty it would be this for you instead:
set names utf8;

MYSQL - set default as NULL to all columns, where default is not set

I have about 12 databases, each with 50 tables and most of them with 30+ columns; the db was running in strict mode as OFF, but now we had to migrate the db to cleardb service which by default has strict mode as ON.
all the tables that had "Not Null" constraint, the inserts have stopped working, just because the default values are not being passed; while in case of strict mode as OFF if the value are not provided, the MYSQL will presume the default value of the column datatype.
Is there a script I can use to get the metadata about all the columns of all tables and generate a script to alter all the tables with such columns to change the default to "Null"
You should consider using the information_schema tables to generate DDL statements to alter the tables. This kind of query will get you the list of offending columns.
SELECT CONCAT_WS('.',TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME) col
FROM information_schema.COLUMNS
WHERE IS_NULLABLE = 0
AND LENGTH(COLUMN_DEFAULT) = 0
AND TABLE_SCHEMA IN ('db1', 'db2', 'db3')
You can do similar things to generate ALTER statements to change the tables. But beware, MySQL likes to rewrite tables when you alter certain things. It might take time.
DO NOT attempt to UPDATE the information_schema directly!
You could try changing the strict_mode setting when you connect to the SaaS service, so your software will work compatibly.
This is a large project and is probably important business for cleardb. Why not ask them for help in changing the strict_mode setting?
This is what I came up with on the base of #Ollie-jones script
https://gist.github.com/brijrajsingh/efd3c273440dfebcb99a62119af2ecd5
SELECT CONCAT_WS('.',TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME) col,CONCAT('alter table ',TABLE_NAME,' MODIFY COLUMN ', COLUMN_NAME,' ',DATA_TYPE,'(',CHARACTER_MAXIMUM_LENGTH,') NULL DEFAULT NULL') as script_col
FROM information_schema.COLUMNS
WHERE is_nullable=0
and length(COLUMN_DEFAULT) is NULL and
CHARACTER_MAXIMUM_LENGTH is not NULL and
table_schema = 'immh'
Union
SELECT CONCAT_WS('.',TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME) col,CONCAT('alter table ',TABLE_NAME,' MODIFY COLUMN ', COLUMN_NAME,' ',DATA_TYPE,' NULL DEFAULT NULL') as script_col
FROM information_schema.COLUMNS
WHERE is_nullable=0
and length(COLUMN_DEFAULT) is NULL and
CHARACTER_MAXIMUM_LENGTH is NULL and
table_schema = 'immh'

Issue in a query with left join

Following is my query in which I am getting results from two different tables, but its giving me following error kindly let me know what i did wrong:
Error: #1267 - Illegal mix of collations (utf8_general_ci,COERCIBLE) and (latin1_swedish_ci,IMPLICIT) for operation '='
SELECT MD5( pre_quiz.qid ),
pre_quiz.quiz_title,
pre_quiz.quiz_desc,
pre_course.cname
FROM pre_quiz
LEFT JOIN pre_course ON
MD5( pre_course.cid ) = pre_quiz.quiz_course_id
WHERE pre_quiz.createdby = 'user'
ORDER BY pre_quiz.quiz_title
Check the collation type of each table, and make sure that they have the same collation.
After that check also the collation type of each table field that you have use in operation.
Here's how to check which columns are the wrong collation:
SELECT table_schema, table_name, column_name, character_set_name, collation_name
FROM information_schema.columns
WHERE collation_name = 'latin1_general_ci'
ORDER BY table_schema, table_name,ordinal_position;
And here's the query to fix it:
ALTER TABLE tbl_name CONVERT TO CHARACTER SET latin1 COLLATE 'latin1_swedish_ci';
Edit
To change the collation of a column
ALTER TABLE MyTable ALTER COLUMN Column1 [TYPE] COLLATE [NewCollation]
Source

mysql check collation of a table

How can I see what collation a table has? I.E. I want to see:
+-----------------------------+
| table | collation |
|-----------------------------|
| t_name | latin_general_ci |
+-----------------------------+
SHOW TABLE STATUS shows information about a table, including the collation.
For example SHOW TABLE STATUS where name like 'TABLE_NAME'
The above answer is great, but it doesn't actually provide an example that saves the user from having to look up the syntax:
show table status like 'test';
Where test is the table name.
(Corrected as per comments below.)
Checking the collation of a specific table
You can query INFORMATION_SCHEMA.TABLES and get the collation for a specific table:
SELECT TABLE_SCHEMA
, TABLE_NAME
, TABLE_COLLATION
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 't_name';
that gives a much more readable output in contrast to SHOW TABLE STATUS that contains a lot of irrelevant information.
Checking the collation of columns
Note that collation can also be applied to columns (which might have a different collation than the table itself). To fetch the columns' collation for a particular table, you can query INFORMATION_SCHEMA.COLUMNS:
SELECT TABLE_SCHEMA
, TABLE_NAME
, COLUMN_NAME
, COLLATION_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 't_name';
For more details you can refer to the article How to Check and Change the Collation of MySQL Tables
Use this query:
SHOW CREATE TABLE tablename
You will get all information related to table.
Check collation of the whole database
If someone is looking here also for a way to check collation on the whole database:
use mydatabase; (where mydatabase is the name of the database you're going to check)
SELECT ##character_set_database, ##collation_database;
You should see the result like:
+--------------------------+----------------------+
| ##character_set_database | ##collation_database |
+--------------------------+----------------------+
| utf8mb4 | utf8mb4_unicode_ci |
+--------------------------+----------------------+
1 row in set (0.00 sec)