The value cut off after that character 💀
Why this happening?
create table tmp2(t1 varchar(100));
insert into tmp2 values('before💀after');
mysql> select * from tmp2;
+--------+
| t1 |
+--------+
| before |
+--------+
1 row in set (0.01 sec)
I ran followed commands and returned some useful information
mysql> SHOW FULL COLUMNS FROM tmp2;
+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| t1 | varchar(100) | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
+-------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
1 row in set (0.00 sec)
and this,
mysql> SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_schema = "test" AND table_name = "tmp2" AND column_name = "t1";
+--------------------+
| character_set_name |
+--------------------+
| utf8 |
+--------------------+
1 row in set (0.00 sec)
Im testing this on ubuntu/mysql command line.
I found the solution here
I learnt some characters are not includes in utf8
There is a good article here
I needed to change column utf8 to utf8mb4 and it worked
alter table tmp2 modify t1 varchar(100) character set utf8mb4;
SET NAMES utf8mb4;
insert tmp2 values('before💀after');
Related
I have a table with a column, which has cp1251_general_ci collation. I don't want to change column collation, but I want to get data in utf8 encoding.
Is there a way to select any data somehow in a way that it looks just like a data with utf8_general_ci collation?
I.e. I need something like this
SELECT CONVERT_TO_UTF8(weirdColumn) FROM weirdTable
Here's a demo table using the cp1251 encoding. I'll insert some Cyrillic characters into it.
mysql> CREATE TABLE weirdTable (weirdColumn text) ENGINE=InnoDB DEFAULT CHARSET=cp1251;
mysql> insert into weirdTable values ('ЂЃЉЌ');
mysql> select * from weirdTable;
+-------------+
| weirdColumn |
+-------------+
| ЂЃЉЌ |
+-------------+
Use MySQL's CONVERT() function to force the characters to a different encoding:
mysql> select convert(weirdColumn using utf8) as weirdColumnUtf8 from weirdTable;
+-----------------+
| weirdColumnUtf8 |
+-----------------+
| ЂЃЉЌ |
+-----------------+
Here's proof that the result has been converted to utf8. I create a table using metadata from the query result:
mysql> create table w2
as select convert(weirdColumn using utf8) as weirdColumnUtf8 from weirdTable;
Query OK, 1 row affected (0.07 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> show create table w2\G
*************************** 1. row ***************************
Table: w2
Create Table: CREATE TABLE `w2` (
`weirdColumnUtf8` longtext CHARACTER SET utf8
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
mysql> select * from w2;
+-----------------+
| weirdColumnUtf8 |
+-----------------+
| ЂЃЉЌ |
+-----------------+
On my MySQL instance, utf8mb4 is the default character encoding. That's okay; it's a superset of utf8, and the utf8 encoding is enough to store these characters. However, I generally recommend if you use utf8, there's no reason not to use utf8mb4.
If you change the character encoding, you cannot keep the cp1251 collation. Collations are specific to encodings. But you can use one of the collations associated with utf8 or utf8mb4. You can see the available collations for a given character encoding:
mysql> SHOW COLLATION WHERE Charset = 'utf8';
+--------------------------+---------+-----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+--------------------------+---------+-----+---------+----------+---------+---------------+
...
| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 | PAD SPACE |
| utf8_general_mysql500_ci | utf8 | 223 | | Yes | 1 | PAD SPACE |
...
I have a table that has nullable columns:
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| name | varchar(255) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
I insert a row with name set to NULL;
INSERT INTO some_table (id, name) VALUES (1, NULL);
Query OK, 1 row affected (0.02 sec)
SELECT * FROM some_table;
+------+------+
| id | name |
+------+------+
| 1 | NULL |
+------+------+
1 row in set (0.01 sec)
If I alter the table's name column to be not-nullable it apparently converts NULL to an empty string:
ALTER TABLE some_table CHANGE COLUMN name name VARCHAR(255) NOT NULL;
Query OK, 1 row affected, 1 warning (0.02 sec)
Records: 1 Duplicates: 0 Warnings: 1
SELECT * FROM some_table;
+------+------+
| id | name |
+------+------+
| 1 | |
+------+------+
1 row in set (0.02 sec)
At this point I would expect an exception to be raised telling me that I have NULL in my dataset and I can not set the column name to NOT NULL.
Is this a configurable option in SQL/MariaDB?
Why is NULL being converted to an empty string?
There is a warning being invoked when altering the table:
SHOW WARNINGS;
+---------+------+-------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------+
| Warning | 1265 | Data truncated for column 'name' at row 1 |
+---------+------+-------------------------------------------+
1 row in set (0.01 sec)
Version:
SELECT version();
+----------------+
| version() |
+----------------+
| 5.5.62-MariaDB |
+----------------+
1 row in set (0.02 sec)
Apparently, from the documentation for ALTER TABLE, enabling strict mode would prevent your alter statement from succeeding:
This conversion may result in alteration of data. For example, if you shorten a string column, values may be truncated. To prevent the operation from succeeding if conversions to the new data type would result in loss of data, enable strict SQL mode before using ALTER TABLE.
One way to enable strict mode from within MySQL:
SET GLOBAL sql_mode='STRICT_TRANS_TABLES';
See here for other options.
Using 10.3.15-MariaDB-1 on Debian Buster, I cannot reproduce the problem:
MariaDB [foo]> CREATE TABLE some_table(id int(11), name varchar(255));
Query OK, 0 rows affected (0.009 sec)
MariaDB [foo]> INSERT INTO some_table (id, name) VALUES (1, NULL);
Query OK, 1 row affected (0.003 sec)
MariaDB [foo]> SELECT * FROM some_table;
+------+------+
| id | name |
+------+------+
| 1 | NULL |
+------+------+
1 row in set (0.000 sec)
MariaDB [foo]> ALTER TABLE some_table CHANGE COLUMN name name VARCHAR(255) NOT NULL;
ERROR 1265 (01000): Data truncated for column 'name' at row 1
MariaDB [foo]> SELECT * FROM some_table;
+------+------+
| id | name |
+------+------+
| 1 | NULL |
+------+------+
1 row in set (0.000 sec)
MariaDB [foo]> SELECT version();
+-------------------+
| version() |
+-------------------+
| 10.3.15-MariaDB-1 |
+-------------------+
1 row in set (0.000 sec)
If possible, I suggest you update your MariaDB version. It seems very old to me.
When attempting to insert 💩 (for example, which is a 4-byte unicode char), both MySQL (5.7) and MariaDB (10.2/10.3/10.4) give the same error:
Incorrect string value: '\xF0\x9F\x92\xA9'
The statement:
mysql> insert into bob (test) values ('💩');
Here's my database's charset/collation:
mysql> select ##collation_database; +----------------------+
| ##collation_database |
+----------------------+
| utf8mb4_unicode_ci |
+----------------------+
1 row in set (0.00 sec)
mysql> SELECT ##character_set_database; +--------------------------+
| ##character_set_database |
+--------------------------+
| utf8mb4 |
+--------------------------+
1 row in set (0.00 sec)
The server's character set:
mysql> show global variables like '%character_set_server%'\G; *************************** 1. row ***************************
Variable_name: character_set_server
Value: utf8mb4
The table:
create table bob ( `test` TEXT NOT NULL );
mysql> SHOW FULL COLUMNS FROM bob;
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
| test | text | utf8mb4_unicode_ci | NO | | NULL | | select,insert,update,references | |
+-------+------+--------------------+------+-----+---------+-------+---------------------------------+---------+
1 row in set (0.00 sec)
Can anyone point me in the right direction?
Yes, as you commented, you need to use SET NAMES utf8mb4.
Your 4-byte character must pass from your client through the database connection and into a table. All of those must support utf8mb4. If any one of them does not support utf8mb4, then 4-byte characters will not be able to get through.
SET NAMES utf8mb4 makes the database session expect clients to send string using that encoding. The default for character_set_client on MySQL 5.7 is utf8, so you need to set it to utf8mb4.
In MySQL 8.0.1 and later, the default character_set_client is utf8mb4 already, so you won't need to change it.
I have stored a value as varchar and as bigint in a MySQL DB:
userID_as_varchar varchar(50) DEFAULT NULL,
userID_as_bigint bigint(20) DEFAULT NULL,
+--------------------+---------------------------+
| userID_as_varchar | userID_as_bigint |
+--------------------+---------------------------+
| 917876131364446205 | 917876131364446200 |
+--------------------+---------------------------+
For any reason, I can't query the full userID_as_bigint value in full precision with SQL, but with R.
Behaviour SQL:
If I query the data or cast it it's always the "rounded" value.
Tested in phpMyAdmin and directly with sql command in shell.
Behaviour R:
If I query the field with R (RMySQL package) the value is complete 917876131364446205
Can anyone explain this behaviour or know a way how to get the full value with SQL.
Best regards.
Not quite sure what you mean, here's a test:
create table test(t1 varchar(50), t2 bigint);
Query OK, 0 rows affected (0.03 sec)
mysql> desc test
-> ;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| t1 | varchar(50) | YES | | NULL | |
| t2 | bigint(20) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
mysql> insert into test values('917876131364446205', 917876131364446205);
Query OK, 1 row affected (0.01 sec)
mysql> select * from test;
+--------------------+--------------------+
| t1 | t2 |
+--------------------+--------------------+
| 917876131364446205 | 917876131364446205 |
+--------------------+--------------------+
1 row in set (0.00 sec)
In MySQL, what is the best way of programmatically retrieving the character set and the collation of the current database?
Is the following:
SELECT
default_character_set_name, default_collation_name
FROM
information_schema.SCHEMATA
WHERE
SCHEMA_NAME = SCHEMA()
identical to the below example?
select ##character_set_database, ##collation_database
According to the documentation:
character_set_database
...
The character set used by the default
database. The server sets this variable whenever the default database
changes. If there is no default database, the variable has the same
value as character_set_server.
...
and
collation_database
...
The collation used by the default database. The
server sets this variable whenever the default database changes. If
there is no default database, the variable has the same value as
collation_server.
...
with both sentences would obtain the same result:
SELECT
default_character_set_name, default_collation_name
FROM
information_schema.SCHEMATA
WHERE
SCHEMA_NAME = SCHEMA()
and
select ##character_set_database, ##collation_database
demonstrated in the following test:
mysql> DROP DATABASE IF EXISTS `my_database`;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT SCHEMA();
+----------+
| SCHEMA() |
+----------+
| NULL |
+----------+
1 row in set (0.00 sec)
mysql> SELECT
-> ##SESSION.character_set_database,
-> ##SESSION.collation_database,
-> ##SESSION.character_set_server,
-> ##SESSION.collation_server;
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| ##SESSION.character_set_database | ##SESSION.collation_database | ##SESSION.character_set_server | ##SESSION.collation_server |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| latin1 | latin1_swedish_ci | latin1 | latin1_swedish_ci |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
1 row in set (0.00 sec)
mysql> CREATE DATABASE IF NOT EXISTS `my_database`
-> CHARACTER SET utf8mb4
-> COLLATE utf8mb4_general_ci;
Query OK, 1 row affected (0.00 sec)
mysql> SELECT SCHEMA();
+----------+
| SCHEMA() |
+----------+
| NULL |
+----------+
1 row in set (0.00 sec)
mysql> SELECT
-> ##SESSION.character_set_database,
-> ##SESSION.collation_database,
-> ##SESSION.character_set_server,
-> ##SESSION.collation_server;
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| ##SESSION.character_set_database | ##SESSION.collation_database | ##SESSION.character_set_server | ##SESSION.collation_server |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| latin1 | latin1_swedish_ci | latin1 | latin1_swedish_ci |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
1 row in set (0.00 sec)
mysql> USE `my_database`;
Database changed
mysql> SELECT SCHEMA();
+-------------+
| SCHEMA() |
+-------------+
| my_database |
+-------------+
1 row in set (0.00 sec)
mysql> SELECT
-> ##SESSION.character_set_database,
-> ##SESSION.collation_database,
-> ##SESSION.character_set_server,
-> ##SESSION.collation_server;
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| ##SESSION.character_set_database | ##SESSION.collation_database | ##SESSION.character_set_server | ##SESSION.collation_server |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
| utf8mb4 | utf8mb4_general_ci | latin1 | latin1_swedish_ci |
+----------------------------------+------------------------------+--------------------------------+----------------------------+
1 row in set (0.00 sec)
mysql> SELECT
-> `DEFAULT_CHARACTER_SET_NAME`,
-> `DEFAULT_COLLATION_NAME`
-> FROM
-> `information_schema`.`SCHEMATA`
-> WHERE
-> SCHEMA_NAME = SCHEMA();
+----------------------------+------------------------+
| DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| utf8mb4 | utf8mb4_general_ci |
+----------------------------+------------------------+
1 row in set (0.00 sec)
however, a union of both sentences would not be wrong:
mysql> USE `my_database`;
Database changed
mysql> SELECT
-> `DEFAULT_CHARACTER_SET_NAME`,
-> `DEFAULT_COLLATION_NAME`
-> FROM
-> `information_schema`.`SCHEMATA`
-> WHERE
-> SCHEMA_NAME = SCHEMA() AND
-> `DEFAULT_CHARACTER_SET_NAME` = ##SESSION.character_set_database AND
-> `DEFAULT_COLLATION_NAME` = ##SESSION.collation_database;
+----------------------------+------------------------+
| DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| utf8mb4 | utf8mb4_general_ci |
+----------------------------+------------------------+
1 row in set (0.00 sec)