Data too long for column error with national characters - mysql

I have to port some DBS into stand alone MySQL Version: 5.0.18 running on Windows7 64bit and I got a problem I am stuck with. If I try to insert any national/unicode character into varchar I got error:
ERROR 1406 (22001): Data too long for column 'nam' at row 1
Here MCVE SQL script:
SET NAMES utf8;
DROP TABLE IF EXISTS `tab`;
CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' ) DEFAULT CHARSET=utf8;
INSERT INTO `tab` VALUES (1,'motorček');
INSERT INTO `tab` VALUES (2,'motorcek');
SELECT * FROM `tab`;
And here output:
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> DROP TABLE IF EXISTS `tab`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' ) DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO `tab` VALUES (1,'motorček');
ERROR 1406 (22001): Data too long for column 'nam' at row 1
mysql> INSERT INTO `tab` VALUES (2,'motorcek');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM `tab`;
+------+----------+
| ix | nam |
+------+----------+
| 2 | motorcek |
+------+----------+
1 row in set (0.00 sec)
As you can see the entry with national character č E8h is missing.
I am aware of these QAs:
How to make MySQL handle UTF-8 properly
“Data too long for column” - why?
Error Code: 1406. Data too long for column - MySQL
but they do not address this problem (no solution from any of those work for this).
This problem is present even for single character strings. No matter the size of VARCHAR. So the only solution for now is change the national characters into ASCII but that would lose information which I would rather avoid.
I tried using various character sets utf8, ucs2, latin1 without any effect.
I tried drop the STRICT_TRANS_TABLES as some of the other answers suggest but that has no effect with this either (and the string size is many times bigger than needed).
Does anyone have any clues? May be it has something to do with fact that this MySQL server is standalone (it is not installed) it is started with this cmd:
#echo off
bin\mysqld --defaults-file=bin\my.ini --standalone --console --wait_timeout=2147483 --interactive_timeout=2147483
if errorlevel 1 goto error
goto finish
:error
echo.
echo MySQL could not be started
pause
:finish
and queries are done inside console started like this cmd:
#echo off
bin\mysql.exe -uroot -h127.0.0.1 -P3306
rem bin\mysql.exe -uroot -proot -h127.0.0.1 -P3306

Well looking at the char č code E8h (while writing question) It does not look like UTF8 but rather a extended ASCII (code above 7Fh) which finally pointed me to try this MySQL script:
SET NAMES latin1;
DROP TABLE IF EXISTS `tab`;
CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' );
INSERT INTO `tab` VALUES (1,'motorček');
INSERT INTO `tab` VALUES (2,'motorcek');
SELECT * FROM `tab`;
Which finally works (silly me I thought I already tried it before without correct result). So my error was to force Unicode (which was set as default) for Non Unicode strings (which I think should work). Here the result:
mysql> SET NAMES latin1;
Query OK, 0 rows affected (0.00 sec)
mysql> DROP TABLE IF EXISTS `tab`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE `tab` (`ix` INT default 0,`nam` VARCHAR(1024) default '' );
Query OK, 0 rows affected (0.02 sec)
mysql> INSERT INTO `tab` VALUES (1,'motorček');
Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO `tab` VALUES (2,'motorcek');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM `tab`;
+------+----------+
| ix | nam |
+------+----------+
| 1 | motorček |
| 2 | motorcek |
+------+----------+
2 rows in set (0.00 sec)
But as you can see there is some discrepancy in the table formatting but that does not matter much as the presentation will be done in C++ anyway.
Without writing this Question I would probably going in circles for hours or even days. Hopefully this helps others too.
[Edit1]
Now I got another problem caused by Windows. If I pass the script with Clipboard or type it myself all is OK but if I use source file then the national characters will go wrong (and the -e option does not help either). As I need to use files I am still looking for solution. But as this is different problem I decided to Ask new question:
Using source command corrupts non Unicode text encoding

Related

Different behavior with MySQL 5.7 and 8.0

I'm trying to use MySQL 8.0 but I'm having some problems. I have installed MySQL 5.7 and 8.0, and have different behavior with CHAR columns.
For MySQL 5.7:
mysql> create table test (id integer, c5 char(5));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test values(0, 'a');
Query OK, 1 row affected (0.00 sec)
mysql> select * from test where c5 = 'a ';
+------+------+
| id | c5 |
+------+------+
| 0 | a |
+------+------+
1 row in set (0.00 sec)
mysql>
For MySQL 8.0:
mysql> create table test (id integer, c5 char(5));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into test values(0, 'a');
Query OK, 1 row affected (0.01 sec)
mysql> select * from test where c5 = 'a ';
Empty set (0.00 sec)
mysql>
Both servers have same configuration.
MySQL 5.7:
[mysqld]
port=3357
datadir=/opt/mysql_57/data
sql_mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
default_storage_engine=innodb
character-set_server=utf8mb4
socket=/opt/mysql_57/mysql57.sock
max_allowed_packet=4194304
server_id=1
lower_case_table_names=0
MySQL 8.0:
[mysqld]
port=3380
datadir=/opt/mysql_80/data
sql_mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
default_storage_engine=innodb
character-set_server=utf8mb4
socket=/opt/mysql_80/mysql80.sock
max_allowed_packet=4194304
server_id=1
lower_case_table_names=0
A brief overview of the MySQL 8.0 changelog didn't give me any information. Where described this behavior changes?
Best regards.
How MySQL handled trailing spaces, depends on the collation being used. See https://dev.mysql.com/doc/refman/8.0/en/charset-binary-collations.html for details.
What has changed between 5.7 and 8.0, is that the default character set is now UTF8mb4 with NOPAD collations.
If you want another behavior, you should change character set/collation for your column/table/database. Check the INFORMATION_SCHEMA table COLLATIONS for available PAD collations. (One warning: The older PAD SPACE collations may be less efficient. Quite some work has been made to improve the performance of the new Unicode collations based on UCA 9.0.0.)
See PAD_CHAR_TO_FULL_LENGTH in MySQL documentation

MySql update. difference between procedure results and phpmyadmin sql

I am having difficult getting a procedure to update a table in the way I require. I am using phpmyadmin on my local computer. In phpmyadmin I can put the following code into the SQL tab and one row will be updated:
SET `adjCost` = 22.05 WHERE `Name` LIKE CONCAT('magic', '%') AND `idKey` = '2016fulham02345';
As expected and wanted, IF the name begins with magic AND the idKey is '2016fulham02345' THEN the adjCost is updated to 22.05.
There will be between 2 and 50 rows with the same idKey. The Name will never be repeated in a set with the same idKey.
I created a procedure with the following parameters:
IN idK VARCHAR 255 Charset
IN aName VARCHAR 255 Charset
IN cost FLOAT 5,2
BEGIN
UPDATE `raceresults` SET `adjCost` = cost WHERE `Name` LIKE CONCAT(aName, '%') AND `idKey` = idK;
END
When I run this procedure it updates ALL adjCost where the idKey = idk and (seems) to ignore the name parameter.
I have tried concatenating the name string first:
BEGIN
SELECT CONCAT(aName, '%') INTO #str;
UPDATE `raceresults` SET `adjCost` = cost WHERE `Name` = #str AND `idKey` = idK;
END
but to no avail.
I looked through w3schools, stackoverflow and google and have not been able to find the answer.
My question is:
How can I correct my procedure to get it to work as I would like?
UPDATE: as requested.
CREATE DEFINER=`root`#`localhost` PROCEDURE `importAltUpdateAjdCost`(IN `idK` VARCHAR(255), IN `aName` VARCHAR(255), IN `cost` FLOAT(5,2))
NO SQL
BEGIN
UPDATE `costingPP`
SET `adjCost` = cost
WHERE
`Name` LIKE CONCAT(aName, '%')
AND
`idKey` = idK;
END
To get this, I selected export on my list of procedures on phpmyadmin.
I'm not entirely sure what or how you did, but here's what I did and it instantly worked. Since you didn't specify MySQL version, I used 5.7.
EDIT: Now as I went back to see your procedure creation statement I realised that NO SQL was introduced in MySQL 8.0. Since your procedure clearly is SQL then please remove the NO SQL and re-create the procedure.
I'm leaving my MySQL 5.7 sample here for reference:
1) Created a simple table:
mysql> CREATE TABLE raceresults (
-> idKey VARCHAR(255),
-> Name VARCHAR(255),
-> adjCost FLOAT(5,2)
-> );
Query OK, 0 rows affected (0.06 sec)
2) Here we insert a sample data row:
mysql> INSERT INTO raceresults VALUES ('2016fulham02345', 'magicFlyingHorse', 0.00);
Query OK, 1 row affected (0.01 sec)
3) To create a (STORED) PROCEDURE we have to temporarily set a different delimiter, so query parser wouldn't terminate procedure creation on default semi-colon, as it's used inside the procedure. After delimiter's change we create the procedure and set the delimiter back to semi-colon
mysql> DELIMITER //
mysql> CREATE PROCEDURE update_test(IN idK VARCHAR(255), IN aName VARCHAR(255), IN cost FLOAT(5,2))
-> BEGIN
-> UPDATE `raceresults` SET `adjCost` = cost WHERE `Name` LIKE CONCAT(aName, '%') AND `idKey` = idK;
-> END//
mysql> DELIMITER ;
Query OK, 0 rows affected (0.00 sec)
4) Now let's see how it all works. Before and after the procedure call I'm selecting the rows from database. You can see the cost column value changing:
mysql> SELECT * FROM raceresults;
+-----------------+------------------+---------+
| idKey | Name | adjCost |
+-----------------+------------------+---------+
| 2016fulham02345 | magicFlyingHorse | 0.00 |
+-----------------+------------------+---------+
1 row in set (0.00 sec)
mysql> CALL update_test('2016fulham02345', 'magic', 1.23);
Query OK, 1 row affected (0.02 sec)
mysql> SELECT * FROM raceresults;
+-----------------+------------------+---------+
| idKey | Name | adjCost |
+-----------------+------------------+---------+
| 2016fulham02345 | magicFlyingHorse | 1.23 |
+-----------------+------------------+---------+
1 row in set (0.00 sec)
And now one piece of advise too:
If possible, use only lower case table, column, indexes, functions, procedures, etc... names, while always writing all SQL commands in uppercase (which you did). This is kind of a de facto standard and makes life easier both for you and others reading your code.

MySQL query returns 0 rows when searching for value with dot (.) in string

If I try to search for a value in mysql database and the string value contains dot in it, query returns 0 rows. Example:
SELECT * FROM table WHERE `username`='marco.polo' --> 0 rows
SELECT * FROM table WHERE `username` LIKE '%.polo%' --> 0 rows
SELECT * FROM table WHERE `username` LIKE 'polo' --> Success
This appeared after moving server and database to another place. I know that dot is a set of extended regular expressions, but it should not apply to equal nor LIKE operator, simply because I don't use REGEXP in query.
I've tested the same query on my local database and it works fine.
Could there be a special setting in mysql that treats dot differently than it usually does?
user1084605, I tried to replicate the problem (using MySQL version 5.1.37), but got exactly the opposite results as you. See below:
mysql> create table test (username varchar(100));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into test values ('marco.polo');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM test WHERE `username`='marco.polo';
+------------+
| username |
+------------+
| marco.polo |
+------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM test WHERE `username` LIKE '%.polo%';
+------------+
| username |
+------------+
| marco.polo |
+------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM test WHERE `username` LIKE 'polo';
Empty set (0.00 sec)
According to the MySQL docs, the only special characters when using the LIKE operator are "%" (percent: matches 0, 1, or many characters) and "_" (underscore: matches one and only one character).
http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html
A "." (period) does have special meaning for MySQL's REGEXP operator, but it should still match a literal period in your column.
http://dev.mysql.com/doc/refman/5.0/en/regexp.html
Can you replicate the SQL statements I ran above and paste your results in reply?
As #cen already mentioned, character set can causes that problem.
I have had this sample:
`email` VARCHAR(45) CHARACTER SET 'armscii8' NOT NULL,
this is was in the .sql dump, which I receive.
So, when I was trying to fetch object with this email
I couldn't get it.
The below query takes care of the scenario when we have only DOT operator in the columns.
SELECT * FROM test WHERE `username` LIKE '%.%';

2 servers, 2 memory tables, different sizes

I have got two servers both running a MySQL instance. The first one, server1, is running MySQL 5.0.22. The other one, server2, is running MySQL 5.1.58.
When I create a memory table on server1 and I add a row its size is instantly 8,190.0 KiB.
When I create a memory table on server2 and I add a row its size is still only some bytes, though.
Is this caused by the difference in MySQL version or (hopefully) is this due to some setting I can change?
EDIT:
I haven't found the reason for this behaviour yet, but I did found a workaround. So, for future references, this is what fixed it for me:
All my memory tables are made once and are read-only from thereon. When you specify to MySQL the maximum number of rows your table will have, its size will shrink. The following query will do that for you.
ALTER TABLE table_name MAX_ROWS = N
Factor of 2?
OK, the problem likely is caused by the UTF-8 vs latin1
:- http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html
You can check the database connection, database default character set for both servers.
here is the testing I have just done :-
mysql> create table test ( name varchar(10) ) engine
-> =memory;
Query OK, 0 rows affected (0.03 sec)
mysql> show create table test;
+-------+------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+------------------------------------------------------------------------------------------------+
| test | CREATE TABLE `test` (
`name` varchar(10) DEFAULT NULL
) ENGINE=MEMORY DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> insert into test values ( 1 );
mysql> set names utf8;
Query OK, 0 rows affected (0.01 sec)
mysql> create table test2 ( name varchar(10) ) engine =memory default charset = utf8;
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> insert into test2 values ( convert(1 using utf8) );
Query OK, 1 row affected (0.01 sec)
mysql> select table_name, avg_row_length from information_schema.tables where TABLE_NAME in( 'test2', 'test');
+------------+----------------+
| table_name | avg_row_length |
+------------+----------------+
| test | 12 |
| test2 | 32 |
+------------+----------------+
2 rows in set (0.01 sec)

double checking my mysql field lengths

I am creating my first serious project in PHP and I want to make sure I have my database setup correctly. It is utf8_general_ci and for example the max I want usernames to be is 20 characters, so the username field in the database would be a varchar(20)? Sorry if this is stupid, it is just I read something somewhere that is making me question myself.
Yes you're right:
CREATE DATABASE my_test_db
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
USE my_test_db;
Database changed
CREATE TABLE users (username varchar(20));
Query OK, 0 rows affected (0.04 sec)
INSERT INTO users VALUES ('abcdefghijklmnopqrstuvwxyz');
Query OK, 1 row affected, 1 warning (0.00 sec)
SELECT * FROM users;
+----------------------+
| username |
+----------------------+
| abcdefghijklmnopqrst |
+----------------------+
1 row in set (0.00 sec)