MySQL truncates concatenated result of a GROUP_CONCAT function - mysql

I've created a view which uses GROUP_CONCAT to concatenate results from a query on products column with data type of 'varchar(7) utf8_general_ci' in a column named concat_products.
The problem is that MySQL truncates value of "concat_products" column.
phpMyAdmin says the data type of "concat_products" column is varchar(341) utf8_bin
Table products:
CREATE TABLE `products`(
`productId` tinyint(2) unsigned NOT NULL AUTO_INCREMENT,
`product` varchar(7) COLLATE utf8_general_ci NOT NULL,
`price` mediumint(5) unsigned NOT NULL,
PRIMARY KEY (`productId`)
) ENGINE=InnoDB AUTO_INCREMENT=28 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
The "concat_products_vw" view:
CREATE VIEW concat_products_vw AS
SELECT
`userId`,
GROUP_CONCAT(CONCAT_WS('_', `product`, `productId`, `price`)
ORDER BY `productId` ASC SEPARATOR '*') AS concat_products
FROM
`users`
LEFT JOIN `products`
ON `users`.`accountBalance` >= `product`.`price`
GROUP BY `productId`
According to MySQL manual:
Values in VARCHAR columns are variable-length strings
Length can be specified as a value from 1 to 255 before MySQL 4.0.2 and 0 to 255 as of MySQL 4.0.2.
EDIT
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535.
Why MySQL specifies more than 255 characters for varchar "concat_products" column? (solved!)
Why uf8_bin instead of utf8_general_ci?
Is it possible to change the data type of a column in a view for example in my case to text for "concat_products" column?
If not what can I do to prevent MySQL from truncating "concat_products" column?

As I already wrote in an earlier comment, the MySQL manual says:
Values in VARCHAR columns are variable-length strings. The length can
be specified as a value from 0 to 65,535.
So the problem is not with the data type of the field.
The MySQL manual also says:
The result is truncated to the maximum length that is given by the
group_concat_max_len system variable, which has a default value of
1024. The value can be set higher, although the effective maximum length of the return value is constrained by the value of
max_allowed_packet. The syntax to change the value of
group_concat_max_len at runtime is as follows, where val is an
unsigned integer:
SET [GLOBAL | SESSION] group_concat_max_len = val;
Your options for changing the value of group_concat_max_len are:
changing the value at MySQL startup by appending this to the command:
--group_concat_max_len=your_value_here
adding this line in your MySQL configuration file (mysql.ini): group_concat_max_len=your_value_here
running this command after MySQL startup:
SET GLOBAL group_concat_max_len=your_value_here;
running this command after opening a MySQL connection:
SET SESSION group_concat_max_len=your_value_here;
Documentation: SET, Server System Variables: group_concat_max_len

As Jocelyn mentioned, the size of a GROUP_CONCAT() result is bounded by group_concat_max_len, however there is an additional interaction with ORDER BY that results in a further truncation to 1/3 of group_concat_max_len. For an example, see this related answer.
The default value for group_concat_max_len is 1024, and 1024 / 3 = 341 probably explains why the type of concat_products shows up as varchar(341) in the original example. If you were to remove the GROUP BY productId clause, concat_products should show up as varchar(1024).
I have not found this interaction between GROUP_CONCAT() and ORDER BY mentioned in the MySQL Manual, but it affects at least MySQL Server 5.1.

Year 2023, on Linux Server you can edit my.cnf file and then set:
[mysqld]
group_concat_max_len = 2048

Related

Mysql How to add a column as varchar(21884) with UTF8 charset?

If I execute this query:
CREATE TABLE `varchar_test1` (
`id` tinyint(1) NOT NULL,
`cloumn_1` varchar(21844) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;
it is ok.
If I then execute this:
ALTER TABLE `varchar_test1` ADD COLUMN `cloumn_2` varchar(21844) NOT NULL;
I get an error:
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
If I execute this:
CREATE TABLE `varchar_test2` (
`id` int NOT NULL AUTO_INCREMENT,
`cloumn_1` varchar(21844) NOT NULL,
PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
I get:
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
Why?
Running mysql --version returns
mysql Ver 14.14 Distrib 5.7.17, for macos10.12 (x86_64) using EditLine wrapper
Your problem is that your columns store multi-byte values and thus exceed the maximum row size.
As explained in the docs, MySQL tables have a maximum row size of 65,535 bytes, regardless of the engine you use. Adding your two varchar(21844) columns means you go over this limit. This happens because the CHARSET on your table is utf8 (currently an alias for utf8mb3), so each character of a varchar in your table takes a maximum of 3 bytes (plus 2 bytes to store the length). That's fine with one column of 21,844 characters, but not with two.
To fix this, use TEXT (or TINYTEXT, MEDIUMTEXT, etc.) instead of VARCHAR columns. This will cause the values to be stored separately, so each of your columns will actually only contribute a few bytes to the row size. (This is also explained in the docs.)
Also, FYI: the spelling is column, not cloumn.

Cakephp 3 create i18n table in phpmyadmin issue

I have a problem to create i18n table for CakePHP 3 Translate Behavior. So I have my database in phpmyadmin and when I want to execute this piece of code from the official cookbook :
CREATE TABLE i18n (
id int NOT NULL auto_increment,
locale varchar(6) NOT NULL,
model varchar(255) NOT NULL,
foreign_key int(10) NOT NULL,
field varchar(255) NOT NULL,
content text,
PRIMARY KEY (id),
UNIQUE INDEX I18N_LOCALE_FIELD(locale, model, foreign_key, field),
INDEX I18N_FIELD(model, foreign_key, field)
);
PhpMyAdmin say :
1071 - Specified key was too long; max key length is 767 bytes
I'm in uft8_unicode_ci. Should I go for utf8_general_ci?
Thanks for your help.
There is no difference in size requirements between utf8_unicode and utf8_general, they only differ with regards to sorting.
By default the index (key prefix) limit is 767 bytes for InnoDB tables (and 1000 bytes for MyISAM), if applicable enable the innodb_large_prefix option (it is enabled by default as of MySQL 5.7) which raises the limit to 3072 bytes, or make the VARCHAR columns smaller, and/or change their collation, the locale column (which holds ISO locale/country codes) surely doesn't use unicode characters, and chances are that your model and column/field names also only use ASCII characters, and that their names are way below 255 characters in length.
With an ASCII collation the VARCHAR columns require only 1 byte per char, unlike with UTF-8, which can require up to 3 bytes (or 4 bytes for the mb4 variants), which alone already causes the index size limit to be exceeded (3 * 255 * 2 = 1530).
See also
MySQL 5.7 Manual > Character Sets and Collations
MySQL 5.7 Manual > Limits on InnoDB Tables > Maximums and Minimums
MySQL 5.7 Manual > InnoDB Startup Options and System Variables > innodb_large_prefix
I have limited my request with :
model varchar(85) NOT NULL,
field varchar(85) NOT NULL,
model and field at 85, I think it's enought, I mysql accept it.
Hope that will help someone.

How to make MySQL handle strings like SQLite does, with regard to Unicode and collation?

I've been researching this question for several hours now, on SO, in MySQL docs, and elsewhere, but still can't find a satisfactory solution. The problem is:
What is the simplest way to make MySQL treat strings just like SQLite does, without any extra "smart" conversions?
For example, the following works perfectly in SQLite:
CREATE TABLE `dummy` (`key` VARCHAR(255) NOT NULL UNIQUE);
INSERT INTO `dummy` (`key`) VALUES ('one');
INSERT INTO `dummy` (`key`) VALUES ('one ');
INSERT INTO `dummy` (`key`) VALUES ('One');
INSERT INTO `dummy` (`key`) VALUES ('öne');
SELECT * FROM `dummy`;
However, in MySQL, with the following settings:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_bin
and the following CREATE DATABASE statement:
CREATE DATABASE `dummydb` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_bin;
it still fails on the second INSERT.
I'd rather keep string column declarations as simple as possible, SQLite's TEXT being the ideal. Looks like VARBINARY is the way to go, but I would still like to hear your opinions on any other, potentially better options.
Addendum: The SHOW CREATE TABLE dummy output is
mysql> SHOW CREATE TABLE dummy;
+-------+-----------------------------------------------------
| Table | Create Table
+-------+-----------------------------------------------------
| dummy | CREATE TABLE `dummy` (
`key` varchar(255) COLLATE utf8mb4_bin NOT NULL,
UNIQUE KEY `key` (`key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+-----------------------------------------------------
1 row in set (0.00 sec)
MySQL wants to convert strings when doing INSERT and SELECT. The conversion is between what you declare the client to have and what the column is declared to be storing.
The only way to avoid that is with VARBINARY and BLOB instead of VARCHAR and TEXT.
The use of COLLATION utf8mb4_bin does not avoid conversion to/from CHARACTER SET utf8mb4; it merely says that WHERE and ORDER BY should compare the bits instead of dealing with accents and case folding.
Keep in mind that CHARACTER SET utf8mb4 is a way to encode text; COLLATION utf8mb4_* is rules for comparing texts in that encoding. _bin is simpleminded.
UNIQUE involves comparing for equality, hence COLLATION. In most utf8mb4 collations, the 3 (without spaces) will compare equal. utf8mb4_bin will treat the 3 as different. utf8mb4_hungarian_ci treats one=One>öne.
The trailing spaces are controlled by the datatype of the column (VARCHAR or other). The latest version even has a setting relating to whether to consider trailing spaces.
The approach shown in the question should (mostly) work just fine in MySQL for the following reasons:
Collation (not to be confused with encoding) is the set or rules that define how to sort and compare characters, typically used to replicate at database level the user expectations from a cultural perspective (if I search for cafe I expect to find café as well).
Collation plays an important rule on unique constraints because its establishes the definition of unique.
Binary collations are specifically meant to ignore cultural rules and work at byte level, thus utf8mb4_bin is the right choice here.
MySQL allows to set a combination of encoding and collation with a column level granularity.
If a column definition is missing collation, it'll use the table level one.
If a table definition is missing collation, it'll use the database level one.
If a database definition is missing collation, it'll use the server level one.
It's also worth noting that MySQL will convert between encodings transparently as long as:
Connection encoding is properly set
Conversion is physically possible (e.g. all source characters also belong to target encoding)
For this last reason, VARBINARY is possibly not the best choice for a column that's still text because it opens the door to getting café stored from a connection configured to use ISO-8859-1 and not being able to retrieve it correctly from a connection configured to use UTF-8.
Side note: the table definition shown may trigger the following error:
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
Indexes may have a relatively small maximum size. From docs:
If innodb_large_prefix is enabled (the default), the index key prefix
limit is 3072 bytes for InnoDB tables that use DYNAMIC or COMPRESSED
row format. If innodb_large_prefix is disabled, the index key prefix
limit is 767 bytes for tables of any row format.
innodb_large_prefix is deprecated and will be removed in a future
release. innodb_large_prefix was introduced in MySQL 5.5 to disable
large index key prefixes for compatibility with earlier versions of
InnoDB that do not support large index key prefixes.
The index key prefix length limit is 767 bytes for InnoDB tables that
use the REDUNDANT or COMPACT row format. For example, you might hit
this limit with a column prefix index of more than 255 characters on a
TEXT or VARCHAR column, assuming a utf8mb3 character set and the
maximum of 3 bytes for each character.
Attempting to use an index key prefix length that exceeds the limit
returns an error. To avoid such errors in replication configurations,
avoid enabling innodb_large_prefix on the master if it cannot also be
enabled on slaves.
Since utf8_mb8 allocates 4 bytes per character, a 767 limit will be overflowed with only 192 characters.
We have one more problem:
mysql> CREATE TABLE `dummy` (
-> `key` varchar(191) COLLATE utf8mb4_bin NOT NULL,
-> UNIQUE KEY `key` (`key`)
-> )
-> ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO `dummy` (`key`) VALUES ('one');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO `dummy` (`key`) VALUES ('one ');
ERROR 1062 (23000): Duplicate entry 'one ' for key 'key'
Pardon?
mysql> INSERT INTO `dummy` (`key`) VALUES ('One');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO `dummy` (`key`) VALUES ('öne');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM `dummy`;
+-----+
| key |
+-----+
| One |
| one |
| öne |
+-----+
3 rows in set (0.00 sec)
This last issue is a interesting subtlety of MySQL collations. From docs:
All MySQL collations are of type PADSPACE. This means that all CHAR,
VARCHAR, and TEXT values in MySQL are compared without regard to any
trailing spaces. “Comparison” in this context does not include the
LIKE pattern-matching operator, for which trailing spaces are
significant
[...]
For those cases where trailing pad characters are stripped or
comparisons ignore them, if a column has an index that requires unique
values, inserting into the column values that differ only in number of
trailing pad characters will result in a duplicate-key error.
I'd dare say then that VARBINARY type is the only way to overcome this...

Setting a column as timestamp in MySql workbench?

This might be a really elementary question, but I've never created a table with TIMESTAMP() before, and I'm confused on what to put as the parameters. For example, here:
I just randomly put TIMESTAMP(20), but what does the 20 as a parameter signify here? What should be put in here?
I googled the question, but didn't really come up with anything so... Anyway I'm new to sql, so any help would be greatly appreciated, thank you!!
EDIT
As of MySQL 5.6.4, datatype TIMESTAMP(n) specifies n (0 up to 6) decimal digits of precision for fractional seconds.
Before MySQL 5.6, MySQL did not support fractional seconds stored as part of a TIMESTAMP datatype.
Reference: https://dev.mysql.com/doc/refman/5.6/en/fractional-seconds.html
We don't need to specify a length modifier on a TIMESTAMP. We can just specify TIMESTAMP by itself.
But be aware that the first TIMESTAMP column defined in the table is subject to automatic initialization and update. For example:
create table foo (id int, ts timestamp, val varchar(2));
show create table foo;
CREATE TABLE `foo` (
`id` INT(11) DEFAULT NULL,
`ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`val` VARCHAR(2) DEFAULT NULL
)
What goes in parens following a datatype depends on what the datatype is, but for some datatypes, it's a length modifier.
For some datatypes, the length modifier affects the maximum length of values that can be stored. For example, VARCHAR(20) allows up to 20 characters to be stored. And DECIMAL(10,6) allows for numeric values with four digits before the decimal point and six after, and effective range of -9999.999999 to 9999.999999.
For other types, the length modifier it doesn't affect the range of values that can be stored. For example, INT(4) and INT(10) are both integer, and both can store the full range of values for allowed for the integer datatype.
What that length modifier does in that case is just informational. It essentially specifies a recommended display width. A client can make use of that to determine how much space to reserve on a row for displaying values from the column. A client doesn't have to do that, but that information is available.
EDIT
A length modifier is no longer accepted for the TIMESTAMP datatype. (If you are running a really old version of MySQL and it's accepted, it will be ignored.)
Thats the precision my friend, if you put for example (2) as a parameter, you will get a date with a precision like: 2015-12-29 00:00:00.00, by the way the maximum value is 6.
This syntax seems to be from old version of MySQL, prior to 4.1. It has been removed completely from 5.5 https://dev.mysql.com/doc/refman/5.0/en/upgrading-from-previous-series.html
So no point in specifying a width here, as it may be ignored. What version are you running?
MySQL 5.7 appears to support this syntax. The argument passed is the precision. TIMESTAMP(3) will allow millisecond precision. 6 is the highest amount of allowed precision.
reference: http://dev.mysql.com/doc/refman/5.7/en/datetime.html
In MySQL workbench 8.0
TIMESTAMP
doesn't work, you need to add wole statement (if u don't want to update timestamp in future)
TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
than u have e.g :
2020-01-08 19:10:05
but if you want that TIMESTAMP could be modify with the record update than you use :
TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

How do I get a GROUP_CONCAT column to be greater than 1,024 characters? [duplicate]

I've created a view which uses GROUP_CONCAT to concatenate results from a query on products column with data type of 'varchar(7) utf8_general_ci' in a column named concat_products.
The problem is that MySQL truncates value of "concat_products" column.
phpMyAdmin says the data type of "concat_products" column is varchar(341) utf8_bin
Table products:
CREATE TABLE `products`(
`productId` tinyint(2) unsigned NOT NULL AUTO_INCREMENT,
`product` varchar(7) COLLATE utf8_general_ci NOT NULL,
`price` mediumint(5) unsigned NOT NULL,
PRIMARY KEY (`productId`)
) ENGINE=InnoDB AUTO_INCREMENT=28 DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
The "concat_products_vw" view:
CREATE VIEW concat_products_vw AS
SELECT
`userId`,
GROUP_CONCAT(CONCAT_WS('_', `product`, `productId`, `price`)
ORDER BY `productId` ASC SEPARATOR '*') AS concat_products
FROM
`users`
LEFT JOIN `products`
ON `users`.`accountBalance` >= `product`.`price`
GROUP BY `productId`
According to MySQL manual:
Values in VARCHAR columns are variable-length strings
Length can be specified as a value from 1 to 255 before MySQL 4.0.2 and 0 to 255 as of MySQL 4.0.2.
EDIT
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535.
Why MySQL specifies more than 255 characters for varchar "concat_products" column? (solved!)
Why uf8_bin instead of utf8_general_ci?
Is it possible to change the data type of a column in a view for example in my case to text for "concat_products" column?
If not what can I do to prevent MySQL from truncating "concat_products" column?
As I already wrote in an earlier comment, the MySQL manual says:
Values in VARCHAR columns are variable-length strings. The length can
be specified as a value from 0 to 65,535.
So the problem is not with the data type of the field.
The MySQL manual also says:
The result is truncated to the maximum length that is given by the
group_concat_max_len system variable, which has a default value of
1024. The value can be set higher, although the effective maximum length of the return value is constrained by the value of
max_allowed_packet. The syntax to change the value of
group_concat_max_len at runtime is as follows, where val is an
unsigned integer:
SET [GLOBAL | SESSION] group_concat_max_len = val;
Your options for changing the value of group_concat_max_len are:
changing the value at MySQL startup by appending this to the command:
--group_concat_max_len=your_value_here
adding this line in your MySQL configuration file (mysql.ini): group_concat_max_len=your_value_here
running this command after MySQL startup:
SET GLOBAL group_concat_max_len=your_value_here;
running this command after opening a MySQL connection:
SET SESSION group_concat_max_len=your_value_here;
Documentation: SET, Server System Variables: group_concat_max_len
As Jocelyn mentioned, the size of a GROUP_CONCAT() result is bounded by group_concat_max_len, however there is an additional interaction with ORDER BY that results in a further truncation to 1/3 of group_concat_max_len. For an example, see this related answer.
The default value for group_concat_max_len is 1024, and 1024 / 3 = 341 probably explains why the type of concat_products shows up as varchar(341) in the original example. If you were to remove the GROUP BY productId clause, concat_products should show up as varchar(1024).
I have not found this interaction between GROUP_CONCAT() and ORDER BY mentioned in the MySQL Manual, but it affects at least MySQL Server 5.1.
Year 2023, on Linux Server you can edit my.cnf file and then set:
[mysqld]
group_concat_max_len = 2048