mysql maximum row size - mysql

I don't know why I'm having this odd behaviour on mysql 5.6.24 , can you help me ? do you think it's a bug
mysql -D database --default_character_set utf8 -e "ALTER TABLE abc_folder ADD COLUMN lev10 varchar(5000);"
ERROR 1118 (42000) at line 1: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
instead
mysql -D database --default_character_set utf8 -e "ALTER TABLE abc_folder ADD COLUMN lev10 varchar(50000);"
In other words a bigger varchar() entry is accepted and correctly working.
Does anybody know what is going on ?

This error means that in your table abc_folder, one single line would be bigger than 65535 bytes without columns of type TEXT or BLOBS.
65535 is the highest number which can be represented by an unsigned 16-bit binary number. And as stated here :
Although InnoDB supports row sizes larger than 65,535 bytes internally, MySQL itself imposes a row-size limit of 65,535 for the combined size of all columns:
mysql> CREATE TABLE t (a VARCHAR(8000), b VARCHAR(10000),
-> c VARCHAR(10000), d VARCHAR(10000), e VARCHAR(10000),
-> f VARCHAR(10000), g VARCHAR(10000)) ENGINE=InnoDB;
ERROR 1118 (42000): Row size too large. The maximum row size for the
used table type, not counting BLOBs, is 65535. You have to change some
columns to TEXT or BLOBs
This means that internally, MySQL is probably using a 16 byte number to store the size of a row.
To correct your error, I would suggest changing VARCHAR(5000) into TEXT or to use multiple tables.
Using TEXT if you want to impose a limit of 5000 bytes on your column, you should use a truncate function when inserting into the table. You can use substr for that.
INSERT INTO abc_folder (lev10) VALUES (SUBSTR('Some Text',0, 5000))

Related

How can I observe the storage difference between VARCHAR(255) and VARCHAR(255 + n)?

Background
The MySQL documentation states the following:
In contrast to CHAR, VARCHAR values are stored as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.
To put this to the test myself, I created two tables:
CREATE TABLE `varchar_length_test_255` (
`characters` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `varchar_length_test_256` (
`characters` varchar(256) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I then inserted 10,000 rows into each table, each row with values having the maximum length for the characters column.
Since I am using a character set that has a maximum byte length of one byte per character (latin1), I expected to see a difference of 20,000 bytes in storage size between the two tables, derived from the following:
Each row in the varchar_length_test_256 table contains an additional character than the rows in the varchar_length_test_255 table. With the latin1 character set, that adds up to 10,000 bytes, since there are 10,000 rows in each table.
Based on the MySQL documentation, VARCHAR values exceeding 255 bytes require an additional "length" byte. Since each row in the varchar_length_test_256 table contains a value in the characters column that has a length of 256, which equates to 256 bytes for each value since the latin1 character set is used, that adds up to another 10,000 bytes utilized.
Problem
When issuing a query to retrieve the size of each table, it appears that the tables are the same size! I used the following query (based on off of this SO post) to determine the size of each table:
SELECT
table_name AS `Table`,
(data_length + index_length) `Size in Bytes`
FROM
information_schema.TABLES
WHERE
table_schema = "test";
which yielded this output:
+-------------------------+---------------+
| Table | Size in Bytes |
+-------------------------+---------------+
| varchar_length_test_255 | 4734976 |
| varchar_length_test_256 | 4734976 |
+-------------------------+---------------+
2 rows in set (0.00 sec)
What am I missing here?
Am I correctly understanding the MySQL documentation?
Is there something wrong with my test that is preventing the expected outcome?
Is the query I am using to calculate the size of the tables correct?
How could I correctly observe the information communicated in the MySQL documentation?
Check he data_free column too.
InnoDB stores data on so called 'pages' which are 16KB in size (by default). When a page is almost full, and you insert a new record, but it can't fit on the page, MySQL will open a new page leaving the leftover space empty.
It is my assumption, that MySQL reports the number of pages times the page size as data/index sizes.
This is the effective size used on the OS to store the table's data, not the actual size stored on those pages.
Update: https://mariadb.com/kb/en/library/information-schema-tables-table/
On this page (even if it is MariaDB, but the storage engine is the same) the descrtiption of data_lenght is the following:
For InnoDB/XtraDB, the index size, in pages, multiplied by the page
size. For Aria and MyISAM, length of the data file, in bytes. For
MEMORY, the approximate allocated memory.
Edit (some calculations)
16 KB = 16384 B
Storage (B) # of record # of pages
on a page
---------------------------------------------------
varchar(255) 256 64 156.25
varchar(256) 258 63.5 158.73
As you see the raw data (with the length marker) can be stored on almost the same amount of pages.
Due to the fact that a page is not necessary filled to 100% (however innodb_fill_factor defaults to 100) and there is some overhead in the row structure, this little difference won't necessarily visible.
The database files are not like a csv file, but they have to handle multiple things such as NULL values, row size when it is varying, etc which takes up additional space.
More about the InnoDB Row Structure: https://dev.mysql.com/doc/refman/5.5/en/innodb-physical-record.html

How can I alter an indexed varchar(255) from utf8 to utf8mb4 and still stay under the 767 max key length?

I have an mysql column that needs to support emoji, and that means converting a utf8 column into a utf8mb4. But my varchar(255) won't fit, so long as the column is indexed (not unique).
How can I keep the index, and get the utf8mb4 collation?
I've tried to just reduce the length to 191 but unfortunately some of my rows are longer and I get this error: #1406 - Data too long for column 'column_name' at row 33565 (which isn't terribly helpful since I don't have an auto-increment column and have no idea how to fine row 33565).
I think it is connected with maximum data length of the row, there is such limitation, at least for string data types as I know. To avoid this try to separate table's data, e.g. split table into two tables using one-to-one relation.
About the maximum key length: I have tried to create table with indexed utf8mb4 field, it was successfully created with key length 191, but when I set it to 192, it threw an error - Specified key was too long; max key length is 767 bytes.
I ended up removing the index.
If performance is negatively impacted I may add a second indexed column that only contains the first n characters (up to 191, but likely just 10-20 or so) of the current column.
The 191 character limit is due to the maximum key length of 767 bytes. For a 4 byte character, this means a max of 191 characters (floor(767/4) = 191).

How can I find the row error for a table without an auto-increment?

I'm trying to convert an indexed column (not unique) varchar(255) to use utf8mb4_general_ci collation. But I keep running into max key errors.
So I tried limited my varchar length to lower numbers and receive this error:
Data too long for column at 'table_name' at row 122
But my table does not have auto-increment ids, so I'm stuck at figuring out where row 122 is.
My hunch is that there's just a few long records that I might be able to truncate to fit the 767 key length for utf8mb4. But I need to find the long strings first.
To find the longest strings ('foo' being your column name that is too long):
SELECT *, char_length(foo)
FROM table_name
ORDER BY char_length(foo) DESC
LIMIT 25

Workaround to allow a TEXT column in mysql MEMORY/HEAP table

I want to use a temporary MEMORY table to store some intermediate data, but I need/want it to support TEXT columns. I had found a workaround involving casting the TEXT to a VARCHAR or something, but like an idiot I didn't write down the URL anywhere I can find now.
Does anyone know how to, for example, copy a table x into a memory table y where x may have TEXT columns? If anyone knows how to cast columns in a "CREATE TABLE y SELECT * FROM x" sorta format, that would definitely be helpful.
Alternatively, it would help if I could create a table that uses the MEMORY engine by default, and "upgrades" to a different engine (the default maybe) if it can't use the MEMORY table (because of text columns that are too big or whatever).
You can specify a SELECT statement after CREATE TEMPORARY TABLE:
CREATE TEMPORARY TABLE NewTempTable
SELECT
a
, convert(b, char(100)) as b
FROM OtherTable
Re comment: it appears that CHAR is limited to 512 bytes, and you can't cast to VARCHAR. If you use TEXT in a temporary table, the table is stored on disk rather than in memory.
What you can try is defining the table explicitly:
CREATE TEMPORARY TABLE NewTempTable (
a int
, b varchar(1024)
)
insert into NewTempTable
select a, b
from OtherTable
You can use varChar(5000). No need to cast. If you have example data, you can use it as a measure. There is 64Kb space.
The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
Do you mean CAST(text_column AS CHAR)? Note that you shouldn't need it, MySQL will cast it automatically if the target column is VARCHAR(n).

Equivalent of varchar(max) in MySQL?

What is the equivalent of varchar(max) in MySQL?
The max length of a varchar is subject to the max row size in MySQL, which is 64KB (not counting BLOBs):
VARCHAR(65535)
However, note that the limit is lower if you use a multi-byte character set:
VARCHAR(21844) CHARACTER SET utf8
Here are some examples:
The maximum row size is 65535, but a varchar also includes a byte or two to encode the length of a given string. So you actually can't declare a varchar of the maximum row size, even if it's the only column in the table.
mysql> CREATE TABLE foo ( v VARCHAR(65534) );
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
But if we try decreasing lengths, we find the greatest length that works:
mysql> CREATE TABLE foo ( v VARCHAR(65532) );
Query OK, 0 rows affected (0.01 sec)
Now if we try to use a multibyte charset at the table level, we find that it counts each character as multiple bytes. UTF8 strings don't necessarily use multiple bytes per string, but MySQL can't assume you'll restrict all your future inserts to single-byte characters.
mysql> CREATE TABLE foo ( v VARCHAR(65532) ) CHARSET=utf8;
ERROR 1074 (42000): Column length too big for column 'v' (max = 21845); use BLOB or TEXT instead
In spite of what the last error told us, InnoDB still doesn't like a length of 21845.
mysql> CREATE TABLE foo ( v VARCHAR(21845) ) CHARSET=utf8;
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs
This makes perfect sense, if you calculate that 21845*3 = 65535, which wouldn't have worked anyway. Whereas 21844*3 = 65532, which does work.
mysql> CREATE TABLE foo ( v VARCHAR(21844) ) CHARSET=utf8;
Query OK, 0 rows affected (0.32 sec)
TLDR; MySql does not have an equivalent concept of varchar(max), this is a MS SQL Server feature.
What is VARCHAR(max)?
varchar(max) is a feature of Microsoft SQL Server.
The amount of data that a column could store in Microsoft SQL server versions prior to version 2005 was limited to 8KB. In order to store more than 8KB you would have to use TEXT, NTEXT, or BLOB columns types, these column types stored their data as a collection of 8K pages separate from the table data pages; they supported storing up to 2GB per row.
The big caveat to these column types was that they usually required special functions and statements to access and modify the data (e.g. READTEXT, WRITETEXT, and UPDATETEXT)
In SQL Server 2005, varchar(max) was introduced to unify the data and queries used to retrieve and modify data in large columns. The data for varchar(max) columns is stored inline with the table data pages.
As the data in the MAX column fills an 8KB data page an overflow page is allocated and the previous page points to it forming a linked list. Unlike TEXT, NTEXT, and BLOB the varchar(max) column type supports all the same query semantics as other column types.
So varchar(MAX) really means varchar(AS_MUCH_AS_I_WANT_TO_STUFF_IN_HERE_JUST_KEEP_GROWING) and not varchar(MAX_SIZE_OF_A_COLUMN).
MySql does not have an equivalent idiom.
In order to get the same amount of storage as a varchar(max) in MySql you would still need to resort to a BLOB column type. This article discusses a very effective method of storing large amounts of data in MySql efficiently.
The max length of a varchar is
65535
divided by the max byte length of a character in the character set the column is set to (e.g. utf8=3 bytes, ucs2=2, latin1=1).
minus 2 bytes to store the length
minus the length of all the other columns
minus 1 byte for every 8 columns that are nullable. If your column is null/not null this gets stored as one bit in a byte/bytes called the null mask, 1 bit per column that is nullable.
For Sql Server
alter table prg_ar_report_colors add Text_Color_Code VARCHAR(max);
For MySql
alter table prg_ar_report_colors add Text_Color_Code longtext;
For Oracle
alter table prg_ar_report_colors add Text_Color_Code CLOB;
Mysql Converting column from VARCHAR to TEXT when under limit size!!!
mysql> CREATE TABLE varchars1(ch3 varchar(6),ch1 varchar(3),ch varchar(4000000))
;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+-------+------+---------------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------------+
| Note | 1246 | Converting column 'ch' from VARCHAR to TEXT |
+-------+------+---------------------------------------------+
1 row in set (0.00 sec)
mysql>