What does size limit on MySQL index mean? - mysql

I have a table created like so:
CREATE TABLE `my_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`info` varchar(50) DEFAULT NULL,
`some_more_info` smallint(5) unsigned NOT NULL
PRIMARY KEY (`id`),
KEY `my_index` (`some_more_info`,`info`(24)),
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
My question is about the second key called my_index. What does the "(24)" size limit mean? The actual size of the column is 50, but the index is only 24 characters.
Does this mean that MySQL indexes only the first 24 characters of the column info?

In short, yes, the first 24 characters are taken into consideration to build the BTree index. Indexing limits are assigned to text types such as varchar and text, as they don't affect numeric precision.

Yes.
The entire description about the index length can be found here:
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
Prefix lengths are given in characters for nonbinary string types and
in bytes for binary string types. That is, index entries consist of
the first length characters of each column value for CHAR, VARCHAR,
and TEXT columns, and the first length bytes of each column value for
BINARY, VARBINARY, and BLOB columns.
Also you create query has/had some extra ,'s.

Related

limit length of string, set max length of string column

I want to limit usernames to 16 characters long.
I thought this would work:
create table user(id int unsigned auto_increment not null, username tinytext(16) not null, primary key (id));
but it doesn't
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(16) not null, primary key (id))' at line 1
That is what varchar() is for:
create table user(
id int unsigned auto_increment not null,
username varchar(16) not null,
primary key (id)
);
Review syntax of data types here: https://dev.mysql.com/doc/refman/5.7/en/string-type-overview.html
TINYTEXT does not accept a length argument.
TEXT does accept a length argument, but it doesn't do what you think it does. It just changes the data type to one of the flavors of TEXT that is the smallest type that will allow at least the length you request.
As stated in the manual page:
An optional length M can be given for this type. If this is done, MySQL creates the column as the smallest TEXT type large enough to hold values M characters long.
So TEXT(16) will create the column as TINYTEXT because that's the least of the family of TEXT types that will hold strings of length 16. Another example is if you specify TEXT(2000000), it would promote the column to MEDIUMTEXT.
mysql> create table t ( t1 text(16), t2 text(2000000) );
Query OK, 0 rows affected (0.04 sec)
mysql> show create table t\G
*************************** 1. row ***************************
Table: t
Create Table: CREATE TABLE `t` (
`t1` tinytext,
`t2` mediumtext
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
Notice the columns have automatically been changed, and they no longer have length specifiers.
This means the TINYTEXT column will still allow up to 255 bytes, and the MEDIUMTEXT column will allow up to 16MB. The text length specified is not a limit, but a guideline for which type is needed.
If you really want to limit inputs to 16 characters, then use VARCHAR(16).

What is difference between char and varchar

CREATE TABLE IF NOT EXISTS `test` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`country` varchar(5) NOT NULL,
`state` char(5) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
I tried following query to insert data
INSERT INTO `test`.`test` (`id` ,`country` ,`state`)
VALUES (NULL , 'south-india', 'Gujarat');
When I execute above query It will shows following warning
Warning: #1265 Data truncated for column 'country' at row 1
Warning: #1265 Data truncated for column 'state' at row 1
I found Reference that VARCHAR is variable-length.CHAR is fixed length.
Then what you mean by
VARCHAR is variable-length.
CHAR is fixed length.
VARCHAR(5) will use at most 5 characters of storage, while CHAR(5) will always use exactly 5.
For a field holding a person's name, for example, you'd want to use a VARCHAR, because while on average someone's name is usually short, you still want to cope with the few people with very long names, without having to have that space wasted for the majority of your database rows.
As you said varchar is variable-length and char is fixed. But the main difference is the byte it uses.
Example.
column: username
type: char(10)
if you have data on column username which is 'test', it will use 10 bytes. and it will have space.
'test______'
Hence the varchar column will only uses the byte you use. for 'test' it will only use 4 bytes. and your data will be
'test'
THanks.
As you mentioned VARCHAR is variable-length. CHAR is fixed length.
when you say
Varchar(5) and if the data you store in it is of length 1, The
remaining 4 byte memory space will be used by others. example: "t"
on the other hand
Char(5) and if the data you store in it is of length 1, The remaining
4 byte memory space cant be used. The 4 byte will end up not used by
any other data. example: "t____" here ____ is the unused space.

SQL key length error makes no sense

The following looks perfectly reasonable to me:
CREATE TABLE `mydb`.`Temp` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`x` VARCHAR ( 300 ) NOT NULL ,
`id_foo` INT NULL DEFAULT NULL,
FOREIGN KEY ( `id_foo`) REFERENCES `Foo` (`id`) ON DELETE CASCADE ,
INDEX (`id_foo`),
INDEX (`x`),
UNIQUE (`id_foo`, `x`)
) ENGINE = INNODB;
With MySQL this gives an error
#1071 - Specified key was too long; max key length is 767 bytes
This seems wrong because the whole row is 309 bytes: less than 767, not even half. What's going on?
According to the MYSQL Documantation : http://dev.mysql.com/doc/refman/5.0/en/create-index.html
MySQL has different limits on the amount of space you can use to define indexes on column(s)
for MyISAM it's 1,000 bytes;
for InnoDB it's 767 .
Moreover, the data type of those columns matters - for VARCHAR, it's 3x
So, an index on a VARCHAR(300) just like in your table will take 900 of those bytes which is greater than 767 bytes, max key length.
EDIT: Apparently this is not a bug of MySQL, but the UTF8 in MySQL that supports up to 3 bytes. Also, with introducing 4-byte utf8 character set (WL#1213) maximum possible key length changed from 255 to 191 characters (191 * 4 + 2 = 766 where 2 bytes hold for the length). All -utf, -utf8mb4, -utf16, -utf32 are affected from this change beginning with the MySQL version 5.5 or higher.
Try determining how long that index needs to be in order to remain effective:
SELECT COUNT(DISTINCT(`x`)) as n_unique,
COUNT(DISTINCT(LEFT(`x`,200))) as n_100,
COUNT(DISTINCT(LEFT(`x`,150))) as n_150,
COUNT(DISTINCT(LEFT(`x`,100))) as n_100,
COUNT(DISTINCT(LEFT(`x`,50))) as n_50,
COUNT(DISTINCT(LEFT(`x`,25))) as n_25,
COUNT(DISTINCT(LEFT(`x`,10))) as n_10
FROM Temp;
Dividing each n_ result by the n_unique will give you the percent coverage. Once you have that you can likely get decent coverage with a smaller number of characters.
ALTER TABLE Temp ADD index x_improved(20)
Where 20 is really the n_ count of distinct variables given above.
It's your INDEX (x) that's the problem. It works if the VARCHAR is shorter.
What character set are you using?
Based on that error, it appears you are using a multi-byte character set, probably utf8, which reserves 3 bytes per character. So a varchar(300) in utf8 results in a 900 byte key length, which exceeds the innodb limit of 767.
In order to create your table with those indexes, you either need to a different character set, or shorten the length of your column. If you just had the index on x I would recommend simply indexing the first 255 characters of that column, but given your unique index that include x that solution is not viable, since it would reject values as duplicates if they match on the first 255 characters, even if they differ in the last 45 characters.
Here are a couple of example that will work:
-- shorten x to 255 characters
CREATE TABLE `mydb`.`Temp` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`x` VARCHAR ( 255 ) NOT NULL ,
`id_foo` INT NULL DEFAULT NULL,
FOREIGN KEY ( `id_foo`) REFERENCES `Foo` (`id`) ON DELETE CASCADE ,
INDEX (`id_foo`),
INDEX (`x`),
UNIQUE (`id_foo`, `x`)
) ENGINE = INNODB;
-- use single-byte character set
CREATE TABLE `mydb`.`Temp` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`x` VARCHAR ( 300 ) NOT NULL ,
`id_foo` INT NULL DEFAULT NULL,
FOREIGN KEY ( `id_foo`) REFERENCES `Foo` (`id`) ON DELETE CASCADE ,
INDEX (`id_foo`),
INDEX (`x`),
UNIQUE (`id_foo`, `x`)
) ENGINE = INNODB DEFAULT CHARSET LATIN1;

MySQL index for long strings

I have MySQL InnoDb table where I want to store long (limit is 20k symbols) strings. Is there any way to create index for this field?
you can put an MD5 of the field into another field and index that. then when u do a search, u match versus the full field that is not indexed and the md5 field that is indexed.
SELECT *
FROM large_field = "hello world hello world ..."
AND large_field_md5 = md5("hello world hello world ...")
large_field_md5 is index and so we go directly to the record that matches. Once in a blue moon it might need to test 2 records if there is a duplicate md5.
You will need to limit the length of the index, otherwise you are likely to get error 1071 ("Specified key was too long"). The MySQL manual entry on CREATE INDEX describes this:
Indexes can be created that use only the leading part of column values, using col_name(length) syntax to specify an index prefix length:
Prefixes can be specified for CHAR, VARCHAR, BINARY, and VARBINARY columns.
BLOB and TEXT columns also can be indexed, but a prefix length must be given.
Prefix lengths are given in characters for nonbinary string types and in bytes for binary string types. That is, index entries consist of the first length characters of each column value for CHAR, VARCHAR, and TEXT columns, and the first length bytes of each column value for BINARY, VARBINARY, and BLOB columns.
It also adds this:
Prefix support and lengths of prefixes (where supported) are storage engine dependent. For example, a prefix can be up to 1000 bytes long for MyISAM tables, and 767 bytes for InnoDB tables.
Here is an example how you could do that. As #Gidon Wise mentioned in his answer you can index the additional field. In this case it will be query_md5.
CREATE TABLE `searches` (
`id` int(10) UNSIGNED NOT NULL,
`query` varchar(10000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`query_md5` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
) ENGINE=InnoDB;
ALTER TABLE `searches`
ADD PRIMARY KEY (`id`),
ADD KEY `searches_query_md5_index` (`query_md5`);
To make sure you will not have any similar md5 hashes you want to double check by doing and `query` =''.
The query will look like this:
select * from `searches` where `query_md5` = "b6d31dc40a78c646af40b82af6166676" and `query` = 'long string ...'
b6d31dc40a78c646af40b82af6166676 is md5 hash of the long string ... string. This, I think can improve query performance and you can be sure that you will get right results.
Use the sha2 function with a specific length. Add this to your table:
`hash` varbinary(32) GENERATED ALWAYS AS (unhex(sha2(`your_text`,256)))
ADD UNIQUE KEY `ix_hash` (`hash`);
Read about the SHA2 function

Need Help understanding this piece of MySQL Code

CREATE TABLE `users` (
`ID` int(10) unsigned zerofill NOT NULL auto_increment,
`username` varchar(20) NOT NULL,
PRIMARY KEY (`ID`),
KEY `Username` (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
what is
a) unsigned, zerofill in ID table?
b) what do you mean by KEY Username (username) ?
thank you
zerofill - left pad with 0
For example, for a column declared as INT(4) ZEROFILL, a value of 5 is retrieved as 0005.
unsigned is number not less than zero
If you specify ZEROFILL for a numeric column, MySQL automatically adds the UNSIGNED attribute to the column.
Unsigned type can be used to permit only nonnegative numbers in a column or when you need a larger upper numeric range for the column
details : http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html
KEY Username (username) ?
is an index name after Username on column username
details : http://dev.mysql.com/doc/refman/5.0/en/create-table.html
unsigned = a none positive / negative number, so you couldnt have -1 as the "-" is a sign.
zerofill = fill it with zeros by default. Not necessary as the column's already got the auto_increment / pk attributes
key = index this column i.e. make SELECTS that search on this column faster.
Ta
here is a partial answer.
a) unsigned means that the value is positive.
b) zerofill means that it will have leftpadding with '0'
ex : without zerofill you have 55 and with you will have 00000055
Regards.
Zerofill is used to pad a number with zeros instead of spaces. In this case, the number 1337 would be padded with 6 zeros and shown as 0000001337 because of int(10). Specifying unsigned is not needed since zerofill automatically chooses unsigned, see http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html
KEY foo (bar) creates an index on the bar column. The name of the index is foo.