how can have my query be able to not differienciate between my lower and upper case column query but without using the lower method of sql?
select * from
User
where
lower(lusername)= 'abdel'
i want to still be able to use my column name with any type of case (Abc, aBc, ABC,aBC..) but without using the lower function in my query sql because the lower disable my index that i put on my column name.
is there a way to do that ? thank you in advance
The default collation used in MySQL is case-insensitive. Therefore you don't need to use LOWER() if you are doing searching.
In other words, the following two queries should return the same results:
SELECT ... FROM mytable WHERE LOWER(name) = 'Abc';
SELECT ... FROM mytable WHERE name = 'Abc'
The latter query is able to optimize the query with an index, but the former can't.
Any expression or function call makes the search not "sargable" which means it can't use an index, because the optimizer can't know if that expression or function outputs have the same sort order as the index on the column.
Based on your comment below, I see you are using a collation utf8_bin, which is case-sensitive. This does improve performance a little bit over using a collation, so if performance is the only thing that's important in this case, I understand why you want to keep using that collation.
Here's a demo of using EXPLAIN to verify that it uses an index when you compare strings directly, but the binary collation does a case-sensitive search:
mysql> create table User (id serial primary key, lusername varchar(75) collate utf8_bin, created datetime, key(lusername));
mysql> explain select * from User where lusername = 'Abc';
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | User | NULL | ref | lusername | lusername | 228 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-------+
But when we use a function like LOWER(), it can't use the index, and resorts to a table-scan:
mysql> explain select * from User where LOWER(lusername) = 'Abc';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | User | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
However, MySQL 5.7 has a feature to add a virtual column based on the expression, and index the virtual column:
mysql> alter table User add lower_lusername varchar(75) as (LOWER(lusername)), add key (lower_lusername);
mysql> explain select * from User where LOWER(lusername) = 'Abc';
+----+-------------+-------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | User | NULL | ref | lower_lusername | lower_lusername | 303 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+-----------------+-----------------+---------+-------+------+----------+-------+
Alternatively, in MySQL 8.0, you can now get the index on the expression it without even adding a generated column:
mysql> alter table User add key ((LOWER(lusername)));
mysql> explain select * from User where LOWER(lusername) = 'Abc';
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | User | NULL | ref | functional_index | functional_index | 228 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+-------+
Related
Assuming I have a table as:
create table any_table (any_column_1 int, any_column_2 varchar(255));
create index any_table_any_column_1_IDX USING BTREE ON any_table (any_column_1);
(Note: Index type should not matter here)
I was wondering if querying any_column with int or string have any impact on performance, i.e. does
select * from any_table where any_column_1 = 12345;
have any differences in terms of performance with this one?
select * from any_table where any_column_1 = '12345';
I have looked around the web and really have not faced this particular case.
It should be fine to do this either way for an indexed integer column. When you compare an integer column to a constant, the constant value is cast to an integer whether you format it as an integer or a string.
You can confirm this with EXPLAIN. In both cases, the EXPLAIN shows that it will use the index (type: ref indicates an index lookup), and the performance will be the same.
mysql> explain select * from any_table where any_column_1 = 12345;
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
mysql> explain select * from any_table where any_column_1 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_1_IDX | any_table_any_column_1_IDX | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
If you had indexed the string column in your example, any_column_2, it would make a difference because the collation of a string column must match the collation of the value you compare it to. A string literal will be cast to a compatible collation by default, so it uses the index:
create index any_table_any_column_2_IDX USING BTREE ON any_table (any_column_2);
mysql> explain select * from any_table where any_column_2 = '12345';
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | any_table | NULL | ref | any_table_any_column_2_IDX | any_table_any_column_2_IDX | 768 | const | 1 | 100.00 | NULL |
+----+-------------+-----------+------------+------+----------------------------+----------------------------+---------+-------+------+----------+-------+
But an integer literal has no collation, so you get warnings, and the index cannot be used. The EXPLAIN shows type: ALL so it will do a table-scan and that will have poor performance if you query a table with many rows.
mysql> explain select * from any_table where any_column_2 = 12345;
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | any_table | NULL | ALL | any_table_any_column_2_IDX | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+------+----------------------------+------+---------+------+------+----------+-------------+
1 row in set, 3 warnings (0.00 sec)
mysql> show warnings;
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Warning | 1739 | Cannot use range access on index 'any_table_any_column_2_IDX' due to type or collation conversion on field 'any_column_2' |
| Note | 1003 | /* select#1 */ select `test2`.`any_table`.`any_column_1` AS `any_column_1`,`test2`.`any_table`.`any_column_2` AS `any_column_2` from `test2`.`any_table` where (`test2`.`any_table`.`any_column_2` = 12345) |
+---------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I now have a table like this:
> DESC userInfo;
+--------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | char(32) | NO | MUL | NULL | |
| age | tinyint(3) unsigned | NO | | NULL | |
| gender | tinyint(1) | NO | | 1 | |
+--------+---------------------+------+-----+---------+----------------+
I made (name, age) a joint unique index:
> SHOW INDEX FROM userInfo;
+----------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+--------------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+--------------------+
| userInfo | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| userInfo | 0 | joint_unique_index | 1 | name | A | 0 | NULL | NULL | | BTREE | | 联合唯一索引 |
| userInfo | 0 | joint_unique_index | 2 | age | A | 0 | NULL | NULL | | BTREE | | 联合唯一索引 |
+----------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+--------------------+
3 rows in set (0.00 sec)
Now, when I use the following query statement, its type is All:
> DESC SELECT * FROM userInfo WHERE age = 18;
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | userInfo | NULL | ALL | NULL | NULL | NULL | NULL | 1 | 100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
I can understand this behavior, because according to the leftmost prefix matching feature, age will not be used as an index column when querying.
But when I use the following statement to query, its type is Index:
> DESC SELECT name, age FROM userInfo WHERE age = 18;
+----+-------------+----------+------------+-------+---------------+--------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+--------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | userInfo | NULL | index | NULL | joint_unique_index | 132 | NULL | 1 | 100.00 | Using where; Using index |
+----+-------------+----------+------------+-------+---------------+--------------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
I can't understand how this result is produced. According to Example 1, the age as the query condition does not satisfy the leftmost prefix matching feature, but from the results, its type is actually Index! Is this an optimization in MySQL?
When I try to make sure I use indexed columns as query conditions, their type is always ref, as shown below:
> DESC SELECT * FROM userInfo WHERE name = "Jack";
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | userInfo | NULL | ref | joint_unique_index | joint_unique_index | 128 | const | 1 | 100.00 | NULL |
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
> DESC SELECT name, age FROM userInfo WHERE name = "Jack";
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | userInfo | NULL | ref | joint_unique_index | joint_unique_index | 128 | const | 1 | 100.00 | Using index |
+----+-------------+----------+------------+------+--------------------+--------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Please tell me why when I use age as a query, the first result is ALL, but the second result is INDEX. Is this the result of MySQL optimization?
In other words, when SELECT * is used, index column queries are not applied, but when SELECT joint_col1, joint_col2 FROM joint_col2 are used, index column queries (because type is INDEX) are used. Why does this difference occur?
Simplifying a bit, an index (name, age) is basically the same as if you had another table (name, age, id) with a copy of those values. The primary key is (for InnoDB) included for technical reasons - MySQL uses it to find the full row in the original table.
So you can basically think of it as if you have 2 tables: (id, name, age, gender) and (name, age, id), both with the same amount of rows. And both have the ability to jump to/skip specific rows if you provide the leftmost columns.
If you do
SELECT * FROM userInfo WHERE age = 18;
MySQL has to read, as you expected, every row of the table, as there is no way to find rows with age = 18 faster - just as you concluded, there is no index with age as the leftmost column.
If you do
SELECT name, age FROM userInfo WHERE age = 18;
the situation doesn't change a lot: MySQL will also have to read every row, and still cannot use the index on (name, age) to limit the number of rows it has to read.
But MySQL can use a trick: since you only need the columns name and age, it can read all rows from the index-"table" and still have all information it needs, as the index is a covering index (it covers all required columns).
Why would MySQL do that? Because it has to read less absolute data than reading the complete table: the index stores the information you want in less bytes (as it doesn't include gender). Reading less data to get all the information you need is better/faster than reading more data to get the same information. So MySQL will do just that.
But to emphasize it: your query still has to read all rows, it is still basically a full table scan ("ALL") - just on a "table" (the index) with less columns, to save some bytes. While you won't notice a difference with one tinyint column, if your table has a lot of or large columns, it's actually a relevant speedup.
The "leftmost" rule applies to the WHERE clause versus the INDEX.
INDEX(name, age) is useful for WHERE name = '...' or WHERE name = '...' AND ((anything else)) because name is leftmost in the index.
What you have is WHERE age = ... ((and nothing else)), so you need INDEX(age) (or INDEX(age, ...)).
In particular, SELECT name, age FROM userInfo WHERE age = 18;:
INDEX(age) -- good
INDEX(age, name) -- better because it is "covering".
The order of columns in the WHERE does not matter; the order in the INDEX does matter.
First, I create a simple database with one MyISAM table with an indexed field called feature.
CREATE DATABASE test;
USE test;
CREATE TABLE data(
id INT(11) NOT NULL AUTO_INCREMENT,
feature VARCHAR(64),
PRIMARY KEY (id)
) ENGINE=MyISAM;
INSERT INTO data VALUES (1, 'a'), (2, 'b');
CREATE INDEX data_feature ON data(feature);
Then, when I test a GROUP BY query with a count, it doesn't use the index when the COUNT() is made by id (see Extra column at the end of the EXPLAIN).
mysql> EXPLAIN SELECT feature, COUNT(1) FROM data GROUP BY feature;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | data | NULL | index | data_feature | data_feature | 259 | NULL | 2 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT feature, COUNT(*) FROM data GROUP BY feature;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | data | NULL | index | data_feature | data_feature | 259 | NULL | 2 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN SELECT feature, COUNT(id) FROM data GROUP BY feature;
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------+
| 1 | SIMPLE | data | NULL | index | data_feature | data_feature | 259 | NULL | 2 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+--------------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
I have tested it on MySQL Community 8.0.21 and MariaDB 10.3.25.
COUNT(id) obligates the Optimizer to check id for being NOT NULL. The standard pattern is to simply say COUNT(*).
InnoDB probably works differently. But this is because the PRIMARY KEY is implicitly tacked onto the end of any secondary index. That makes INDEX(feature) work like INDEX(feature, id), in which case it would be a "covering" index as indicated by "Using index".
(There is virtually no reason to stick with MyISAM in this decade.)
Consider a table with the following fields:
mysql> DESCRIBE my_table;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| pk | int(11) | NO | PRI | NULL | |
| name | varchar(20) | NO | UNI | | |
| value | varchar(255) | NO | | | |
+-------+--------------+------+-----+---------+-------+
3 rows in set (0.01 sec)
Notice that the field name has a unique constraint.
Lets say I want to optimize the following query:
SELECT name, value
FROM my_table
WHERE name = 'my_name'
There already is an index for the name field (due to the unique constraint), but it would be even better to have a covering index for the field value as well.
With just one index for the unique constraint, nothing surprising happens when I run the EXPLAIN command:
mysql> EXPLAIN
-> SELECT name, value
-> FROM my_table
-> WHERE name = "my_name";
+----+-------------+----------+-------+---------------+------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+------+---------+-------+------+-------+
| 1 | SIMPLE | my_table | const | name | name | 62 | const | 1 | |
+----+-------------+----------+-------+---------------+------+---------+-------+------+-------+
Now if I try to add a covering index,
ALTER TABLE my_table ADD INDEX idx_name_value (name, value);
it appears as a candidate for the query, but is not selected!
mysql> EXPLAIN
-> SELECT name, value
-> FROM my_table
-> WHERE name = "my_name";
+----+-------------+----------+-------+---------------------+------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------------+------+---------+-------+------+-------+
| 1 | SIMPLE | my_table | const | name,idx_name_value | name | 62 | const | 1 | |
+----+-------------+----------+-------+---------------------+------+---------+-------+------+-------+
Notice that if I remove the unique constraint,
ALTER TABLE my_table DROP INDEX name;
the covering index works as expected:
mysql> EXPLAIN
-> SELECT name, value
-> FROM my_table
-> WHERE name = "my_name";
+----+-------------+----------+------+----------------+----------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+----------------+----------------+---------+-------+------+--------------------------+
| 1 | SIMPLE | my_table | ref | idx_name_value | idx_name_value | 62 | const | 1 | Using where; Using index |
+----+-------------+----------+------+----------------+----------------+---------+-------+------+--------------------------+
So how can I use a covering index and still have a unique constraint?
There already is an index for the name field (due to the unique constraint), but it would be even better to have a covering index for the field value as well.
No. With the unique constraint you get the index for name "for free". And because you are searching only with name, you don't need the covering (combined) index with value.
You only get benefits of an index on value when you use queries with value in the where clause.
A combined index (name, value) would even an overhead when you never search for value (and so never use the index). On insert/update operations the index have to be updated and this could cost performance.
Below is the EXPLAIN for the query in question..
mysql> EXPLAIN SELECT orders_0.id, orders_0.created_at, orders_0.total_amount, orders_0.delivery_date, orders_0.customer_id, orders_0.items_summary, orders_0.site_id, orders_0.depot_id, orders_0.region_id, addresses_0.country, orders_0.payment_status, orders_0.created_by, orders_0.status, orders_0.delivered_by
-> FROM production.addresses addresses_0, production.orders orders_0
-> WHERE orders_0.delivery_address_id = addresses_0.id AND ((orders_0.status Not In ('cancelled','checkout','expired')) AND (orders_0.delivery_date>{d '2011-10-31'}));
+----+-------------+-------------+--------+------------------------------------------+---------+---------+-----------------------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+------------------------------------------+---------+---------+-----------------------------------------------+--------+-------------+
| 1 | SIMPLE | orders_0 | ALL | status,delivery_date,delivery_address_id | NULL | NULL | NULL | 929330 | Using where |
| 1 | SIMPLE | addresses_0 | eq_ref | PRIMARY | PRIMARY | 4 | production.orders_0.delivery_address_id | 1 | |
+----+-------------+-------------+--------+------------------------------------------+---------+---------+-----------------------------------------------+--------+-------------+
2 rows in set (0.00 sec)
mysql>
I have indexes on the three columns it's trying to do where conditions on but sadly it's not trying to use an index merge. Is the only option to create a multi-column index or is there something that's putting MYSQL off using those indexes?
You have to create a multi-columns index.
For example, to do that in Phpmyadmin, you have to check this different fields of your table and click the 'index' button below the table structure.