Simple query:
select *
from data.staff AS staff
left join data.contact AS workphones on staff.id = workphones.staff_with_work_phone_id
Mysql run time: 5.3 sec.
MariaDb run time: 0.016 sec.
Contact has ~50000 rows.
Staff has ~600 rows.
What is the reason?
Is it possible to achieve the same result on mysql?
Thank you!
Explain MySql (v5.7.14):
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+-------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+-------+----------+---------------------------------------+
| 1 | SIMPLE | staff | NULL | ALL | NULL | NULL | NULL | NULL | 606 | 100.00 | NULL |
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+-------+----------+---------------------------------------+
| 2 | SIMPLE | workphones | NULL | ALL | FK_2f7824065c2c4b0fbe5c00da271 | NULL | NULL | NULL | 49180 | 100.00 | Using where. |
| | | | | | | | | | | | Using join buffer (Block Nested Loop) |
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+-------+----------+---------------------------------------+
Explain MariaDB (v10.0.28):
+----+-------------+------------+------+--------------------------------+--------------------------------+---------+--------------------+-------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------+--------------------------------+--------------------------------+---------+--------------------+-------+----------+-------+
| 1 | SIMPLE | staff | ALL | | | | | 602 | 100.00 | |
+----+-------------+------------+------+--------------------------------+--------------------------------+---------+--------------------+-------+----------+-------+
| 2 | SIMPLE | workphones | ALL | FK_1249f6bc1d68495090691f3ce02 | FK_1249f6bc1d68495090691f3ce02 | 9 | user_data.staff.id | 25476 | 100.00 | |
+----+-------------+------------+------+--------------------------------+--------------------------------+---------+--------------------+-------+----------+-------+
The rest of the verification conditions are identical.
The test was conducted many times.
Your two query plans show you why MySQL is slower.
Both find the possible keys, which is a foreign key.
MariaDB will USE the FK: FK_1249f6bc1d68495090691f3ce02 is in both columns possible_keys AND keys in row 2.
MySQL does see the FK, but does NOT use it.
MySQL tells you, that it will use a
Using join buffer (Block Nested Loop)
in the EXTRA table.
MySQL does not use your Foreign Key.
Foreign Key Joins
Do you have an index on your foreign key in both database systems?
If only MariaDB has it, then you cannot blame MySQL, because it cannot use, what it does not have.
Related
There is one huge table which is having 25M records and when we try to delete the records by manually passing the value it is using the INDEX and query is executing faster.
Below are details.
MySQL [(none)]> explain DELETE FROM isca51410_octopus_prod_eai.WMSERVICE WHERE contextid in ('1121','1245','5432','12412','1212','7856','2342','1345','5312','2342','3432','5321');
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | range | IDX_BIG_CID | IDX_BIG_CID | 109 | const | 12 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+-------------+---------+-------+------+----------+-------------+
But when we try to pass the values by using select query it is not using index and query is executing for more time.
Below is the explain plan.
MySQL [(none)]> explain DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
| 1 | DELETE | BIG_TABLE | NULL | ALL | NULL | NULL | NULL | NULL | 25730673 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | TABLE_2 | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+--------------------+------------------+------------+------+---------------+------+---------+------+----------+----------+-------------+
Here DATABASE_2.TABLE_2 is a table where the values will change everytime and row count will be less than 100.
How to make use of index IDX_BIG_CID on table DATABASE1_1.BIG_TABLE for the below query
DELETE FROM DATABASE1_1.BIG_TABLE WHERE contextid in (SELECT contextid FROM DATABASE_2.TABLE_2);
Don't use IN ( SELECT ... ). Use a multi-table DELETE. (See the ref manual.)
MySQL version 5.7.17
When I use JOIN for small tables, MySQL don't allow me to use index on the main table.
Compare:
EXPLAIN SELECT `ko_base_client_orders`.*
FROM `ko_base_client_orders`
LEFT JOIN `ko_base_new_perspe` AS `ko_of`
ON (`ko_of`.`id` = `ko_base_client_orders`.`office`)
ORDER BY `order_time` DESC
LIMIT 10 OFFSET 0;
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+-------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+-------+----------+-----------------------------------------------------------------+
| 1 | SIMPLE | ko_base_client_orders | NULL | ALL | NULL | NULL | NULL | NULL | 60737 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | ko_of | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 2 | 100.00 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+-------+----------+-----------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
and the same query with index hint:
EXPLAIN SELECT `ko_base_client_orders`.*
FROM `ko_base_client_orders`
LEFT JOIN `ko_base_new_perspe` AS `ko_of`
FORCE INDEX FOR JOIN (`PRIMARY`)
ON (`ko_of`.`id` = `ko_base_client_orders`.`office`)
ORDER BY `order_time` DESC
LIMIT 10 OFFSET 0;
+----+-------------+-----------------------+------------+--------+---------------+------------+---------+---------------------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+--------+---------------+------------+---------+---------------------------------+------+----------+-------------+
| 1 | SIMPLE | ko_base_client_orders | NULL | index | NULL | order_time | 6 | NULL | 10 | 100.00 | NULL |
| 1 | SIMPLE | ko_of | NULL | eq_ref | PRIMARY | PRIMARY | 4 | e3.ko_base_client_orders.office | 1 | 100.00 | Using index |
+----+-------------+-----------------------+------------+--------+---------------+------------+---------+---------------------------------+------+----------+-------------+
In second case MySQL uses key for order_time, but in first example it ignores index and scans the entire table.
I can not predict which table would be small, and which would not (it differs from project to project), so I don't want to use index hints all the time. Is there another solution for this problem?
We have two tables - the first is relatively big (contact table) 250k rows and the second is small(user table, < 10 rows). On mysql 5.6 version I have next explain result:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
2 rows in set (0,00 sec)
When i use force index for join:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ force index for join(`PRIMARY`) ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
or adding indexes on fields which appears in select clause (first_name, last_name) on user table:
alter table user add index(first_name, last_name);
Explain result changes to this:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | NULL |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0,00 sec)
On mysql 5.5 version I have same explain result without additional indexes:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 255706 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0.00 sec)
Why i need force use PRIMARY index or add extra indexes on mysql 5.6 version?
Same behavior occurs with other selects, when join small tables.
If you have a table with so few rows, it may actually be faster to do a full table scan, than going to an index, locate the records and then go back to the table. If you have other fields in the user table apart from the 3 in the query, then you may consider adding a covering index, but franly, I do not think that any of this would have significant affect on the speed of the query.
When I add a second level of abstraction to a join, ie joining on a table that was join in the first place, the query time is be multiplied by 1000x
mysql> SELECT xxx.content.id, columns.stories.title, columns.stories.date
-> FROM xxx.content
-> LEFT JOIN columns.stories on xxx.content.contentable_id = columns.stories.id
-> WHERE columns.stories.title IS NOT NULL
-> AND xxx.content.contentable_type = 'PublisherStory';
yields results in 0.01 seconds
mysql> SELECT xxx.content.id, columns.photos.id as pid, columns.stories.title, columns.stories.date
-> FROM xxx.content
-> LEFT JOIN columns.stories on xxx.content.contentable_id = columns.stories.id
-> LEFT JOIN columns.photos on columns.stories.id = columns.photos.story_id
-> WHERE columns.stories.title IS NOT NULL
-> AND xxx.content.contentable_type = 'PublisherStory';
yields results in 14 seconds
this is being performed on tables with records in the 10ks to low 100ks of records
Is this normal or what could be causing such a slowdown?
first query plan:
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+------+-------------+
| 1 | SIMPLE | content | ALL | NULL | NULL | NULL | NULL | 7099 | Using where |
| 1 | SIMPLE | stories | eq_ref | PRIMARY | PRIMARY | 8 | xxx.content.contentable_id | 1 | Using where |
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+------+-------------+
Second Query
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+-------+-------------+
| 1 | SIMPLE | content | ALL | NULL | NULL | NULL | NULL | 7099 | Using where |
| 1 | SIMPLE | stories | eq_ref | PRIMARY | PRIMARY | 8 | xxx.content.contentable_id | 1 | Using where |
| 1 | SIMPLE | photos | ALL | NULL | NULL | NULL | NULL | 21239 | |
+----+-------------+---------+--------+---------------+---------+---------+--------------------------------------+-------+-------------+
if joining columns.photos slows down so much, it could be because :
photos.story_id is not a foreign key from stories and there is no index on that column.
Not seing your table structure, I could no tell exactly but I suggest you to
verify that photos.story_id is a foreign key and if your mysql version does not
support foreign key ( pretty old) put an index on that column.
I would check indexes on contentable_type and stories.title also as they are
part of the where close.
I have the following two (simplified for the sake of example) tables in my MySQL db:
DESCRIBE appname_item;
-----------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(200) | NO | | NULL | |
+-----------------+---------------+------+-----+---------+----------------+
DESCRIBE appname_favorite;
+---------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | MUL | NULL | |
| item_id | int(11) | NO | MUL | NULL | |
+---------------+----------+------+-----+---------+----------------+
I'm trying to get a list of items ordered by the number of favorites. The query below works, however there are thousands of records in the Item table, and the query is taking up to a couple of minutes to complete.
SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(`appname_favorite`.`id`) AS `num_favorites`
FROM `appname_item`
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`)
GROUP BY `appname_item`.`id`, `appname_item`.`name`
ORDER BY `num_favorites` DESC;
Here are the results of EXPLAIN, which provides some insight as to why the query is so slow (type "ALL", "using temporary", and "using filesort" should all be avoided if possible.)
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
| 1 | SIMPLE | appname_item | ALL | NULL | NULL | NULL | NULL | 574 | Using temporary; Using filesort |
| 1 | SIMPLE | appname_favorite | ref | appname_favorite_67b70d25 | appname_favorite_67b70d25 | 4 | appname.appname_item.id | 1 | |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
I know that the easiest way to optimize the query is to add an Index, but I can't seem to figure out how to add an Index for a Count() query that involves a JOIN and an order_by. I should also mention that I am running this through the Django ORM, so would prefer to not change the sql query and just work on fixing and fine tuning the db to run the query in the most effective way.
I've been trying to figure this out for a while, so any help would be much appreciated!
UPDATE
Here are the indexes that are already in the db:
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| appname_favorite | 0 | PRIMARY | 1 | id | A | 594 | NULL | NULL | | BTREE | |
| appname_favorite | 1 | appname_favorite_fbfc09f1 | 1 | user_id | A | 12 | NULL | NULL | | BTREE | |
| appname_favorite | 1 | appname_favorite_67b70d25 | 1 | item_id | A | 594 | NULL | NULL | | BTREE | |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Actually you can't avoid filesort because the count is determined at the calculation time and is unknown in the index. The only solution I can imagine is to create a composite index for table appname_item, which may help a little or not, depending on your particular data:
ALTER TABLE appname_item ADD UNIQUE INDEX `item_id_name` (`id` ASC, `name` ASC);
There is nothing wrong with your query - it looks good.
It could be the the optimizer has out-of-date info about the table. Try running this:
ANALYZE TABLE <tableaname>;
for all tables involved.
Firstly, for the count() function, you can check this answer to know more detail:
https://stackoverflow.com/a/2710630/1020600
For example, using MySQL, count(*) will be fast under a MyISAM table
but slow under an InnoDB. Under InnoDB you should use count(1) or
count(pk)
If your storage engines is MYISAM and if you want to count on row (i guess so), use count(*) is enough.
From your EXPLAIN, I found there's no Key for appname_item, if i try to add a condition
where `appname_item`.`id` = `appname_favorite`.`item_id`
then the "key" appears. so funny but it's work.
The final sql like this
explain SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(*) AS `num_favorites`
FROM `appname_item`
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`)
where `appname_item`.`id` = `appname_favorite`.`item_id`
GROUP BY `appname_item`.`id`, `appname_item`.`name`
ORDER BY `num_favorites` DESC;
+----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra
|
+----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | appname_favorite | index | item_id |
item_id | 5 | NULL | 2312 | Using
index; Using temporary; Using filesort | | 1 | SIMPLE |
appname_item | eq_ref | PRIMARY | PRIMARY | 4 |
test.appname_favorite.item_id | 1 | Using where
|
+----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+
On my computer, table appname_item has 1686 rows and appname_favorite has 2312 rows, old sql takes from 15 to 23ms. new sql takes 3.7 to 5.3ms