Optimize DELETE FROM mdl_grade_items_history - mysql

Last week I installed some additional database monitoring and have since come to discover that a full 30% of our database load is spent on a single query on a single table (which currently has some 6 million rows):
delete FROM mdl_grade_items_history WHERE timemodified < ?
In a testing environment, I tried to make some schema changes:
Running EXPLAIN on this query indicates that every time this query is run, a full table scan is done.
EXPLAIN DELETE FROM mdl_grade_items_history WHERE timemodified < '1490528405';
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | DELETE | mdl_grade_items_history | NULL | ALL | NULL | NULL | NULL | NULL | 140784 | 100.00 | Using where |
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set (0.00 sec)
Checking EXPLAIN for a (very similar) SELECT query shows a similar situation.
EXPLAIN SELECT id FROM mdl_grade_items_history WHERE timemodified < '1490528405';
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | mdl_grade_items_history | NULL | ALL | NULL | NULL | NULL | NULL | 140784 | 33.33 | Using where |
+----+-------------+-------------------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.01 sec)
Checking the table definition, there does not seem to be an index on timemodified
SHOW INDEX FROM mdl_grade_items_history;
+-------------------------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| mdl_grade_items_history | 0 | PRIMARY | 1 | id | A | 140784 | NULL | NULL | | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_act_ix | 1 | action | A | 2 | NULL | NULL | | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_old_ix | 1 | oldid | A | 17170 | NULL | NULL | | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_cou_ix | 1 | courseid | A | 1065 | NULL | NULL | YES | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_cat_ix | 1 | categoryid | A | 2300 | NULL | NULL | YES | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_sca_ix | 1 | scaleid | A | 6 | NULL | NULL | YES | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_out_ix | 1 | outcomeid | A | 1 | NULL | NULL | YES | BTREE | | |
| mdl_grade_items_history | 1 | mdl_graditemhist_log_ix | 1 | loggeduser | A | 30 | NULL | NULL | YES | BTREE | | |
+-------------------------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
8 rows in set (0.00 sec)
So I tried to add one (both via CREATE INDEX and ALTER TABLE .. ADD INDEX)
CREATE INDEX `mdl_gradeitemhist_tim_ix` ON `mdl_grade_items_history` (`timemodified`);
ALTER TABLE `mdl_grade_items_history` ADD INDEX `mdl_gradeitemhist_tim_ix` (`timemodified`);
In both instances, the SELECT query was affected (note the change in type)
EXPLAIN `SELECT` id FROM mdl_grade_items_history WHERE timemodified < '1490528405';
+----+-------------+-------------------------+------------+-------+--------------------------+--------------------------+---------+------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+-------+--------------------------+--------------------------+---------+------+-------+----------+--------------------------+
| 1 | SIMPLE | mdl_grade_items_history | NULL | range | mdl_gradeitemhist_tim_ix | mdl_gradeitemhist_tim_ix | 9 | NULL | 70206 | 100.00 | Using where; Using index |
+----+-------------+-------------------------+------------+-------+--------------------------+--------------------------+---------+------+-------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
But not the DELETE query.
EXPLAIN DELETE FROM mdl_grade_items_history WHERE timemodified < '1490528405';
+----+-------------+-------------------------+------------+------+--------------------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+------------+------+--------------------------+------+---------+------+--------+----------+-------------+
| 1 | DELETE | mdl_grade_items_history | NULL | ALL | mdl_gradeitemhist_tim_ix | NULL | NULL | NULL | 140412 | 100.00 | Using where |
+----+-------------+-------------------------+------------+------+--------------------------+------+---------+------+--------+----------+-------------+
1 row in set (0.00 sec)
What have I done wrong? What else could I try?

Low cardinality indexes (action, scaleid, outcomeid) are almost never used. Get rid of them.
Having a large number of single-column indexes is a red flag. Please learn about the power and benefit of "composite" indexes. (Not relevent for the select/delete mentioned here, but probably relevant for other queries.)
Extra indexes on a table slightly slow down INSERTs and DELETEs since the indexes need to (eventually) be updated.
Extra indexes slow down UPDATEs if an indexed column is modified.
CREATE INDEX and ALTER TABLE ADD INDEX do the same thing; you probably have a redundant index now.
The EXPLAINs are different because (1) SELECT and DELETE do different things and (2) EXPLAIN is not very sophisticated.
Deleting a large number of rows takes a lot of effort -- Keep in mind that the deleted rows are hung onto in case of a ROLLBACK. Only after COMMIT can the rows really be removed. (With autocommit=ON, there is an implicit COMMIT.)
Tips on large deletes:
Deleting in chunks
Using PARTITIONs for very efficient deletion of time series

Related

Query optimizer not using an index

I have two tables CUSTOMER_ORDER_PUBLIC and LINEITEM_PUBLIC which have the following indices:
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| CUSTOMER_ORDER_PUBLIC | 1 | O_ORDERKEY | 1 | O_ORDERKEY | A | 2633457 | NULL | NULL | YES | BTREE | | |
| CUSTOMER_ORDER_PUBLIC | 1 | O_ORDERDATE | 1 | O_ORDERDATE | A | 2350 | NULL | NULL | YES | BTREE | | |
| CUSTOMER_ORDER_PUBLIC | 1 | PUB_C_CUSTKEY | 1 | PUB_C_CUSTKEY | A | 273000 | NULL | NULL | | BTREE | | |
+-----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
and:
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| LINEITEM_PUBLIC | 0 | PRIMARY | 1 | PUB_L_ORDERKEY | A | 16488602 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 0 | PRIMARY | 2 | PUB_L_LINENUMBER | A | 44146904 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 1 | LINEITEM_PRIVATE_FK2 | 1 | PUB_L_PARTKEY | A | 2083757 | NULL | NULL | | BTREE | | |
| LINEITEM_PUBLIC | 1 | LINEITEM_PRIVATE_FK3 | 1 | PUB_L_SUPPKEY | A | 85599 | NULL | NULL | | BTREE | | |
+-----------------+------------+----------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Each time I run an Explain of a specific query I get the following:
mysql> EXPLAIN SELECT *
FROM CUSTOMER_ORDER_PUBLIC
LEFT OUTER JOIN LINEITEM_PUBLIC ON O_ORDERKEY= PUB_L_ORDERKEY;
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
| 1 | SIMPLE | CUSTOMER_ORDER_PUBLIC | NULL | ALL | NULL | NULL | NULL | NULL | 2900769 | 100.00 | NULL |
| 1 | SIMPLE | LINEITEM_PUBLIC | NULL | ref | PRIMARY | PRIMARY | 4 | TPCH.CUSTOMER_ORDER_PUBLIC.O_ORDERKEY | 2 | 100.00 | NULL |
+----+-------------+-----------------------+------------+------+---------------+---------+---------+---------------------------------------+---------+----------+-------+
For some reason the query optimizer is not using the index (O_ORDERKEY) even if I use a FORCE INDEX. I know a lot of people posted similar questions but I tried everything and nothing seems to help!
Any other suggestions would be greatly appreciated!
Edit:
The query used is the following:
SELECT * FROM CUSTOMER_ORDER_PUBLIC
LEFT OUTER JOIN LINEITEM_PUBLIC ON O_ORDERKEY= PUB_L_ORDERKEY;
For this query:
SELECT *
FROM CUSTOMER_ORDER_PUBLIC cop LEFT OUTER JOIN
LINEITEM_PUBLIC lp
ON cop.O_ORDERKEY = lp.PUB_L_ORDERKEY;
For this query, you want an index on LINEITEM_PUBLIC(PUB_L_ORDERKEY). Of course, you already have this index because this is the first key in the primary key.
There is no reason to use an index on CUSTOMER_ORDER_PUBLIC, because all rows in the table are going to the result set.
The FORCE INDEX hint tells the optimizer that a full scan of the table is very expensive.
The most likely explanation for the observed behavior is that the optimizer thinks it needs to access every row in the table, and the index suggested in the hint is not a covering index for the query.
Based on the EXPLAIN output, we only see evidence of a single predicate on the JOIN operation. And it looks like the optimizer is choosing CUSTOMER_ORDER_PUBLIC as the driving table for the join, and using an index on the LINEITEM_PUBLIC table.
I'm not sure any of that answers the question you asked. (I'm not sure that there was a question asked.) Absent an actual SQL statement, we are just making guesses.
I have a question: Aside from the FORCE INDEX hint, why would we expect the optimizer to use a particular index? And why would that be a reasonable expectation?

mysql need to run optimize every day after batch jobs

I have a table with about 4M rows. Every night, about 15 batch jobs run on the data, with a few hundred thousand inserts and updates. The problem is, when I run a simple count query such as
select count(*) from items;
I have to wait for about 15 minutes for it to return. After researching on SO, I see that
optimize table items;
does seem to fix the problem, after running it, the above query returns instantly. The problem is, it takes 17 hours to run. Any suggestions on what to look for to figure out why this is happening and how to fix it?
Thanks for any help,
Kevin
UPDATE:
Here's what happens when I optimize:
mysql> optimize table items;
+------------------------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------------------+----------+----------+-------------------------------------------------------------------+
| g_production.items | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| g_production.items | optimize | status | OK |
+------------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (9 hours 20 min 48.36 sec)
Also, strangely, the select is not using the primary index, ID:
explain select count(id) from items;
+----+-------------+-------+-------+---------------+--------------------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------------------------- +---------+------+----------+-------------+
| 1 | SIMPLE | items | index | NULL | index_items_on_real_sale | 2 | NULL | 45152757 | Using index |
+----+-------------+-------+-------+---------------+--------------------------+---------+------+----------+-------------+
1 row in set (0.10 sec)
And finally, here are all the indexes on the table:
+-------+------------+---------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+---------------------------------------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| items | 0 | PRIMARY | 1 | id | A | 47144790 | NULL | NULL | | BTREE | | |
| items | 1 | index_items_on_affiliate_id | 1 | affiliate_id | A | 47144790 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_brand_id | 1 | brand_id | A | 1024886 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_real_sale | 1 | real_sale | A | 18 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_retailer_id_and_affiliate_id | 1 | retailer_id | A | 18 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_retailer_id_and_affiliate_id | 2 | affiliate_id | A | 47144790 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_retailer_id | 1 | retailer_id | A | 40021 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_shopzilla_id | 1 | shopzilla_id | A | 457716 | NULL | NULL | YES | BTREE | | |
| items | 1 | index_items_on_updated_at | 1 | updated_at | A | 6734970 | NULL | NULL | | BTREE | | |
Note the cardinality on the index that the EXPLAIN is revealing, I have 4M rows, but explain says it's using index_items_on_real_sale, which the show indexes command reveals has a cardinality of 18. Could this be the problem?
It could be quite a few things, but I'm wondering if it's indexed properly. Also, try to run the query with explain, like so:
EXPLAIN SELECT a,b,c WHERE....
Look at the output and see how many rows it's reading to process the query and they type of indexes etc...
Definitely need more information in order to help out, I'm just guessing based on the limited information you provided.

What's the difference between these two indexed MySQL tables?

I'm working on modifying an old MySQL database which turned out to be designed improperly for the sort of data it was storing. I'm not very familiar with SQL at all, so I used SHOW CREATE TABLE to get the CREATE statement used for the old table ('interaction_old') and copied it almost exactly, with the only changes being a few of the column names and data types, to make a new table ('interaction_new'). Now some queries which were using indexes in the old table no longer use indexes in the new table, and I can't figure out why.
Here are the indexes from both tables:
mysql> SHOW KEYS FROM interaction_old;
+-----------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| interaction_old | 0 | PRIMARY | 1 | interactionid | A | 138996006 | NULL | NULL | | BTREE | |
| interaction_old | 1 | Complex_pdbid | 1 | Complex_pdbid | A | 1338 | NULL | NULL | | BTREE | |
| interaction_old | 1 | Protein_id | 1 | Protein_id | A | 13737 | NULL | NULL | | BTREE | |
| interaction_old | 1 | RNA_id | 1 | RNA_id | A | 2806 | NULL | NULL | | BTREE | |
+-----------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> SHOW KEYS FROM interaction_new;
+-----------------+------------+------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------------+------------+------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| interaction_new | 0 | PRIMARY | 1 | interactionid | A | 152311144 | NULL | NULL | | BTREE | |
| interaction_new | 1 | pdbid | 1 | pdbid | A | 2924 | NULL | NULL | | BTREE | |
| interaction_new | 1 | pchainname | 1 | pchainname | A | 472 | NULL | NULL | | BTREE | |
| interaction_new | 1 | rchainname | 1 | rchainname | A | 487 | NULL | NULL | | BTREE | |
+-----------------+------------+------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
And an example query that is behaving differently between the two:
mysql> EXPLAIN SELECT DISTINCT Complex_pdbid FROM interaction_old;
+----+-------------+-----------------+-------+---------------+---------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+---------------+---------+------+------+--------------------------+
| 1 | SIMPLE | interaction_old | range | NULL | Complex_pdbid | 6 | NULL | 1339 | Using index for group-by |
+----+-------------+-----------------+-------+---------------+---------------+---------+------+------+--------------------------+
mysql> EXPLAIN SELECT DISTINCT pdbid FROM interaction_new;
+----+-------------+-----------------+------+---------------+------+---------+------+-----------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+---------------+------+---------+------+-----------+-----------------+
| 1 | SIMPLE | interaction_new | ALL | NULL | NULL | NULL | NULL | 152311144 | Using temporary |
+----+-------------+-----------------+------+---------------+------+---------+------+-----------+-----------------+
As you might expect, the query on interaction_old finishes in a fraction of a second, whereas I've let the query on interaction_new run for ~20 minutes before killing it. interaction_old.Complex_pdbid and interaction_new.pdbid are the same data type (and are storing almost exactly the same data). USE INDEX and/or FORCE INDEX doesn't seem to have any effect. What's causing the different behavior?
Edit: According to the documentation, the first table uses a loose index scan to increase speed -- nothing from that page makes it clear to me why this doesn't work on the second table, though.

Avoid "Using temporary" and "Using filesort" with ORDER BY

sorry but after browsing nearly every posts and questions about it, I still can't manage to get rid of "Using temporary" and "Using filesort" in a simple query. I know this is a problem of keys but I can't find the right combination...
I also don't know if the order of the join defined by the optimizer is ok, I tested other orders using STRAIGHT_JOIN but nothing better... The query is pretty slow using ORDER BY, but really fast without it and of course without "Using temporary" and "Using filesort"! (there is something like 100.000 rows in points table)
The query :
SELECT points.id,
points.id_owner,
points.point_title,
points.point_desc,
users.user_id,
users.username
FROM points,
JOIN users ON points.id_owner = users.user_id
JOIN follows ON follows.id_followed = points.id_owner
WHERE points.deleted = 0
AND follows.id_follower = 22
ORDER BY points.id DESC
LIMIT 10
the explain :
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
| 1 | SIMPLE | follows | ref | FOLLOW_DUO | FOLLOW_DUO | 4 | const | 2 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | users | eq_ref | PRIMARY | PRIMARY | 4 | follows.id_followed | 1 | |
| 1 | SIMPLE | points | ref | GETPOINT1 | GETPOINT1 | 5 | users.user_id,const | 460 | Using where |
+----+-------------+---------+--------+---------------+------------+---------+---------------------+------+----------------------------------------------+
And here is the SHOW INDEX from the three tables :
SHOW INDEX FROM points
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| points | 0 | PRIMARY | 1 | id | A | 91987 | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 1 | id_owner | A | NULL | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 2 | deleted | A | NULL | NULL | NULL | | BTREE | |
| points | 0 | GETPOINT1 | 3 | id | A | 91987 | NULL | NULL | | BTREE | |
+--------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
SHOW INDEX FROM users
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| users | 0 | PRIMARY | 1 | user_id | A | 4 | NULL | NULL | | BTREE | |
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
SHOW INDEX FROM follows
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| follows | 0 | PRIMARY | 1 | id | A | 5 | NULL | NULL | | BTREE | |
| follows | 0 | FOLLOW_DUO | 1 | id_follower | A | NULL | NULL | NULL | | BTREE | |
| follows | 0 | FOLLOW_DUO | 2 | id_followed | A | 5 | NULL | NULL | | BTREE | |
| follows | 1 | id_follower | 1 | id_follower | A | NULL | NULL | NULL | | BTREE | |
| follows | 1 | id_followed | 1 | id_followed | A | NULL | NULL | NULL | | BTREE | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
From now I don't know what to test to try to avoid the "Using temporary" and "Using filesort"... So if you have an idea for me... Thank you in advance for your help !
Looks like so many rows are being examined from points table. I had tried following trick to avoid temporary table usage in my project. Please do as follows and give it an explain to see any improvement:
Delete all indexes called 'GETPOINT1' except Primary Key Index form points table.
Add covering index on columns (deleted, id_owner). Please keep the order of columns as mentioned.
If you still don't see any improvement, remove above index and add index again in order (id, deleted, id_owner) and (deleted, id_owner, id) columns and try again
In addition you may remove follows.id_follower = 22 from where clause and put it in join condition like JOIN follows ON follows.id_followed = points.id_owner AND follows.id_follower = 22
Please also add index in order as (id_follower, id_owner) in follows table.
I do not guarantee but above should be able to give you improvements.

MySQL Inconsistencies in index usage on the same query

I have a table of over 9 million rows. I have a SELECT query that I'm using an index for. Here is the query:
SELECT `username`,`id`
FROM `04c1Tg0M`
WHERE `id` > 9259466
AND `tried` = 0
LIMIT 1;
That query executes very fast (0.00 sec). Here is the explain for that query:
+----+-------------+----------+-------+-----------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+-----------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | 04c1Tg0M | range | PRIMARY,triedex | PRIMARY | 4 | NULL | 10822 | Using where |
+----+-------------+----------+-------+-----------------+---------+---------+------+-------+-------------+
Now here is the same query except that I'm going to change the id to 6259466:
SELECT `username`,`id`
FROM `04c1Tg0M`
WHERE `id` > 5986551
AND `tried` = 0
LIMIT 1;
That query took 4.78 seconds to complete. This is the problem. Here is the explain for that query:
+----+-------------+----------+------+-----------------+---------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+-----------------+---------+---------+-------+---------+-------------+
| 1 | SIMPLE | 04c1Tg0M | ref | PRIMARY,triedex | triedex | 2 | const | 9275107 | Using where |
+----+-------------+----------+------+-----------------+---------+---------+-------+---------+-------------+
What is happening here and how can I fix it? Here are my indexes:
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| 04c1Tg0M | 0 | PRIMARY | 1 | id | A | 9275093 | NULL | NULL | | BTREE | |
| 04c1Tg0M | 1 | pdex | 1 | username | A | 9275093 | NULL | NULL | | BTREE | |
| 04c1Tg0M | 1 | pdex | 2 | id | A | 9275093 | NULL | NULL | | BTREE | |
| 04c1Tg0M | 1 | pdex | 3 | tried | A | 9275093 | NULL | NULL | YES | BTREE | |
| 04c1Tg0M | 1 | triedex | 1 | tried | A | 0 | NULL | NULL | YES | BTREE | |
| 04c1Tg0M | 1 | triedex | 2 | id | A | 9275093 | NULL | NULL | | BTREE | |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
And here is my table structure:
| 04c1Tg0M | CREATE TABLE `04c1Tg0M` (
`id` int(20) NOT NULL AUTO_INCREMENT,
`username` varchar(50) NOT NULL,
`tried` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `pdex` (`username`,`id`,`tried`),
KEY `triedex` (`tried`,`id`)
) ENGINE=MyISAM AUTO_INCREMENT=9275108 DEFAULT CHARSET=utf8 |
The first SQL returns 10822 rows, while the second one returns 9275107 rows!
The use of primary key "id" index in the second query isn't so useful because you have to do a full table scan anyway.
MySQL's cost-based optimizer thinks, in the case of the 2nd query, it's better off to use the index on 'tried'.
If you have to do a full table-scan, you're better off not using an index, as index constitutes additional disk reads.
You can use "use index" or "force index" in your query to hint to the optimizer whether to use an index.
Also update the statistics by analyzing your table periodically so the cost-based optimizer is working correctly.