How to improve Limit clause in MySQL - mysql

I have the posts table with 10k rows and I want to create pagination by that. So I have the next query for that purpose:
SELECT post_id
FROM posts
LIMIT 0, 10;
When I Explain that query I get the next result:
So I don't understand why MySql need to iterate thru 9976 rows for finding the 10 first rows? I will be very thankful if somebody help me to optimize this query.
Also I know about that topic MySQL ORDER BY / LIMIT performance: late row lookups, but the problem still exist even if I modify the query to the next one:
SELECT t.post_id
FROM (
SELECT post_id
FROM posts
ORDER BY
post_id
LIMIT 0, 10
) q
JOIN posts t
ON q.post_id = t.post_id
Update
#pala_'s solution works great for above simple case but now while I am testing a more complex query with inner join. My purpose is to join comment table with post table and unfortunately when I Explain new query is still iterate through 9976 rows.
Select comm.comment_id
from comments as comm
inner join (
SELECT post_id
FROM posts
ORDER BY post_id
LIMIT 0, 10
) as paged_post on comm.post_id = paged_post.post_id;
Do you have some idea what is the reason of such MySQL behavior ?

Try this:
SELECT post_id
FROM posts
ORDER BY post_id DESC
LIMIT 0, 10;
Pagination via LIMIT doesn't make much sense without ordering anyway, and it should fix your problem.
mysql> explain select * from foo;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | foo | index | NULL | PRIMARY | 4 | NULL | 20 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql> explain select * from foo limit 0, 10;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | foo | index | NULL | PRIMARY | 4 | NULL | 20 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql> explain select * from foo order by id desc limit 0, 10;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | foo | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
Regarding your last comments about the comment join. Do you have an index on comment(post_id)? with my test data I'm getting the following results:
mysql> alter table comments add index pi (post_id);
Query OK, 0 rows affected (0.15 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> explain select c.id from comments c inner join (select id from posts o order by id limit 0, 10) p on c.post_id = p.id;
+----+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 1 | PRIMARY | c | ref | pi | pi | 5 | p.id | 4 | Using where; Using index |
| 2 | DERIVED | o | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |
+----+-------------+------------+-------+---------------+---------+---------+------+------+--------------------------+
and for table size reference:
mysql> select count(*) from posts;
+----------+
| count(*) |
+----------+
| 15021 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from comments;
+----------+
| count(*) |
+----------+
| 1000 |
+----------+
1 row in set (0.00 sec)

Related

Efficient way to update a table with subquery

Consider the following data in the table of books:
bId serial
1 123
2 234
5 445
9 556
There's another table of missing_books with a latest_known_serial whose values come from the following query:
UPDATE missing_books mb
SET latest_known_serial = (
SELECT serial FROM books b
WHERE b.bId < mb.bId
ORDER BY b.bId DESC LIMIT 1)
The aforementioned query produces the following:
bId latest_known_serial
3 234
4 234
6 445
7 445
8 445
It all works, but I was wondering if there's any more performant way to do this as it actually hits big tables.
You can make performance increase by using indexes to make your query faster: I tried to simulate your query:
mysql> EXPLAIN UPDATE missing_books mb
-> SET latest_known_serial = (
-> SELECT serial FROM books b
-> WHERE b.bId < mb.bId
-> ORDER BY b.bId DESC LIMIT 1);
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------------------+
| 1 | UPDATE | mb | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | NULL |
| 2 | DEPENDENT SUBQUERY | b | NULL | ALL | bId | NULL | NULL | NULL | 5 | 33.33 | Range checked for each record (index map: 0x1); Using filesort |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------------------+
2 rows in set, 2 warnings (0.00 sec)
As you can see in the above query, It uses a full table scan (type: ALL) to perform the operation: Optimizer didn't select to use the indexes (unique) defined on bId column.
Now Let's make it Primary Key instead of unique index, then run the optimizer to see the result set:
Drop Unique index first:
mysql> ALTER TABLE books DROP INDEX bId;
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
Then Define PK on bId Column
mysql> ALTER TABLE books
ADD PRIMARY KEY (bId);
Now test again:
mysql> EXPLAIN UPDATE missing_books mb SET latest_known_serial = ( SELECT serial FROM books b WHERE b.bId < mb.bId ORDER BY b.bId DESC LIMIT 1);
+----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| 1 | UPDATE | mb | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | NULL |
| 2 | DEPENDENT SUBQUERY | b | NULL | index | PRIMARY | PRIMARY | 4 | NULL | 1 | 33.33 | Using where; Backward index scan |
+----+--------------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
2 rows in set, 2 warnings (0.00 sec)
As you can see in the key column, optimizer used the PK index defined on books table! You can test the speed by making small adjustments.

MySQL : Indexes for View based on an aggregated query

I have a working, nice, indexed SQL query aggregating notes (sum of ints) for all my users and others stuffs. This is "query A".
I want to use this aggregated notes in others queries, say "query B".
If I create a View based on "query A", will the indexes of the original query will be used when needed if I join it in "query B" ?
Is that true for MySQL ? For others flavors of SQL ?
Thanks.
In MySQL, you cannot create an index on a view. MySQL uses indexes of the underlying tables when you query data against the views that use the merge algorithm. For the views that use the temptable algorithm, indexes are not utilized when you query data against the views.
https://www.percona.com/blog/2007/08/12/mysql-view-as-performance-troublemaker/
Here's a demo table. It has a userid attribute column and a note column.
mysql> create table t (id serial primary key, userid int not null, note int, key(userid,note));
If you do an aggregation to get the sum of note per userid, it does an index-scan on (userid, note).
mysql> explain select userid, sum(note) from t group by userid;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| 1 | SIMPLE | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
1 row in set (0.00 sec)
If we create a view for the same query, then we can see that querying the view uses the same index on the underlying table. Views in MySQL are pretty much like macros — they just query the underlying table.
mysql> create view v as select userid, sum(note) from t group by userid;
Query OK, 0 rows affected (0.03 sec)
mysql> explain select * from v;
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
2 rows in set (0.00 sec)
So far so good.
Now let's create a table to join with the view, and join to it.
mysql> create table u (userid int primary key, name text);
Query OK, 0 rows affected (0.09 sec)
mysql> explain select * from v join u using (userid);
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | test.u.userid | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
3 rows in set (0.01 sec)
I tried to use hints like straight_join to force it to read v then join to u.
mysql> explain select * from v straight_join u on (v.userid=u.userid);
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
"Using join buffer (Block Nested Loop)" is MySQL's terminology for "no index used for the join." It's just looping over the table the hard way -- by reading batches of rows from start to finish of the table.
I tried to use force index to tell MySQL that type=ALL is to be avoided.
mysql> explain select * from v straight_join u force index(PRIMARY) on (v.userid=u.userid);
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | v.userid | 1 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
Maybe this is using an index for the join? But it's weird that table u is before table t in the EXPLAIN. I'm frankly not sure how to understand what it's doing, given the order of rows in this EXPLAIN report. I would expect the joined table should come after the primary table of the query.
I only put a few rows of data into each table. One might get some different EXPLAIN results with a larger representative sample of test data. I'll leave that to you to try.

How to improve this sql query performance?

SELECT id, name, detail FROM student WHERE id NOT IN (1,788,103,100) ORDER BY id DESC LIMIT 1000,10
The table is tiny (10,000 rows). I have to consider two point, "IN query" and "LIMIT query".
Here are the DDLs and the EXPLAIN. I'm using MySQL 5.6.4.
CREATE TABLE student
( id int(11) NOT NULL AUTO_INCREMENT
, name varchar(45) NOT NULL
, detail varchar(255) NOT NULL
, PRIMARY KEY (id)
) ENGINE = MyISAM;
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
| 1 | SIMPLE | student| ALL | Primary,id | NULL | NULL | NULL | 13 | |
The LIMIT and ORDER BY clauses mean that the query has to build the whole table and then order it and then go the record 1000 and then extract the next 10 records.
Why are you looking for 10 records starting at record 1000?
Removing the ORDER BY clause would make it faster as the query would only need to extract 1010 records.
I cannot replicate this finding...
SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.5.16 |
+-----------+
SELECT COUNT(*) FROM student;
+----------+
| COUNT(*) |
+----------+
| 131072 |
+----------+
SELECT id
FROM student
WHERE id
NOT IN (1,788,103,100)
ORDER
BY id DESC
LIMIT 1000,10;
+--------+
| id |
+--------+
| 195591 |
| 195590 |
| 195589 |
| 195588 |
| 195587 |
| 195586 |
| 195585 |
| 195584 |
| 195583 |
| 195582 |
+--------+
10 rows in set (0.00 sec)
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+
| 1 | SIMPLE | student | range | PRIMARY | PRIMARY | 4 | NULL | 131069 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+--------------------------+

Mysql not using index on group by and order by

I have table users with following columns.
id, name, updated_at
Query with explain plan
mysql> explain select * from users group by users.id order by users.updated_at desc limit 10;
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 190551 | Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
1 row in set (0.00 sec)
Created a new index
create index test_id_updated_at on users (id, updated_at);
After creating a new index still getting the same result with explain plan.
mysql> explain select * from users group by users.id order by users.updated_at desc limit 10;
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 190551 | Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+--------+----------------+
1 row in set (0.00 sec)
After forcing a new index in query, still getting same result.
I don't understand why it says 'Using filesort' after creating a new index.
I replicated your cenario, and mysql used the index:
mysql> explain SELECT * FROM test_index GROUP BY id ORDER BY updated_at DESC;
+----+-------------+------------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | test_index | ALL | NULL | NULL | NULL | NULL | 393520 | Using temporary; Using filesort |
+----+-------------+------------+------+---------------+------+---------+------+--------+---------------------------------+
1 row in set (0.00 sec)
mysql> CREATE INDEX index1 ON test_index(id, updated_at);
Query OK, 0 rows affected (1.85 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> explain SELECT * FROM test_index GROUP BY id ORDER BY updated_at DESC;
+----+-------------+------------+-------+---------------+--------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | test_index | index | NULL | index1 | 12 | NULL | 393520 | Using temporary; Using filesort |
+----+-------------+------------+-------+---------------+--------+---------+------+--------+---------------------------------+
1 row in set (0.00 sec)
Maybe your mysql version? Tested this on 5.5.
It's hard to optimize this query (remove the using filesort and temporary) without knowing the full table structure and what you want to retrieve with this (specify the fields instead of "*")

Why is MySQL not using an index when I'm including a subquery?

I have the following query, which is fine, but will get slower as the brands table grows:
mysql> explain select brand_id as id,brands.name from tags
-> INNER JOIN brands on tags.brand_id = brands.id
-> where brand_id in
-> (select brand_id from tags where outfit_id in
-> (1,6,68,265,271))
-> group by brand_id, brands.name
-> ORDER BY count(brand_id)
-> LIMIT 5;
+----+--------------------+--------+----------------+------------------------------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------+----------------+------------------------------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
| 1 | PRIMARY | brands | ALL | PRIMARY | NULL | NULL | NULL | 165 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | tags | ref | index_tags_on_brand_id | index_tags_on_brand_id | 5 | waywn.brands.id | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | tags | index_subquery | index_tags_on_outfit_id,index_tags_on_brand_id | index_tags_on_brand_id | 5 | func | 1 | Using where |
+----+--------------------+--------+----------------+------------------------------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
3 rows in set (0.00 sec)
I don't understand why it isn't using the primary key as the index here and doing a file sort. If I replace the subquery with the values returned from that subquery, MySQL correctly uses the indices:
mysql> explain select brand_id as id,brands.name from tags
-> INNER JOIN brands on tags.brand_id = brands.id
-> where brand_id in
-> (2, 2, 9, 10, 40, 32, 9, 118)
-> group by brand_id, brands.name
-> ORDER BY count(brand_id)
-> LIMIT 5;
+----+-------------+--------+-------+------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
| 1 | SIMPLE | brands | range | PRIMARY | PRIMARY | 4 | NULL | 6 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | tags | ref | index_tags_on_brand_id | index_tags_on_brand_id | 5 | waywn.brands.id | 1 | Using where; Using index |
+----+-------------+--------+-------+------------------------+------------------------+---------+-----------------+------+----------------------------------------------+
2 rows in set (0.00 sec)
mysql> explain select brand_id from tags where outfit_id in (1,6,68,265,271);
+----+-------------+-------+-------+-------------------------+-------------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------------+-------------------------+---------+------+------+-------------+
| 1 | SIMPLE | tags | range | index_tags_on_outfit_id | index_tags_on_outfit_id | 5 | NULL | 8 | Using where |
+----+-------------+-------+-------+-------------------------+-------------------------+---------+------+------+-------------+
1 row in set (0.00 sec)
Why would this be? It doesn't really make sense to me. I mean, I can break it up into 2 calls, but that seems poor. I did notice that I can make it slightly more efficient by including a distinct in the subquery, but that didn't change the way it uses keys at all.
Why don't you juste write :
SELECT brand_id as id,brands.name
FROM tags
INNER JOIN brands ON tags.brand_id = brands.id
WHERE outfit_id in (1,6,68,265,271)
GROUP BY brand_id, brands.name
ORDER BY count(brand_id)
LIMIT 5;