Slow query with multiple where and order by clauses - mysql

I'm trying to find a way to speed up a slow (filesort) MySQL query.
Tables:
categories (id, lft, rgt)
questions (id, category_id, created_at, votes_up, votes_down)
Example query:
SELECT * FROM questions q
INNER JOIN categories c ON (c.id = q.category_id)
WHERE c.lft > 1 AND c.rgt < 100
ORDER BY q.created_at DESC, q.votes_up DESC, q.votes_down ASC
LIMIT 4000, 20
If I remove the ORDER BY clause, it's fast. I know MySQL doesn't like both DESC and ASC orders in the same clause, so I tried adding a composite (created_at, votes_up) index to the questions table and removed q.votes_down ASC from the ORDER BY clause. That didn't help and it seems that the WHERE clause gets in the way here because it filters by columns from another (categories) table. However, even if it worked, it wouldn't be quite right since I do need the q.votes_down ASC condition.
What are good strategies to improve performance in this case? I'd rather avoid restructuring the tables, if possible.
EDIT:
CREATE TABLE `categories` (
`id` int(11) NOT NULL auto_increment,
`lft` int(11) NOT NULL,
`rgt` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `lft_idx` (`lft`),
KEY `rgt_idx` (`rgt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `questions` (
`id` int(11) NOT NULL auto_increment,
`category_id` int(11) NOT NULL,
`votes_up` int(11) NOT NULL default '0',
`votes_down` int(11) NOT NULL default '0',
`created_at` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `questions_FI_1` (`category_id`),
KEY `votes_up_idx` (`votes_up`),
KEY `votes_down_idx` (`votes_down`),
KEY `created_at_idx` (`created_at`),
CONSTRAINT `questions_FK_1` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE q ALL questions_FI_1 NULL NULL NULL 31774 Using filesort
1 SIMPLE c eq_ref PRIMARY,lft_idx,rgt_idx PRIMARY 4 ttt.q.category_id 1 Using where

Try a subquery to get the desired categories:
SELECT * FROM questions
WHERE category_id IN ( SELECT id FROM categories WHERE lft > 1 AND rgt < 100 )
ORDER BY created_at DESC, votes_up DESC, votes_down ASC
LIMIT 4000, 20

Try selecting only what you need in your query, instead of the SELECT *
Why not to use SELECT * ( ALL ) in MySQL

Try putting conditions, concerning joined tables into ON clauses:
SELECT * FROM questions q
INNER JOIN categories c ON (c.id = q.category_id AND c.lft > 1 AND c.rgt < 100)
ORDER BY q.created_at DESC, q.votes_up DESC, q.votes_down ASC
LIMIT 4000, 20

Related

MySQL - how to optimize query with order by

I am trying to generate a list of the 5 most recent history items for for a collection of user tasks. If I remove the order by the execution drops from ~2 seconds to < 20msec.
Indexes are on
h.task_id
h.mod_date
i.task_id
i.user_id
This is the query
SELECT h.*
, i.task_id
, i.user_id
, i.name
, i.completed
FROM h
, i
WHERE i.task_id = h.task_id
AND i.user_id = 42
ORDER
BY h.mod_date DESC
LIMIT 5
Here is the explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE i ref PRIMARY,UserID UserID 4 const 3091 Using temporary; Using filesort
1 SIMPLE h ref TaskID TaskID 4 myDB.i.task_id 7
Here are the show create tables:
CREATE TABLE `h` (
`history_id` int(6) NOT NULL AUTO_INCREMENT,
`history_code` tinyint(4) NOT NULL DEFAULT '0',
`task_id` int(6) NOT NULL,
`mod_date` datetime NOT NULL,
`description` text NOT NULL,
PRIMARY KEY (`history_id`),
KEY `TaskID` (`task_id`),
KEY `historyCode` (`history_code`),
KEY `modDate` (`mod_date`)
) ENGINE=InnoDB AUTO_INCREMENT=185647 DEFAULT CHARSET=latin1
and
CREATE TABLE `i` (
`task_id` int(6) NOT NULL AUTO_INCREMENT,
`user_id` int(6) NOT NULL,
`name` varchar(60) NOT NULL,
`due_date` date DEFAULT NULL,
`create_date` date NOT NULL,
`completed` tinyint(1) NOT NULL DEFAULT '0',
`task_description` blob,
PRIMARY KEY (`task_id`),
KEY `name_2` (`name`),
KEY `UserID` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12085 DEFAULT CHARSET=latin1
INDEX(task_id, mod_date, history_id) -- in this order
Will be "covering" and the columns will be in the optimal order
Also, DROP
KEY `TaskID` (`task_id`)
So that the Optimizer won't be tempted to use it.
Try changing the index on h.task_id so it's this compound index.
CREATE OR REPLACE INDEX TaskID ON h(task_id, mod_date DESC);
This may (or may not) allow MySql to shortcut some or all the extra work in your ORDER BY ... LIMIT ... request. It's a notorious performance anti pattern, by the way, but sometimes necessary.
Edit the index didn't help. So let's try a so-called deferred join so we don't have to ORDER and then LIMIT all the data from your h table.
Start with this subquery. It retrieves only the primary key values for the rows involved in your results, and will generate just five rows.
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
Why this subquery? It handles the work-intensive ORDER BY ... LIMIT operation while manipulating only the primary keys and the date. It still must sort tons of rows only to discard all but five, but the rows it has to handle are much shorter. Because this subquery does the heavy work, you focus on optimizing it, rather than the whole query.
Keep the index I suggested above, because it covers the subquery for h.
Then, join it to the rest of your query like this. That way you'll only have to retrieve the expensive h.description column for the five rows you care about.
SELECT h.* , i.task_id, i.user_id , i.name, i.completed
FROM h
JOIN i ON i.task_id = h.task_id
JOIN (
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
) selected ON h.history_id = selected.history_id
AND i.task_id = selected.task_id
ORDER BY h.mod_date DESC
LIMIT 5

Any way to do this query faster with big data

This query takes around 2.23seconds and feels a bit slow ... is there anyway to make it faster.
our member.id, member_id, membership_id, valid_to, valid_from has index as well.
select *
from member
where (member.id in ( select member_id from member_membership mm
INNER JOIN membership m ON mm.membership_id = m.id
where instr(organization_chain, 2513) and m.valid_to > NOW() and m.valid_from < NOW() ) )
order by id desc
limit 10 offset 0
EXPLAIN FOR WHAT QUERY DOING: every member has many a member_memberships and and member_memberships connect with another table called membership there we have the membership details. so query will get all members that has valid memberships and where the organization id 2513 exist on member_membership.
Tables as following:
CREATE TABLE `member` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) DEFAULT NULL,
`last_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
CREATE TABLE `member_membership` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`membership_id` int(11) DEFAULT NULL,
`member_id` int(11) DEFAULT NULL,
`organization_chain` text DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `member_membership_to_membership` (`membership_id`),
KEY `member_membership_to_member` (`member_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
CREATE TABLE `membership` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`valid_to` datetime DEFAULT NULL,
`valid_from` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `valid_to` (`valid_to`),
KEY `valid_from` (`valid_from`),
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
ALTER TABLE `member_membership` ADD CONSTRAINT `member_membership_to_membership` FOREIGN KEY (`membership_id`) REFERENCES `membership` (`id`);
ALTER TABLE `member_membership` ADD CONSTRAINT `member_membership_to_member` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`);
Here with EXPLAIN statement => https://i.ibb.co/xjrcYWR/EXPLAIN.png
Relations
member has many member_membership
membership has manymember_membership
So member_membership is like join for tables member and membership.
Well I found a way to make it less to 800ms ... like this. Is this good way or maybe there is more we can do?
select *
from member
where (member.id in ( select member_id from member_membership mm FORCE INDEX (PRIMARY)
INNER JOIN membership m ON mm.membership_id = m.id
where instr(organization_chain, 2513) and m.valid_to > NOW() and m.valid_from < NOW() ) )
order by id desc
limit 10 offset 0
NEW UPDATE.. and I think this solve the issue.. 15ms :)
I added FORCE INDEX..
The FORCE INDEX hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the named indexes to find rows in the table.
select *
from member
where (member.id in ( select member_id from member_membership mm FORCE INDEX (member_membership_to_member)
INNER JOIN membership m FORCE INDEX (organization_to_membership) ON mm.membership_id = m.id
where instr(organization_chain, 2513) and m.valid_to > NOW() and m.valid_from < NOW() ) )
order by id desc
limit 10 offset 0
How big is organization_chain? If you don't need TEXT, use a reasonably sized VARCHAR so that it could be in an index. Better yet, is there some way to get 2513 in a column by itself?
Don't use id int(11) NOT NULL AUTO_INCREMENT, in a many-to-many table; rather have the two columns in PRIMARY KEY.
Put the ORDER BY and LIMIT in the subquery.
Don't use IN ( SELECT ...), use a JOIN.

Improving query performance with Join, Full Table Scan

I am trying to improve the query performance on a stats reporting website for a Battlefield game, and am having a little bit of trouble with a very specific query. The issue I am having is that EXPLAIN is stating this query is doing a full table scan. This is troublesome because I expect this table to get very large (potentially 1 million rows or more). I am using MySQL 5.7 as my database of choice.
Here is my table and Query: http://pastebin.com/DsiGe2UB
--
-- Table structure for table `player_kit`
--
CREATE TABLE `player_kit` (
`id` TINYINT UNSIGNED NOT NULL,
`pid` INT UNSIGNED NOT NULL,
`time` INT UNSIGNED NOT NULL DEFAULT 0,
`kills` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
`deaths` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
PRIMARY KEY(`pid`,`id`),
FOREIGN KEY(`pid`) REFERENCES player(`id`) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY(`id`) REFERENCES kit(`id`) ON DELETE RESTRICT ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `bf2stats`.`player_kit` ADD INDEX `reverse_ids` (`id`, `pid`);
--
-- My Full Scanning Query
-- SELECTS players, ordering them by kills and time in kit
--
SELECT p.name, p.rank, p.country, k.pid, k.kills, k.deaths, k.time
FROM player_kit AS k
INNER JOIN player AS p ON k.pid = p.id
WHERE k.id = 0 AND k.kills > 0
ORDER BY kills DESC, time DESC
LIMIT 0, 40
--
-- EXPLAIN results by MySQL
--
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE k NULL ref PRIMARY 1 const 75 32.11 Using index condition; Using where; Using filesort
1 SIMPLE p NULL eq_ref PRIMARY PRIMARY 4 bf2stats.k.pid 1 100.00 NULL
--
-- Additional Tables just in case, for reference
--
--
-- Table structure for table `kit`
--
CREATE TABLE `kit` (
`id` TINYINT UNSIGNED,
`name` VARCHAR(32) NOT NULL,
PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Table structure for table `player`
--
CREATE TABLE `player` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(32) UNIQUE NOT NULL,
`rank` TINYINT NOT NULL DEFAULT 0,
`country` CHAR(2) NOT NULL DEFAULT 'xx',
PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Here is the explain from phpMyAdmin:
I am hoping that one of you can help me improve the performance of this query, since any kind of index I have put on it does not seem to help much.
For this query:
SELECT p.name, p.rank, p.country, k.pid, k.kills, k.deaths, k.time
FROM player_kit k INNER JOIN
player p
ON k.pid = p.id
WHERE k.id = 0 AND k.kills > 0
ORDER BY kills DESC, time DESC
LIMIT 0, 40;
The optimal indexes are:
player_kit(id, kills, pid)
player(id) -- if this is not already there
You can also add the other columns in the index to get a covering index for the query.

Sorting result of mysql join by avg of third table?

I have three tables.
One table contains submissions which has about 75,000 rows
One table contains submission ratings and only has < 10 rows
One table contains submission => competition mappings and for my test data also has about 75,000 rows.
What I want to do is
Get the top 50 submissions in a round of a competition.
Top is classified as highest average rating, followed by highest amount of votes
Here is the query I am using which works, but the problem is that it takes over 45 seconds to complete! I profiled the query (results at bottom) and the bottlenecks are copying the data to a tmp table and then sorting it so how can I speed this up?
SELECT `submission_submissions`.*
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
submission_submissions
CREATE TABLE `submission_submissions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`description` varchar(255) DEFAULT NULL,
`genre` int(11) NOT NULL,
`goals` text,
`submission` text NOT NULL,
`date_created` datetime DEFAULT NULL,
`date_modified` datetime DEFAULT NULL,
`date_deleted` datetime DEFAULT NULL,
`cover_image` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `genre` (`genre`),
KEY `account_id` (`account_id`),
KEY `date_created` (`date_created`)
) ENGINE=InnoDB AUTO_INCREMENT=115037 DEFAULT CHARSET=latin1;
submission_ratings
CREATE TABLE `submission_ratings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`stars` tinyint(1) NOT NULL,
`date_created` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `submission_id` (`submission_id`),
KEY `account_id` (`account_id`),
KEY `stars` (`stars`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
competition_submissions
CREATE TABLE `competition_submissions` (
`competition_id` int(11) NOT NULL,
`submission_id` int(11) NOT NULL,
`top_round` int(11) DEFAULT '1',
PRIMARY KEY (`submission_id`),
KEY `competition_id` (`competition_id`),
KEY `top_round` (`top_round`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
SHOW PROFILE Result (ordered by duration)
state duration (summed) in sec percentage
Copying to tmp table 33.15621 68.46924
Sorting result 11.83148 24.43260
removing tmp table 3.06054 6.32017
Sending data 0.37560 0.77563
... insignificant amounts removed ...
Total 48.42497 100.00000
EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE competition_submissions index_merge PRIMARY,competition_id,top_round competition_id,top_round 4,5 18596 Using intersect(competition_id,top_round); Using where; Using index; Using temporary; Using filesort
1 SIMPLE submission_submissions eq_ref PRIMARY PRIMARY 4 inkstakes.competition_submissions.submission_id 1 Using where
1 SIMPLE submission_ratings ALL submission_id 5 Using where; Using join buffer (flat, BNL join)
Assuming that in reality you won't be interested in unrated submissions, and that a given submission only has a single competition_submissions entry for a given match and top_round, I suggest:
SELECT s.*
FROM (SELECT `submission_id`,
AVG(`stars`) AvgStars,
COUNT(`id`) CountId
FROM `submission_ratings`
GROUP BY `submission_id`
ORDER BY AVG(`stars`) DESC, COUNT(`id`) DESC
LIMIT 50) r
JOIN `submission_submissions` s
ON r.`submission_id` = s.`id` AND
s.`date_deleted` IS NULL
JOIN `competition_submissions` c
ON c.`submission_id` = s.`id` AND
c.`top_round` = 1 AND
c.`competition_id` = '2'
ORDER BY r.AvgStars DESC,
r.CountId DESC
(If there is more than one competition_submissions entry per submission for a given match and top_round, then you can add the GROUP BY clause back in to the main query.)
If you do want to see unrated submissions, you can union the results of this query to a LEFT JOIN ... WHERE NULL query.
There is a simple trick that works on MySql and helps to avoid copying/sorting huge temp tables in queries like this (with LIMIT X).
Just avoid SELECT *, this copies all columns to the temporary table, then this huge table is sorted, and in the end, the query takes only 50 records from this huge table ( 50 / 70000 = 0,07 % ).
Select only columns that are really necessary to perform sort and limit, and then join missing columns only for selected 50 records by id.
select ss.*
from submission_submissions ss
join (
SELECT `submission_submissions`.id,
AVG(submission_ratings.`stars`) stars,
COUNT(submission_ratings.`id`) cnt
FROM `submission_submissions`
JOIN `competition_submissions`
ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings`
ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
WHERE `top_round` = 1
AND `competition_id` = '2'
AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC,
COUNT(submission_ratings.`id`) DESC
LIMIT 50
) xx
ON ss.id = xx.id
ORDER BY xx.stars DESC,
xx.cnt DESC;

mysql select from table 1 join and order by table 2 , optimization filesort

read mysql post but couldnt find the solution.
Any pointers would be appreciated.
I have 2 tables A,B
Table A has folderid mapped to multiple feedid (1 to many )
Table B has feedid and data associated with it.
The following query gives Using index; Using temporary; Using filesort
SELECT A.feed_id,B.feed_id,B.entry_id
FROM feed_folder A
LEFT JOIN feed_data B on A.feed_id=B.feed_id
WHERE A.folder_id=29
AND B.entry_id <= 123
AND B.entry_created <= '2012-11-01 21:38:54'
ORDER by B.entry_created desc limit 0,20;
Any ideas as to how the temporary filesort can be avoided.
Following is the table structure
CREATE TABLE `feed_folder` (
`folder_id` bigint(20) unsigned NOT NULL,
`feed_id` bigint(20) unsigned NOT NULL,
UNIQUE KEY `folder_id` (`folder_id`,`feed_id`),
KEY `feed_id` (`feed_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `feed_data` (
`entry_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` bigint(20) unsigned NOT NULL,
`entry_created` datetime NOT NULL,
`entry_object` text,
`entry_permalink` varchar(150) NOT NULL,
`entry_orig_created` datetime NOT NULL,
PRIMARY KEY (`entry_id`),
UNIQUE KEY `feed_id_2` (`feed_id`,`entry_permalink`),
KEY `entry_created` (`entry_created`),
KEY `feed_id` (`feed_id`,`entry_created`),
KEY `feed_id_entry_id` (`feed_id`,`entry_id`),
KEY `entry_permalink` (`entry_permalink`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
Try this query:
SELECT * FROM(SELECT
A.feed_id,B.feed_id,B.entry_id
FROM feed_folder as A
INNER JOIN feed_data as B on A.feed_id=B.feed_id
WHERE A.folder_id=29
AND B.entry_id <= 123
AND B.entry_created <= '2012-11-01 21:38:54')joinAndSortTable
ORDER by B.entry_created desc limit 0,20;
The problem is in alias, you are forgot to give as A and as B
i'm also remind you about LEFT JOIN and RIGHT JOIN, INNER JOIN...
To join all records what you selected, Using INNER JOIN is a workaround.
Because INNER JOIN can join all columns with full accessed.
GOOD LUCK