I have a MySQL database with InnoDB tables summing up over 10 ten GB of data that I want to migrate from MySQL 5.5 to MySQL 5.7. And I have a query that looks a bit like:
SELECT dates.date, count(mySub2.myColumn1), sum(mySub2.myColumn2)
FROM (
SELECT date
FROM dates -- just a table containing all possible dates next 5 years
WHERE date BETWEEN '2016-06-01' AND '2016-09-03'
) AS dates
LEFT JOIN (
SELECT o.id, time_start, time_end
FROM order AS o
INNER JOIN order_items AS oi on oi.order_id = o.id
WHERE time_start BETWEEN '2016-06-01' AND '2016-09-03'
) AS mySub1 ON dates.date >= mySub1.time_start AND dates.date < mySub1.time_end
LEFT JOIN (
SELECT o.id, time_start, time_end
FROM order AS o
INNER JOIN order_items AS oi on oi.order_id = o.id
WHERE o.shop_id = 50 AND time_start BETWEEN '2016-06-01' AND '2016-09-03'
) AS mySub2 ON dates.date >= mySub2.time_start AND dates.date < mySub2.time_end
GROUP BY dates.date;
My problem is that this query is performing fast in MySQL 5.5 but extremely slow in MySQL 5.7.
In MySQL 5.5 it is taking over 1 second at first and < 0.001 seconds every recurring execution without restarting MySQL.
In MySQL 5.7 it is taking over 11.5 seconds at first and 1.4 seconds every recurring execution without restarting MySQL.
And the more LEFT JOINs I add to the query, the slower the query becomes in MySQL 5.7.
Both instances now run on the same machine, on the same hard drive and with the same my.ini settings. So it isn't hardware.
The execution plans do differ, though and I don't know what to make from it.
This is the EXPLAIN EXTENDED on MySQL 5.5:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | extra |
|----|-------------|------------|-------|---------------|-------------|---------|-----------|-------|----------|---------------------------------|
| 1 | PRIMARY | dates | ALL | | | | | 95 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ALL | | | | | 281 | 100.00 | '' |
| 1 | PRIMARY | <derived3> | ALL | | | | | 100 | 100.00 | '' |
| 3 | DERIVED | o | ref | xxxxxx | shop_id_fk | 4 | '' | 1736 | 100.00 | '' |
| 3 | DERIVED | oc | ref | xxxxx | order_id_fk | 4 | myDb.o.id | 1 | 100.00 | Using index |
| 2 | DERIVED | o | range | xxxx | date_start | 3 | | 17938 | 100.00 | Using where |
| 2 | DERIVED | oc | ref | xxx | order_id_fk | 4 | myDb.o.id | 1 | 100.00 | Using where |
This is the EXPLAIN EXTENDED on MySQL 5.7:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | extra |
|----|-------------|-------|--------|---------------|-------------|---------|------------------|------|----------|----------------|
| 1 | SIMPLE | dates | ALL | | | | | 95 | 100.00 | Using filesort |
| 1 | SIMPLE | oi | ref | xxxxxx | order_id_fk | 4 | const | 228 | 100.00 | |
| 1 | SIMPLE | o | eq_ref | xxxxx | PRIMARY | 4 | myDb.oi.order_id | 1 | 100.00 | Using where |
| 1 | SIMPLE | o | ref | xxxx | shop_id_fk | 4 | const | 65 | 100.00 | Using where |
| 1 | SIMPLE | oi | ref | xxx | order_id_fk | 4 | myDb.o.id | 1 | 100.00 | Using where |
I want to understand why the MySQLs treat the same query that much different, and how I can tweak MySQL 5.7 to be faster?
I'm not looking for help on rewriting the query to be faster, as that is something I am already doing on my own.
As can be read in the comments, #wchiquito has suggested to look at the optimizer_switch. In here I found that the switch derived_merge could be set to off, to fix this new, and in this specific case undesired, behaviour.
set session optimizer_switch='derived_merge=off'; fixes the problem.
(This can also be done with set global ... or be put in the my.cnf / my.ini)
Building and maintaining a "Summary Table" would make this query run much faster than even 1 second.
Such a table would probably include shop_id, date, and some count.
More on summary tables.
I too faced slow query execution issue after migrating to mysql 5.7 and in my case, even setting session optimizer_switch to 'derived_merge=off'; didn't help.
Then, I followed this link: https://www.saotn.org/mysql-innodb-performance-improvement/ and the query's speed became normal.
To be specific my change was just setting these four parameters in my.ini as described in the link:
innodb_buffer_pool_size
innodb_buffer_pool_instances
innodb_write_io_threads
innodb_read_io_threads
Related
I'm investigating some long running queries in my PRODUCTION mysql 5.7 database. 1 particular query is taking over 60 seconds.
My usual approach is to take a dump of the data from PROD, import it into a DEV database, reproduce the issue, then analyse and try out some tweaks to the query.
However, the exact same query in DEV is taking less than a second.
Obviously, the mysql configuration, table structure, record numbers, etc are all the same as in PROD.
The query itself is a select with joins across 3 tables with a where clause on each table; 2 of the tables have approx 15m records in them. My initial suspicion was the lack of indexes on the queried columns, but the fact that in DEV it runs very fast would appear to disprove that.
What can I do to shed some light on this?
EXPLAIN results of my query:
PROD
EXPLAIN select this_.id as y0_ from event this_ inner join member m1_ on this_.member_id=m1_.id inner join event_type et2_ on this_.type_id=et2_.id where m1_.submission_id=40646 and this_.status in ('SUPPRESSED') and et2_.name in ('Salary') order by m1_.ni_number asc, m1_.ident1 asc, m1_.ident2 asc, m1_.ident3 asc, m1_.id asc, et2_.name asc limit 15;
+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | et2_ | NULL | ALL | PRIMARY | NULL | NULL | NULL | 17 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | this_ | NULL | ref | FK5C6729A2434DA80,FK5C6729AE4E22C6E | FK5C6729AE4E22C6E | 8 | iconnect.et2_.id | 4166 | 10.00 | Using where |
| 1 | SIMPLE | m1_ | NULL | eq_ref | PRIMARY,IND_submission_id | PRIMARY | 8 | iconnect.this_.member_id | 1 | 5.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+
3 rows in set, 1 warning (0.00 sec)
DEV
EXPLAIN select this_.id as y0_ from event this_ inner join member m1_ on this_.member_id=m1_.id inner join event_type et2_ on this_.type_id=et2_.id where m1_.submission_id=40646 and this_.status in ('SUPPRESSED') and et2_.name in ('Salary') order by m1_.ni_number asc, m1_.ident1 asc, m1_.ident2 asc, m1_.ident3 asc, m1_.id asc, et2_.name asc limit 15;
+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | et2_ | NULL | ALL | PRIMARY | NULL | NULL | NULL | 17 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | m1_ | NULL | ref | PRIMARY,IND_submission_id | IND_submission_id | 8 | const | 26644 | 100.00 | NULL |
| 1 | SIMPLE | this_ | NULL | ref | FK5C6729A2434DA80,FK5C6729AE4E22C6E | FK5C6729A2434DA80 | 8 | iconnect.m1_.id | 2 | 1.86 | Using where |
+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+
3 rows in set, 1 warning (0.03 sec)
Have also spotted that the Cardinality of some of indexes accessed by this query are massively different between DEV and PROD:
FK5C6729AE4E22C6E: DEV=9, PROD=3792
IND_submission_id: DEV=2490, PROD=74220
Could this be impacting performance in PROD?
Query inefficiencies down to the tables containing more data than the index pages can hold. Increasing
innodb_stats_persistent_sample_pages
from 20 to 100, then running ANALYZE TABLE changed the execution plan for the query to be as expected, then running the query took less than 1 second.
today I stumbled upon two different form of the same query (which return the very same result) but that execute in very different durations:
ORIGINAL QUERY:
select count(distinct unit.ID)
from UNIT unit
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
or
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
);
plan:
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| 1 | SIMPLE | unit | NULL | ALL | FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | NULL | NULL | NULL | 451486 | 100.00 | NULL |
| 1 | SIMPLE | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID | 9 | edea2.unit.ID | 1 | 100.00 | Using where |
| 1 | SIMPLE | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
duration:
+-------------------------+
| count(distinct unit.ID) |
+-------------------------+
| 538 |
+-------------------------+
1 row in set (2.46 sec)
Then, I observed that when executing this query with only one predicate of where, the duration is sensibly decreased.
So I had the idea to rewrite it in a new style:
select count(distinct unit_root.ID)
from UNIT unit_root
where
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
)
or
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
where
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
)
);
plan:
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| 1 | PRIMARY | unit_root | NULL | index | PRIMARY,FK_UNIT_RESPONSIBLE_ID,FK_UNIT_WORKFLOW_EXECUTION_ID,IX_UNIT_EXPECTEDRELEASEDATE | IX_UNIT_EXPECTEDRELEASEDATE | 6 | NULL | 451486 | 100.00 | Using where; Using index |
| 3 | SUBQUERY | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID,FK_AUTHORIZATION_SUBJECT_ID,IX_AUTHORIZATION_INTERVENTION | FK_AUTHORIZATION_SUBJECT_ID | 8 | const | 1 | 50.00 | Using where |
| 3 | SUBQUERY | unit | NULL | eq_ref | PRIMARY,FK_UNIT_WORKFLOW_EXECUTION_ID | PRIMARY | 8 | edea2.auth.UNIT_ID | 1 | 100.00 | NULL |
| 3 | SUBQUERY | exec | NULL | eq_ref | PRIMARY,IX_WORKFLOW_EXECUTION_ACTIVE | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 26.47 | Using where |
| 2 | SUBQUERY | unit | NULL | ref | PRIMARY,FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | FK_UNIT_RESPONSIBLE_ID | 8 | const | 225743 | 100.00 | NULL |
| 2 | SUBQUERY | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
duration:
+------------------------------+
| count(distinct unit_root.ID) |
+------------------------------+
| 538 |
+------------------------------+
1 row in set (0.51 sec)
Finally, the questions:
Why there's such a difference? Shouldn't the optimizer be able to ptimize this kind of queries?
Is there a tip to quickly individuate this kind of queries without having to measure execution time or investigating query plan?
Any tips on how to rewite to faster style?
Note that I'm using MySQL 5.7.10 and these queries are generated by Hibernate.
Thank you
You have too many questions, but here is some background.
First, you are asking way too much of the optimizer. I haven't looked through the details of your queries, but they may not be exactly the same logically. For instance, NULL values or duplicate values in a table might cause differences.
Second, although it is often said (by me and many others) that SQL is a descriptive language and not a procedural language. However, this doesn't extend to query re-writes. It usually means that the optimizer can re-arrange joins, use indexes, choose appropriate algorithms for joins and aggregations, decide the optimal place do filtering, and a few other things. However, the basic structure of the processing is put in place.
In terms of hints. The use of GROUP BY and DISTINCT slow queries down. Don't avoid them! They are necessary parts of the language. But as you discover in these two queries, there can be a savings. Another danger is OR, because it can prevent the use of relevant indexes.
Judging by the complexity of logic in the subqueries, it might be possible to further optimize the queries.
Finally, if you want to write efficient queries, you cannot depend on the query optimizer. You will need to put the work in to learn how the query optimizer works, as well as something about the algorithms used for different components of the query, and fundamental optimization techniques such as indexing and partitioning.
I don't have a lot of experience yet with MySQL and with databases in general, though I'm going head on into the development of a large-scale web app anyway. The following is the search query for my app that allows users to search for other users. Now that the primary table for this query dev_Profile has about 14K rows, the query is considerably slow (about 5 secs when running a query that returns the largest set possible). I'm sure there are many optimization tweaks that could be made here, but would creating an index be the most fundamental first step to do here? I've been first trying to learn about indexes on my own, and how to make an index for a query with multiple joins, but I'm just not quite grasping it. I'm hoping that seeing things in the context of my actual query could be more educational.
Here's the basic query:
SELECT
dev_Profile.ID AS pid,
dev_Profile.Name AS username,
IF(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP()) > 300 OR ISNULL(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP())), 0, 1) AS online,
FLOOR(DATEDIFF(CURRENT_DATE, dev_Profile.DOB) / 365) AS age,
IF(dev_Profile.GenderID=1, 'M', 'F') AS sex,
IF(ISNULL(st2.Description), 0, st2.Description) AS relStatus,
st3.Name AS country,
IF(dev_Profile.RegionID > 0, st4.Name, 0) AS region,
IF(dev_Profile.CityID > 0, st5.Name, 0) AS city,
IF(ISNULL(st6.filename), 0, IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, 1, 0)) AS hasPhoto,
IF(ISNULL(st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'), IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, CONCAT('http://www.mysite.com/uploads/', st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'))) AS photo,
IF(ISNULL(dev_Profile.StatusMessage), IF(ISNULL(dev_Profile.AboutMe), IF(ISNULL(st7.AboutMyMatch), 0, st7.AboutMyMatch), dev_Profile.AboutMe), dev_Profile.StatusMessage) AS text
FROM
dev_Profile
LEFT JOIN dev_User AS st1 ON st1.ID = dev_Profile.UserID
LEFT JOIN dev_ProfileRelationshipStatus AS st2 ON st2.ID = dev_Profile.ProfileRelationshipStatusID
LEFT JOIN Country AS st3 ON st3.ID = dev_Profile.CountryID
LEFT JOIN Region AS st4 ON st4.ID = dev_Profile.RegionID
LEFT JOIN City AS st5 ON st5.ID = dev_Profile.CityID
LEFT JOIN dev_Photos AS st6 ON st6.ID = dev_Profile.PhotoAvatarID
LEFT JOIN dev_DesiredMatch AS st7 ON st7.ProfileID = dev_Profile.ID
WHERE
dev_Profile.ID != 11222 /* $_SESSION['ProfileID'] */
AND st1.EmailVerified = 'true'
AND st1.accountIsActive=1
ORDER BY st1.lastActivityTime DESC LIMIT 900;
The speed of this query (too slow, as you can see):
900 rows in set (5.20 sec)
The EXPLAIN for this query:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CountryID | 1 | |
| 1 | SIMPLE | st4 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.RegionID | 1 | |
| 1 | SIMPLE | st5 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CityID | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
It's also possible that the query can have more WHERE and HAVING clauses, if a user's search contains additional criteria. The additional clauses are (set with example values):
AND dev_Profile.GenderID = 1
AND dev_Profile.CountryID=127
AND dev_Profile.RegionID=36
AND dev_Profile.CityID=601
HAVING (age >= 18 AND age <= 50)
AND online=1
AND hasPhoto=1
This is the EXPLAIN for the query using all possible WHERE and HAVING clauses:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st4 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st5 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
I'm not even sure if this is TMI or not enough.
Is an index the right step to take here? If so, could someone get me going in the right direction?
The right step is whatever speeds up your query!
With your original query, I would say that you end up doing a table scan on the dev_Profile table as there are no indexable conditions on it. With your modified query, it depends on the number of different values allowed in the column - if there are may duplicates then the index may not get used as it has to fetch the table anyway in order to complete the rest of the query.
I have read your plan correctly then you are joining all of your other tables on an indexed non-nullable column already (except for st7, which doesn't seem to be using an index for some reason). It therefore looks as if you should not be using left joins. This would then allow the use of an index on (EmailVerified, accountIsActive, lastActivityTime) on table st1.
One should use indexes that are relevant for frequent queries. An index just slightly degrades write-performance while immensely speeds searches. As a rule of thumb, objects own IDs should be indexed as PRIMARY key and it's a good idea to have an index on column-groups that appear always together in a query. I figure you should index GenderID, CountryID, RegionID, CityID, age, online and hasPhoto. You should provide the schema of at least dev_Profile if you think that the right indexes are not used.
Notice that country/region/city IDs might represent redundant information. Your design may be suboptimal.
Notice2: you're doing awfully lot of application logic in SELECT. SQL is not designed for these lots of IF-in-IF-in-IF clauses, and because of the URLs the query is returning much larger a table than if would if you just requested the relevant field (i.e. filename, genderID, and so on). There might be times when those exact interpreted values have to be returned by the query, by in general you are better off (in the aspects of speed and readability) to code these processing steps into your application code.
I have a query that is taking way too long to execute (4 seconds) even though all the fields i am querying against are indexed. Below are the query and the explain results. Any ideas what the problem is? (mysql CPU usage shoots up to 100% when executing the query
EXPLAIN SELECT count(hd.did) as NumPo, `hd`.`sid`, `src`.`Name`
FROM (`hd`)
JOIN `result` ON `result`.`did` = `hd`.`did`
JOIN `sf` ON `sf`.`fid` = `hd`.`fid`
JOIN `src` ON `src`.`sid` = `hd`.`sid`
WHERE `sf`.`tid` = 2
AND `result`.`set` = 'xxxxxxx'
GROUP BY `hd`.`sid`
ORDER BY `NumPo` DESC
LIMIT 10;
+----+-------------+--------------+--------+-------------------------+---------+---------+--------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-------------------------+---------+---------+--------------------------+------+----------------------------------------------+
| 1 | SIMPLE | sf | ref | PRIMARY,type | type | 2 | const | 4 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | hd | ref | PRIMARY,sid,fid | FeedID | 4 | f2.sf.fid | 3 | |
| 1 | SIMPLE | result | ALL | resultset | NULL | NULL | NULL | 5322 | Using where; Using join buffer |
| 1 | SIMPLE | src | eq_ref | PRIMARY | PRIMARY | 4 | f2.hd.sid | 1 | |
+----+-------------+--------------+--------+-------------------------+---------+---------+--------------------------+------+----------------------------------------------+
| 1 | SIMPLE | result | ALL | resultset | NULL | NULL | NULL | 5322 | Using where; Using join buffer |
It looks like it's not using an index on the biggest table. I'm having trouble guessing what this query is supposed to do, but it looks like you have an index on result.set, so I'd try adding one to result.did and see if it helps.
I've recently noticed that a query I have is running quite slowly, at almost 1 second per query.
The query looks like this
SELECT eventdate.id,
eventdate.eid,
eventdate.date,
eventdate.time,
eventdate.title,
eventdate.address,
eventdate.rank,
eventdate.city,
eventdate.state,
eventdate.name,
source.link,
type,
eventdate.img
FROM source
RIGHT OUTER JOIN
(
SELECT event.id,
event.date,
users.name,
users.rank,
users.eid,
event.address,
event.city,
event.state,
event.lat,
event.`long`,
GROUP_CONCAT(types.type SEPARATOR ' | ') AS type
FROM event FORCE INDEX (latlong_idx)
JOIN users ON event.uid = users.id
JOIN types ON users.tid=types.id
WHERE `long` BETWEEN -74.36829174058 AND -73.64365405942
AND lat BETWEEN 40.35195025942 AND 41.07658794058
AND event.date >= '2009-10-15'
GROUP BY event.id, event.date
ORDER BY event.date, users.rank DESC
LIMIT 0, 20
)eventdate
ON eventdate.uid = source.uid
AND eventdate.date = source.date;
and the explain is
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
| 1 | PRIMARY | | ALL | NULL | NULL | NULL | NULL | 20 | |
| 1 | PRIMARY | source | ref | iddate_idx | iddate_idx | 7 | eventdate.id,eventdate.date | 156 | |
| 2 | DERIVED | event | ALL | latlong_idx | NULL | NULL | NULL | 19500 | Using temporary; Using filesort |
| 2 | DERIVED | types | ref | eid_idx | eid_idx | 4 | active.event.id | 10674 | Using index |
| 2 | DERIVED | users | eq_ref | id_idx | id_idx | 4 | active.types.id | 1 | Using where |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+-------+---------------------------------+
I've tried using 'force index' on latlong, but that doesn't seem to speed things up at all.
Is it the derived table that is causing the slow responses? If so, is there a way to improve the performance of this?
--------EDIT-------------
I've attempted to improve the formatting to make it more readable, as well
I run the same query changing only the 'WHERE statement as
WHERE users.id = (
SELECT users.id
FROM users
WHERE uidname = 'frankt1'
ORDER BY users.approved DESC , users.rank DESC
LIMIT 1 )
AND date & gt ; = '2009-10-15'
GROUP BY date
ORDER BY date)
That query runs in 0.006 seconds
the explain looks like
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
| 1 | PRIMARY | | ALL | NULL | NULL | NULL | NULL | 42 | |
| 1 | PRIMARY | source | ref | iddate_idx | iddate_idx | 7 | eventdate.id,eventdate.date | 156 | |
| 2 | DERIVED | users | const | id_idx | id_idx | 4 | | 1 | |
| 2 | DERIVED | event | range | eiddate_idx | eiddate_idx | 7 | NULL | 24 | Using where |
| 2 | DERIVED | types | ref | eid_idx | eid_idx | 4 | active.event.bid | 3 | Using index |
| 3 | SUBQUERY | users | ALL | idname_idx | idname_idx | 767 | | 5 | Using filesort |
+----+-------------+------------+-------+---------------+---------------+---------+------------------------------+------+----------------+
The only way to clean up that mammoth SQL statement is to go back to the drawing board and carefully work though your database design and requirements. As soon as you start joining 6 tables and using an inner select you should expect incredible execution times.
As a start, ensure that all your id fields are indexed, but better to ensure that your design is valid. I don't know where to START looking at your SQL - even after I reformatted it for you.
Note that 'using indexes' means you need to issue the correct instructions when you CREATE or ALTER the tables you are using. See for instance MySql 5.0 create indexes