MySQL view taking too much time to select data - mysql

In the web page that I'm working on I need to show some statistics based on a different user details which are in three tables. So I have the following query that I join to more different tables:
SELECT *
FROM `user` `u`
LEFT JOIN `subscriptions` `s` ON `u`.`user_id` = `s`.`user_id`
LEFT JOIN `devices` `ud` ON `u`.`user_id` = `ud`.`user_id`
GROUP BY `u`.`user_id`
When I execute the query with LIMIT 1000 it takes about 0.05 seconds and since I'm using the data from all the three tables in a lot of queries I've decided to put it inside a VIEW:
CREATE VIEW `user_details` AS ( the same query from above )
And now when I run:
SELECT * FROM user_details LIMIT 1000
it takes about 7-10 seconds.
So my question is can I do something to optimize the view because the query seems to be pretty quick or I should the whole query instead of the view ?
Edit: this is what EXPLAIN SELECT * FROM user_details returns
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 322666 | |
| 2 | DERIVED | u | index | NULL | PRIMARY | 4 | NULL | 372587 | |
| 2 | DERIVED | s | eq_ref | PRIMARY | PRIMARY | 4 | db_users.u.user_id | 1 | |
| 2 | DERIVED | ud | ref | device_id_name | device_id_name | 4 | db_users.u.user_id | 1 | |
+----+-------------+------------+--------+----------------+----------------+---------+------------------------+--------+-------+
4 rows in set (8.67 sec)
this is what explain retuns for the query:
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
| 1 | SIMPLE | u | index | NULL | PRIMARY | 4 | NULL | 372587 | |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | db_users.u.user_id | 1 | |
| 1 | SIMPLE | ud | ref | device_id_name | device_id_name | 4 | db_users.u.user_id | 1 | |
+----+-------------+-------+--------+----------------+----------------+---------+------------------------+--------+-------+
3 rows in set (0.00 sec)

Views and joins are extremely bad if it comes to performance. This is more or less true for all relational database management systems. Sounds strange, since that is what those systems are designed for, but it is true nevertheless.
Try to avoid the joins if this is a query in heavy usage on your page: instead create a real table (not a view) that is filled from the three tables. you can automate that process using triggers. So each time an entry is inserted into one of the original tables the triggers takes care that the data is propagated to the physical user_details table.
This strategy certainly means a one time investment for the setup, but you definitely will get a much better performance.

Related

MySQL query performance: individuate and rewrite faster query

today I stumbled upon two different form of the same query (which return the very same result) but that execute in very different durations:
ORIGINAL QUERY:
select count(distinct unit.ID)
from UNIT unit
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
or
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
);
plan:
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
| 1 | SIMPLE | unit | NULL | ALL | FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | NULL | NULL | NULL | 451486 | 100.00 | NULL |
| 1 | SIMPLE | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID | 9 | edea2.unit.ID | 1 | 100.00 | Using where |
| 1 | SIMPLE | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------------------------------------+-------------------------------------+---------+----------------------------------+--------+----------+-------------+
duration:
+-------------------------+
| count(distinct unit.ID) |
+-------------------------+
| 538 |
+-------------------------+
1 row in set (2.46 sec)
Then, I observed that when executing this query with only one predicate of where, the duration is sensibly decreased.
So I had the idea to rewrite it in a new style:
select count(distinct unit_root.ID)
from UNIT unit_root
where
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
where
(
unit.RESPONSIBLE_ID=2
and
(
(
unit.STATUS<>'CLOSED'
and
unit.EXPECTEDRELEASEDATE is not null
)
or
exec.ACTIVE=1
)
)
)
or
unit_root.ID in
(
select unit.ID
from UNIT unit
left outer join WORKFLOW_EXECUTION exec on unit.WORKFLOW_EXECUTION_ID=exec.ID
left outer join AUTHORIZATION auth on unit.ID=auth.UNIT_ID
where
(
exec.ACTIVE=1
and
auth.INTERVENTION=1
and
auth.SUBJECT_ID=2
)
);
plan:
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
| 1 | PRIMARY | unit_root | NULL | index | PRIMARY,FK_UNIT_RESPONSIBLE_ID,FK_UNIT_WORKFLOW_EXECUTION_ID,IX_UNIT_EXPECTEDRELEASEDATE | IX_UNIT_EXPECTEDRELEASEDATE | 6 | NULL | 451486 | 100.00 | Using where; Using index |
| 3 | SUBQUERY | auth | NULL | ref | UK_AUTHORIZATION_UNIT_ID_SUBJECT_ID,FK_AUTHORIZATION_UNIT_ID,FK_AUTHORIZATION_SUBJECT_ID,IX_AUTHORIZATION_INTERVENTION | FK_AUTHORIZATION_SUBJECT_ID | 8 | const | 1 | 50.00 | Using where |
| 3 | SUBQUERY | unit | NULL | eq_ref | PRIMARY,FK_UNIT_WORKFLOW_EXECUTION_ID | PRIMARY | 8 | edea2.auth.UNIT_ID | 1 | 100.00 | NULL |
| 3 | SUBQUERY | exec | NULL | eq_ref | PRIMARY,IX_WORKFLOW_EXECUTION_ACTIVE | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 26.47 | Using where |
| 2 | SUBQUERY | unit | NULL | ref | PRIMARY,FK_UNIT_RESPONSIBLE_ID,IX_UNIT_STATUS,IX_UNIT_EXPECTEDRELEASEDATE | FK_UNIT_RESPONSIBLE_ID | 8 | const | 225743 | 100.00 | NULL |
| 2 | SUBQUERY | exec | NULL | eq_ref | PRIMARY | PRIMARY | 8 | edea2.unit.WORKFLOW_EXECUTION_ID | 1 | 100.00 | Using where |
+----+-------------+-----------+------------+--------+---------------------------------------------------------------------------------------------------------------------------+-----------------------------+---------+----------------------------------+--------+----------+--------------------------+
duration:
+------------------------------+
| count(distinct unit_root.ID) |
+------------------------------+
| 538 |
+------------------------------+
1 row in set (0.51 sec)
Finally, the questions:
Why there's such a difference? Shouldn't the optimizer be able to ptimize this kind of queries?
Is there a tip to quickly individuate this kind of queries without having to measure execution time or investigating query plan?
Any tips on how to rewite to faster style?
Note that I'm using MySQL 5.7.10 and these queries are generated by Hibernate.
Thank you
You have too many questions, but here is some background.
First, you are asking way too much of the optimizer. I haven't looked through the details of your queries, but they may not be exactly the same logically. For instance, NULL values or duplicate values in a table might cause differences.
Second, although it is often said (by me and many others) that SQL is a descriptive language and not a procedural language. However, this doesn't extend to query re-writes. It usually means that the optimizer can re-arrange joins, use indexes, choose appropriate algorithms for joins and aggregations, decide the optimal place do filtering, and a few other things. However, the basic structure of the processing is put in place.
In terms of hints. The use of GROUP BY and DISTINCT slow queries down. Don't avoid them! They are necessary parts of the language. But as you discover in these two queries, there can be a savings. Another danger is OR, because it can prevent the use of relevant indexes.
Judging by the complexity of logic in the subqueries, it might be possible to further optimize the queries.
Finally, if you want to write efficient queries, you cannot depend on the query optimizer. You will need to put the work in to learn how the query optimizer works, as well as something about the algorithms used for different components of the query, and fundamental optimization techniques such as indexing and partitioning.

Mysql 5.6 optimizer doesn't use indexes in small tables joins

We have two tables - the first is relatively big (contact table) 250k rows and the second is small(user table, < 10 rows). On mysql 5.6 version I have next explain result:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+-------+---------------+----------------------+---------+------+--------+----------------------------------------------------+
2 rows in set (0,00 sec)
When i use force index for join:
EXPLAIN SELECT
o0_.id AS id_0,
o8_.first_name,
o8_.last_name
FROM
contact o0_
LEFT JOIN user o8_ force index for join(`PRIMARY`) ON o0_.user_owner_id = o8_.id
LIMIT
25 OFFSET 100
or adding indexes on fields which appears in select clause (first_name, last_name) on user table:
alter table user add index(first_name, last_name);
Explain result changes to this:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 253030 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | NULL |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0,00 sec)
On mysql 5.5 version I have same explain result without additional indexes:
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
| 1 | SIMPLE | o0_ | index | NULL | IDX_403263ED9EB185F9 | 5 | NULL | 255706 | Using index |
| 1 | SIMPLE | o8_ | eq_ref | PRIMARY | PRIMARY | 4 | o0_.user_owner_id | 1 | |
+----+-------------+-------+--------+---------------+----------------------+---------+-------------------------+--------+-------------+
2 rows in set (0.00 sec)
Why i need force use PRIMARY index or add extra indexes on mysql 5.6 version?
Same behavior occurs with other selects, when join small tables.
If you have a table with so few rows, it may actually be faster to do a full table scan, than going to an index, locate the records and then go back to the table. If you have other fields in the user table apart from the 3 in the query, then you may consider adding a covering index, but franly, I do not think that any of this would have significant affect on the speed of the query.

Attempting to create index for MySQL query

I don't have a lot of experience yet with MySQL and with databases in general, though I'm going head on into the development of a large-scale web app anyway. The following is the search query for my app that allows users to search for other users. Now that the primary table for this query dev_Profile has about 14K rows, the query is considerably slow (about 5 secs when running a query that returns the largest set possible). I'm sure there are many optimization tweaks that could be made here, but would creating an index be the most fundamental first step to do here? I've been first trying to learn about indexes on my own, and how to make an index for a query with multiple joins, but I'm just not quite grasping it. I'm hoping that seeing things in the context of my actual query could be more educational.
Here's the basic query:
SELECT
dev_Profile.ID AS pid,
dev_Profile.Name AS username,
IF(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP()) > 300 OR ISNULL(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP())), 0, 1) AS online,
FLOOR(DATEDIFF(CURRENT_DATE, dev_Profile.DOB) / 365) AS age,
IF(dev_Profile.GenderID=1, 'M', 'F') AS sex,
IF(ISNULL(st2.Description), 0, st2.Description) AS relStatus,
st3.Name AS country,
IF(dev_Profile.RegionID > 0, st4.Name, 0) AS region,
IF(dev_Profile.CityID > 0, st5.Name, 0) AS city,
IF(ISNULL(st6.filename), 0, IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, 1, 0)) AS hasPhoto,
IF(ISNULL(st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'), IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, CONCAT('http://www.mysite.com/uploads/', st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'))) AS photo,
IF(ISNULL(dev_Profile.StatusMessage), IF(ISNULL(dev_Profile.AboutMe), IF(ISNULL(st7.AboutMyMatch), 0, st7.AboutMyMatch), dev_Profile.AboutMe), dev_Profile.StatusMessage) AS text
FROM
dev_Profile
LEFT JOIN dev_User AS st1 ON st1.ID = dev_Profile.UserID
LEFT JOIN dev_ProfileRelationshipStatus AS st2 ON st2.ID = dev_Profile.ProfileRelationshipStatusID
LEFT JOIN Country AS st3 ON st3.ID = dev_Profile.CountryID
LEFT JOIN Region AS st4 ON st4.ID = dev_Profile.RegionID
LEFT JOIN City AS st5 ON st5.ID = dev_Profile.CityID
LEFT JOIN dev_Photos AS st6 ON st6.ID = dev_Profile.PhotoAvatarID
LEFT JOIN dev_DesiredMatch AS st7 ON st7.ProfileID = dev_Profile.ID
WHERE
dev_Profile.ID != 11222 /* $_SESSION['ProfileID'] */
AND st1.EmailVerified = 'true'
AND st1.accountIsActive=1
ORDER BY st1.lastActivityTime DESC LIMIT 900;
The speed of this query (too slow, as you can see):
900 rows in set (5.20 sec)
The EXPLAIN for this query:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CountryID | 1 | |
| 1 | SIMPLE | st4 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.RegionID | 1 | |
| 1 | SIMPLE | st5 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.CityID | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
It's also possible that the query can have more WHERE and HAVING clauses, if a user's search contains additional criteria. The additional clauses are (set with example values):
AND dev_Profile.GenderID = 1
AND dev_Profile.CountryID=127
AND dev_Profile.RegionID=36
AND dev_Profile.CityID=601
HAVING (age >= 18 AND age <= 50)
AND online=1
AND hasPhoto=1
This is the EXPLAIN for the query using all possible WHERE and HAVING clauses:
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | dev_Profile | range | PRIMARY | PRIMARY | 4 | NULL | 13503 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | st2 | eq_ref | PRIMARY | PRIMARY | 1 | syk.dev_Profile.ProfileRelationshipStatusID | 1 | |
| 1 | SIMPLE | st3 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st4 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st5 | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | st1 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.UserID | 1 | Using where |
| 1 | SIMPLE | st6 | eq_ref | PRIMARY | PRIMARY | 4 | syk.dev_Profile.PhotoAvatarID | 1 | |
| 1 | SIMPLE | st7 | ALL | NULL | NULL | NULL | NULL | 442 | |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
I'm not even sure if this is TMI or not enough.
Is an index the right step to take here? If so, could someone get me going in the right direction?
The right step is whatever speeds up your query!
With your original query, I would say that you end up doing a table scan on the dev_Profile table as there are no indexable conditions on it. With your modified query, it depends on the number of different values allowed in the column - if there are may duplicates then the index may not get used as it has to fetch the table anyway in order to complete the rest of the query.
I have read your plan correctly then you are joining all of your other tables on an indexed non-nullable column already (except for st7, which doesn't seem to be using an index for some reason). It therefore looks as if you should not be using left joins. This would then allow the use of an index on (EmailVerified, accountIsActive, lastActivityTime) on table st1.
One should use indexes that are relevant for frequent queries. An index just slightly degrades write-performance while immensely speeds searches. As a rule of thumb, objects own IDs should be indexed as PRIMARY key and it's a good idea to have an index on column-groups that appear always together in a query. I figure you should index GenderID, CountryID, RegionID, CityID, age, online and hasPhoto. You should provide the schema of at least dev_Profile if you think that the right indexes are not used.
Notice that country/region/city IDs might represent redundant information. Your design may be suboptimal.
Notice2: you're doing awfully lot of application logic in SELECT. SQL is not designed for these lots of IF-in-IF-in-IF clauses, and because of the URLs the query is returning much larger a table than if would if you just requested the relevant field (i.e. filename, genderID, and so on). There might be times when those exact interpreted values have to be returned by the query, by in general you are better off (in the aspects of speed and readability) to code these processing steps into your application code.

same query, dramatic different performances on different data. MySQL

So, I have this huge database with 40 Millions entries.
the query is a simple (a_view is a view!)
select * from a_view where id > x LIMIT 10000
this is the behavior I get: If x is a little number (int) the query is super fast.
when x > 29 Millions the query starts to take minutes. if it is closer to 30 Millions it takes hours. and so on...
why is that? what can I do to avoid this?
I am using InnoDB as engine, tables have indexes.
the value of the limit is a critical one, it affects performances. if it is small the query is always fast. but if x is close to 30Millions then I need to be very careful to set it not too big (less than 300 hundreds), and still it is quite slow, but doesn't take forever
If you need more details, feel free to ask.
EDIT: here is the explain
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+
| 1 | SIMPLE | aH | index | PRIMARY | PRIMARY | 39 | NULL | 3028439 | Using index |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 4 | odb.aH.albumID | 1 | Using where |
| 1 | SIMPLE | aHT | ref | PRIMARY,albumID | albumID | 4 | odb.a.albumID | 4 | |
| 1 | SIMPLE | t | eq_ref | PRIMARY | PRIMARY | 4 | odb.aHT.id | 1 | Using where |
| 1 | SIMPLE | g | eq_ref | PRIMARY | PRIMARY | 4 | odb.t.genre | 1 | |
| 1 | SIMPLE | ar | eq_ref | PRIMARY | PRIMARY | 4 | odb.t.artist | 1 | |
+----+-------------+-------+--------+-----------------+---------+---------+---------------------+---------+-------------+
Here is a guess. Basically, your view is a select on some tables. The "id" could be a row number. The larger your "x" is, the more select rows need to be created (and discarded) before you can get whatever data you want. That is why your query slows down when your "x" increases.
If this is true, one solution could be to create a table that contains the rownum and a primary key sorted by whatever "order by" you are using. Once you have that table, you can join it with the rest of your data and select your data window by a rownum range.

Currently using View, Should I use a hard table instead?

I am currently debating whether my table, mapping_uGroups_uProducts, which is a view formed by the following table:
CREATE ALGORITHM=UNDEFINED DEFINER=`root`#`localhost`
SQL SECURITY DEFINER VIEW `db`.`mapping_uGroups_uProducts`
AS select distinct `X`.`upID` AS `upID`,`Z`.`ugID` AS `ugID` from
((`db`.`mapping_uProducts_Products` `X` join `db`.`productsInfo` `Y`
on((`X`.`pID` = `Y`.`pID`))) join `db`.`mapping_uGroups_Groups` `Z`
on((`Y`.`gID` = `Z`.`gID`)));
My current query is:
SELECT upID FROM uProductsInfo \
JOIN fs_uProducts USING (upID) column \
JOIN mapping_uGroups_uProducts USING (upID) -- could be faster if we use hard table and index \
JOIN mapping_fs_key USING (fsKeyID) \
WHERE fsName="OVERALL" \
AND ugID=1 \
ORDER BY score DESC \
LIMIT 0,30;
which is pretty slow. (for 30 results, it requires about 10 secondes). I think the reason for my query being so slow is definitely due to the fact that that particular query relies on a VIEW which has no index to speed things up.
+----+-------------+----------------+--------+----------------+---------+---------+---------------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+----------------+---------+---------+---------------------------------------+-------+---------------------------------+
| 1 | PRIMARY | mapping_fs_key | const | PRIMARY,fsName | fsName | 386 | const | 1 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 19706 | Using where |
| 1 | PRIMARY | uProductsInfo | eq_ref | PRIMARY | PRIMARY | 4 | mapping_uGroups_uProducts.upID | 1 | Using index |
| 1 | PRIMARY | fs_uProducts | ref | upID | upID | 4 | db.uProductsInfo.upID | 221 | Using where |
| 2 | DERIVED | X | ALL | PRIMARY | NULL | NULL | NULL | 40772 | Using temporary |
| 2 | DERIVED | Y | eq_ref | PRIMARY | PRIMARY | 4 | db.X.pID | 1 | Distinct |
| 2 | DERIVED | Z | ref | PRIMARY | PRIMARY | 4 | db.Y.gID | 2 | Using index; Distinct |
+----+-------------+----------------+--------+----------------+---------+---------+---------------------------------------+-------+---------------------------------+
7 rows in set (0.48 sec)
The explain here looks pretty cryptic, and I don't know whether I should drop view and write a script to just insert everything in the view to a hard table. ( obviously, it will lose the flexibility of the view since the mapping changes quite frequently).
Does anyone have any idea to how I can optimize my schema better?
You current plan uses the view as a driven table: it is scanned for each record in mapping_fs_key with fsName = 'OVERALL'
You could replace the view with this function:
SELECT upID FROM uProductsInfo
JOIN fs_uProducts USING (upID)
JOIN mapping_fs_key USING (fsKeyID)
WHERE fsName='OVERALL'
AND upID IN
(
SELECT upID
FROM mapping_uGroups_Groups Z
JOIN productsInfo Y
ON y.gID = z.gID
JOIN mapping_uProducts_Products X
ON x.pID = y.pID
WHERE z.ugID = 1
)
ORDER BY
score DESC
LIMIT 0,30