What is slowing down this product search MySQL query? - mysql

We are using the OpenCart ecommerce platform running on PHP 7.2 with MySQL 5.7.27, with about 5000 products.
We use an extension to search through products in the admin panel and it takes about 70-80 seconds on average to execute the search query.
Raw query:
SELECT
SQL_CALC_FOUND_ROWS pd.*,
p.*,
(
SELECT
price
FROM
product_special
WHERE
product_id = p.product_id
AND
(
date_start = '0000-00-00'
OR date_start < NOW()
AND
(
date_end = '0000-00-00'
OR date_end > NOW()
)
)
ORDER BY
priority,
price LIMIT 1
)
AS special_price,
IF(p.image IS NOT NULL
AND p.image <> ''
AND p.image <> 'no_image.png', 'Igen', 'Nem') AS image_text,
IF(p.status, 'Engedélyezett', 'Letiltott') AS status_text,
GROUP_CONCAT(DISTINCT CONCAT_WS(' > ', fgd.name, fd.name)
ORDER BY
CONCAT_WS(' > ', fgd.name, fd.name) ASC SEPARATOR '
') AS filter_text, GROUP_CONCAT(DISTINCT fd.filter_id ORDER BY CONCAT_WS(' > ', fgd.name, fd.name) ASC SEPARATOR '_') AS filter, GROUP_CONCAT(DISTINCT cat.name ORDER BY cat.name ASC SEPARATOR ' ') AS category_text, GROUP_CONCAT(DISTINCT cat.category_id ORDER BY cat.name ASC SEPARATOR '_') AS category, GROUP_CONCAT(DISTINCT IF(p2s.store_id = 0, 'ButopĂȘa HU', s.name) SEPARATOR ' ') AS store_text, GROUP_CONCAT(DISTINCT p2s.store_id SEPARATOR '_') AS store FROM product p LEFT JOIN product_description pd ON (p.product_id = pd.product_id AND pd.language_id = '2') LEFT JOIN product_to_category p2c ON (p.product_id = p2c.product_id) LEFT JOIN (SELECT cp.category_id AS category_id, GROUP_CONCAT(cd1.name ORDER BY cp.level SEPARATOR ' > ') AS name FROM category_path cp LEFT JOIN category c ON (cp.path_id = c.category_id) LEFT JOIN category_description cd1 ON (c.category_id = cd1.category_id) LEFT JOIN category_description cd2 ON (cp.category_id = cd2.category_id) WHERE cd1.language_id = '2' AND cd2.language_id = '2' GROUP BY cp.category_id ORDER BY name) AS cat ON (p2c.category_id = cat.category_id) LEFT JOIN product_to_category p2c2 ON (p.product_id = p2c2.product_id) LEFT JOIN product_filter p2f ON (p.product_id = p2f.product_id) LEFT JOIN filter f ON (f.filter_id = p2f.filter_id) LEFT JOIN filter_description fd ON (fd.filter_id = p2f.filter_id AND fd.language_id = '2') LEFT JOIN filter_group_description fgd ON (f.filter_group_id = fgd.filter_group_id AND fgd.language_id = '2')
LEFT JOIN
product_filter p2f2
ON (p.product_id = p2f2.product_id)
LEFT JOIN
product_to_store p2s
ON (p.product_id = p2s.product_id)
LEFT JOIN
store s
ON (s.store_id = p2s.store_id)
LEFT JOIN
product_to_store p2s2
ON (p.product_id = p2s2.product_id)
GROUP BY
p.product_id
ORDER BY
pd.name ASC LIMIT 0,
190
I tried using MySQL's EXPLAIN functionality to see what's going on, but nothing catches my attention right away:
My test environment is running on Intel NVME, 2666 MHz DDR4 RAM, and i7 8th gen. CPU, and yet it's still very slow.
I appreciate any hints as to what is slowing this query down.

It looks like some many-to-many mappings being used in that SELECT. WooCommerce (and the underlying Wordpress) have an inefficient way of implementing such.
Here is my discussion of how to change the schema to improve performance of wp_postmeta and similar tables (product_to_category and product_to_store): http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta
This may be a bug:
( date_start = '0000-00-00'
OR date_start < NOW()
AND ( date_end = '0000-00-00'
OR date_end > NOW() )
)
You probably wanted extra parentheses:
( ( date_start = '0000-00-00'
OR date_start < NOW() )
AND ( date_end = '0000-00-00'
OR date_end > NOW() )
)
Also,
( date_start = '0000-00-00' OR date_start < NOW() )
can be simplified to just
( date_start < NOW() )
And, the query seems to have the explode-implode syndrome where the JOINs expand to generate a large temp table, only to have the GROUP BY collapse down to the original size. The workaround is to turn
GROUP_CONCAT(... foo.x ...) AS blah.
...
LEFT JOIN foo ... ON ...
into
( SELECT GROUP_CONCAT(... foo.x ...) FROM foo WHERE ... ) AS blah,
If that eliminates all the LEFT JOINs, then the GROUP BY p.product_id can also be eliminated.
Do not say LEFT JOIN when the 'right' table is not optional. (Instead, say JOIN.)
LEFT JOIN category_description cd1 ON (cp.category_id = cd1.category_id)
WHERE cd1.language_id = '2' -- this invalidates the `LEFT`
cd2 seems not to be used except for checking that cd2.language_id = '2'. Consider removing references to it.
This requires two temp tables since they are different:
GROUP BY p.product_id
ORDER BY pd.name ASC
Am I correct in saying that pd.name is simply the name for p.product_id in 'language' 2? If so, this may be semantically the same, but faster because of eliminating a temp table and sort):
GROUP BY pd.name
ORDER BY pd.name
Once that is done, it may be better to have INDEX(language_id, name) on pd.
The speedup from LIMIT 0, 190 is mostly eliminated by SQL_CALC_FOUND_ROWS.
Over-normalization led to having the two components of this in separate tables?
CONCAT_WS(' > ', fgd.name, fd.name)

Related

How to Make This SQL Query More Efficient?

I'm not sure how to make the following SQL query more efficient. Right now, the query is taking 8 - 12 seconds on a pretty fast server, but that's not close to fast enough for a Website when users are trying to load a page with this code on it. It's looking through tables with many rows, for instance the "Post" table has 717,873 rows. Basically, the query lists all Posts related to what the user is following (newest to oldest).
Is there a way to make it faster by only getting the last 20 results total based on PostTimeOrder?
Any help would be much appreciated or insight on anything that can be done to improve this situation. Thank you.
Here's the full SQL query (lots of nesting):
SELECT DISTINCT p.Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(p.PostCreationTime) AS PostTimeOrder
FROM Post p
WHERE (p.Id IN (SELECT pc.PostId
FROM PostCreator pc
WHERE (pc.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pc.UserId = '100')
))
OR (p.Id IN (SELECT pum.PostId
FROM PostUserMentions pum
WHERE (pum.UserId IN (SELECT uf.FollowedId
FROM UserFollowing uf
WHERE uf.FollowingId = '100')
OR pum.UserId = '100')
))
OR (p.Id IN (SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100'))
))
OR (p.Id IN (SELECT psm.PostId
FROM PostSMentions psm
WHERE (psm.StockId IN (SELECT sf.StockId
FROM StockFollowing sf
WHERE sf.UserId = '100' ))
))
UNION ALL
SELECT DISTINCT p.Id AS Id, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime, p.Content AS Content, p.Bu AS Bu, p.Se AS Se, UNIX_TIMESTAMP(upe.PostEchoTime) AS PostTimeOrder
FROM Post p
INNER JOIN UserPostE upe
on p.Id = upe.PostId
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND (uf.FollowingId = '100' OR upe.UserId = '100'))
ORDER BY PostTimeOrder DESC;
Changing your p.ID in (...) predicates to existence predicates with correlated subqueries may help. Also since both halves of your union all query are pulling from the Post table and possibly returning nearly identical records you might be able to combine the two into one query by left outer joining to UserPostE and adding upe.PostID is not null as an OR condition in the WHERE clause. UserFollowing will still inner join to UPE. If you want the same Post record twice once with upe.PostEchoTime and once with p.PostCreationTime as the PostTimeOrder you'll need keep the UNION ALL
SELECT
DISTINCT -- <<=- May not be needed
p.Id
, UNIX_TIMESTAMP(p.PostCreationTime) AS PostCreationTime
, p.Content AS Content
, p.Bu AS Bu
, p.Se AS Se
, UNIX_TIMESTAMP(coalesce( upe.PostEchoTime
, p.PostCreationTime)) AS PostTimeOrder
FROM Post p
LEFT JOIN UserPostE upe
INNER JOIN UserFollowing uf
on (upe.UserId = uf.FollowedId AND
(uf.FollowingId = '100' OR
upe.UserId = '100'))
on p.Id = upe.PostId
WHERE upe.PostID is not null
or exists (SELECT 1
FROM PostCreator pc
WHERE pc.PostId = p.ID
and pc.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pc.UserID
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM PostUserMentions pum
WHERE pum.PostId = p.ID
and pum.UserId = '100'
or exists (SELECT 1
FROM UserFollowing uf
WHERE uf.FollowedId = pum.UserId
and uf.FollowingId = '100')
)
OR exists (SELECT 1
FROM SStreamPost ssp
WHERE ssp.PostId = p.ID
and exists (SELECT 1
FROM SStreamFollowing ssf
WHERE ssf.SStreamId = ssp.SStreamId
and ssf.UserId = '100')
)
OR exists (SELECT 1
FROM PostSMentions psm
WHERE psm.PostId = p.ID
and exists (SELECT
FROM StockFollowing sf
WHERE sf.StockId = psm.StockId
and sf.UserId = '100' )
)
ORDER BY PostTimeOrder DESC
The from section could alternatively be rewritten to also use an existence clause with a correlated sub query:
FROM Post p
LEFT JOIN UserPostE upe
on p.Id = upe.PostId
and ( upe.UserId = '100'
or exists (select 1
from UserFollowing uf
where uf.FollwedID = upe.UserID
and uf.FollowingId = '100'))
Turn IN ( SELECT ... ) into a JOIN .. ON ... (see below)
Turn OR into UNION (see below)
Some the tables are many:many mappings? Such as SStreamFollowing? Follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Example of IN:
SELECT ssp.PostId
FROM SStreamPost ssp
WHERE (ssp.SStreamId IN (
SELECT ssf.SStreamId
FROM SStreamFollowing ssf
WHERE ssf.UserId = '100' ))
-->
SELECT ssp.PostId
FROM SStreamPost ssp
JOIN SStreamFollowing ssf ON ssp.SStreamId = ssf.SStreamId
WHERE ssf.UserId = '100'
The big WHERE with all the INs becomes something like
JOIN ( ( SELECT pc.PostId AS id ... )
UNION ( SELECT pum.PostId ... )
UNION ( SELECT ssp.PostId ... )
UNION ( SELECT psm.PostId ... ) )
Get what you can done of that those suggestions, then come back for more advice if you still need it. And bring SHOW CREATE TABLE with you.

Combine query that relies on resultset of another

I run this query to get 20 random items from my wordpress database based on things like rating, category, etc
SELECT (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY RAND()
LIMIT 20
Then, for each result of the random items, I want to find a corresponding item that is very similar to the random item (around the same rating) but not identical and also one the user has not seen:
SELECT ABS($site_rating-(A.user_votes/A.user_voters)) as diff, (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category ,IFNULL(F.count,0) as count
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
LEFT JOIN (
SELECT *,COUNT(*) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as F ON (A.post_id = F.post_id_winner OR A.post_id = F.post_id_loser)
WHERE
E.name = '$category' AND
B.ID <> '$post_id' AND
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY count ASC, diff ASC
LIMIT 1
Where the following php variables refer to the result of the previous query
$post_id = $result['post_id'];
$category = $result['category'];
$site_rating = $result['site_rating'];
and $_SERVER['REMOTE_ADDR'] refers to the user's IP.
Is there a way to combine the first query with the 20 additional queries that need to be called to find corresponding items, so that I need just 1 or 2 queries?
Edit: Here is the view that simplifies the joins
CREATE VIEW `versus_random` AS
SELECT (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
My attempt now with the view:
SELECT post_id,
(
SELECT INNER_TABLE.post_id
FROM `versus_random` as INNER_TABLE
WHERE
INNER_TABLE.post_id <> OUTER_TABLE.post_id
ORDER BY (SELECT COUNT(*) FROM `versus` WHERE ip = '54' AND (INNER_TABLE.post_id = post_id_winner OR INNER_TABLE.post_id = post_id_loser)) ASC
LIMIT 1
) as innerquery
FROM `versus_random` as OUTER_TABLE
ORDER BY RAND()
LIMIT 20
However the query just timesout and freezes my mysql.
I think it should work like this, but I don't have any Wordpress at hand to test it. The second query that gets the related post is embedded in the other query, when it gets just the related_post_id. The whole query is turned into a subquery itself, given the alias 'X' (although you are free to use 'G', if you want to continue your alphabet.)
In the outer query, the tables for posts and data-article are joined again (RA and RP) to query the relevant fields of the related post, based on the related_post_id from the inner query. These two tables are left joined (and in reverse order), so you still get the main post if no related post was found.
SELECT
X.site_rating,
X.post_id,
X.post_author,
X.post_date,
X.category,
RA.user_votes / RA.user_voters as related_post_site_rating,
RP.ID as related_post_id,
RP.post_author as related_post_author,
RP.post_date as related_post_date,
RP.name as related_category,
FROM
( SELECT
(A.user_votes/A.user_voters) as site_rating,
B.ID as post_id, B.post_author, B.post_date,E.name as category,
( SELECT
RB.ID as post_id
FROM `wp_gdsr_data_article` as RA
INNER JOIN `wp_posts` as RB ON (RA.post_id = RB.id)
INNER JOIN wp_term_relationships RC ON (RB.ID = RC.object_id)
INNER JOIN wp_term_taxonomy RD ON (RC.term_taxonomy_id = RD.term_taxonomy_id)
INNER JOIN wp_terms RE ON (RD.term_id = RE.term_id)
LEFT JOIN (
SELECT *,COUNT(*) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as RF ON (RA.post_id = RF.post_id_winner OR RA.post_id = RF.post_id_loser)
WHERE
RE.name = E.name AND
RB.ID <> B.ID AND
RB.post_type = 'post' AND
RB.post_status = 'publish' AND
RD.taxonomy='category' AND
RE.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY count ASC, diff ASC
LIMIT 1) as related_post_id
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY RAND()
LIMIT 20
) X
LEFT JOIN `wp_posts` as RP ON RP.id = X.related_post_id
LEFT JOIN `wp_gdsr_data_article` as RA.post_id = RP.id
I can't test my proposal so take it with the benefit of the doubt. Anyway i hope it could be a valid starting point for some of the issues faced.
I can not imagine a solution that does not pass through a temporary table, cabling onerous computations present in your queries. You could also have the goal to not interfere with the randomization of the first phase. In the following I try to clarify.
I'll start with these rewritings:
-- first query
SELECT site_rating, post_id, post_author, post_date, category
FROM POSTS_COMMON
ORDER BY RAND()
LIMIT 20
-- second query
SELECT ABS(R.site_rating_A - R.site_rating_B) as diff, R.site_rating_B as site_rating, P.post_id, P.post_author, P.post_date, P.category, F.count
FROM POSTS_COMMON AS P
INNER JOIN POSTS_RATING_DIFFS AS R ON (P.post_id = R.post_id_B)
LEFT JOIN (
/* post_id_winner, post_id_loser explicited; COUNT(*) NULL treatment anticipated */
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as F ON (P.post_id = F.post_id_winner OR P.post_id = F.post_id_loser)
WHERE
P.category = '$category'
AND R.post_id_A = '$post_id'
ORDER BY count ASC, diff ASC
LIMIT 1
with:
SELECT A.post_id_A, B.post_id_B, A.site_rating as site_rating_A, B.site_rating as site_rating_B
INTO POSTS_RATING_DIFFS
FROM POSTS_COMMON as A, POSTS_COMMON as B
WHERE A.post_id <> B.post_id AND A.category = B.category
CREATE VIEW POSTS_COMMON AS
SELECT A.ID as post_id, A.user_votes, A.user_voters, (A.user_votes / A.user_voters) as site_rating, B.post_author, B.post_date, E.name as category
FROM wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.post_id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
POSTS_COMMON isolates a common view between the two queries.
With POSTS_RATING_DIFFS, a temporary table populated with the ratings combinations and diffs, we have "the trick" of transforming the inequality join criteria on post_id(s) in an equality one (see R.post_id_A = '$post_id' in the second query).
We also take advantage of a temporary table in having precomputed ratings for the combinatory explosion of A.post_id <> B.post_id (with post category equality), and moreover being useful for other sessions.
Also extracting the RAND() ordering in a temporary table could be advantageous. In this case we could limit the ratings combinations and diffs only on the 20 randomly chosen.
Original limiting to one single row in the dependent second level query is done by mean of ordering and limit statements.
The proposed solution avoids elaborating a LIMIT 1 on an ORDER BY resultset in the second level query wich become a subquery.
The single row calculation in the subquery is done by mean of a WHERE criteria on the maximum of a single value calculated from the columns values on which ORDER BY clause is used.
The combination into a single value must be valid in preserving the correct ordering. I'll leave in pseudo-code as:
'<combination of count and diff>'
For example, using combination of the two values into a string type, we could have:
CONCAT(LPAD(CAST(count AS CHAR), 10, '0'), LPAD(CAST(ABS(diff) AS CHAR), 20, '0'))
The structure of the single query would be:
SELECT (Q_LVL_1.user_votes/Q_LVL_1.user_voters) as site_rating_LVL_1, Q_LVL_1.post_id as post_id_LVL_1
, Q_LVL_1.post_author as post_author_LVL_1, Q_LVL_1.post_date as post_date_LVL_1
, Q_LVL_1.category as category_LVL_1, Q_LVL_2.post_id as post_id_LVL_2
, Q_LVL_2.diff as diff_LVL_2, Q_LVL_2.site_rating as site_rating_LVL_2
, Q_LVL_2.post_author as post_author_LVL_2, Q_LVL_2.post_date as post_date_LVL_2
, Q_LVL_2.count
FROM POSTS_COMMON AS Q_LVL_1
, /* 1-row-selection query placed side by side for each Q_LVL_1's row */
(
SELECT CORE_P.post_id, CORE_P.ABS_diff as diff, P.site_rating, P.post_author, P.post_date, CORE_P.count
FROM POSTS_COMMON AS P
INNER JOIN (
SELECT FIRST(CORE_P.post_id) as post_id, ABS(CORE_P.diff) as ABS_diff, CORE_P.count
FROM (
/*
selection of posts with post_id(s) different from first level query,
not already taken and with the topmost value of
'<combination of count and diff>'
*/
) AS CORE_P
GROUP BY CORE_P.count, ABS(CORE_P.diff)
/* the one row selector */
) AS CORE_ONE_LINER ON P.post_id = CORE_ONE_LINER.post_id
) AS Q_LVL_2
ORDER BY RAND()
LIMIT 20
CORE_P selection could have more post_id(s) corresponding to the topmost value '<combination of count and diff>', so the use of GROUP BY and FIRST clauses to reach the single row.
This brings to a possible final implementation:
SELECT (Q_LVL_1.user_votes/Q_LVL_1.user_voters) as site_rating_LVL_1, Q_LVL_1.post_id as post_id_LVL_1
, Q_LVL_1.post_author as post_author_LVL_1, Q_LVL_1.post_date as post_date_LVL_1
, Q_LVL_1.category as category_LVL_1, Q_LVL_2.post_id as post_id_LVL_2
, Q_LVL_2.diff as diff_LVL_2, Q_LVL_2.site_rating as site_rating_LVL_2
, Q_LVL_2.post_author as post_author_LVL_2, Q_LVL_2.post_date as post_date_LVL_2
, Q_LVL_2.count
FROM POSTS_COMMON AS Q_LVL_1
, (
SELECT CORE_P.post_id, CORE_P.ABS_diff as diff, P.site_rating, P.post_author, P.post_date, CORE_P.count
FROM POSTS_COMMON AS P
INNER JOIN
(
SELECT FIRST(CORE_P.post_id) as post_id, ABS(CORE_P.diff) as ABS_diff, CORE_F.count
FROM (
SELECT CORE_RATING.post_id as post_id, ABS(CORE_RATING.diff) as ABS_diff, CORE_F.count
FROM (
SELECT post_id_B as post_id, site_rating_A - site_rating_B as diff
FROM POSTS_RATING_DIFFS
WHERE POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
) as CORE_RATING
LEFT JOIN (
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count
FROM `verus`
WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as CORE_F ON (CORE_RATING.post_id = CORE_F.post_id_winner OR CORE_RATING.post_id = CORE_F.post_id_loser)
WHERE
POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
AND '<combination of CORE_F.count and CORE_RATING.diff>'
= MAX (
SELECT '<combination of CORE_F_2.count and CORE_RATING_2.diff>'
FROM (
SELECT site_rating_A - site_rating_B as diff
FROM POSTS_RATING_DIFFS
WHERE POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
) as CORE_RATING_2
LEFT JOIN (
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count
FROM `verus`
WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as CORE_F_2 ON (CORE_RATING_2.post_id = CORE_F_2.post_id_winner OR CORE_RATING_2.post_id = CORE_F_2.post_id_loser)
) /* END MAX */
) AS CORE_P
GROUP BY CORE_P.count, ABS(CORE_P.diff)
) AS CORE_ONE_LINER ON P.post_id = CORE_ONE_LINER.post_id
) AS Q_LVL_2
ORDER BY RAND()
LIMIT 20

mysql query taking long time to excute

This is my query taking 3 second to execute :
SELECT I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM items I
JOIN itemdetail D
ON I.itemtype = 'I'
AND I.itemid = D.itemid
AND D.propid = 13
LEFT OUTER JOIN icd
ON icd.code = D.value
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
WHERE I.deleteflag = 0
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
AND I.itemname LIKE 'A%'
AND ( I.keyname = 'Assessments' )
ORDER BY I.itemname ASC limit 0,6;
i have index IX_items_itemType_deleteFlag_keyName_itemName on multiple column itemType , deleteFlag ,keyName ,itemName in items table and also have index on other table's column which used in join and where clause.
so how can i improve performance of query ?
Thanks
I would have an index on your items table based on the multiple key columns used for your where clause and order by. I would have the index with the smallest result set in the front position. For example, you are specifically looking for "Assessments". If your table has 1 million records, and 600k of them are of item type "I", but only 5k are "Assessments", then the smallest part up front might be better for your query TO process.
I would have your:
items table indexed on ( keyname, itemtype, deleteflag, itemname )
ItemDetail table, indexed ON ( itemid, propid )
icd table indexed ON ( code, validfrom, validto, ecwstatus )
edi_icdcodes table index ON (itemid)
SELECT
I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM
items I
JOIN itemdetail D
ON I.itemid = D.itemid
AND D.propid = 13
LEFT OUTER JOIN icd
ON D.value = icd.code
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
WHERE
I.itemtype = 'I'
AND I.deleteflag = 0
AND I.keyname = 'Assessments'
AND I.itemname LIKE 'A%'
ORDER BY
I.itemname ASC
LIMIT
0,6;
Note... if the ICD table will always have a value for both from/to dates when records are created, you won't need to test for NULL, but do understand why you had that via left-join and putting in the where clause. So, that part might be simplified to
LEFT OUTER JOIN icd
ON D.value = icd.code
AND icd.validfrom <= '2012-12-06'
AND icd.validto >= '2012-12-06'
What you do there, it's that you use a basic table, then you do some 'join' and only after, you do your 'where' requests . To go quicky, try to include your condition in your 'joins'. In this manneer, it selects the different lines in the 'join' request and not after.
SELECT I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM items I
JOIN itemdetail D
ON I.itemtype = 'I'
AND I.itemid = D.itemid
AND D.propid = 13
AND I.deleteflag = 0
LEFT OUTER JOIN icd
ON icd.code = D.value
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
AND I.itemname LIKE 'A%'
AND ( I.keyname = 'Assessments' )
ORDER BY I.itemname ASC limit 0,6;

Sorting by virtual key takes very long time in mysql

SELECT p . * , (
SELECT (
SELECT COUNT( * )
FROM sales s
WHERE s.affiliate != ''
AND s.pid = p.pid
AND s.saletype = 'sale' )
) AS popular
FROM products p
INNER JOIN members m ON m.uname = p.vendor
WHERE (m.mpid = p.pid OR p.marketavail = 'yes')
AND p.showinmarket = 'yes'
AND p.pname != ''
AND p.pdesc != ''
AND p.active = 'yes'
ORDER BY popular DESC
Here, If i use ORDER BY popular , it takes 17 seconds to load. without this ordering , query is executed in 4 seconds.
Please tell me why it is taking too much time while ordering by virtual columns?
All tables has index on required columns, so indexing is not the issue i guess. And if i run select count(*) for single product, it is executing in milliseconds.
And one more error i saw, If i remove SELECT word (second select word in my sql), it takes 105 sec to execute.
Please tell me if i need to give any more information.
Due to such delay in sorting, i am using php instead of mysql for sorting. Please help me to make it better.
Thank you in advance.
please try this query
SELECT p.column1,
p.column2,
p.column3,
COUNT(s.pid) as popular
FROM products p
INNER JOIN members m ON m.uname = p.vendor
LEFT JOIN sales s ON s.pid = p.pid AND s.affiliate != '' AND s.saletype = 'sale'
WHERE (m.mpid = p.pid OR p.marketavail = 'yes')
AND p.showinmarket = 'yes'
AND p.pname != ''
AND p.pdesc != ''
AND p.active = 'yes'
GROUP BY p.column1,p.column2,p.column3
ORDER BY popular DESC
column1,column2,column3 are just examples of columns you want, because you're select * I don't know what column names are from product. so change them to your actual column names.
edit: try this query see if it's any faster
SELECT p.pname, p.vendor, p.pid,
COUNT( s.pid ) AS popular
FROM products p INNER JOIN members m ON m.uname = p.vendor
LEFT JOIN
(SELECT pid FROM sales
WHERE affiliate != ''
AND saletype = 'sale'
)s
ON (s.pid = p.pid)
WHERE ( m.mpid = p.pid OR p.marketavail = 'yes' )
AND p.showinmarket = 'yes' AND p.pname != ''
AND p.pdesc != ''
AND p.active = 'yes'
GROUP BY p.pid, p.pname
ORDER BY popular DESC
if it runs faster you can pre-filter products too like this query and see if it runs even faster
SELECT p.pname, p.vendor, p.pid,
COUNT( s.pid ) AS popular
FROM (SELECT pname,vendor,pid,marketavail
FROM products
WHERE showinmarket = 'yes'
AND pname != ''
AND pdesc != ''
AND active = "yes"
)p
INNER JOIN members m ON m.uname = p.vendor
LEFT JOIN
(SELECT pid FROM sales
WHERE affiliate != ''
AND saletype = 'sale'
)s
ON (s.pid = p.pid)
WHERE ( m.mpid = p.pid OR p.marketavail = 'yes' )
GROUP BY p.pid, p.pname
ORDER BY popular DESC

Slow MySQL query with subquery from table

I am trying to bring back a string based on an IF statement but it is extremely slow.
It has something to do with the first subquery but I am unsure of how to rearrange this as to bring back the same results but faster.
Here is my SQL:
SELECT IF
(
(
SELECT COUNT(*)
FROM
(
SELECT DISTINCT enquiryId, type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id
) AS parts
WHERE parts.enquiryId = enquiries.id
) > 1, 'Mixed',
(
SELECT DISTINCT type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id AND enquiryId = enquiries.id
)
) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
How can I make it faster?
I have modified my original query below, but I am getting the error that subquery returns more than one row:
SELECT
(SELECT
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
Please have a look if this query yields the same results:
SELECT
enquiryId,
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId
But N.B.'s comment is still valid. To see if and index is used and other information we need to see the EXPLAIN and the table definitions.
This should get you what you want.
I would first pre-query your parts enquiries and parts service types looking for both the count and MINIMUM of the part 'type', grouped by the enquiry ID.
then, run your IF() against that result. If the distinct count is > 0, then 'Mixed'. If only one, since I did the MIN(), it would only have the description of that one value that you desire anyhow.
SELECT
E.ID
IF ( PreQuery.DistTypes > 1, 'Mixed', PreQuery.FirstType ) as PartType
from
Enquiries E
JOIN ( SELECT
PE.EnquiryID,
COUNT( DISTINCT PE.ServiceTypeID ) as DistTypes,
MIN( PST.Type ) as FirstType
from
Parts_Enquiries PE
JOIN Parts_Service_Types PST
ON PE.ServiceTypeID = PST.ID
group by
PE.EnquiryID ) as PreQuery
ON E.ID = PreQuery.EnquiryID