How to improve performance with an ORDER BY clause - mysql

I have a query that is reading through approximately 2.4m rows of data.
The query itself is running well but the ORDER BY clause is causing performance issues. If I remove the ORDER BY the query takes 0.03 seconds to execute. With the ORDER BY it can take 4.5 to 5 seconds.
Is there anyway I an optimise this query further? Indexes have been added so that isn't a solution.
EDIT 1 -
This query is a shortened version of a much bigger PDO query so I think the join is necessary. You can see the main query at the bottom of this post.
SELECT t.processing_time, t.paymentType, t.status, t.merchantTransactionId, t.paymentBrand, t.amount, t.currency, t.code, t.holder, t.bin, t.last4Digits, t.recurringType, m.name AS merchant, c.name AS channel, concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)), lower(SUBSTRING(trim(sp.status_description),2))) as status_description
FROM transactionsV2 t
JOIN channels c
ON t.entityId = c.uuid
JOIN merchants m
ON m.uuid = c.sender
JOIN status_payments sp
ON t.code = sp.status_code
JOIN (
SELECT t.id, t.processing_time FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= "2018-11-08 00:00:00")
AND (t.processing_time <= "2018-11-12 23:59:59")
ORDER BY t.processing_time DESC
LIMIT 1000
) t2
ON t.id = t2.id
WHERE t.status = 1
$transactions = DB::connection('mysql2')->select(DB::raw("SELECT t.processing_time, t.paymentType, t.status, t.merchantTransactionId, t.paymentBrand, t.amount, t.currency, t.code, t.holder, t.bin, t.last4Digits, t.recurringType, m.name AS merchant, c.name AS channel, concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)), lower(SUBSTRING(trim(sp.status_description),2))) as status_description
FROM transactionsV2 t
JOIN channels c
ON t.entityId = c.uuid
JOIN merchants m
ON m.uuid = c.sender
JOIN status_payments sp
ON t.code = sp.status_code
JOIN (
SELECT t.id, t.processing_time FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= :insTs1)
AND (t.processing_time <= :insTs2)
AND (:merchant1 IS NULL OR m.name LIKE :merchant2)
AND (:channel1 IS NULL OR c.name LIKE :channel2)
ORDER BY t.processing_time DESC
LIMIT 1000
) t2
ON t.id = t2.id
WHERE (:status1 IS NULL OR t.status = :status2)
AND (:holder1 IS NULL OR holder LIKE :holder2)
AND (:paymentType1 IS NULL OR t.paymentType IN (".$paymentType."))
AND (:merchantTransactionId1 IS NULL OR merchantTransactionId LIKE :merchantTransactionId2)
AND (:paymentBrand1 IS NULL OR paymentBrand LIKE :paymentBrand2)
AND (:amount1 IS NULL OR amount = :amount2)
AND (:recurringType1 IS NULL OR t.recurringType = :recurringType2)"),
['status1' => $search->searchCriteria['status'],
'status2' => $search->searchCriteria['status'],
'holder1' => $search->searchCriteria['holder'],
'holder2' => '%'.$search->searchCriteria['holder'].'%',
'paymentType1' => $paymentType,
'merchantTransactionId1' => $search->searchCriteria['merchantTransactionId'],
'merchantTransactionId2' => '%'.$search->searchCriteria['merchantTransactionId'].'%',
'paymentBrand1' => $search->searchCriteria['paymentBrand'],
'paymentBrand2' => '%'.$search->searchCriteria['paymentBrand'].'%',
'amount1' => $search->searchCriteria['amount'],
'amount2' => $search->searchCriteria['amount'],
'recurringType1' => $search->searchCriteria['recurringType'],
'recurringType2' => $search->searchCriteria['recurringType'],
'merchant1' => $search->searchCriteria['merchant'],
'merchant2' => '%'.$search->searchCriteria['merchant'].'%',
'channel1' => $search->searchCriteria['channel'],
'channel2' => '%'.$search->searchCriteria['channel'].'%',
'insTs1' => $search->searchCriteria['fromDate'] . ' 00:00:00',
'insTs2' => $search->searchCriteria['toDate'] . ' 23:59:59']);

Perhaps I'm missing something, but I don't see that the subquery requires the joins. Does this suffice?
SELECT t.id, t.processing_time
FROM transactionsV2 t
WHERE t.processing_time >= '2018-11-08' AND
t.processing_time <= '2018-11-13'
ORDER BY t.processing_time DESC
LIMIT 1000
If so, an index on transactionsV2(processing_time) would help (assuming that it is not a view).

I believe that the subquery is redundant since it is independent subquery and you are doing the join according to a primary key (transactionsV2.id). You can simply use
SELECT t.processing_time,
t.paymentType,
t.status,
t.merchantTransactionId,
t.paymentBrand,
t.amount,
t.currency,
t.code,
t.holder,
t.bin,
t.last4Digits,
t.recurringType,
m.name AS merchant,
c.name AS channel,
concat(UPPER(SUBSTRING(trim(sp.status_description),1,1)),
lower(SUBSTRING(trim(sp.status_description),2))) as status_description,
row_number() over ()
FROM transactionsV2 t
JOIN channels c ON t.entityId = c.uuid
JOIN merchants m ON m.uuid = c.sender
WHERE (t.processing_time >= "2018-11-08 00:00:00") AND (t.processing_time <= "2018-11-12 23:59:59") and t.status = 1
ORDER BY t.processing_time DESC
LIMIT 1000

Related

MySql query taking long time to execute when called from VBA

I am trying to retrieve some data from MySql database using Excel Vba. Everything is working fine...but the MySql query is taking too much time to execute.
Here is my code:
SELECT
d.DATE,
c.name,
c.address,
c.state_name,
c.contact_no,
d.AMOUNT,
d.BY_NAME,
d.NARATION,
t.REMARK
FROM
database1.data d
JOIN
(
SELECT DISTINCT
cust_id,
OR_NO
FROM
database1.ordbill
) o ON SUBSTRING_INDEX(database1.d.NARATION,
':',
-1) = o.OR_NO
JOIN
database1.contact c ON o.cust_id = c.id
JOIN
database1.total t ON t.VCH_NO = d.VCH_NO
WHERE
d.PARTY_NAME = 'advance' AND(
d.`BY_NAME` = 'Bank1' OR d.`BY_NAME` = 'CASH' OR d.`BY_NAME` = 'Bank2'
) AND d.DATE BETWEEN '2019-09-01' AND '2019-09-30'
ORDER BY
d.DATE ASC `
Assuming you have already index on table contact pk id an index
SELECT *
FROM `loans`
WHERE `date` >= '2019-11-25'
AND `date`<='2019-11-28'
AND `designation` LIKE '%sdf%'
why does this happen ?
SELECT d.DATE
,c.name
,c.address
,c.state_name
,c.contact_no
, d.AMOUNT
, d.BY_NAME
, d.NARATION
,t.REMARK
FROM database1.data d JOIN (
SELECT DISTINCT cust_id, OR_NO FROM database1.ordbill
) o ON SUBSTRING_INDEX(database1.d.NARATION,':',-1)=o.OR_NO
JOIN database1.contact c on o.cust_id=c.id
JOIN database1.total t on t.VCH_NO=d.VCH_NO
WHERE d.PARTY_NAME = 'advance'
AND (d.`BY_NAME` = 'Bank1' OR d.`BY_NAME` = 'CASH' OR d.`BY_NAME` = 'Bank2')
AND d.DATE BETWEEN '2019-09-01' AND '2019-09-30'
ORDER BY d.DATE ASC
be sure you have also proper composite index on
table data columns(PARTY_NAME, BY_NAME, DATE, VCH_NO )
and a index also
table total column (VCH_NO)

SQL - GROUB BY - HAVING - MISSING ROWS

the following is the situation. I need to connect an order-table with a message-table. But i'm only interested in the first message(lowest message-id). The connection between the tables is the orderid.
$result = $this->db->executeS('
SELECT o.*, c.iso_code AS currency, s.name AS shippingMethod, m.message AS note
FROM '._DB_PREFIX_.'orders o
LEFT JOIN '._DB_PREFIX_.'currency c ON c.id_currency = o.id_currency
LEFT JOIN '._DB_PREFIX_.'message m ON m.id_order = o.id_order
LEFT JOIN '._DB_PREFIX_.'carrier s ON s.id_carrier = o.id_carrier
LEFT JOIN jtl_connector_link l ON o.id_order = l.endpointId AND l.type = 4
WHERE l.hostId IS NULL AND o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
GROUP BY o.id_order
HAVING MIN(m.id_message)
LIMIT '.$limit
);
This query works so far. But now orders without a message are missing.
Thank you for your help!
Markus
You want to select several orders and per order the first message. This is generally difficult in MySQL for the lack of window functions (e.g. ROW_NUMBER OVER). But as it's just one column from the message table you are interested in, you can use a subquery in the SELECT clause.
SELECT
o.*,
c.iso_code AS currency,
s.name AS shippingMethod,
(
SELECT m.message
FROM message m
WHERE m.id_order = o.id_order
ORDER BY m.id_message
LIMIT 1
) AS note
FROM orders o
JOIN currency c ON c.id_currency = o.id_currency
JOIN carrier s ON s.id_carrier = o.id_carrier
WHERE o.date_add BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW()
AND NOT EXISTS
(
SELECT *
FROM jtl_connector_link l
WHERE l.endpointId = o.id_order
AND l.type = 4
);

Buyers structure by registration date query optimisation

I would like to show buyers structure by their registration date e.g.:
H12016 10.000 buyers
from which
2.000 registered in H12014
4.000 registered in H22014
etc.
I have two queries for that:
Number 1 (buyers from H12016 (about 50k records)):
SELECT DISTINCT
r.idUsera as id_usera
FROM
rezerwacje r
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Number 2 (users_ids and their registration (insert) date (about 3,8M users)):
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
Both queries separately run fine, but when I try to combine them like so:
SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
LEFT JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
this query runs "indefinetely" and I have to kill it after some time.
I do not belive it should run that long. If the query Number 2 was smaller i.e. about 1M users I could combine results in Excel in matter of seconds. So why is it not possible inside the database? What am I doing wrong?
SELECT DISTINCT
r.idUsera as id_usera,
t1.data_insert
FROM
rezerwacje r
INNER JOIN
(
SELECT
m.user_id,
date(m.action_date) as data_insert
FROM
mwids m
WHERE
m.`type` = 'insert'
) t1 ON t1.user_id = r.idUsera
WHERE
r.dataZalozenia between '2016-01-01' and '2016-07-01'
and r.`status` = 'zabookowana'
ORDER BY
id_usera
Try with INNER JOIN.
Query 1 needs
INDEX(status, dataZalozenia, id_usera)
Query 3: Rewrite thus:
If there is only one row in mwids for 'insert' per user:
SELECT r.idUsera as id_usera, DATE(m.action_date) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
with
INDEX(status, dataZalozenia, isUsera) -- on r
INDEX(type, user_id, action_date) -- on m
If there can be multiple rows, do this:
SELECT r.idUsera as id_usera,
( SELECT DATE(m.action_date)
FROM mwids m
WHERE m.user_id = r.idUsera
AND m.`type` = 'insert'
LIMIT 1
) AS data_insert
FROM rezerwacje r
LEFT JOIN mwids m ON m.user_id = r.idUsera
AND m.`type` = 'insert'
WHERE r.dataZalozenia >= '2016-01-01'
AND r.dataZalozenia < '2016-01-01' + 12 MONTH
and r.`status` = 'zabookowana'
ORDER BY r.idUsera
But you will be getting a random action_date. So maybe you want MIN() or MAX()?

Combine query that relies on resultset of another

I run this query to get 20 random items from my wordpress database based on things like rating, category, etc
SELECT (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY RAND()
LIMIT 20
Then, for each result of the random items, I want to find a corresponding item that is very similar to the random item (around the same rating) but not identical and also one the user has not seen:
SELECT ABS($site_rating-(A.user_votes/A.user_voters)) as diff, (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category ,IFNULL(F.count,0) as count
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
LEFT JOIN (
SELECT *,COUNT(*) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as F ON (A.post_id = F.post_id_winner OR A.post_id = F.post_id_loser)
WHERE
E.name = '$category' AND
B.ID <> '$post_id' AND
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY count ASC, diff ASC
LIMIT 1
Where the following php variables refer to the result of the previous query
$post_id = $result['post_id'];
$category = $result['category'];
$site_rating = $result['site_rating'];
and $_SERVER['REMOTE_ADDR'] refers to the user's IP.
Is there a way to combine the first query with the 20 additional queries that need to be called to find corresponding items, so that I need just 1 or 2 queries?
Edit: Here is the view that simplifies the joins
CREATE VIEW `versus_random` AS
SELECT (A.user_votes/A.user_voters) as site_rating, B.ID as post_id, B.post_author, B.post_date,E.name as category
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
My attempt now with the view:
SELECT post_id,
(
SELECT INNER_TABLE.post_id
FROM `versus_random` as INNER_TABLE
WHERE
INNER_TABLE.post_id <> OUTER_TABLE.post_id
ORDER BY (SELECT COUNT(*) FROM `versus` WHERE ip = '54' AND (INNER_TABLE.post_id = post_id_winner OR INNER_TABLE.post_id = post_id_loser)) ASC
LIMIT 1
) as innerquery
FROM `versus_random` as OUTER_TABLE
ORDER BY RAND()
LIMIT 20
However the query just timesout and freezes my mysql.
I think it should work like this, but I don't have any Wordpress at hand to test it. The second query that gets the related post is embedded in the other query, when it gets just the related_post_id. The whole query is turned into a subquery itself, given the alias 'X' (although you are free to use 'G', if you want to continue your alphabet.)
In the outer query, the tables for posts and data-article are joined again (RA and RP) to query the relevant fields of the related post, based on the related_post_id from the inner query. These two tables are left joined (and in reverse order), so you still get the main post if no related post was found.
SELECT
X.site_rating,
X.post_id,
X.post_author,
X.post_date,
X.category,
RA.user_votes / RA.user_voters as related_post_site_rating,
RP.ID as related_post_id,
RP.post_author as related_post_author,
RP.post_date as related_post_date,
RP.name as related_category,
FROM
( SELECT
(A.user_votes/A.user_voters) as site_rating,
B.ID as post_id, B.post_author, B.post_date,E.name as category,
( SELECT
RB.ID as post_id
FROM `wp_gdsr_data_article` as RA
INNER JOIN `wp_posts` as RB ON (RA.post_id = RB.id)
INNER JOIN wp_term_relationships RC ON (RB.ID = RC.object_id)
INNER JOIN wp_term_taxonomy RD ON (RC.term_taxonomy_id = RD.term_taxonomy_id)
INNER JOIN wp_terms RE ON (RD.term_id = RE.term_id)
LEFT JOIN (
SELECT *,COUNT(*) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as RF ON (RA.post_id = RF.post_id_winner OR RA.post_id = RF.post_id_loser)
WHERE
RE.name = E.name AND
RB.ID <> B.ID AND
RB.post_type = 'post' AND
RB.post_status = 'publish' AND
RD.taxonomy='category' AND
RE.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY count ASC, diff ASC
LIMIT 1) as related_post_id
FROM `wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
ORDER BY RAND()
LIMIT 20
) X
LEFT JOIN `wp_posts` as RP ON RP.id = X.related_post_id
LEFT JOIN `wp_gdsr_data_article` as RA.post_id = RP.id
I can't test my proposal so take it with the benefit of the doubt. Anyway i hope it could be a valid starting point for some of the issues faced.
I can not imagine a solution that does not pass through a temporary table, cabling onerous computations present in your queries. You could also have the goal to not interfere with the randomization of the first phase. In the following I try to clarify.
I'll start with these rewritings:
-- first query
SELECT site_rating, post_id, post_author, post_date, category
FROM POSTS_COMMON
ORDER BY RAND()
LIMIT 20
-- second query
SELECT ABS(R.site_rating_A - R.site_rating_B) as diff, R.site_rating_B as site_rating, P.post_id, P.post_author, P.post_date, P.category, F.count
FROM POSTS_COMMON AS P
INNER JOIN POSTS_RATING_DIFFS AS R ON (P.post_id = R.post_id_B)
LEFT JOIN (
/* post_id_winner, post_id_loser explicited; COUNT(*) NULL treatment anticipated */
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count FROM `verus` WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as F ON (P.post_id = F.post_id_winner OR P.post_id = F.post_id_loser)
WHERE
P.category = '$category'
AND R.post_id_A = '$post_id'
ORDER BY count ASC, diff ASC
LIMIT 1
with:
SELECT A.post_id_A, B.post_id_B, A.site_rating as site_rating_A, B.site_rating as site_rating_B
INTO POSTS_RATING_DIFFS
FROM POSTS_COMMON as A, POSTS_COMMON as B
WHERE A.post_id <> B.post_id AND A.category = B.category
CREATE VIEW POSTS_COMMON AS
SELECT A.ID as post_id, A.user_votes, A.user_voters, (A.user_votes / A.user_voters) as site_rating, B.post_author, B.post_date, E.name as category
FROM wp_gdsr_data_article` as A
INNER JOIN `wp_posts` as B ON (A.post_id = B.post_id)
INNER JOIN wp_term_relationships C ON (B.ID = C.object_id)
INNER JOIN wp_term_taxonomy D ON (C.term_taxonomy_id = D.term_taxonomy_id)
INNER JOIN wp_terms E ON (D.term_id = E.term_id)
WHERE
B.post_type = 'post' AND
B.post_status = 'publish' AND
D.taxonomy='category' AND
E.name NOT IN ('Satire', 'Declined', 'Outfits','Unorganized', 'AP')
POSTS_COMMON isolates a common view between the two queries.
With POSTS_RATING_DIFFS, a temporary table populated with the ratings combinations and diffs, we have "the trick" of transforming the inequality join criteria on post_id(s) in an equality one (see R.post_id_A = '$post_id' in the second query).
We also take advantage of a temporary table in having precomputed ratings for the combinatory explosion of A.post_id <> B.post_id (with post category equality), and moreover being useful for other sessions.
Also extracting the RAND() ordering in a temporary table could be advantageous. In this case we could limit the ratings combinations and diffs only on the 20 randomly chosen.
Original limiting to one single row in the dependent second level query is done by mean of ordering and limit statements.
The proposed solution avoids elaborating a LIMIT 1 on an ORDER BY resultset in the second level query wich become a subquery.
The single row calculation in the subquery is done by mean of a WHERE criteria on the maximum of a single value calculated from the columns values on which ORDER BY clause is used.
The combination into a single value must be valid in preserving the correct ordering. I'll leave in pseudo-code as:
'<combination of count and diff>'
For example, using combination of the two values into a string type, we could have:
CONCAT(LPAD(CAST(count AS CHAR), 10, '0'), LPAD(CAST(ABS(diff) AS CHAR), 20, '0'))
The structure of the single query would be:
SELECT (Q_LVL_1.user_votes/Q_LVL_1.user_voters) as site_rating_LVL_1, Q_LVL_1.post_id as post_id_LVL_1
, Q_LVL_1.post_author as post_author_LVL_1, Q_LVL_1.post_date as post_date_LVL_1
, Q_LVL_1.category as category_LVL_1, Q_LVL_2.post_id as post_id_LVL_2
, Q_LVL_2.diff as diff_LVL_2, Q_LVL_2.site_rating as site_rating_LVL_2
, Q_LVL_2.post_author as post_author_LVL_2, Q_LVL_2.post_date as post_date_LVL_2
, Q_LVL_2.count
FROM POSTS_COMMON AS Q_LVL_1
, /* 1-row-selection query placed side by side for each Q_LVL_1's row */
(
SELECT CORE_P.post_id, CORE_P.ABS_diff as diff, P.site_rating, P.post_author, P.post_date, CORE_P.count
FROM POSTS_COMMON AS P
INNER JOIN (
SELECT FIRST(CORE_P.post_id) as post_id, ABS(CORE_P.diff) as ABS_diff, CORE_P.count
FROM (
/*
selection of posts with post_id(s) different from first level query,
not already taken and with the topmost value of
'<combination of count and diff>'
*/
) AS CORE_P
GROUP BY CORE_P.count, ABS(CORE_P.diff)
/* the one row selector */
) AS CORE_ONE_LINER ON P.post_id = CORE_ONE_LINER.post_id
) AS Q_LVL_2
ORDER BY RAND()
LIMIT 20
CORE_P selection could have more post_id(s) corresponding to the topmost value '<combination of count and diff>', so the use of GROUP BY and FIRST clauses to reach the single row.
This brings to a possible final implementation:
SELECT (Q_LVL_1.user_votes/Q_LVL_1.user_voters) as site_rating_LVL_1, Q_LVL_1.post_id as post_id_LVL_1
, Q_LVL_1.post_author as post_author_LVL_1, Q_LVL_1.post_date as post_date_LVL_1
, Q_LVL_1.category as category_LVL_1, Q_LVL_2.post_id as post_id_LVL_2
, Q_LVL_2.diff as diff_LVL_2, Q_LVL_2.site_rating as site_rating_LVL_2
, Q_LVL_2.post_author as post_author_LVL_2, Q_LVL_2.post_date as post_date_LVL_2
, Q_LVL_2.count
FROM POSTS_COMMON AS Q_LVL_1
, (
SELECT CORE_P.post_id, CORE_P.ABS_diff as diff, P.site_rating, P.post_author, P.post_date, CORE_P.count
FROM POSTS_COMMON AS P
INNER JOIN
(
SELECT FIRST(CORE_P.post_id) as post_id, ABS(CORE_P.diff) as ABS_diff, CORE_F.count
FROM (
SELECT CORE_RATING.post_id as post_id, ABS(CORE_RATING.diff) as ABS_diff, CORE_F.count
FROM (
SELECT post_id_B as post_id, site_rating_A - site_rating_B as diff
FROM POSTS_RATING_DIFFS
WHERE POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
) as CORE_RATING
LEFT JOIN (
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count
FROM `verus`
WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as CORE_F ON (CORE_RATING.post_id = CORE_F.post_id_winner OR CORE_RATING.post_id = CORE_F.post_id_loser)
WHERE
POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
AND '<combination of CORE_F.count and CORE_RATING.diff>'
= MAX (
SELECT '<combination of CORE_F_2.count and CORE_RATING_2.diff>'
FROM (
SELECT site_rating_A - site_rating_B as diff
FROM POSTS_RATING_DIFFS
WHERE POSTS_RATING_DIFFS.post_id_A = Q_LVL_1.post_id
) as CORE_RATING_2
LEFT JOIN (
SELECT post_id_winner, post_id_loser, IFNULL(COUNT(*), 0) as count
FROM `verus`
WHERE ip = '{$_SERVER['REMOTE_ADDR']}'
) as CORE_F_2 ON (CORE_RATING_2.post_id = CORE_F_2.post_id_winner OR CORE_RATING_2.post_id = CORE_F_2.post_id_loser)
) /* END MAX */
) AS CORE_P
GROUP BY CORE_P.count, ABS(CORE_P.diff)
) AS CORE_ONE_LINER ON P.post_id = CORE_ONE_LINER.post_id
) AS Q_LVL_2
ORDER BY RAND()
LIMIT 20

mysql query taking long time to excute

This is my query taking 3 second to execute :
SELECT I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM items I
JOIN itemdetail D
ON I.itemtype = 'I'
AND I.itemid = D.itemid
AND D.propid = 13
LEFT OUTER JOIN icd
ON icd.code = D.value
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
WHERE I.deleteflag = 0
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
AND I.itemname LIKE 'A%'
AND ( I.keyname = 'Assessments' )
ORDER BY I.itemname ASC limit 0,6;
i have index IX_items_itemType_deleteFlag_keyName_itemName on multiple column itemType , deleteFlag ,keyName ,itemName in items table and also have index on other table's column which used in join and where clause.
so how can i improve performance of query ?
Thanks
I would have an index on your items table based on the multiple key columns used for your where clause and order by. I would have the index with the smallest result set in the front position. For example, you are specifically looking for "Assessments". If your table has 1 million records, and 600k of them are of item type "I", but only 5k are "Assessments", then the smallest part up front might be better for your query TO process.
I would have your:
items table indexed on ( keyname, itemtype, deleteflag, itemname )
ItemDetail table, indexed ON ( itemid, propid )
icd table indexed ON ( code, validfrom, validto, ecwstatus )
edi_icdcodes table index ON (itemid)
SELECT
I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM
items I
JOIN itemdetail D
ON I.itemid = D.itemid
AND D.propid = 13
LEFT OUTER JOIN icd
ON D.value = icd.code
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
WHERE
I.itemtype = 'I'
AND I.deleteflag = 0
AND I.keyname = 'Assessments'
AND I.itemname LIKE 'A%'
ORDER BY
I.itemname ASC
LIMIT
0,6;
Note... if the ICD table will always have a value for both from/to dates when records are created, you won't need to test for NULL, but do understand why you had that via left-join and putting in the where clause. So, that part might be simplified to
LEFT OUTER JOIN icd
ON D.value = icd.code
AND icd.validfrom <= '2012-12-06'
AND icd.validto >= '2012-12-06'
What you do there, it's that you use a basic table, then you do some 'join' and only after, you do your 'where' requests . To go quicky, try to include your condition in your 'joins'. In this manneer, it selects the different lines in the 'join' request and not after.
SELECT I.itemname,
I.overdue,
D.value,
I.itemid,
icd.ecwstatus AS status,
C.inactiveflag AS inactive,
icd.validfrom,
icd.validto
FROM items I
JOIN itemdetail D
ON I.itemtype = 'I'
AND I.itemid = D.itemid
AND D.propid = 13
AND I.deleteflag = 0
LEFT OUTER JOIN icd
ON icd.code = D.value
AND ( icd.validfrom <= '2012-12-06'
OR icd.validfrom IS NULL )
AND ( icd.validto >= '2012-12-06'
OR icd.validto IS NULL )
LEFT OUTER JOIN edi_icdcodes C
ON I.itemid = C.itemid
AND I.itemname LIKE 'A%'
AND ( I.keyname = 'Assessments' )
ORDER BY I.itemname ASC limit 0,6;