I have written a query. It works better. But currently, all tables have 100K rows, and one of my queries returns too slow. Can you please suggest to me how I can optimize the query?
select *
from tbl_xray_information X
WHERE locationCode = (SELECT t.id
from tbl_location t
where CODE = '202')
AND ( communicate_with_pt is NULL || communicate_with_pt='')
AND x.patientID NOT IN (SELECT patientID
FROM tbl_gxp_information
WHERE center_id = '202')
order by insertedON desc LIMIT 2000
Please note here 'patientID' is varchar.
This may run faster:
select *
from tbl_xray_information AS X
WHERE locationCode =
( SELECT t.id
from tbl_location t
where CODE = '202'
)
AND ( x.communicate_with_pt is NULL
OR x.communicate_with_pt = '' )
AND NOT EXISTS ( SELECT 1 FROM tbl_gxp_information
WHERE x.patientID = patientID
AND center_id = '202' )
order by insertedON desc
LIMIT 2000
These indexes may help:
tbl_location: INDEX(CODE)
tbl_gxp_information: INDEX(center_id, patientID) -- (either order)
Since OR is poorly optimized, it may be better to pick either NULL or empty-string for communicate_with_pt (to avoid testing for both).
Related
I am trying to get through a problem where there are multiple accounts of same scheme on same customer id. On a given txn date I want to retrieve the total Sanctioned Limit and total utilized amount from these accounts. Below is the SQL query I have constructed.
SELECT
cust_id,
tran_date,
rollover_date,
next_rollover,
(
SELECT
acc_num as kcc_ac
FROM
dbzsubvention.acc_disb_amt a
WHERE
(a.tran_date <= AB.tran_date)
AND a.sch_code = 'xxx'
AND a.cust_id = AB.cust_id
ORDER BY
a.tran_date desc
LIMIT
1
) KCC_ACC,
(
SELECT
SUM(kcc_prod)
FROM
(
SELECT
prod_limit as kcc_prod,
acc_num,
s.acc_status
FROM
dbzsubvention.acc_disb_amt a
inner join dbzsubvention.acc_rollover_all_sub_status s using (acc_num)
left join dbzsubvention.acc_close_date c using (acc_num)
WHERE
a.cust_id = AB.cust_id
AND a.tran_date <= AB.tran_date
AND (
ac_close > AB.tran_date || ac_close is null
)
AND a.sch_code = 'xxx'
AND s.acc_status = 'R'
AND s.rollover_date <= AB.tran_date
AND (
AB.tran_date < s.next_rollover || s.next_rollover is null
)
GROUP BY
acc_num
order by
a.tran_date
) t
) kcc_prod,
(
SELECT
sum(disb_amt)
FROM
(
SELECT
disb_amt,
acc_num,
tran_date
FROM
(
SELECT
disb_amt,
a.acc_num,
a.tran_date
FROM
dbzsubvention.acc_disb_amt a
inner join dbzsubvention.acc_rollover_all_sub_status s using (acc_num)
left join dbzsubvention.acc_close_date c using (acc_num)
WHERE
a.tran_date <= AB.tran_date
AND (
c.ac_close > AB.tran_date || c.ac_close is null
)
AND a.sch_code = 'xxx'
AND a.cust_id = AB.cust_id
AND s.acc_status = 'R'
AND s.rollover_date <= AB.tran_date
AND (
AB.tran_date < s.next_rollover || s.next_rollover is null
)
GROUP BY
acc_num,
a.tran_date
order by
a.tran_date desc
) t
GROUP BY
acc_num
) tt
) kcc_disb
FROM
dbzsubvention.acc_disb_amt AB
WHERE
AB.cust_id = 'abcdef'
group by
cust_id,
tran_date
order by
tran_date asc;
This query isn't working. Upon research I have found that correlated subquery works only till 1 level down. However I couldn't get a workaround to this problem.
I have tried searching the solution around this problem but couldn't find the desired one. Using the SUM function at the inner query will not give desired results as
In the second subquery that will sum all the values in column before applying the group by clause.
In third subquery the sorting has to be done first then the grouping and finally the sum.
Therefore I am reaching out to the community for help to suggest a workaround to the issue.
You're correct - external column cannot be transferred through the nesting level immediately.
Try this workaround:
SELECT ... -- outer query
( -- correlated subquery nesting level 1
SELECT ...
( -- correlated subquery nesting level 2
SELECT ...
...
WHERE table0_level1.column0_1 ... -- moved value
)
FROM table1
-- move through nesting level making it a source of current level
CROSS JOIN ( SELECT table0.column0 AS column0_1 ) AS table0_level1
) AS ...,
...
FROM table0
...
WITH t as (
SELECT *
FROM scd p
WHERE p.modified_date > FROM_UNIXTIME(1593060230)
AND ( p.main_id = 1
OR FIND_IN_SET(1, p.mult_ids) <> 0 )
ORDER BY modified_date DESC
LIMIT 2 OFFSET 0
),
del as (
SELECT
*
FROM t WHERE (status <> 1 AND status <> 2)
),
w_del as (
SELECT
*
FROM t WHERE (status = 1 OR status = 2)
)
SELECT w_del.*, del.* FROM w_del,del;
How do I achieve this with normal sub queries. I am using MySQL 5.7 and can't use CTEs. Im getting can't reuse table error if I use UNION/sub-queries. Is there a way to achieve this without temporary tables?
Please help.
You can just plug in the code for each alias . . . and keep doing that until you are at the base tables:
SELECT w_del.*, del.*
FROM (SELECT t.*
FROM (SELECT *
FROM scd p
WHERE p.modified_date > FROM_UNIXTIME(1593060230) AND
( p.main_id = 1 OR FIND_IN_SET(1, p.mult_ids) <> 0 )
ORDER BY modified_date DESC
LIMIT 2 OFFSET 0
) t
WHERE (status <> 1 AND status <> 2)
) w_del CROSS JOIN
(SELECT t.*
FROM (SELECT *
FROM scd p
WHERE p.modified_date > FROM_UNIXTIME(1593060230) AND
( p.main_id = 1 OR FIND_IN_SET(1, p.mult_ids) <> 0 )
ORDER BY modified_date DESC
LIMIT 2 OFFSET 0
) t
WHERE (status = 1 OR status = 2)
) del;
One critical point, though: The definition of t is using ORDER BY and LIMIT. If there are ties in the modified_date column, then the two subqueries could return different result sets. You have two choices to avoid a problem here:
Add additional keys to the ORDER BY to ensure that the sorting is stable (i.e. returns the same results each time because the combination of keys is unique).
Materialize the subquery using a temporary table.
Here is my query
SELECT file_id, file_name, file_date, file_email
FROM (SELECT *
FROM `file`
ORDER BY file_date DESC
) AS t
WHERE file_domains = ''
GROUP BY file_name
ORDER BY file_date DESC
LIMIT 0 , 100
primary key is file_id and index is file_name. Records about 900k
It took about 2 seconds in my local computer.
Is there any optimize for this query?
thanks in advance.
Your query uses a non-standard "feature" (mistake: one non-standard and one semi-standard feature) of MySQL and there is no guarantee that it will not break in future versions of MySQL, when the optimizer will be clever enough to understand that the subquery is redundant.
Add an index on (file_domains, file_name, file_date) and try this version:
SELECT f.file_id, f.file_name, f.file_date, f.file_email
FROM
`file` AS f
JOIN
( SELECT file_name
, MAX(file_date) AS max_file_date
FROM `file`
WHERE file_domains = ''
GROUP BY file_name
ORDER BY max_file_date DESC
LIMIT 0 , 100
) AS fm
ON fm.file_name = f.file_name
AND fm.max_file_date = f.file_date
ORDER BY f.file_date DESC ;
This intermediate query:
SELECT *
FROM `file`
ORDER BY file_date DESC
Fetches 900k records and orders by date, that might be slow.
Suppose equity has a column called TickerID. I would like to replace the 111's with equity.TickerID. MySQL can't seem to resolve the scope and returns an unknown column when I try that. This SQL statement works but I need to run it for each ticker. Would be nice if I could get a full table.
SELECT Ticker,
IF(tbl_m200.MA200_Count = 200,tbl_m200.MA200,-1) AS MA200,
IF(tbl_m50.MA50_Count = 50,tbl_m50.MA50,-1) AS MA50,
IF(tbl_m20.MA20_Count = 20,tbl_m20.MA20,-1) AS MA20
FROM equity
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA200,COUNT(Y.Close) AS MA200_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 0,200
) AS Y
) AS tbl_m200
USING(TickerID)
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA50,COUNT(Y.Close) AS MA50_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 50
) AS Y
) AS tbl_m50
USING(TickerID)
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA20,COUNT(Y.Close) AS MA20_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 0,20
) AS Y
) AS tbl_m20
USING(TickerID)
This seems to be some bug or "feature" of MySQL. Many persons seems to have the same problem with outer tables being out of scope.
Anyway... You could create functions that retrieve the information you want:
DROP FUNCTION IF EXISTS AveragePriceHistory_20;
CREATE FUNCTION AveragePriceHistory_20(MyTickerID INT)
RETURNS DECIMAL(9,2) DETERMINISTIC
RETURN (
SELECT AVG(Y.Close)
FROM (
SELECT Z.Close
FROM equity_pricehistory_daily Z
WHERE Z.TickerID = MyTickerID
ORDER BY Timestamp DESC
LIMIT 20
) Y
HAVING COUNT(*) = 20
);
SELECT
E.TickerID,
E.Ticker,
AveragePriceHistory_20(E.TickerID) AS MA20
FROM equity E;
You would get NULL instead of -1. If this is undesirable, you could wrap the function-call with IFNULL(...,-1).
Another way of solving this, would be to select for the time-frame, instead of using LIMIT.
SELECT
E.TickerID,
E.Ticker,
(
SELECT AVG(Y.Close)
FROM equity_pricehistory_daily Y
WHERE Y.TickerID = E.TickerID
AND Y.Timestamp > ADDDATE(CURRENT_TIMESTAMP, INTERVAL -20 DAY)
) AS MA20
FROM equity E;
I have the following tables in my game's database:
rankedUp (image_id, user_id, created_at)
globalRank (image_id, rank )
matchups (user_id, image_id1, image_id2)
All image_ids in globalRank table are assigned a rank which is a float from 0 to 1
Assuming I have the current logged in user's "user_id" value, I'm looking for a query that will return a pair of image ids (imageid1, imageid2) such that:
imageid1 has lower rank than imageid2 but is also the next highest rank less than imageid2
matchups table doesn't have (userid,imageid1,imageid2) or (userid,imageid2,imageid1)
rankedup table doesn't have (userid,imageid1) or if it does, the createdat column is older than X hours
What I have so far for requirement 1 is this:
SELECT lowerImages.image_id AS lower_image, higherImages.image_id AS higher_image
FROM global_rank AS lowerImages, global_rank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND lowerImages.image_id = (
SELECT image_id
FROM (
SELECT image_id
FROM global_rank
WHERE rank < higherImages.rank
ORDER BY rank DESC
LIMIT 1 , 1
) AS tmp
)
but it doesnt work because I can't reference higherImages.rank in the subquery.
Does anyone know how I could satisfy all of those requirements in one query?
Thanks for your help
EDIT:
I now have this query but I don't know about the efficiency and I need to test it for correctness:
SELECT lowerImages.image_id AS lower_image,
max(higherImages.image_id) AS higher_image
FROM global_rank AS lowerImages, global_rank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND 1 NOT IN (select 1 from ranked_up where
lowerImages.image_id = ranked_up.image_id
AND ranked_up.user_id = $user_id
AND ranked_up.created_at > DATE_SUB(NOW(), INTERVAL 1 DAY))
AND 1 NOT IN (
SELECT 1 from matchups where user_id = $userId
AND lower_image_id = lowerImages.image_id
AND higher_image_id = higherImages.image_id
UNION
SELECT 1 from matchups where user_id = $user_id
AND lower_image_id = higherImages.image_id
AND higher_image_id = lowerImages.image_id
)
GROUP BY 1
the "not in" statements I'm using are all indexed so they should run fast. The efficiency problem I have is the group by and selection of the global_rank tables
This question is a revision of Pretty Complex SQL Query, which should no longer be answered.
select
(
select image_id, rank from
rankedup inner join globalRank
on rankedup.image_id = globalRank .image_id
where user_id = XXX
limit 1, 1
) as highest,
(
select image_id, rank from
rankedup inner join globalRank
on rankedup.image_id = globalRank .image_id
where user_id = XXX
limit 2, 1
) as secondhighest
I normally use SQL Server, but this i think is the translation for mysql :)
This should do the trick:
SELECT lowerImages.*, higherImages.*
FROM globalrank AS lowerImages, globalrank AS higherImages
WHERE lowerImages.rank < higherImages.rank
AND lowerImages.image_id = (
SELECT image_id
FROM (
SELECT image_id
FROM globalrank
WHERE rank < higherImages.rank
ORDER BY rank DESC
LIMIT 1,1
) AS tmp
)
AND NOT EXISTS (
SELECT * FROM matchups
WHERE user_id = $user_id
AND ((image_id1 = lowerImages.image_id AND image_id2 = higherImages.image_id)
OR (image_id2 = lowerImages.image_id AND image_id1 = higherImages.image_id))
)
AND higherImages.image_id NOT IN (
SELECT image_id FROM rankedup
WHERE created_at < DATE_ADD(NOW(), INTERVAL 1 DAY)
AND USER_ID <> $user_id
)
ORDER BY higherImages.rank
I'm assuming the PKs of matchups and rankedup include all columns in those tables. This would allow the second 2 sub-queries to utilize the PK indexes. You would probably want an ordered index on globalrank.rank to speed up the first sub-query.