Convert NOT IN query to better performance - mysql

I'm using MySQL 5.0, and I need to fine tune this query. Can anyone please tell me what tuning I can do in this?
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;

The first thing that I'd do is rewrite the subqueries as joins:
SELECT h.alert_master_id
FROM alert_appln_header h
JOIN schedule_config c
ON c.schedule_name = 'Purging_Config'
LEFT JOIN alert_details d
ON d.alert_master_id = h.alert_master_id
AND d.end_date IS NULL
AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
LEFT JOIN (
alert_sara_header s
JOIN alert_sara_lines l
ON l.alert_sara_master_id = s.sara_master_id
)
ON s.alert_master_id = h.alert_master_id
AND s.end_date IS NULL
AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
WHERE h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
AND d.alert_master_id IS NULL
AND s.alert_master_id IS NULL
GROUP BY h.alert_master_id
LIMIT 5000
If it's still slow after that, re-examine your indexing strategy. I'd suggest indexes over:
alert_appln_header(alert_master_id,created_date)
schedule_config(schedule_name)
alert_details(alert_master_id,end_date,created_date)
alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
alert_sara_lines(alert_sara_master_id)

OK, this may be just a shot in the dark, but I think you don't need as many DISTINCT here.
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
-- removed distinct here --
SELECT alert_master_id FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
-- removed distinct here --
SELECT alert_master_id FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL)
AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Since using the DISTINCT is very costly, try to avoid it. In the first WHERE clause you are checking for ids that are NOT within some result, so it shouldn't matter if in that result some ids appear more than once.

Related

MySQL - Using COALESCE with DATE_ADD and DATE_SUB to get next/previous record

I am trying to query MySQL to select the previous and next record. I need help in using COALESCE and DATE_ADD/DATE_SUB together.
SELECT * from `Historical` where `DeltaH` = 'ALTF' and `Date`=
COALESCE(DATE_SUB('2019-01-21', INTERVAL 1 DAY),
DATE_SUB('2019-01-21',INTERVAL 2 DAY),
DATE_SUB('2019-01-21', INTERVAL 3 DAY));
I cannot use the primary key because rows in the table are/will be deleted. The date column also does not necessarily have fixed dates, what I want to find is the next earlier/later date.
SELECT * from `Historical` where `DeltaH` = 'ALTF' and `Date`=
DATE_SUB('2019-01-21', INTERVAL 3 DAY);
The above query seems to work, however I need to query for INTERVAL 1 DAY, in case the date does not exist move to INTERVAL 2 DAY....
select * from `Historical` where `DeltaH` = 'ALTF' and `Date`=
DATE_SUB('2019-01-21', INTERVAL COALESCE(1,2,3,4,5) DAY);
This one does not work either. I understand that the COALESCE() function returns the first non-null value, however I am not able to get it to work using the above query. I have confirmed that data exists for 2019-01-18 but is not being selected. Can you please advise?
I am OK with using an alternate solution.
You can use a subquery to find the most recent date in the table that is less than 2019-01-21 e.g.
SELECT *
FROM `Historical`
WHERE `DeltaH` = 'ALTF' AND `Date`= (SELECT MAX(`Date`)
FROM `Historical`
WHERE `DeltaH` = 'ALTF' AND `Date` < '2019-01-21')
To find the closest date that is later, we just adapt the query slightly, using MIN and >:
SELECT *
FROM `Historical`
WHERE `DeltaH` = 'ALTF' AND `Date`= (SELECT MIN(`Date`)
FROM `Historical`
WHERE `DeltaH` = 'ALTF' AND `Date` > '2019-01-21')
FWIW, I'd write this differently...
SELECT x.*
FROM Historical
JOIN
( SELECT deltah
, MAX(date) date
FROM Historical
WHERE date < '2019-01-21'
GROUP
BY deltah
) y
ON y.deltah = x.deltah
AND y.date = x.date
WHERE x.deltah = 'ALTF';
This seems like the simplest method:
select h.*
from historical h
where h.DeltaH = 'ALTF' and
h2.Date < '2019-01-21'
order by h.Date DESC
limit 1
For best performance, you want an index on (DeltaH, Date).
If you want both the date before and after:
(select h.*
from historical h
where h.DeltaH = 'ALTF' and
h2.Date < '2019-01-21'
order by h.Date desc
limit 1
) union all
(select h.*
from historical h
where h.DeltaH = 'ALTF' and
h2.Date > '2019-01-21'
order by h.Date asc
limit 1
);
I'm not sure if one or both comparisons should be have =, so you can get results on that date.

How to change format of the MySQL result?

I have a complex mysql query language, including several sub queries and my final result is as below. There is something that I am dealing with it and I can't solve it and this is a way result is being presented. I am wondering to know how can i change the structure of the result in a way that the result is being presented only in one row and I don't want to see NULL fields. I mean something like below
This is mysql query
select count(*) as userRetentionSameDay, null as 'userRetentionDiffDay' from (SELECT date(`timestamp`), `user_xmpp_login`
FROM table1
WHERE DATE(`timestamp` ) = CURDATE() - INTERVAL 1 DAY) as res1
right join (select date(ts), user
from table2
WHERE DATE(ts ) = CURDATE() - INTERVAL 1 DAY
and product_id REGEXP ("^(europe+$" )) as lej1
on lej1.user = res1.`user_xmpp_login`
where res1.`user_xmpp_login` IS not NULL
union all
select null as 'userRetentionSameDay', count(*) as userRetentionDiffDay from (SELECT date(`timestamp`), `user_xmpp_login`
FROM table1
WHERE DATE(`timestamp` ) = CURDATE() - INTERVAL 1 DAY) as res1
right join (select date(ts), user
from table2
WHERE DATE(ts ) = CURDATE() - INTERVAL 1 DAY
and product_id REGEXP ("^(europe+$" )) as lej2
on lej2.user = res1.`user_xmpp_login`
where res1.`user_xmpp_login` IS NULL;
What are the recommended solutions to doing that?
try this.
SELECT A.userRetentionSameDay,B.userRetentionDiffDay FROM (
SELECT COUNT() AS userRetentionSameDay FROM
(
SELECT DATE(timestamp), user_xmpp_login
FROM table1
WHERE DATE(timestamp ) = CURDATE() - INTERVAL 1 DAY) AS res1
RIGHT JOIN (SELECT DATE(ts), USER
FROM table2
WHERE DATE(ts ) = CURDATE() - INTERVAL 1 DAY
AND product_id REGEXP ("^(europe+$" )) AS lej1
ON lej1.user = res1.user_xmpp_login
WHERE res1.user_xmpp_login IS NOT NULL
) A,
(
SELECT COUNT() AS userRetentionDiffDay FROM (
SELECT DATE(timestamp), user_xmpp_login
FROM table1
WHERE DATE(timestamp ) = CURDATE() - INTERVAL 1 DAY
) AS res1
RIGHT JOIN (SELECT DATE(ts), USER
FROM table2
WHERE DATE(ts ) = CURDATE() - INTERVAL 1 DAY
AND product_id REGEXP ("^(europe+$" )
) AS lej2
ON lej2.user = res1.user_xmpp_login
WHERE res1.user_xmpp_login IS NULL
) B;

MYSQL - How do I show 0 is there are no results?

I have the below query which is working fine and shows any missed calls on a dashboard. However if there are no results it returns no data. How do I show '0' is there are no results? Thanks
SELECT DialledNumber, COUNT(*) As Missed
FROM CALLS
WHERE destination like '%!%' AND datetime BETWEEN CURDATE() AND DATE_ADD(CURDATE(), INTERVAL 1 day)
GROUP By DialledNumber
HAVING (DialledNumber = '500') OR (DialledNumber = '580') OR (DialledNumber = '515') OR (DialledNumber = '513') OR (DialledNumber = '514')
You can use IFNULL function in MYSQL
SELECT DialledNumber, IFNULL(COUNT(DialledNumber),0) As Missed
FROM CALLS
WHERE destination like '%!%' AND datetime BETWEEN CURDATE() AND DATE_ADD(CURDATE(), INTERVAL 1 day)
GROUP By DialledNumber
HAVING (DialledNumber = '500') OR (DialledNumber = '580') OR (DialledNumber = '515') OR (DialledNumber = '513') OR (DialledNumber = '514')
You want to use left join rather than in:
SELECT dn.dn, COUNT(c.diallednumber) As Missed
FROM (select '500' as dn union all select '580' union all select '515' union all
select '513' union all select '514'
) dn left join
CALLS c
ON c.diallednumber = dn.dn and
c.destination like '%!%' AND
c.datetime BETWEEN CURDATE() AND DATE_ADD(CURDATE(), INTERVAL 1 day)
GROUP By dn.dn;
As a small note, in English the word "dialed" does not have two "l"s.
This could work for you (untested):
SELECT CALLS.DialedNumber, IFNULL(MISSEDCALLS.Missed, 0) FROM CALLS
LEFT JOIN (
SELECT DialedNumber, COUNT(*) As Missed FROM CALLS
WHERE destination like '%!%' AND datetime BETWEEN CURDATE() AND DATE_ADD(CURDATE(), INTERVAL 1 day)
GROUP By DialledNumber
) AS MISSEDCALLS ON MISSEDCALLS.DialedNumber = CALLS.DialedNumber
WHERE CALLS.DialledNumber IN ('500', '580', '515', '513', '514')

Error in SQL CASE in where statement during select

I have a query that selects 3 random items from database table but I need to apply some more logic to the query based on the value of a field in the query.
This is what I have so far hope it makes sense. Have not fully tested it yet figured I would run it through you guys first see if there is anything that jumps out.
SELECT ord.id, keyword, url, daily_max
FROM orders AS ord
LEFT JOIN product_tasks AS tsk ON tsk.id = ord.task_id
LEFT JOIN product_groups AS grp ON grp.id = tsk.product_group
WHERE (
status = 'approved' AND
ord.total_actions_today < tsk.daily_max AND
grp.id = 1 AND
country_code = '$country' AND
(
CASE WHEN daily_max >= 5 THEN last_displayed < (NOW() - INTERVAL 30 MINUTE)
CASE WHEN daily_max >10 THEN last_displayed < (NOW() - INTERAVAL 5 MINUTE)
ELSE last_displayed < (NOW() - INTERAVAL 60 MINUTE)
)
)
GROUP BY ord.id
ORDER BY RAND()
LIMIT 3
Just tested the query and as I suspected I have my syntax wrong so any help would be appreciated:
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CASE WHEN daily_max >10 THEN last_displayed < (NOW() - INTERAVAL 5 MINUTE) ' at line 14
Edit - Fixed
(After several attempts + edits):
SELECT ord.id, keyword, url, daily_max
FROM orders AS ord
LEFT JOIN product_tasks AS tsk ON tsk.id = ord.task_id
LEFT JOIN product_groups AS grp ON grp.id = tsk.product_group
WHERE (
status = 'approved' AND
ord.total_actions_today < tsk.daily_max AND
grp.id = 1 AND
country_code = 'us'
AND last_displayed <
CASE WHEN (daily_max >= 5) THEN (NOW() - INTERVAL 30 MINUTE)
WHEN (daily_max >10) THEN (NOW() - INTERVAL 5 MINUTE)
ELSE (NOW() - INTERVAL 60 MINUTE)
END
)
GROUP BY ord.id
ORDER BY RAND()
LIMIT 3
http://sqlfiddle.com/#!2/ac4499/1
Solved DOH missing ) in the where statement
Move the last_displayed out of the CASE, such that it is compared to a single value as projected out of the CASE statement:
WHERE...
AND last_displayed <
CASE WHEN (daily_max >= 5) THEN (NOW() - INTERVAL 30 MINUTE)
WHEN (daily_max >10) THEN (NOW() - INTERVAL 5 MINUTE)
ELSE (NOW() - INTERVAL 60 MINUTE)
END;
Also note a couple of typos - INTERVAL not INTERAVAL, and just one CASE is required.
SqlFiddle here
Your structure of your WHERE was not correct. Try the following:
SELECT ord.id, keyword, url, daily_max
FROM orders AS ord
LEFT JOIN product_tasks AS tsk ON tsk.id = ord.task_id
LEFT JOIN product_groups AS grp ON grp.id = tsk.product_group
WHERE (
status = 'approved' AND
ord.total_actions_today < tsk.daily_max AND
grp.id = 1 AND
country_code = '$country' AND
(
CASE WHEN daily_max >10 THEN (NOW() - INTERAVAL 5 MINUTE)
WHEN daily_max >= 5 THEN (NOW() - INTERVAL 30 MINUTE)
ELSE (NOW() - INTERAVAL 60 MINUTE)
END > last_displayed
)
)
GROUP BY ord.id
ORDER BY RAND()
LIMIT 3

mysql join same table different result set

I would like to combine different results from the same table as one big result.
SELECT host_name,stats_avgcpu,stats_avgmem,stats_avgswap,stats_avgiowait
FROM sar_stats,sar_hosts,sar_appgroups,sar_environments
WHERE stats_host = host_id
AND host_environment = env_id
AND env_name = 'Staging 2'
AND host_appgroup = group_id
AND group_name = 'Pervasive'
AND DATE(stats_report_time) = DATE_SUB(curdate(), INTERVAL 1 DAY)
SELECT AVG(stats_avgcpu),AVG(stats_avgmem),AVG(stats_avgswap),AVG(stats_avgiowait)
FROM sar_stats
WHERE stats_id = "stat_id of the first query" and DATE(stats_report_time)
BETWEEN DATE_SUB(curdate(), INTERVAL 8 DAY) and DATE_SUB(curdate(), INTERVAL 1 DAY)
SELECT AVG(stats_avgcpu),AVG(stats_avgmem),AVG(stats_avgswap),AVG(stats_avgiowait)
FROM sar_stats
WHERE stats_id = "stat_id of the first query" and DATE(stats_report_time)
BETWEEN DATE_SUB(curdate(), INTERVAL 31 DAY) and DATE_SUB(curdate(), INTERVAL 1 DAY)
Desired output would be something like ...
host_name|stats_avgcpu|stats_avgmem|stats_avgswap|stats_avgiowait|7daycpuavg|7daymemavg|7dayswapavg|7dayiowaitavg|30daycpuavg|30daymemavg|....etc
SQL Fiddle
http://sqlfiddle.com/#!8/4930b/3
It seems like this is what you want. I updated the first query to use proper ANSI JOIN syntax and then for the additional two queries they were joined via a LEFT JOIN on the stats_host field:
SELECT s.stats_host,
h.host_name,
s.stats_avgcpu,
s.stats_avgmem,
s.stats_avgswap,
s.stats_avgiowait,
s7.7dayavgcpu,
s7.7dayavgmem,
s7.7dayavgswap,
s7.7dayavgiowait,
s30.30dayavgcpu,
s30.30dayavgmem,
s30.30dayavgswap,
s30.30dayavgiowait
FROM sar_stats s
INNER JOIN sar_hosts h
on s.stats_host = h.host_id
INNER JOIN sar_appgroups a
on h.host_appgroup = a.group_id
and a.group_name = 'Pervasive'
INNER JOIN sar_environments e
on h.host_environment = e.env_id
and e.env_name = 'Staging 2'
LEFT JOIN
(
SELECT s.stats_host,
AVG(s.stats_avgcpu) AS '7dayavgcpu',
AVG(s.stats_avgmem) AS '7dayavgmem',
AVG(s.stats_avgswap) AS '7dayavgswap',
AVG(s.stats_avgiowait) AS '7dayavgiowait'
FROM sar_stats s
WHERE DATE(stats_report_time) BETWEEN DATE_SUB(curdate(), INTERVAL 8 DAY) AND DATE_SUB(curdate(), INTERVAL 1 DAY)
GROUP BY s.stats_host
) s7
on s.stats_host = s7.stats_host
LEFT JOIN
(
SELECT s.stats_host,
AVG(s.stats_avgcpu) AS '30dayavgcpu',
AVG(s.stats_avgmem) AS '30dayavgmem',
AVG(s.stats_avgswap) AS '30dayavgswap',
AVG(s.stats_avgiowait) AS '30dayavgiowait'
FROM sar_stats s
WHERE DATE(s.stats_report_time) BETWEEN DATE_SUB(curdate(), INTERVAL 31 DAY) AND DATE_SUB(curdate(), INTERVAL 1 DAY)
GROUP BY s.stats_host
) s30
on s.stats_host = s30.stats_host
WHERE DATE(s.stats_report_time) = DATE_SUB(curdate(), INTERVAL 1 DAY);
see SQL Fiddle with Demo