I am trying to optimize this query. Now it takes 28 seconds.
AS used to be missing in my query. After adding, query time dropped 20%
SELECT
g.id,
g.adresid,
g.senaryoid,
g.olayid,
g.gonderilecegitarih
FROM
(
SELECT
adresid
FROM
expose2.800_emsenaryolar_emgidenbulten
WHERE
olayid = '3320'
) AS s
RIGHT JOIN expose2.800_emsenaryolar_emgidenbulten AS g ON s.adresid = g.adresid
WHERE
s.adresid IS NULL
AND g.olayid = '2784'
AND g.durum = '1'
AND g.gonderilecegitarih < DATE_SUB(
'2015-05-13 15:40:15',
INTERVAL 1 DAY
)
If you use s.adresid IS NULL condition in subquery it will join faster then more rows ...
SELECT
g.id,
g.adresid,
g.senaryoid,
g.olayid,
g.gonderilecegitarih
FROM (
SELECT adresid FROM expose2.800_emsenaryolar_emgidenbulten WHERE olayid = '3320' and s.adresid IS NULL
) AS s
RIGHT JOIN expose2.800_emsenaryolar_emgidenbulten AS g ON s.adresid = g.adresid
AND g.olayid = '2784'
AND g.durum = '1'
AND g.gonderilecegitarih < DATE_SUB(
'2015-05-13 15:40:15',
INTERVAL 1 DAY
)
still this query optimized using self join.
For added speed, add this composite index to g:
INDEX(olayid, durum, gonderilecegitarih)
Please provide SHOW CREATE TABLE 800_emsenaryolar_emgidenbulten; I want to verify that you also have an index on adresid.
Related
I have a query where I am basically doing a left outer join and checking if the joined value is null
select count(T1.code)
from ( select code
from asset
where type = 'meter'
and creation_time <= '2022-04-29 00:00:00'
and (deactivation_time > '2022-04-28 00:00:00' or deactivation_time is null )
group by code
) as T1
left join ( select asset_code
from amr_midnight_data
where server_time between '2022-04-28 00:00:00' and '2022-04-29 00:00:00'
group by asset_code
) as T2 on T1.code = T2.asset_code
Where T2.asset_code is null;
This query takes 3 seconds to execute, but if I replace the is null at the end with is not null, it takes less then a second. Why is there a performance difference here and what alternatives do I have to make my original query faster?
Look at the EXPLAIN. A guess... Changing to IS NOT NULL lets the Optimizer change LEFT JOIN to JOIN, which lets it start with amr_midnight_data which might optimize better.
I think that the LEFT JOIN ( SELECT ... ) .. IS [NOT] NULL can be replaced with
WHERE [NOT] EXISTS ( SELECT 1 FROM amr_midnight_data
WHERE asset_code = T1.code
AND server_time >= '2022-04-28'
AND server_time < '2022-04-28' + INTERVAL 1 DAY )
That would like to have INDEX(asset_code, server_time)
EXISTS is faster than SELECT .. GROUP BY because it can stop as soon as one matching row is found.
asset would probably benefit from INDEX(type, creation_time) or (to make it "covering"):
INDEX(time, creation_time, deactivation_time, code)
If you wish to discuss further, please provide SHOW CREATE TABLE for both tables and EXPLAIN for each SELECT.
When I run this query, it took an average of 1.2421 seconds, which I think is slow, I have added indexing to every single possible column in those WHERE clause. So anymore improvement that I can do to speed up this query? The table that contains most data is the eav which have around 111276 rows/records
SELECT SQL_CALC_FOUND_ROWS eav.entid,
ent.entname
FROM eav,
ent,
catatt ca
WHERE eav.entid = ent.entid
AND ent.status = 'active'
AND eav.status = 'active'
AND eav.attid = ca.attid
AND ca.catid = 1
AND eav.catid = 1
AND ( ca.canviewby <= 6
|| ( ent.addedby = 87
AND canviewby <= 6 ) )
AND ( ( eav.attid = 13
AND ( `char` = '693fafba093bfa35118995860e340dce' ) )
OR ( eav.attid = 3
AND `double` = 6 )
OR ( eav.attid = 45
AND ( `int` = 191 ) ) )
GROUP BY eav.entid
HAVING Count(*) >= 3
EXPLAIN output
catatt table index
eav table index
ent table index
I have simplified Your query to understand it better, removed unnecessary case from where clause, made query planning.
So check this query and put to comment results and let's debug it under my answer:
SELECT
SQL_CALC_FOUND_ROWS
eav.entid,
ent.entname
FROM
eav
INNER JOIN ent ON (eav.entid = ent.entid AND ent.status = 'active')
INNER JOIN catatt ON (eav.attid = catatt.attid AND catatt.catid = 1)
WHERE
eav.catid = 1 AND eav.status = 'active'
AND (catatt.canviewby <= 6 OR ent.addedby = 87)
AND
(
(eav.attid = 13 AND eav.`char` = '693fafba093bfa35118995860e340dce')
OR
(eav.attid = 3 AND eav.`double` = 6)
OR
(eav.attid = 45 AND eav.`int` = 191)
)
GROUP BY eav.entid
HAVING COUNT(eav.entid) > 2
+ also I see You've rarely UPDATE-ing tables (data mostly inserted to these tables) - so try to make these table's engine to be MyISAM
+ create compound indexes from combinations of: attid, char ; attid, double ; attid, int
+ take a look at mysql's configuration and tune it for better query caching and memory usage
I have a 3rd party plugin that displays events, however for some reason whenever there is an event with multiple days, it stops showing the event when the current day is past the start date, even though the end date is still in the future.
The MySQL query appears to be trying to return these with the BETWEEN part after the OR at the end but it never does. I'm not familiar enough with MySQL to see what's wrong I guess.
For instance the row I'm expecting it to return here contains:
published=1
dates=2014-04-17
enddates=2014-04-19
SQL:
SELECT a.*
FROM eventlist_events AS a
WHERE a.published = 1
AND a.dates >= '2014-04-18'
AND (DATE(a.dates) = DATE_ADD('2014-04-18',INTERVAL 0 DAY)
OR ( a.enddates IS NOT NULL
AND (DATE_ADD('2014-04-18',INTERVAL 0 DAY)
BETWEEN DATE(a.dates) AND DATE(a.enddates))) )
To satisfy your where clause for 2014-04-18 fall in between the column values:
SELECT a.* FROM eventlist_events AS a
WHERE a.published = 1
AND a.enddates IS NOT NULL
AND '2014-04-18' BETWEEN DATE(a.dates) AND DATE(a.enddates)
And if you want to fetch when a.enddates too is null, then
SELECT a.* FROM eventlist_events AS a
WHERE a.published = 1
-- AND a.enddates IS NOT NULL
AND '2014-04-18'
BETWEEN DATE( a.dates )
AND DATE( IFNULL( a.enddates, a.dates ) )
I'm trying to select values from a database, but I need to check another value in another database .
I created this code, but only get 1 result and I don't know why:
SELECT `id` FROM `mc_region`
WHERE `is_subregion` = 'false'
AND lastseen < CURDATE() - INTERVAL 20 DAY
AND (SELECT id_region FROM mc_region_flags
WHERE flag <> 'expire'
AND id_region = mc_region.id
)
LIMIT 0, 30
What I've made wrong?
#Edit
I think I know why this code is not working. At database mc_region_flags not all records from the primary database has flag.
I would like to do the following:
1º Select all records on the first database, where is not subregion and lastseen is more than 20 day
2º Check if any result on the 1st database has flag 'expire', if yes, they are not included in the result.
I cant do this in 1 only SQL Code?
#Edit2
I created this code that simulate FULL JOIN but seems that WHERE is not work
SELECT *
FROM mc_region AS r RIGHT OUTER JOIN
mc_region_flags AS f ON r.id = f.id_region
UNION ALL
SELECT * from
mc_region AS r LEFT OUTER JOIN
mc_region_flags AS f
ON r.id = f.id_region
WHERE r.is_subregion = 'false'
AND f.flag = 'exipre'
AND r.lastseen < CURDATE() - INTERVAL 20 DAY
Problems WHERE not work
f.flag is not 'expire'
f.lastseen is not > 20 days
UPDATED
SELECT *
FROM `mc_region` AS r LEFT JOIN
`mc_region_flags` AS f ON r.`id` = f.`id_region`
WHERE r.`is_subregion` = 'false' AND
r.`lastseen` < CURDATE() - INTERVAL 20 DAY AND
COALESCE(f.`flag`, '-') <> 'expire'
LIMIT 0, 30;
Before the inner nested select add :
id in (select...)
My problem is that we make a select, and then, for each row, we run 4 differents request SQL (is madness), as you can guess we make a lot of requests, and the system using this is very slow.
SELECT
deal_source.id,
deal_source.source_name,
deal_source.spider_status,
spider.last_success_date
FROM deal_source
JOIN spider
ON deal_source.id = spider.deal_source_id
Then for each row of this query we make:
$total_query = "SELECT count(id) as total
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$added_query = "SELECT count(id) as added
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'added'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$extended_query = "SELECT count(id) as extended
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'extended'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$duplicate_query = "SELECT count(id) as duplicate
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'duplicate'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
SELECT d.id,
d.source_name,
d.spider_status,
s.last_success_date,
COUNT(l.id) AS total,
SUM(l.id IS NOT NULL AND l.action='added' ) AS added,
SUM(l.id IS NOT NULL AND l.action='extended' ) AS extended,
SUM(l.id IS NOT NULL AND l.action='duplicate') AS duplicate
FROM deal_source d
JOIN spider s
ON s.deal_source_id = d.id
JOIN spider_log l
ON l.deal_source_id = d.id
ON l.date_created >= s.last_success_date
AND l.date_created < s.last_success_date + INTERVAL 1 DAY
GROUP BY d.id
Some points:
You can optimize performance of each query, using EXPLAIN and the careful adding of indexes.
You can combine all the queries to a big one, so you don't have to hit the database with a lot of queries.
Besides the lots of queries, The date_format(date_created, '%Y-%m-%d') = '$lastdate' is a performance killer because it apples a function (DATE_FORMAT()) to a column (date_created) so no index can be used and the function is called thosuand or million of times (as many rows are examined). Change such conditions - wherever they are in your code - to:
( date_created >= DATE('$lastdate')
AND date_created < DATE('$lastdate') + INTERVAL 1 DAY
)
or even better, if that $lastdate is a date, to:
( date_created >= '$lastdate'
AND date_created < '$lastdate' + INTERVAL 1 DAY
)
and even more better, if date_created is a DATE column, to:
date_created = '$lastdate'