How to speed up the SQL query execution time in MySQL database? - mysql

Below is my SQL query.
SELECT `left_table`.`right_table_id`, MAX(left_table.add_time) AS max_add_time
FROM `left_table`
LEFT JOIN `right_table` ON `left_table`.`right_table_id` = `right_table`.`id`
WHERE left_table.add_time <= NOW()
AND (
(right_table.some_id = 1 AND right_table.category != -2)
OR
(right_table.another_id = 1 AND right_table.category != -1)
) AND NOT(right_table.category = -3)
AND NOT(right_table.category = -4)
GROUP BY `right_table_id`
ORDER BY `max_add_time` DESC, `left_table`.`id` DESC
LIMIT 12
It takes 5356.6ms to execute this query. It takes too long to me. I have been trying and trying to speed up the execution time. But no result. How can I improve the execution time for the above query?

Hmmm . . . I would start by writing the logic like this:
SELECT COUNT(DISTINCT lt.`right_table_id`)
FROM `left_table` lt LEFT JOIN
`right_table` rt
ON lt.`right_table_id` = rt.`id`
WHERE lt.add_time <= NOW() AND
((rt.some_id = 1 AND rt.category <> -2) OR
(rt.another_id = 1 AND rt.category <> -1)
) AND
rt.category NOT IN (-3, -4);
There might be additional simplifications, depending on whether lt.right_table_id always matches a row in the right table (or is NULL). And various other considerations.

Related

SQL query with a major NOT IN not working

Does anyone know what's wrong with this query?
This works perfectly on its own:
SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC
So does this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
But combining them like this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' AND car NOT IN (SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC) GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
Gives errors. Any ideas?
Please try the following...
$result = mysqli_query( $con,
"SELECT *
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND car NOT IN ( SELECT car
FROM ( SELECT car,
carvotes
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND carvotes NOT LIKE '0'
AND ( time > ( NOW( ) - INTERVAL 14 DAY ) )
GROUP BY car
ORDER BY carvotes DESC
LIMIT 10 ) X
)
GROUP BY car
ORDER BY time DESC
LIMIT 30" );
The main cause of your problem is that with car NOT IN ( SELECT * FROM ( SELECT *... you are trying to compare each record's value of car with each row returned by your subquery. IN requires you to have the same number of fields on both sides of the comparison. By using SELECT * at both levels of the subquery you were ensuring that the right side of the comparison had however many fields are in data versus your single field on the left, which confused MySQL.
Since you are aiming to compare to a single field, namely car, our subquery has to select just the car field from its dataset. Since the sort order of the subquery's results has no effect upon the IN comparison, and since our innermost query will be returning just car, I have removed the outer level of the subquery.
Beyond changing the first part of the subquery to SELECT car, the only other change that I have made to the subquery is to change LIMIT 0, 10 to LIMIT 10. The former means limit to the the 10 records that are offset by 0 from the first record. This is useful if you want records 6 to 15, but redundant for 1 to 10 as LIMIT 10 has the same affect and is slightly simpler. Ditto for LIMIT 0, 30 at the end of your overall statement.
As for the main body of the statement, I have not made any attempt to specify what fields (or aggregate functions of those fields) should be returned since you have made no statement indicating what your requirements / preferences are. If you are satisfied that GROUP BY has left you with a still valid set of values, then all the good, but if not then I recommend that you rewrite your Question to be specific about that detail.
By default, MySQL sorts the data subjected to a GROUP BY into ascending order, but if an ORDER BY clause is also present then it overrides the GROUP BY's sort pattern. As such, there is no benefit to specifying DESC after either of your GROUP BY car clauses, so I have removed it where it occurs.
Interesting Sidenote : You can override a GROUP BY's sort by specifying ORDER BY NULL.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html - on optimising your ORDER BY sorting
https://dev.mysql.com/doc/refman/5.7/en/select.html - on the SELECT statement's syntax - specifically the parts to do with LIMIT.
https://www.w3schools.com/php/php_mysql_select_limit.asp - a simpler explanation of LIMIT
This is your query:
SELECT *
FROM data
WHERE site = '".$id."' AND disabled = '0' AND
car NOT IN (SELECT *
FROM (SELECT *
FROM data
WHERE site = '".$id."' AND
disabled = '0' AND
carvotes NOT LIKE '0' AND
(time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car
ORDER BY carvotes DESC
LIMIT 0 , 10
) x
ORDER BY time DESC
)
GROUP BY car DESC
ORDER BY time desc
LIMIT 0 , 30 ;
Several comments:
Do not wrap integer constants in single quotes. This can mislead people. This can mislead optimizers.
Do not use string functions on integers (such as like). Same reason.
NOT IN with subqueries is dangerous. The construct does not handle NULL values the way you expect. Use NOT EXISTS or LEFT JOIN instead.
When using subqueries, ORDER BY is almost never appropriate.
Never use SELECT * with GROUP BY. It is just wrong. Happily, MySQL 5.7 has changed its defaults to reject this anti-pattern
So, a better way to write this query is something like this:
SELECT d.car, MAX(time) as time
FROM data d LEFT JOIN
(SELECT d2.*
FROM data d2
WHERE d2.site = '".$id."' AND
d2.disabled = 0 AND
d2.carvotes NOT LIKE 0 AND
(d2.time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY d2.car
ORDER BY carvotes DESC
LIMIT 0 , 10
) car10
ON d.car = car10.car
WHERE d.site = '".$id."' AND d.disabled = 0' AND
car10.car IS NOT NULL
GROUP BY car DESC
ORDER BY MAX(time) desc
LIMIT 0 , 30 ;
Alternatively, use SELECT * and remove the GROUP BY in the outer query.

how to use user defined variables in where clause

I have this query, and i m trying to use the user defined variable #noVar in my where clause to show only the records with value 'Yes' on that variable.
but when I use Having #noVar = 'Yes' as in the query below, it returns 0 result.
SELECT svcreqdetail.id, svcreqcheckin.stime as checkin, #etime:= time(timestampadd(minute, svcreqdetail.hours*60 , concat(caredate,' ', caretime))) as endtime, svcreqcheckout.stime as checkout, time_to_sec( if(svcreqcheckout.stime > svcreqcheckin.stime,
timediff(svcreqcheckout.stime, svcreqcheckin.stime),
addtime(timediff(svcreqcheckout.stime, svcreqcheckin.stime), '24:00:00.000000')))/3600 AS wrkHrs, svcreqdetail.hours,
svcreqstatus.status, #checkoutvar:= time_to_sec(timediff(svcreqcheckout.stime, #etime))/60 as checkoutvar,#noVar:= if (#checkoutvar <= 15,'Yes', 'No') as noVar, qualif
FROM svcreqdetail
LEFT JOIN svcreqcheckin ON svcreqcheckin.reqid = svcreqdetail.id
LEFT JOIN svcreqcheckout ON svcreqcheckout.reqid = svcreqdetail.id
JOIN svcreqstatus ON svcreqstatus.reqdid = svcreqdetail.id
WHERE (yearweek( caredate ) = yearweek( date_sub( CURRENT_DATE, INTERVAL 1 week ) )
AND svcreqstatus.status != 'Incompleted'
AND svcreqstatus.status != 'Deleted')
having #noVar = 'Yes'
is there anyway i can test against that variable in my where clause. and thank you
I don't think you can use user-defined variables in the HAVING clause like that.
One option would be to put your existing query (without the HAVING clause) into a sub-query and filter the results using a WHERE clause outside the sub-query, like so:
SELECT *
FROM
(
<your existing query goes here>
) AS sub_query
WHERE noVar = 'Yes'

Assistance with a complex MySQL SQL Query

I hope this is the appropriate forum to ask for assistance. I have an SQL Query (MySQL) that is not returning the correct records in a Date Range (between two dates). I am happy to answer questions in relation to the query, however if anyone can make suggestions or correct the SQL Query that would be an excellent learning exercise. Thank you.
$raw_query = sprintf("SELECT
swtickets.ticketid AS `Ticket ID`,
swtickettimetracks.tickettimetrackid AS `Track ID`,
swtickets.ticketmaskid AS `TicketMASK`,
(
SELECT
swcustomfieldvalues.fieldvalue
FROM
swcustomfieldvalues,
swcustomfields
WHERE
swcustomfieldvalues.customfieldid = swcustomfields.customfieldid
AND swtickets.ticketid = swcustomfieldvalues.typeid
AND swcustomfields.title = 'Member Company'
ORDER BY
swcustomfieldvalues.customfieldvalueid DESC
LIMIT 1
) AS MemberCompany,
(
SELECT
swcustomfieldvalues.fieldvalue
FROM
swcustomfieldvalues,
swcustomfields
WHERE
swcustomfieldvalues.customfieldid = swcustomfields.customfieldid
AND swtickets.ticketid = swcustomfieldvalues.typeid
AND swcustomfields.title = 'Member Name'
ORDER BY
swcustomfieldvalues.customfieldvalueid DESC
LIMIT 1
) AS MemberName,
(
SELECT
swcustomfieldvalues.fieldvalue
FROM
swcustomfieldvalues,
swcustomfields
WHERE
swcustomfieldvalues.customfieldid = swcustomfields.customfieldid
AND swtickets.ticketid = swcustomfieldvalues.typeid
AND swcustomfields.title = 'Chargeable'
AND
swcustomfieldvalues.fieldvalue = '40'
ORDER BY
swcustomfieldvalues.customfieldvalueid ASC
LIMIT 1
) AS `Chg`,
swtickets.`subject` AS `Subject`,
swtickets.departmenttitle AS Category,
FROM_UNIXTIME(
swtickettimetracks.workdateline
) AS `workDateline`,
FROM_UNIXTIME(
swtickettimetracks.dateline
) AS `dateline`,
swtickettimetracks.timespent AS `Time Spent`,
swtickets.timeworked AS `Time Worked`
FROM
swtickets
INNER JOIN swusers ON swtickets.userid = swusers.userid
INNER JOIN swuserorganizations ON swuserorganizations.userorganizationid = swusers.userorganizationid
INNER JOIN swtickettimetracks ON swtickettimetracks.ticketid = swtickets.ticketid
WHERE
swuserorganizations.organizationname = '%s'
AND (
swtickets.ticketstatustitle = 'Closed'
OR swtickets.ticketstatustitle = 'Completed'
)
AND FROM_UNIXTIME(`workDateline`) >= '%s' AND FROM_UNIXTIME(`workDateline`) <= '%s'
ORDER BY `Ticket ID`,`Track ID`",
$userOrganization,
$startDate,
$endDate
);
As I mentioned, the Query works - however it does not return the records correctly between the two dates.
However, IF I run this simple query against the database :
SELECT swtickettimetracks.tickettimetrackid,
swtickettimetracks.ticketid,
swtickettimetracks.dateline,
swtickettimetracks.timespent,
swtickettimetracks.timebillable,
FROM_UNIXTIME(swtickettimetracks.workdateline)
FROM swtickettimetracks
WHERE FROM_UNIXTIME(swtickettimetracks.workdateline) >= '2013-04-16' AND FROM_UNIXTIME(swtickettimetracks.workdateline) <= '2013-04-18'
I get the correct date range returned. Help? Thank you in anticipation.
Edward.
Unless you are overthinking it, it's all in your different query WHERE clauses...
Your complex query returning the wrong results has
(join conditions between other tables)
AND swuserorganizations.organizationname = '%s'
AND ( swtickets.ticketstatustitle = 'Closed'
OR swtickets.ticketstatustitle = 'Completed' )
AND FROM_UNIXTIME(`workDateline`) >= '%s'
AND FROM_UNIXTIME(`workDateline`) <= '%s'
Your Other query has
FROM swtickettimetracks
WHERE FROM_UNIXTIME(swtickettimetracks.workdateline) >= '2013-04-16'
AND FROM_UNIXTIME(swtickettimetracks.workdateline) <= '2013-04-18'
So I would consider a few things. The first where has
FROM_UNIXTIME >= '%s' and FROM_UNIXTIME <= '%s'
Are you sure the '%s' values are properly formatted to match the '2013-04-16' and '2013-04-18' format sample?
But more importantly, your first query is using the same date range (if correct), but is also only getting those for specific organization name AND (Closed or Completed) records. So, if the second query is returning 100 records, but the main query only 70, then are the other 30 some status other than closed/completed, or a different organization? In addition, if the join tables don't have matching IDs that would prevent those with invalid IDs from being returned. The only way to confirm that is to change to LEFT-JOIN syntax on those tables and see the results.

Select value in mysql but check another database at clause WHERE

I'm trying to select values from a database, but I need to check another value in another database .
I created this code, but only get 1 result and I don't know why:
SELECT `id` FROM `mc_region`
WHERE `is_subregion` = 'false'
AND lastseen < CURDATE() - INTERVAL 20 DAY
AND (SELECT id_region FROM mc_region_flags
WHERE flag <> 'expire'
AND id_region = mc_region.id
)
LIMIT 0, 30
What I've made wrong?
#Edit
I think I know why this code is not working. At database mc_region_flags not all records from the primary database has flag.
I would like to do the following:
1º Select all records on the first database, where is not subregion and lastseen is more than 20 day
2º Check if any result on the 1st database has flag 'expire', if yes, they are not included in the result.
I cant do this in 1 only SQL Code?
#Edit2
I created this code that simulate FULL JOIN but seems that WHERE is not work
SELECT *
FROM mc_region AS r RIGHT OUTER JOIN
mc_region_flags AS f ON r.id = f.id_region
UNION ALL
SELECT * from
mc_region AS r LEFT OUTER JOIN
mc_region_flags AS f
ON r.id = f.id_region
WHERE r.is_subregion = 'false'
AND f.flag = 'exipre'
AND r.lastseen < CURDATE() - INTERVAL 20 DAY
Problems WHERE not work
f.flag is not 'expire'
f.lastseen is not > 20 days
UPDATED
SELECT *
FROM `mc_region` AS r LEFT JOIN
`mc_region_flags` AS f ON r.`id` = f.`id_region`
WHERE r.`is_subregion` = 'false' AND
r.`lastseen` < CURDATE() - INTERVAL 20 DAY AND
COALESCE(f.`flag`, '-') <> 'expire'
LIMIT 0, 30;
Before the inner nested select add :
id in (select...)

How can optimize these instruction with MYSQL (subqueries for each row)?

My problem is that we make a select, and then, for each row, we run 4 differents request SQL (is madness), as you can guess we make a lot of requests, and the system using this is very slow.
SELECT
deal_source.id,
deal_source.source_name,
deal_source.spider_status,
spider.last_success_date
FROM deal_source
JOIN spider
ON deal_source.id = spider.deal_source_id
Then for each row of this query we make:
$total_query = "SELECT count(id) as total
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$added_query = "SELECT count(id) as added
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'added'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$extended_query = "SELECT count(id) as extended
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'extended'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
$duplicate_query = "SELECT count(id) as duplicate
FROM spider_log
WHERE deal_source_id = '$deal_source_id'
AND action = 'duplicate'
AND date_format(date_created, '%Y-%m-%d') = '$lastdate' ";
SELECT d.id,
d.source_name,
d.spider_status,
s.last_success_date,
COUNT(l.id) AS total,
SUM(l.id IS NOT NULL AND l.action='added' ) AS added,
SUM(l.id IS NOT NULL AND l.action='extended' ) AS extended,
SUM(l.id IS NOT NULL AND l.action='duplicate') AS duplicate
FROM deal_source d
JOIN spider s
ON s.deal_source_id = d.id
JOIN spider_log l
ON l.deal_source_id = d.id
ON l.date_created >= s.last_success_date
AND l.date_created < s.last_success_date + INTERVAL 1 DAY
GROUP BY d.id
Some points:
You can optimize performance of each query, using EXPLAIN and the careful adding of indexes.
You can combine all the queries to a big one, so you don't have to hit the database with a lot of queries.
Besides the lots of queries, The date_format(date_created, '%Y-%m-%d') = '$lastdate' is a performance killer because it apples a function (DATE_FORMAT()) to a column (date_created) so no index can be used and the function is called thosuand or million of times (as many rows are examined). Change such conditions - wherever they are in your code - to:
( date_created >= DATE('$lastdate')
AND date_created < DATE('$lastdate') + INTERVAL 1 DAY
)
or even better, if that $lastdate is a date, to:
( date_created >= '$lastdate'
AND date_created < '$lastdate' + INTERVAL 1 DAY
)
and even more better, if date_created is a DATE column, to:
date_created = '$lastdate'