MySQL Query taking over 6 seconds - mysql

Awhile back I got some help with a specific query. Here's the link: SQL Group BY using strings in new columns
My query looks similar to this:
SELECT event_data, class_40_winner, class_30_winner
FROM events e
LEFT JOIN (SELECT result_event, name AS class_40_winner
FROM results
WHERE class = 40 AND position = 1) c40 ON e.id = c40.result_event
LEFT JOIN (SELECT result_event, name AS class_30_winner
FROM results
WHERE class = 30 AND position = 1) c30 ON e.id = c30.result_event
I have now entered enough data in my database (22,000 rows) that this query is taking over 6 seconds to complete. (My actual query is bigger than the above, in that it now has 4 joins in it.)
I used the "Explain" function on my query to take a look. Each of the queries from the "results" table is pulling in the 22,000 rows, so this seems to be the problem.
I have done some research and it sounds like I should be able to INDEX the relevant column on the "results" table to help speed things up. But when I did that, it actually slowed my query down to about 10 seconds.
Any suggestions for what I can do to improve this query?

AFAIK, you are pivoting your data and I think using max(case ...) ... group by has good performance in pivoting data.
I can suggest you to use this query instead:
select event_date
, max(case when r.class = 40 then name end) `Class 40 Winner`
, max(case when r.class = 30 then name end) `Class 30 Winner`
from events e
left join results r on e.event_id = r.result_event and r.position = 1
group by event_date;
[SQL Fiddle Demo]

Try this query:
SELECT
e.event_date,
r1.name as class_40_winner,
r2.name as class_30_winner
FROM
events e,
results r1,
results r2
WHERE
r1.class = 40 AND
r2.class = 30 AND
r1.position = 1 AND
r2.position = 1 AND
r1.result_event = e.id AND
r2.result_event = e.id

SELECT e.event_data
, r.class
, r.name winner
FROM events e
JOIN results r
ON r.result_event = e.id
WHERE class IN (30,40)
AND position = 1
The rest of this problem is a simple display issue, best resolved in application code.

Related

MySQL Explain plain doesn't show index used when it should be

I'm trying to run a query in MySQL that's timing out after a couple of minutes on a QA system with 8 million+ rows. It runs fine for me locally, but obviously less data.
Here's the query:
SELECT
system_name as systemName,
systemLabel,
feature_vector as featureVector,
code as norcaCode,
count(1) as sum
FROM (
SELECT a.id,
a.object_id,
a.system_name,
d.label as systemLabel,
b.norca_type AS norcaType,
b.feature_vector,
a.seqnb,
a.object_index,
c.code
FROM
system_objectdata a
JOIN
sick_il_dacq.system_barcode_norca b
ON
a.id = b.system_objectdata_id
AND
a.partition_key = b.partition_key
LEFT JOIN
system_feature_vector c
ON
b.feature_vector = c.value
JOIN
sick_il_services.system_config d
ON
a.system_name = d.name
WHERE LEFT(FROM_UNIXTIME(object_scan_time/1000),10) >= SUBDATE(CURRENT_DATE, 100)
AND
norca_type = 'BARCODE'
AND
a.is_duplicate = 0
) detail
GROUP BY
system_name, feature_vector, norcaCode;
Here's the explain plan:
It looks like the link to table d, system_config, is has no possible keys.
However, there is an index for name on the table:
Any idea why it's not using the name index? And in general, any ideas on how to improve the query speed?

Understanding why this query is slow

The below query is very slow (takes around 1 second), but is only searching approx 2500 records (+ inner joined tables).
if i remove the ORDER BY, the query runs in much less time (0.05 or less)
OR if i remove the part nested select below "# used to select where no ProfilePhoto specified" it also runs fast, but i need both of these included.
I have indexes (or primary key) on :tPhoto_PhotoID, PhotoID, p.Enabled, CustomerID, tCustomer_CustomerID, ProfilePhoto (bool), u.UserName, e.PrivateEmail, m.tUser_UserID, Enabled, Active, m.tMemberStatuses_MemberStatusID, e.tCustomerMembership_MembershipID, e.DateCreated
(do i have too many indexes? my understanding is add them anywhere i use WHERE or ON)
The Query :
SELECT e.CustomerID,
e.CustomerName,
e.Location,
SUBSTRING_INDEX(e.CustomerProfile,' ', 25) AS Description,
IFNULL(p.PhotoURL, PhotoTable.PhotoURL) AS PhotoURL
FROM tCustomer e
LEFT JOIN (tCustomerPhoto ep INNER JOIN tPhoto p ON (ep.tPhoto_PhotoID = p.PhotoID AND p.Enabled=1))
ON e.CustomerID = ep.tCustomer_CustomerID AND ep.ProfilePhoto = 1
# used to select where no ProfilePhoto specified
LEFT JOIN ((SELECT pp.PhotoURL, epp.tCustomer_CustomerID
FROM tPhoto pp
LEFT JOIN tCustomerPhoto epp ON epp.tPhoto_PhotoID = pp.PhotoID
GROUP BY epp.tCustomer_CustomerID) AS PhotoTable) ON e.CustomerID = PhotoTable.tCustomer_CustomerID
INNER JOIN tUser u ON u.UserName = e.PrivateEmail
INNER JOIN tmembers m ON m.tUser_UserID = u.UserID
WHERE e.Enabled=1
AND e.Active=1
AND m.tMemberStatuses_MemberStatusID = 2
AND e.tCustomerMembership_MembershipID != 6
ORDER BY e.DateCreated DESC
LIMIT 12
i have similar queries that but they run much faster.
any opinions would be grateful:
Until we get more clarity on your question between working in other query etc..Try EXPLAIN {YourSelectQuery} in MySQL client and see the suggestions to improve the performance.

Increase speed of slow sql-code

I'm trying to increase the speed of the sql code below. Load time right now is around 0.662 sec. The problem is that i need to loop this code for each day of the selected month and then 31*0.662 sec ~30sec is way to long time for loading.
select fname,lname,(TIME_TO_SEC(TIMEDIFF(r.edate,r.sdate))-r.break) as TotalDiff from tbluser u LEFT JOIN
tblregtime r
on (r.userid = u.id and
r.projectid = 21
and sdate='2013-11-27'
)
INNER JOIN tblgroup_users gU ON gU.userID = u.id
INNER JOIN tblgroup_brukare gB on gB.tblGroupID=gU.tblGroupID where (gB.tblprojectID = 21 AND (gU.status=0 OR gU.status=2))
order by u.fname ASC,u.lname ASC
Instead of running your sql query 31 times for each day, you could try running a single query for all days and handle them appropriately in your code (php or whatever).
Here's a suggested alternate of your query which will run only one time (you may need to rephrase it a bit). Can you try it and let us know how long that takes? Also, to further optimize, it will be helpful to post your query plan, and maybe create an sql fiddle.
select fname,lname,(TIME_TO_SEC(TIMEDIFF(r.edate,r.sdate))-r.break) as TotalDiff, sdate
from tbluser u
LEFT JOIN tblregtime r on (r.userid = u.id and r.projectid = 21 and sdate between '2013-11-01' and '2013-11-31')
INNER JOIN tblgroup_users gU ON gU.userID = u.id
INNER JOIN tblgroup_brukare gB on gB.tblGroupID=gU.tblGroupID where (gB.tblprojectID = 21 AND (gU.status=0 OR gU.status=2))
order by sdate ASC, u.fname ASC,u.lname ASC

Converting subquery to joins for performance

I have taken over a big project, and as the database is becoming large, some of the code stopped working,
Here is the query to find those rendering_requests who's last rending_log is pending, sometimes there are log entries which have no status change and recorded as noaction we dont need to count them. That is what I understood from the query.
SELECT
COUNT(rr.rendering_id) AS recordCount
FROM
rendering_request rr, rendering_log rl
WHERE
rl.rendering_id = rr.rendering_id
AND rl.status = 'pending' AND
rl.log_id = (
SELECT rl1.log_id
FROM rendering_log rl1
WHERE
rl.rendering_id = rl1.rendering_id AND
rl1.status = 'pending'
AND rl1.log_id = (
SELECT rl2.log_id
FROM rendering_log rl2
WHERE rl1.rendering_id = rl2.rendering_id AND rl2.status!='noaction'
ORDER BY rl2.log_id DESC LIMIT 1
)
ORDER BY rl1.log_id DESC
LIMIT 1
)
for example
rendering_id=1 is having multiple logs
status=noaction
status=noaction
status=pending
and
rendering_id=2 is having multiple logs
status=noaction
status=assigned
status=noaction
status=pending
when we run this query it should display count=1 as only the rendering_id=1 is our desired record.
Right now this query has stopped working, and it hangs the mysql server
Not 100% sure I have got this right, but something like this. Think you still need to use a couple of subselects but (depending on the version of MySQL) doing it this way with JOINs should be a lot faster
SELECT COUNT(rr.rendering_id) AS recordCount
FROM rendering_request rr
INNER JOIN rendering_log rl
ON rl.rendering_id = rr.rendering_id
INNER JOIN (SELECT rendering_id, MAX(log_id) FROM rendering_log WHERE status = 'pending' GROUP BY rendering_id) rl1
ON rl1.rendering_id = rl.rendering_id
AND rl1.log_id = rl.log_id
INNER JOIN (SELECT rendering_id, MAX(log_id) FROM rendering_log WHERE status!='noaction' GROUP BY rendering_id) rl2
ON rl2.rendering_id = rl1.rendering_id
AND rl2.log_id = rl1.log_id
WHERE rl.status = 'pending'

optimize Mysql: get latest status of the sale

In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.