How to optimize this complected query? - mysql

While working with following query on mysql, Its getting locked,
SELECT event_list.*
FROM event_list
INNER JOIN members
ON members.profilenam=event_list.even_loc
WHERE (even_own IN (SELECT frd_id
FROM network
WHERE mem_id='911'
GROUP BY frd_id)
OR even_own = '911' )
AND event_list.even_active = 'y'
GROUP BY event_list.even_id
ORDER BY event_list.even_stat ASC
The Inner query inside IN constraint has many frd_id, So because of that above query is slooow..., So please help.
Thanks.

Try this:
SELECT el.*
FROM event_list el
INNER JOIN members m ON m.profilenam = el.even_loc
WHERE el.even_active = 'y' AND
(el.even_own = 911 OR EXISTS (SELECT 1 FROM network n WHERE n.mem_id=911 AND n.frd_id = el.even_own))
GROUP BY el.even_id
ORDER BY el.even_stat ASC

You don't need the GROUP BY on the inner query, that will be making the database engine do a lot of unneeded work.

If you put even_own = '911' before the select from network, then if even_own IS 911 then it will not have to do the subquery.
Also why do you have a group by on the subquery?
Also run explain plan top find out what is taking the time.

This might work better:
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
JOIN network AS n ON e.even_own = n.frd_id
WHERE n.mem_id = '911'
AND e.even_active = 'y'
ORDER BY e.even_stat ASC )
UNION DISTINCT
( SELECT e.*
FROM event_list AS e
INNER JOIN members AS m ON m.profilenam = e.even_loc
WHERE e.even_own = '911'
AND e.even_active = 'y' )
ORDER BY e.even_stat ASC
Since I don't know whether the JOINs one-to-many (or what), I threw in DISTINCT to avoid dups. There may be a better way, or it may be unnecessary (that is, UNION ALL).
Notice how I avoid two things that are performance killers:
OR -- turned into UNION
IN (SELECT...) -- turned into JOIN.
I made aliases to cut down on the clutter. I moved the ORDER BY outside the UNION (and added parens to make it work right).

Related

How can I merge these two left joins into a single one?

How can I merge these two left joins: http://sqlfiddle.com/#!9/1d2954/69/0
SELECT d.`id`, (adcount + bdcount)
FROM `docs` d
LEFT JOIN
(
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
) ad ON ad.`doc_id` = d.`id`
LEFT JOIN
(
SELECT db.`doc_id`, COUNT(db.`doc_id`) AS bdcount FROM `docs_scod_b` db
INNER JOIN `scod_b` b ON b.`id` = db.`scod_b_id`
WHERE b.`ver_b` IN ('BA', 'BB')
GROUP BY db.`doc_id`
) bd ON bd.`doc_id` = d.`id`
to be a Single left join just to ease its use in my code, while making it no less slower?
Let me first emphasize that your method of doing the calculation is the better method. You have two separate dimensions and aggregating them separately is often the most efficient method for doing the calculation. It is also the most scalable method.
That said, your query should be equivalent to this version:
SELECT d.id,
count(distinct a.id),
count(distinct b.id)
FROM docs d left join
docs_scod_a da
ON da.doc_id = d.id LEFT JOIN
scod_a a
ON a.id = da.scod_a_id AND a.ver_a IN ('AA', 'AB') LEFT JOIN
docs_scod_b db
ON db.doc_id = d.id LEFT JOIN
scod_b b
ON b.id = db.scod_b_id AND b.ver_b IN ('BA', 'BB')
GROUP BY d.id
ORDER BY d.id;
This query is more expensive than it looks, because the COUNT(DISTINCT) incurs additional overhead compared to COUNT().
And here is the SQL Fiddle.
And, because LEFT JOIN can return NULL values, your query is more correctly written as:
SELECT d.`id`, COALESCE(adcount, 0) + COALESCE(bdcount, 0)
If you were having problems with the results, this small change might fix those problems.
Performance may be a big problem, depending on sizes of each table. It appears to be an "inflate-deflate" situation since it first "inflates" the number of rows via JOIN, then "deflates" via GROUP BY. The formulation below avoids inflation-deflation.
But first, if I understand this subquery correctly, this
SELECT da.`doc_id`, COUNT(da.`doc_id`) AS adcount
FROM `docs_scod_a` da
INNER JOIN `scod_a` a ON a.`id` = da.`scod_a_id`
WHERE a.`ver_a` IN ('AA', 'AB')
GROUP BY da.`doc_id`
can be rewritten as
SELECT `doc_id`,
( SELECT COUNT(*)
FROM `scod_a`
WHERE `id` = da.`scod_a_id`
AND `ver_a` IN ('AA', 'AB')
) AS adcount
FROM `docs_scod_a` AS da
If that is correct, then the entire query becomes
SELECT d.id,
( SELECT COUNT(*)
FROM docs_scod_a ds
JOIN scod_a s ON s.id = ds.scod_a_id
WHERE ds.doc_id = d.id
AND s.ver_a IN ('AA', 'AB')
) +
( SELECT COUNT(*)
FROM docs_scod_b ds
JOIN scod_b s ON s.id = ds.scod_b_id
WHERE ds.doc_id = d.id
AND s.ver_b IN ('BA', 'BB')
)
FROM docs AS d
Which needs these indexes:
docs_scod_a: (doc_id, scod_a_id), (scod_a_id, doc_id)
docs_scod_b: (doc_id, scod_b_id), (scod_b_id, doc_id)
scod_a: (ver_a, id)
scod_b: (ver_b, id)
docs: -- presumably has PRIMARY KEY(id)
Note the lack of GROUP BY.
docs_scod_a smells like a many-to-many mapping table. I recommend you follow the tips here.
(No COALESCE is needed since COUNT will simply return zero.)
(I don't know whether my version is better (faster or whatever) than Gordon's, nor whether my indexes will help his formulation.)

Understanding why this query is slow

The below query is very slow (takes around 1 second), but is only searching approx 2500 records (+ inner joined tables).
if i remove the ORDER BY, the query runs in much less time (0.05 or less)
OR if i remove the part nested select below "# used to select where no ProfilePhoto specified" it also runs fast, but i need both of these included.
I have indexes (or primary key) on :tPhoto_PhotoID, PhotoID, p.Enabled, CustomerID, tCustomer_CustomerID, ProfilePhoto (bool), u.UserName, e.PrivateEmail, m.tUser_UserID, Enabled, Active, m.tMemberStatuses_MemberStatusID, e.tCustomerMembership_MembershipID, e.DateCreated
(do i have too many indexes? my understanding is add them anywhere i use WHERE or ON)
The Query :
SELECT e.CustomerID,
e.CustomerName,
e.Location,
SUBSTRING_INDEX(e.CustomerProfile,' ', 25) AS Description,
IFNULL(p.PhotoURL, PhotoTable.PhotoURL) AS PhotoURL
FROM tCustomer e
LEFT JOIN (tCustomerPhoto ep INNER JOIN tPhoto p ON (ep.tPhoto_PhotoID = p.PhotoID AND p.Enabled=1))
ON e.CustomerID = ep.tCustomer_CustomerID AND ep.ProfilePhoto = 1
# used to select where no ProfilePhoto specified
LEFT JOIN ((SELECT pp.PhotoURL, epp.tCustomer_CustomerID
FROM tPhoto pp
LEFT JOIN tCustomerPhoto epp ON epp.tPhoto_PhotoID = pp.PhotoID
GROUP BY epp.tCustomer_CustomerID) AS PhotoTable) ON e.CustomerID = PhotoTable.tCustomer_CustomerID
INNER JOIN tUser u ON u.UserName = e.PrivateEmail
INNER JOIN tmembers m ON m.tUser_UserID = u.UserID
WHERE e.Enabled=1
AND e.Active=1
AND m.tMemberStatuses_MemberStatusID = 2
AND e.tCustomerMembership_MembershipID != 6
ORDER BY e.DateCreated DESC
LIMIT 12
i have similar queries that but they run much faster.
any opinions would be grateful:
Until we get more clarity on your question between working in other query etc..Try EXPLAIN {YourSelectQuery} in MySQL client and see the suggestions to improve the performance.

MAX() Function not working as expected

I've created sqlfiddle to try and get my head around this http://sqlfiddle.com/#!2/21e72/1
In the query, I have put a max() on the compiled_date column but the recommendation column is still coming through incorrect - I'm assuming that a select statement will need to be inserted on line 3 somehow?
I've tried the examples provided by the commenters below but I think I just need to understand this from a basic query to begin with.
As others have pointed out, the issue is that some of the select columns are neither aggregated nor used in the group by clause. Most DBMSs won't allow this at all, but MySQL is a little relaxed on some of the standards...
So, you need to first find the max(compiled_date) for each case, then find the recommendation that goes with it.
select r.case_number, r.compiled_date, r.recommendation
from reporting r
join (
SELECT case_number, max(compiled_date) as lastDate
from reporting
group by case_number
) s on r.case_number=s.case_number
and r.compiled_date=s.lastDate
Thank you for providing sqlFiddle. But only reporting data is given. we highly appreciate if you give us sample data of whole tables.
Anyway, Could you try this?
SELECT
`case`.number,
staff.staff_name AS ``case` owner`,
client.client_name,
`case`.address,
x.mx_date,
report.recommendation
FROM
`case` INNER JOIN (
SELECT case_number, MAX(compiled_date) as mx_date
FROM report
GROUP BY case_number
) x ON x.case_number = `case`.number
INNER JOIN report ON x.case_number = report.case_number AND report.compiled_date = x.mx_date
INNER JOIN client ON `case`.client_number = client.client_number
INNER JOIN staff ON `case`.staff_number = staff.staff_number
WHERE
`case`.active = 1
AND staff.staff_name = 'bob'
ORDER BY
`case`.number ASC;
Check below query:
SELECT c.number, s.staff_name AS `case owner`, cl.client_name,
c.address, MAX(r.compiled_date), r.recommendation
FROM case c
INNER JOIN (SELECT r.case_number, r.compiled_date, r.recommendation
FROM report r ORDER BY r.case_number, r.compiled_date DESC
) r ON r.case_number = c.number
INNER JOIN client cl ON c.client_number = cl.client_number
INNER JOIN staff s ON c.staff_number = s.staff_number
WHERE c.active = 1 AND s.staff_name = 'bob'
GROUP BY c.number
ORDER BY c.number ASC
SELECT
case.number,
staff.staff_name AS `case owner`,
client.client_name,
case.address,
(select MAX(compiled_date)from report where case_number=case.number),
report.recommendation
FROM
case
INNER JOIN report ON report.case_number = case.number
INNER JOIN client ON case.client_number = client.client_number
INNER JOIN staff ON case.staff_number = staff.staff_number
WHERE
case.active = 1 AND
staff.staff_name = 'bob'
GROUP BY
case.number
ORDER BY
case.number ASC
try this

Replacing Subqueries with Joins in MySQL

I have the following query:
SELECT PKID, QuestionText, Type
FROM Questions
WHERE PKID IN (
SELECT FirstQuestion
FROM Batch
WHERE BatchNumber IN (
SELECT BatchNumber
FROM User
WHERE RandomString = '$key'
)
)
I've heard that sub-queries are inefficient and that joins are preferred. I can't find anything explaining how to convert a 3+ tier sub-query to join notation, however, and can't get my head around it.
Can anyone explain how to do it?
SELECT DISTINCT a.*
FROM Questions a
INNER JOIN Batch b
ON a.PKID = b.FirstQuestion
INNER JOIN User c
ON b.BatchNumber = c.BatchNumber
WHERE c.RandomString = '$key'
The reason why DISTINCT was specified is because there might be rows that matches to multiple rows on the other tables causing duplicate record on the result. But since you are only interested on records on table Questions, a DISTINCT keyword will suffice.
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
Try :
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q
INNER JOIN Batch b ON q.PKID = b.FirstQuestion
INNER JOIN User u ON u.BatchNumber = q.BatchNumber
WHERE u.RandomString = '$key'
select
q.pkid,
q.questiontext,
q.type
from user u
join batch b
on u.batchnumber = b.batchnumber
join questions q
on b.firstquestion = q.pkid
where u.randomstring = '$key'
Since your WHERE clause filters on the USER table, start with that in the FROM clause. Next, apply your joins backwards.
In order to do this correctly, you need distinct in the subquery. Otherwise, you might multiply rows in the join version:
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q join
(select distinct FirstQuestion
from Batch b join user u
on b.batchnumber = u.batchnumber and
u.RandomString = '$key'
) fq
on q.pkid = fq.FirstQuestion
As to whether the in or join version is better . . . that depends. In some cases, particularly if the fields are indexed, the in version might be fine.

Why Does My MySQL Query Using a Subselect Hang?

The following query hangs: (although subqueries perfomed separately are fine)
I don't know how to make the explain table look ok. If someone tells me, I'll clean it up.
select
sum(grades.points)) as p,
from assignments
left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID
from assignments
left join grades using (assignmentID)
where ... grades.date <= '1255503600' AND grades.date >= '984902400'
group by assignmentID order by grades.date DESC);
I think the problem is with the first grades table... the type ALL with that many rows seems to be the cause.. Everything is indexed.
I uploaded the table as an image. Couldn't get the formatting right:
http://imgur.com/AjX34.png
A commenter wanted the full where clause:
explain extended select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC);
See "The unbearable slowness of IN":
http://www.artfulsoftware.com/infotree/queries.php#568
Super messy, but: (thanks for everyone's help)
SELECT *
FROM grades
LEFT JOIN assignments ON grades.assignmentID = assignments.assignmentID
RIGHT JOIN (
SELECT g.gradeID
FROM assignments a
LEFT JOIN grades g
USING ( assignmentID )
WHERE a.classID = '7815'
AND (
a.type =30170
)
AND g.contactID =7141
g.points
REGEXP '^[-]?[0-9]+[-]?'
AND g.points != '-'
AND g.points != ''
AND (
g.pointsposs IS NULL
OR g.pointsposs = ''
)
AND g.date <= '1255503600'
AND g.date >= '984902400'
GROUP BY assignmentID
ORDER BY g.date DESC
) AS t1 ON t1.gradeID = grades.gradeID
Suppose you use a Real Database (ie, any database except MySQL, but I'll use Postgres as an example) to do this query :
SELECT * FROM ta WHERE aid IN (SELECT subquery)
a Real Database would look at the subquery and estimate its rowcount :
If the rowcount is small (say, less than a few millions)
It would run the subquery, then build an in-memory hash of ids, which also makes them unique, which is a feature of IN().
Then, if the number of rows pulled from ta is a small part of ta, it would use a suitable index to pull the rows. Or, if a major part of the table is selected, it would just scan it entirely, and lookup each id in the hash, which is very fast.
If however the subquery rowcount is quite large
The database would probably rewrite it as a merge JOIN, adding a Sort+Unique to the subquery.
However, you are using MySQL. In this case, it will not do any of this (it is gonna re-execute the subquery for each row of your table) so it will take 1000 years. Sorry.
If your subquery performs fine when it is executed separately, then try using a JOIN rather than IN, like this:
select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
join
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC) using (gradeID);
There really isn't enough information to answer your question, and you've put a ... in the middle of the where clause which is weird. How big are the tables involved and what are the indexes?
Having said that, if there are too many terms in an in clause, you can see seriously degraded performance. Replace the use of in with a right join.
For starters, the table as_types in the in clause is not used. Left joining it serves no purpose so get rid of it.
That leaves the in clause having only the assignments and grades table from the outer query. Clearly the wheres the modify assignments belong in the where clause for the outer query. You should move all of the where grades=whatever into the on clause of the left join to grades.
The query is a little tough to follow, but I suspect that the subquery isn't necessary at all.
It seems like your query is basically thus:
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE gradeID IN
(
SELECT grades.gradeID
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE your_conditions = TRUE
);
But, you're not doing anything really fancy in the where clause in the subquery.
I suspect something more like
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
GROUP BY groupings
WHERE your_conditions_with_some_tweaks = TRUE;
would work just as well.
If I'm missing some key logic here please comment back and I'll edit/delete this post.