MySQL COUNT of multiple left joins - optomization - mysql

I have a query that is getting counts from multiple tables by using a LEFT JOIN and subqueries. The idea is to get a count various activites a member has participated in.
The schema looks like this:
member
PK member_id
table1
PK tbl1_id
FK member_id
table2
PK tbl2_id
FK member_id
table3
PK tbl3_id
FK member_id
My query looks like this:
SELECT t1.num1,t2.num2,t3.num3
FROM member m
LEFT JOIN
(
SELECT member_id,COUNT(*) as num1
FROM table1
GROUP BY member_id
) t1 ON t1.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num2
FROM table2
GROUP BY member_id
) t2 ON t2.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num3
FROM table3
GROUP BY member_id
) t3 ON t3.member_id = m.member_id
WHERE m.member_id = 27
Where 27 is a test id. The actual query joins more than three tables and the query is run multiple times with the member_id being changed. The problem is this query runs pretty slow. I get the info I need but I am wondering if anyone could suggest a way to optimize this. Any advice is very much appreciated. Thanks much.

You should refactor your query. You can do this by reordering the way the query collects the data. How?
Apply the WHERE clause first
Apply JOINs last
Here is your original query:
SELECT t1.num1,t2.num2,t3.num3
FROM member m
LEFT JOIN
(
SELECT member_id,COUNT(*) as num1
FROM table1
GROUP BY member_id
) t1 ON t1.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num2
FROM table2
GROUP BY member_id
) t2 ON t2.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num3
FROM table3
GROUP BY member_id
) t3 ON t3.member_id = m.member_id
WHERE m.member_id = 27
Here is you new query
SELECT
IFNULL(t1.num1,0) num1,
IFNULL(t1.num2,0) num2,
IFNULL(t1.num3,0) num3
FROM
(
SELECT * FROM member m
WHERE member_id = 27
)
LEFT JOIN
(
SELECT member_id,COUNT(*) as num1
FROM table1
WHERE member_id = 27
GROUP BY member_id
) t1 ON t1.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num2
FROM table2
WHERE member_id = 27
GROUP BY member_id
) t2 ON t2.member_id = m.member_id
LEFT JOIN
(
SELECT member_id,COUNT(*) as num3
FROM table3
WHERE member_id = 27
GROUP BY member_id
) t3 ON t3.member_id = m.member_id
;
BTW I changed member m into SELECT * FROM member m WHERE member_id = 27 in case you need any information about member 27. I also added the IFNULL function to each result to produce 0 in case count is NULL.
You need to make absolutely sure
member_id is the primary key of the member table
member_id is indexed in table1, table2, and table3
Give it a Try !!!

Without knowing your schema and what you've done for indexes, one POSSIBLE way to make this faster is:
SELECT (select ifnull(count(*),0) from table1 where table1.member_id = m.id) as num1,
(select ifnull(count(*),0) from table2 where table2.member_id = m.id) as num2,
(select ifnull(count(*),0) from table3 where table3.member_id = m.id) as num3
from member m
WHERE m.member_id = 27
Now, this is a slightly risky recommendation, simply because I don't know anything about your DB or what else is running, or where the bottlenecks are.
In general, it would be a good idea to post an explain plan with your query to get a better answer.

SELECT num1, num2, count(*) as num3
FROM (
SELECT member_id, num1, count(*) as num2
FROM (
SELECT member_id, count(*) as num1
FROM member
LEFT JOIN table1 USING (member_id)
WHERE member_id = 27) as m1
LEFT JOIN table2 USING (member_id)) as m2
LEFT JOIN table3 USING (member_id);

Related

Wrong count result in mysql when joining two tables

I am trying to join two tables and get the count and grouped by specific field. However, it outputs same count values even if the other table consist only two rows. How should I fix this?
Here's my code:
SELECT tbl1.preferredDay, COUNT(tbl1.preferredDay) as count_1, COUNT(tbl2.preferredDay) as count_2
FROM tblschedule as tbl1
LEFT JOIN tblappointments as tbl2 ON (tbl1.preferredDay = tbl2.preferredDay)
WHERE tbl1.preferredDay = tbl2.preferredDay
GROUP BY preferredDay;
Here is the output but it should be [15, 0][3, 3]
Your query is based on left join it will return the same count().
This is a working query for Mysql 8:
with tbl1 as (
SELECT preferredDay, count(1) as count_1
FROM tblschedule
GROUP BY preferredDay
),
tbl2 as (
SELECT preferredDay, count(1) as count_2
FROM tblappointments
GROUP BY preferredDay
)
select t1.preferredDay, t1.count_1, t2.count_2
from tbl1 t1
inner join tbl2 t2 on t1.preferredDay = t2.preferredDay
There are two WITHs to get separately the count and then an INNER JOIN to join those results
For Mysql 5.7 and lower :
select t1.preferredDay, t1.count_1, t2.count_2
from (
SELECT preferredDay, count(1) as count_1
FROM tblschedule
GROUP BY preferredDay
) as t1
inner join (
SELECT preferredDay, count(1) as count_2
FROM tblappointments
GROUP BY preferredDay
) as t2 on t1.preferredDay = t2.preferredDay

Only return if main query has all the rows that subquery returns

I want to write a query that would return data if and only if it has all the ids that the subquery would return
What I've tried:
SELECT * FROM main_table WHERE id IN ( SELECT id FROM sub_table WHERE SOMECODITION)
but it returns once it finds at least one matching row in sub query which i do not want.
Mainly I wanna be able to check if the main_table has all the ids that sub_table returns
EDIT:
the SOMECONDITION has nothing to do with main query
You can achieve that by using left join as well. Like for instance, let's take a small example main_table has ids 1,2,3 and sub_table has ids 1,2,5,6 (that matches SOMECONDITION) so the following query,
select s.id sid,m.id mid from
sub_table s left join main_table m on s.id = m.id where s.SOMECONDITION
would result in:
sid mid
1 1
2 2
5 null
6 null
In this case main table does not have all the IDs that the sub query returns.
You can make use of count to fetch results only if the count(sid) equals count(mid), since count doesn't consider null values. So your query would become.
SELECT * FROM main_table
WHERE id IN ( SELECT id FROM sub_table WHERE SOMECODITION)
AND
(select count(Distinct sid)
from (select s.id sid,m.id mid from sub_table s left join main_table m on
s.id = m.id where s.SOMECONDITION) )
= (select count(Distinct mid)
from (select s.id sid,m.id mid from sub_table s left join main_table m on
s.id = m.id where s.SOMECONDITION) )
This is rather painful in MySQL. The following gets the number of matches:
SELECT COUNT(DISTINCT id)
FROM main_table
WHERE id IN ( SELECT id FROM sub_table WHERE SOMECODITION);
If you only want to select the rows, I would suggest that you get the count in php and use conditional logic there. But, you can also do that in SQL:
SELECT mt.*
FROM main_table mt
WHERE mt.id IN ( SELECT st.id FROM sub_table st WHERE SOMECONDITION) AND
(SELECT COUNT(DISTINCT mt.id)
FROM main_table mt
WHERE mt.id IN (SELECT st.id FROM sub_table st WHERE SOMECONDITION)
) =
(SELECT COUNT(DISTINCT st.id) FROM sub_table st WHERE SOMECONDITION);
You need an inner join in this case.
SELECT m.* FROM main_table INNER JOIN sub_table s ON m.id = s.id WHERE SOMECODITION
I assume the SOMECODITION contains only conditions against sub_table. With SOMECODITION, it's guaranteed that the result from sub_table must meet the condition and with INNER JOIN, all id's from sub_table must be found in main_table in order to return something from main_table.

Why this sql query keeps showing me a syntax error

The syntax error showed at "as t3" in the following code.
Im trying to full outer join 2 tables, but since mysql does not have full join, im using union to union 2 left/right joined table.
To me, I can not find any syntax error what so ever, but it just wont work...
SELECT
name, f.author_nameauthor_id, c1, c2
FROM
(
SELECT
author_id, c1, c2
FROM
(
(SELECT
author_id, amount AS c1
FROM Author_Keyword_Count
WHERE keyword_id=19478) AS t1
LEFT OUTER JOIN
(SELECT
author_id, amount AS c2
FROM Author_Keyword_Count
WHERE keyword_id=33944) AS t2
ON author_id=author_id
)
UNION
(
(SELECT author_id, amount AS c1 FROM Author_Keyword_Count WHERE keyword_id=19478) AS t3
RIGHT OUTER JOIN
(SELECT author_id, amount AS c2 FROM Author_Keyword_Count WHERE keyword_id=33944) AS t4
ON author_id=author_id
)
) AS f
LEFT OUTER JOIN Author ON author_id=id;
Some observations included below...
SELECT name
, f.author_nameauthor_id
, c1
, c2
FROM
( SELECT author_id
, c1
, c2
FROM
( !-- <-- something missing here!?!
( SELECT author_id
, amount c1
FROM Author_Keyword_Count
WHERE keyword_id = 19478
) t1
LEFT
JOIN
( SELECT author_id
, amount c2
FROM Author_Keyword_Count
WHERE keyword_id = 33944
) t2
ON author_id = author_id !-- <-- which author id equals which other author id!?!?
)
UNION
( !-- <-- something missing here!?!?
( SELECT author_id
, amount c1
FROM Author_Keyword_Count
WHERE keyword_id = 19478
) t3
RIGHT
JOIN !-- for ease of conceptualising, consider restructuring your logic to use a LEFT JOIN instead of a RIGHT JOIN
( SELECT author_id
, amount c2
FROM Author_Keyword_Count
WHERE keyword_id = 33944
) t4
ON author_id = author_id !-- <-- which author id equals which other author id!?!?
)
) f
LEFT
JOIN Author
ON author_id = id; !-- <-- which author id equals which id!?!?
Conceptally, you've written (SELECT 'x') a JOIN (SELECT 'y') b which is (I think) invalid. You could instead write SELECT * FROM (SELECT 'x') a JOIN (SELECT 'y') b, but perhaps there's a more elegant way structuring this query - if only we knew what you were actually trying to do.

Passing the result from one subquery to an IN clause in another subquery in MySQL

Not sure if this is possible, but if it is it would make my query much faster.
Basically I have a query like this:
SELECT *
FROM (SELECT bar.id
FROM pivot_table
WHERE foo.id = x) t1
JOIN (SELECT count(*) c1, bar.id
FROM table
GROUP BY bar.id) t2 ON t1.id = t2.id
JOIN (SELECT count(*) c2, bar.id
FROM another_table
GROUP BY bar.id) t3 ON t1.id = t3.id
But this is quite slow because table and another_table are huge. But really I am only interested in those values resulting from the query in t1. So if I could somehow get those results into an IN clause for t2 and t3 the query ought to speed up significantly.
Is this possible?
Not too clear I guess. OK what I was thinking is that changing the query to something like:
SELECT *
FROM (GROUP_CONCAT (bar.id) as results
FROM pivot_table
WHERE foo.id = x) t1
JOIN (SELECT count(*) c1, bar.id
FROM table
WHERE bar.id IN (*results from t1*)
GROUP BY bar.id) t2 ON t1.id = t2.id
JOIN (SELECT count(*) c2, bar.id
FROM another_table
WHERE bar.id IN (*results from t1*)
GROUP BY bar.id) t3 ON t1.id = t3.id
Might be quicker because it narrows down the number of rows scanned in t2 and t3. Would that not be the case?
Everyone wants to see it, so here is the full query:
SELECT (k_group.count/jk_group.count) * (s_group.count/jk_group.count) AS ratio,
jk_group.k_id ,
jk_group.s_id
FROM
-- find the keywords for the job
(SELECT jk.keyowrd_id AS k_id
FROM jobs_keywords jk
WHERE job_id = 50100
)
extracted_keywords
-- calculate the necessary values using group_by functions
INNER JOIN
(SELECT COUNT(*) count,
skill_id AS s_id ,
keyword_id AS k_id
FROM jobs_keywords jk
JOIN jobs_skills js
ON js.job_id = jk.job_id
JOIN job_feed_details d
ON d.job_id = js.job_id
WHERE d.moderated = 1
GROUP BY skill_id,
keyword_id
)
jk_group
ON extracted_keywords.k_id = jk_group.k_id
INNER JOIN
(SELECT COUNT(*) count,
keyword_id AS k_id
FROM jobs_keywords jk
JOIN job_feed_details d
ON d.job_id = js.job_id
WHERE d.moderated = 1
GROUP BY keyword_id
)
k_group
ON jk_group.k_id = k_group.k_id
INNER JOIN
(SELECT COUNT(*) count,
skill_id AS s_id
FROM jobs_skills js
JOIN job_feed_details d
ON d.job_id = js.job_id
WHERE d.moderated = 1
GROUP BY skill_id
)
s_group
ON jk_group.s_id = s_group.s_id
ORDER BY ratio DESC
LIMIT 25
SELECT COUNT(t1.id) c1, COUNT(t2.id) c2, COUNT(t3.id) c3, t1.id
FROM pivot_table t1
JOIN table t2 ON t1.id=t2.id
JOIN another_table t3 ON t3.id=t1.id where t1.id=x group by t1.id
pls make sure the pivot_table.id, table.id and another_table.id are indexed
about your query:
the problem of your query is driverd table use join buffer, to make your query fast, you should increase your join buffer size
I was able to accomplish what I was trying to do like so:
SELECT *
FROM (#var:=GROUP_CONCAT(bar.id) as results
FROM pivot_table
WHERE foo.id = x) t1
JOIN (SELECT count(*) c1, bar.id
FROM table
WHERE bar.id IN (#var)
GROUP BY bar.id) t2 ON t1.id = t2.id
JOIN (SELECT count(*) c2, bar.id
FROM another_table
WHERE bar.id IN (#var)
GROUP BY bar.id) t3 ON t1.id = t3.id
But the benefits in terms of speed were not too significant. I have now abandoned the one query approach in favor of many smaller queries, and that is much better.
Revision given actual query
I think you can whittle your query down to:
Select jk.Count( Distinct jk.keyword_id )
* jk.Count( Distinct js.skill_id )
/ Power( Count(*), 2 )
As ratio
, js.skill_id
, jk.keyword_id
From jobs_keywords As jk
Join jobs_skills As js
On js.job_id = jk.job_id
Where jk.job_id =50100
Group By js.skill_id, jk.keyword_id
Order By ratio Desc
Limit 25

mySQL: Multi-column join on several tables part II

I am adding a 5th table to an existing join. The original query will always return a single row because the where clause specifies a unique ID. Here are the tables we are using:
Table 1
carid, catid, makeid, modelid, caryear
Table 2
makeid, makename
Table 3
modelid, modelname
Table 4
catid, catname
Table 5
id, caryear, makename, modelname
Here is the existing query I am using:
SELECT a.*, e.citympg, e.hwympg
FROM table1 a
JOIN table2 b on a.makeid=b.makeid
JOIN table3 c on a.modelid=c.modelid
JOIN table4 d on a.catid=d.catid
JOIN table5 e on b.makename = e.make
and c.modelname = e.model
and a.caryear = e.year
WHERE a.carid = $carid;
There are 2 issues that I need to solve -
When there is no match on table 5, it does not return any results. It would seem that I need to do some sort of left join or split the query and do a union.
When there is a match on table 5, it returns multiple rows. Since the criteria that would return a single row is not being used, I would settle for an average of citympg and hwympg.
Can both objectives be achieved with a single query? How?
Assuming I understand what you want correctly... This query will constrain the results from table5 to one row per combination of the join criteria, returning average city/hwy mpg.
SELECT a.*, e.citympg, e.hwympg
FROM table1 a
JOIN table2 b on a.makeid=b.makeid
JOIN table3 c on a.modelid=c.modelid
JOIN table4 d on a.catid=d.catid
LEFT JOIN (SELECT year, make, model,
AVG(citympg) as citympg,
AVG(hwympg) as hwympg
FROM table5
GROUP BY year, make, model) e on b.makename = e.make
and c.modelname = e.model
and a.caryear = e.year
WHERE a.carid = $carid;
Note that it will return NULL mpg values when no record in table5 exists.
The usual approach is to use correlated subqueries like this:
SELECT a.*
, (SELECT avg(e.citympg)
FROM table5 e
WHERE e.make = b.makename
AND e.model = c.modelname
AND e.year = a.caryear
) as citympg
, (SELECT avg(e.hwympg)
FROM table5 e
WHERE e.make = b.makename
AND e.model = c.modelname
AND e.year = a.caryear
) as hwympg
FROM table1 a
JOIN table2 b on a.makeid=b.makeid
JOIN table3 c on a.modelid=c.modelid
JOIN table4 d on a.catid=d.catid
WHERE a.carid = $carid