Odd behavior combining multiple tables and using COALESCE - mysql

I have a big query that I have been struggling with and tweaking for awhile.
SELECT
tastingNotes.userID, tastingNotes.beerID, tastingNotes.noteID,
tastingNotes.note, user.userName,
COALESCE(sum(tasteNoteRate.score),0) as `score`
FROM
tastingNotes
INNER JOIN `user` on tastingNotes.userID = `user`.userID
LEFT JOIN tasteNoteRate on tastingNotes.noteID = tasteNoteRate.noteID
WHERE tastingNotes.beerID = 'C5RJc0'
GROUP BY tastingNotes.noteID
ORDER BY score DESC
LIMIT 0,50;
I am using the COALESCE(sum(tasteNoteRate.score),0) to give results returned a value of zero if they do not have a score yet.
The odd behavior was that when I should have had two results it only returned one note with a score of zero.
When I then gave one a score they then both showed up, one with its score and then the second with zero.

Try
SELECT q.noteID, q.userID, q.beerID, q.note, q.score, u.userName
FROM (
SELECT n.noteID, n.userID, n.beerID, n.note, COALESCE(SUM(r.score), 0) score
FROM tastingNotes n LEFT JOIN tasteNoteRate r
ON n.noteID = r.noteID
WHERE n.beerID = 'C5RJc0'
GROUP BY n.noteID, n.userID, n.beerID, n.note
) q JOIN `user` u ON q.userID = u.userID
ORDER BY score DESC
LIMIT 50
SQLFiddle

Related

Mysql identify which table a row came from in double LEFT OUTTER JOIN with UNION ALL

My below query works, but there are two things I want to get from the query that I don't know how to do.
How to tell which LEFT JOIN the final returned row is coming from?
Is it possible to also return the total count from each LEFT JOIN?
SELECT * FROM (
(SELECT ch.user_ID, ch.clID FROM clubHistory AS ch
LEFT OUTER JOIN clubRaffleWinners AS cr ON
ch.user_ID = cr.user_ID
AND cr.cID=1157
AND cr.rafID=18
AND cr.chDate1 = '2022-06-04'
WHERE ch.cID=1157
AND ch.crID=1001
AND ch.ceID=1167
AND ch.chDate = '2022-06-04'
AND cr.user_ID IS NULL
GROUP BY ch.user_ID )
UNION ALL
(SELECT cu.user_ID, cu.clID FROM clubUsers AS cu
LEFT OUTER JOIN clubRaffleWinners AS cr1 ON
cu.user_ID = cr1.user_ID
AND cr1.cID=1157
AND cr1.rafID=18
AND cr1.chDate1 = '2022-06-04'
WHERE cu.cID=1157
AND cu.crID=1001
AND cu.ceID=1167
AND cu.calDate = '2022-06-04'
AND cr1.user_ID IS NULL
GROUP BY cu.user_ID )
) as winner ORDER BY RAND() LIMIT 1 ;
In my two left join select statements I tried:
(SELECT ch.user_ID as chUserID, ch.clID FROM clubHistory AS ch
and
(SELECT cu.user_ID as cuUserID, cu.clID FROM clubUsers AS cu
But every single result, after dozens and dozens of tries comes back a user_ID or chUserID. When I remove the ORDER BY RAND() LIMIT 1 - the only two columns that come back are user_ID, clID or chUserID, clID even though the combined results is the full list of both tables. Is this even possible?
And #2 above, is it possible to extract the total counts from each LEFT JOIN with and with out the final order by rand() limit 1 ???
For 1 add an extra column containing a value that identifies which subquery of the UNION it is.
SELECT * FROM (
(SELECT 'history' AS which, ch.user_ID, ch.clID FROM clubHistory AS ch
LEFT OUTER JOIN clubRaffleWinners AS cr ON
ch.user_ID = cr.user_ID
AND cr.cID=1157
AND cr.rafID=18
AND cr.chDate1 = '2022-06-04'
WHERE ch.cID=1157
AND ch.crID=1001
AND ch.ceID=1167
AND ch.chDate = '2022-06-04'
AND cr.user_ID IS NULL
GROUP BY ch.user_ID )
UNION ALL
(SELECT 'users' AS which, cu.user_ID, cu.clID FROM clubUsers AS cu
LEFT OUTER JOIN clubRaffleWinners AS cr1 ON
cu.user_ID = cr1.user_ID
AND cr1.cID=1157
AND cr1.rafID=18
AND cr1.chDate1 = '2022-06-04'
WHERE cu.cID=1157
AND cu.crID=1001
AND cu.ceID=1167
AND cu.calDate = '2022-06-04'
AND cr1.user_ID IS NULL
GROUP BY cu.user_ID )
) as winner ORDER BY RAND() LIMIT 1 ;
Please only ask one question at a time.

SELECT MAX in GROUP BY but LIMIT results to 1 in MYSQL

I have the following tables:
Task (id,....)
TaskPlan (id, task_id,.......,end_at)
Note that end_at is a timestamp and that one Task has many TaskPlans. I need to query for the MAX end_at for each Task.
This query works fine, except when you have the same exact timestamp for different TaskPlans. In that case, I would be returned multiple TaskPlans with the MAX end_at for the same Task.
I know this is an unlikely situation, but is there anyway I can limit the number of results for each task_id to 1?
My current code is:
SELECT * FROM Task AS t
INNER JOIN (
SELECT * FROM TaskPlan WHERE end_at in (SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id )
) AS pt
ON pt.task_id = t.id
WHERE status = 'plan';
This works, except in the above situation, how can this be achieved?
Also in the subquery, instad of SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id, is it possible to do something like this so I can use TaskPlan.id for the where in clause?
SELECT id, MAX(end_at) FROM TaskPlan GROUP BY task_id
When I try, it gives the following error:
SQL Error [1055] [42000]: Expression #1 of SELECT list is not in GROUP
BY clause and contains nonaggregated column 'TaskPlan.id' which is not
functionally dependent on columns in GROUP BY clause; this is
incompatible with sql_mode=only_full_group_by
Any explaination and suggestion would be much welcome!
Note on duplicate label: (Now reopened)
I already studied the this question, but it does not provide an answer for my situation where there are multiple max values in the result and it needs to be filtered out to include only one result row per group.
Use the id rather than the timestamp:
SELECT *
FROM Task AS t INNER JOIN
(SELECT tp.*
FROM TaskPlan tp
WHERE tp.id = (SELECT tp2.id FROM TaskPlan tp2 WHERE tp2.task_id = tp.task_id ORDER BY tp2.end_at DESC LIMIT 1)
) tp
ON tp.task_id = t.id
WHERE status = 'plan';
Or use in with tuples:
SELECT *
FROM Task AS t INNER JOIN
(SELECT tp.*
FROM TaskPlan tp
WHERE (tp.task_id, tp.end_at) in (SELECT tp2.task_id, MAX(tp2.end_at)
FROM TaskPlan tp2
GROUP BY tp2.task_id
)
) tp
ON tp.task_id = t.id
WHERE status = 'plan';
If you want to get a list of task ID's with MAX end_at for each, run the query below:
SELECT t.id, MAX(tp.end_at) FROM Task t JOIN TaskPlan tp on t.id = tp.task_id GROUP BY t.id;
EDIT:
Now, I know what exactly you are going to do.
If the TaskPlan table is so big, you can avoid the 'GROUP BY' and run the query below that is very efficient:
SET #first_row := 0;
SET #task_id := 0;
SELECT * FROM Task t JOIN (
SELECT tp.*
, IF(#task_id = tp.`task_id`, #first_row := 0, #first_row := 1) AS temp
, #first_row AS latest_record
, #task_id := tp.`task_id`
FROM TaskPlan tp ORDER BY task_id, end_at DESC) a ON t.task_id = a.task_id AND a.latest_record = 1;
Try this query:
select t.ID , tp1.end_at
from TASK t
left join TASKPLAN tp1 on t.ID = tp1.id
left join TASKPLAN tp2 on t.ID = tp2.id and tp1.end_at < tp2.end_at
where tp2.end_at is null;

Optimize Query with JOINS and Subqueries

I want to speed up one of my slower queries.
The problem is that I can't access the outer colum value within a subquery.
What I have:
SELECT r.id AS room_id, r.room_name, coalesce(d.score,0) AS total_messages, d.latest
FROM cf_rooms_time_frames tf
INNER JOIN cf_rooms r on r.id = tf.room_id
INNER JOIN(
SELECT cf.room_id, count(*) as score, max(cf.id) as latest
FROM cf_rooms_messages cf
WHERE EXISTS(
SELECT NULL FROM cf_rooms_time_frames tf
WHERE tf.start <= cf.id AND ( tf.end IS NULL OR tf.end >= cf.id )
AND tf.room_id = cf.room_id AND tf.uid = 8
)
GROUP BY cf.room_id
ORDER BY latest
DESC ) d on d.room_id = r.id
WHERE tf.uid = 8
ORDER BY coalesce(latest, score) DESC LIMIT 0, 20
What I want:
SELECT r.id AS room_id, r.room_name, coalesce(d.score,0) AS total_messages, d.latest
FROM cf_rooms_time_frames tf
INNER JOIN cf_rooms r on r.id = tf.room_id
INNER JOIN(
SELECT cf.room_id, count(*) as score, max(cf.id) as latest
FROM cf_rooms_messages cf
/* line added here */
WHERE cf.room_id = tf.room_id
/* */
AND EXISTS(
SELECT NULL FROM cf_rooms_time_frames tf
WHERE tf.start <= cf.id AND ( tf.end IS NULL OR tf.end >= cf.id )
AND tf.room_id = cf.room_id AND tf.uid = 8
)
GROUP BY cf.room_id
ORDER BY latest
DESC ) d on d.room_id = r.id
WHERE tf.uid = 8
ORDER BY coalesce(latest, score) DESC LIMIT 0, 20
I think the markup explains what the query does.
It searches for "chatrooms" for a given user and orders them by the last message, gets the number of total message which ids are in a given range ( timeframes ), and the last message id.
I don't know why, but the first query returns all rows within the chatmessage table ( cf ) if I can trust EXPLAIN. It delivers the correct results but is kind of slow on a huge table.
I tested the second one with a "hardcoded" room_id and this one was very fast and doesn't "touched" the whole table.

Join between sub-queries in SQLAlchemy

In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)

How can I adjust a JOIN clause so that rows that have columns with NULL values are returned in the result?

How can I adjust this JOIN clause so that rows with a NULL value for the CountLocId or CountNatId columns are returned in the result?
In other words, if there is no match in the local_ads table, I still want the user's result from the nat_ads table to be returned -- and vice-versa.
SELECT u.franchise, CountLocId, TotalPrice, CountNatId, TotalNMoney, (
TotalPrice + TotalNMoney
)TotalRev
FROM users u
LEFT JOIN local_rev lr ON u.user_id = lr.user_id
LEFT JOIN (
SELECT lrr_id, COUNT( lad_id ) CountLocId, SUM( price ) TotalPrice
FROM local_ads
GROUP BY lrr_id
)la ON lr.lrr_id = la.lrr_id
LEFT JOIN nat_rev nr ON u.user_id = nr.user_id
INNER JOIN (
SELECT nrr_id, COUNT( nad_id ) CountNatId, SUM( tmoney ) TotalNMoney
FROM nat_ads
WHERE MONTH = 'April'
GROUP BY nrr_id
)na ON nr.nrr_id = na.nrr_id
WHERE lr.month = 'April'
AND franchise != 'Corporate'
ORDER BY franchise
Thanks in advance for your help!
try the following in where clause while making a left join. This will take all rows from right table with matched condition
eg.
LEFT JOIN local_rev lr ON (u.user_id = lr.user_id) or (u.user_id IS NULL)
Use this template, as it ensures that :
you have only one record per user_id (notice all subquerys have a GROUP BY user_id) so for one record on user table you have one (or none) record on subquery
independent joins (and calculated data) are not messed togeder
-
SELECT u.franchise, one.CountLocId, one.TotalPrice, two.CountNatId, two.TotalNMoney, (COALESCE(one.TotalPrice,0) + COALESCE(two.TotalNMoney,0)) TotalRev
FROM users u
LEFT JOIN (
SELECT x.user_id, sum(xORy.whatever) as TotalPrice, count(xORy.whatever) as CountLocId
FROM x -- where x is local_rev or local_ads I dont know
LEFT JOIN y on x.... = y.... -- where y is local_rev or local_ads I dont know
GROUP BY x.user_id
) as one on u.user_id = one.user_id
LEFT JOIN (
SELECT x.user_id, sum(xORy.whatever) as TotalNMoney, count(xORy.whatever) as CountNatId
FROM x -- where x is nat_rev or nat_ads I dont know
LEFT JOIN y on x.... = y.... -- where y is nat_rev or nat_ads I dont know
GROUP BY x.user_id
) as two on u.user_id = two.user_id