MySQL: Problem with max() and group by - wrong values - mysql

I have a problem with a SQL statement:
Using this
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2200, 0
31, 10, '2010-07-14', 2205, 0
31, 10, '2010-07-14', 2206, 2
31, 10, '2010-07-14', 2207, 0
31, 10, '2010-07-14', 2210, 2
31, 10, '2010-07-15', 2211, 0
31, 10, '2010-07-14', 2213, 1
32, 10, '2010-07-14', 2203, 0
32, 10, '2010-07-14', 2204, 0
32, 10, '2010-07-14', 2208, 2
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Now I want to get one result row for one ID and date combination. Also I want to get this result row with the highest RE number.
So I write my statement:
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
max(a_au.re) as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
group by ID, date
Now I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2213, 0
31, 10, '2010-07-15', 2211, 0
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Everything seems to be okay, I have one result row per day/ID combination and the row with the highest RE number. But: the column STAT does not have the correct values!
The row
31, 10, '2010-07-14', 2213, 0
must have the status 1:
31, 10, '2010-07-14', 2213, 1
So there must be a mistake in my statement. It seems that MySQL takes the first STAT column value it found. But I want to have the corresponding one.
What should I do?
I saw other topics about this like here:
Selecting all corresponding fields using MAX and GROUP BY
but I can not transfer it to my SQL statement.
Thanks a lot in advance & Best Regards.

Use:
SELECT a.id as ID,
a.dur as DUR,
DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE,
a_au.stat as STAT
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
JOIN ( SELECT a.id as ID,
DATE(FROM_UNIXTIME(timestampCol)) as date,
MAX(a_au.re) as Max_RE
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
WHERE b_c.b_id = 5
GROUP BY a.id, DATE(FROM_UNIXTIME(timestampCol))) x ON x.id = a.id
AND x.date = DATE(FROM_UNIXTIME(timestampCol))
AND x.max_re = a_au.re
WHERE b_c.b_id = 5
Sadly, MySQL doesn't support the WITH clause which could've made this a lot easier to read.

Related

The sql has repeated query “in query” but faster than once in query?

mysql versin is 5.7,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 4.25sec,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 0.02sec.
The only diff is the double "in query" sql as below.
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
Anyone can tell me why? Thanks :)

Aliased LEFT JOIN affects SUM unexpectedly

Why do the two queries below give different results? I thought that joins in the case like those below shouldn't interact with each other, but they apparently are.
The query:
SELECT
s.id,
SUM(CASE WHEN n.template_id = 1 THEN 1 ELSE 0 END) AS template_1
FROM schedulings AS s
LEFT JOIN notes AS calls_made
ON s.id = calls_made.schedule_id
AND calls_made.template_id IN (1, 2, 3, 5, 6, 9, 10, 11, 12, 14)
LEFT OUTER JOIN notes AS n
ON s.id = n.schedule_id
WHERE s.id = 48810;
The results:
id template_1
48810 70
However, if I change the query by commenting out (or removing) the first notes join, I get the expected result.
The query:
SELECT
s.id,
SUM(CASE WHEN n.template_id = 1 THEN 1 ELSE 0 END) AS template_1
FROM schedulings AS s
LEFT OUTER JOIN notes AS n
ON s.id = n.schedule_id
WHERE s.id = 48810;
The result:
id template_1
48810 7

Mysql multiple inner join don't work

I have two tables A & B
A :
id
data
B :
key
value
A_id
I have a problem with my sql query (its hard to explain it, so i create an sqlfiddle)
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '18' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
in this example, I'm supposed to get (A.id = 12 & A.id = 13) but nothing
CREATE TABLE A
(`id` int, `data` int)
;
INSERT INTO A
(`id`, `data`)
VALUES
(11, 11),
(12, 11),
(13, 12)
;
CREATE TABLE B
(`key` int, `value` int, `A_id` int)
;
INSERT INTO B
(`key`, `value`, `A_id`)
VALUES
(20, 1900, 12),
(2, 19, 11),
(11, 19, 11),
(9, 19, 11),
(18, 1950, 13),
(19, 1950, 12)
;
Any idea ?
thanks
First, if you code for more than 10 minutes, you will learn to despise the phrase "don't work"... "don't work" is the phrase that doesn't work.
/rant
You are trying to join tables in an effort to filter. Instead, filter accordingly. Check this out:
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.A_id = A.id
WHERE
(b1.key = '20' AND b1.value = '1900')
OR
(b1.key = '18' AND b1.value >= '1900')
That asks for what you want and joins only when necessary.
INNER JOIN means that it will only return a result if a result exists in both tables. Since you inner join the same table twice each time with unique ids you will never get a result with those three results (table A inner join B inner join b).
I'd suggest something different:
SELECT A.id
FROM A
INNER JOIN B on A.id = B.A_id
WHERE
(B.id = '20' AND B.value = '1980') OR (B.id = '18' AND B.value >= '1990')
;
Your query is using the same A.id for both joins - so it means they must be the same on the rows returned. For b2.key = 18, b2.A_id is 13, while b1.key = 20 forces a b1.A_id of 12. So you want to get a return where b1.A_id = b2.A_id, where one is 12 and the other is 13. If you change the query to
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '19' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
you'll get a return of 12 for A.id.

IF statement in mysql JOIN - Adding multiple conditions if condition success

Here is the case:-
I want to join 2 tables. Lets say table a and b
SELECT *
FROM a
JOIN b ON a.id = b.id AND b.status = '1'
Here is the problem:
b.status = '1'
should only be added when
b.stage in (1, 3, 5, 6, 8)
How can I add such condition in ON clause ?
Like
ON a.id = b.id
CASE
IF (b.stage in (1, 3, 5, 6, 8))
THEN
AND b.status = '1'
END
Your condition is logically the same as "either stage is not in the list or status is 1":
SELECT *
FROM a
JOIN b ON a.id = b.id
AND (b.stage not in (1, 3, 5, 6, 8) OR b.status = '1')

Optimize multiple row count

There are two tables: question and answer. In answer I hold user_id and question_id. I want to count how many times each choice is selected.
Below is a working query, but instead of joining the same table 4 times, what is a faster way i.e. joining the answer table only once.
SELECT question.question_id,
question.correct_choice,
COUNT(DISTINCT a.user_id) as num_of_a,
COUNT(DISTINCT b.user_id) as num_of_b,
COUNT(DISTINCT c.user_id) as num_of_c,
COUNT(DISTINCT d.user_id) as num_of_d
FROM answer a,
answer b,
answer c,
answer d,
question
WHERE a.question_id = question.question_id
AND b.question_id = question.question_id
AND c.question_id = question.question_id
AND d.question_id = question.question_id
AND a.choice = 'A'
AND b.choice = 'B'
AND c.choice = 'C'
AND d.choice = 'D'
GROUP BY question.question_id
ORDER BY question.question_id asc;
returns
273, D, 5, 2, 8, 39
274, C, 2, 14, 50, 2
277, C, 3, 5, 41, 17
278, C, 16, 9, 34, 9
279, C, 8, 30, 24, 12
280, B, 17, 21, 20, 3
284, C, 2, 3, 19, 1
286, A, 16, 3, 2, 2
287, D, 1, 2, 1, 18
289, B, 3, 18, 2, 2
290, D, 6, 9, 8, 6
This solution only does a single join... additionally, I converted your implicit joins to explicit, and rounded out your GROUP BY:
SELECT
q.question_id,
q.correct_choice,
COUNT(DISTINCT CASE WHEN a.choice = 'A' THEN a.user_id END) as num_of_a,
COUNT(DISTINCT CASE WHEN a.choice = 'B' THEN a.user_id END) as num_of_b,
COUNT(DISTINCT CASE WHEN a.choice = 'C' THEN a.user_id END) as num_of_c,
COUNT(DISTINCT CASE WHEN a.choice = 'D' THEN a.user_id END) as num_of_d
FROM
answer a
JOIN question q ON a.question_id = q.question_id
GROUP BY q.question_id, q.correct_choice
ORDER BY q.question_id asc;
This works because when the CASE statement doesn't evaluate to true, it returns NULL, which won't be included in the COUNT DISTINCT of user Ids.
You might consider using a SELECT... UNION SELECT style if you are concerned about performance.
Although I would agree with #benjam that you should EXPLAIN the results to see what optimizer is saying, since you do not have a dependent queries.
Make sure that you have indexes on question.question_id, and on answer.question_id, answer.choice, and answer.user_id and your query should be just as fast as any other that does not join answer for each choice. Then use the following query:
SELECT `question`.`question_id`,
`question`.`correct_choice`,
COUNT(DISTINCT `a`.`user_id`) as `num_of_a`,
COUNT(DISTINCT `b`.`user_id`) as `num_of_b`,
COUNT(DISTINCT `c`.`user_id`) as `num_of_c`,
COUNT(DISTINCT `d`.`user_id`) as `num_of_d`
FROM `question`
LEFT JOIN `answer` AS `a`
USING(`a`.`question_id` = `question`.`question_id`
AND `a`.`choice` = 'A'),
LEFT JOIN `answer` AS `b`
USING(`b`.`question_id` = `question`.`question_id`
AND `b`.`choice` = 'B'),
LEFT JOIN `answer` AS `c`
USING(`c`.`question_id` = `question`.`question_id`
AND `c`.`choice` = 'C'),
LEFT JOIN `answer` AS `d`
USING(`d`.`question_id` = `question`.`question_id`
AND `d`.`choice` = 'D')
GROUP BY `question`.`question_id` ;
The ORDER BY clause is not needed and implied from the GROUP BY clause.