Optimize multiple row count - mysql

There are two tables: question and answer. In answer I hold user_id and question_id. I want to count how many times each choice is selected.
Below is a working query, but instead of joining the same table 4 times, what is a faster way i.e. joining the answer table only once.
SELECT question.question_id,
question.correct_choice,
COUNT(DISTINCT a.user_id) as num_of_a,
COUNT(DISTINCT b.user_id) as num_of_b,
COUNT(DISTINCT c.user_id) as num_of_c,
COUNT(DISTINCT d.user_id) as num_of_d
FROM answer a,
answer b,
answer c,
answer d,
question
WHERE a.question_id = question.question_id
AND b.question_id = question.question_id
AND c.question_id = question.question_id
AND d.question_id = question.question_id
AND a.choice = 'A'
AND b.choice = 'B'
AND c.choice = 'C'
AND d.choice = 'D'
GROUP BY question.question_id
ORDER BY question.question_id asc;
returns
273, D, 5, 2, 8, 39
274, C, 2, 14, 50, 2
277, C, 3, 5, 41, 17
278, C, 16, 9, 34, 9
279, C, 8, 30, 24, 12
280, B, 17, 21, 20, 3
284, C, 2, 3, 19, 1
286, A, 16, 3, 2, 2
287, D, 1, 2, 1, 18
289, B, 3, 18, 2, 2
290, D, 6, 9, 8, 6

This solution only does a single join... additionally, I converted your implicit joins to explicit, and rounded out your GROUP BY:
SELECT
q.question_id,
q.correct_choice,
COUNT(DISTINCT CASE WHEN a.choice = 'A' THEN a.user_id END) as num_of_a,
COUNT(DISTINCT CASE WHEN a.choice = 'B' THEN a.user_id END) as num_of_b,
COUNT(DISTINCT CASE WHEN a.choice = 'C' THEN a.user_id END) as num_of_c,
COUNT(DISTINCT CASE WHEN a.choice = 'D' THEN a.user_id END) as num_of_d
FROM
answer a
JOIN question q ON a.question_id = q.question_id
GROUP BY q.question_id, q.correct_choice
ORDER BY q.question_id asc;
This works because when the CASE statement doesn't evaluate to true, it returns NULL, which won't be included in the COUNT DISTINCT of user Ids.

You might consider using a SELECT... UNION SELECT style if you are concerned about performance.
Although I would agree with #benjam that you should EXPLAIN the results to see what optimizer is saying, since you do not have a dependent queries.

Make sure that you have indexes on question.question_id, and on answer.question_id, answer.choice, and answer.user_id and your query should be just as fast as any other that does not join answer for each choice. Then use the following query:
SELECT `question`.`question_id`,
`question`.`correct_choice`,
COUNT(DISTINCT `a`.`user_id`) as `num_of_a`,
COUNT(DISTINCT `b`.`user_id`) as `num_of_b`,
COUNT(DISTINCT `c`.`user_id`) as `num_of_c`,
COUNT(DISTINCT `d`.`user_id`) as `num_of_d`
FROM `question`
LEFT JOIN `answer` AS `a`
USING(`a`.`question_id` = `question`.`question_id`
AND `a`.`choice` = 'A'),
LEFT JOIN `answer` AS `b`
USING(`b`.`question_id` = `question`.`question_id`
AND `b`.`choice` = 'B'),
LEFT JOIN `answer` AS `c`
USING(`c`.`question_id` = `question`.`question_id`
AND `c`.`choice` = 'C'),
LEFT JOIN `answer` AS `d`
USING(`d`.`question_id` = `question`.`question_id`
AND `d`.`choice` = 'D')
GROUP BY `question`.`question_id` ;
The ORDER BY clause is not needed and implied from the GROUP BY clause.

Related

The sql has repeated query “in query” but faster than once in query?

mysql versin is 5.7,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 4.25sec,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 0.02sec.
The only diff is the double "in query" sql as below.
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
Anyone can tell me why? Thanks :)

Order by joined table through table

my mySQL (pun intended) is a bit rusty. I am trying to join a table through another table.
carparks has many clients
clients has many cars
This is the query
select `carparks`.* from `carparks`
left join `clients` on `carparks`.`carpark_id` = `clients`.`carpark_id`
left join `cars` on `clients`.`client_id` = `cars`.`client_id`
where `carparks`.`carpark_id` in (1, 3, 8, 33, 34, 38, 39)
order by `cars`.`created_at` desc
As you can see I am trying to order by the created_at column of cars, the above query though returns duplicated carparks for each of the cars within the carpark.
What I am looking at is to return only those carparks with the ids in the WHERE IN clause, simply ordered by the created_at column of the cars table.
Thanks
You can use aggregation in your order by clause on max created date from cars table
SELECT cp.*
FROM `carparks` cp
LEFT JOIN `clients` cl ON cp.`carpark_id` = cl.`carpark_id`
LEFT JOIN `cars` c ON cl.`client_id` = c.`client_id`
WHERE cp.`carpark_id` IN (1, 3, 8, 33, 34, 38, 39)
GROUP BY cp.`carpark_id`
ORDER BY MAX(c.`created_at`) DESC
Reduce the wanted dates to one per carpark before joining back to carparks. Note if a carpark does have no cars than a left join is logical, however I expect every carpark (that is open for business) will have cars, so that left join might not be needed.
SELECT `carparks`.*
FROM `carparks`
LEFT JOIN (
SELECT
`carparks`.`carpark_id`
, max(`cars`.`created_at`) max_car_created
FROM `clients`
INNER JOIN `cars` ON `clients`.`client_id` = `cars`.`client_id`
GROUP BY
`carparks`.`carpark_id`
) d ON `carparks`.`carpark_id` = d.`carpark_id`
WHERE `carparks`.`carpark_id` IN (1, 3, 8, 33, 34, 38, 39)
ORDER BY max_car_created DESC
Reduce the number of carparks and clients before doing the joins, this will reduce the execution time of the query.
SELECT A.* FROM (SELECT * FROM `carparks` WHERE `carpark_id` in
(1, 3, 8, 33, 34, 38, 39)) A LEFT JOIN
(SELECT `carpark_id`, `client_id` FROM `clients` WHERE `carpark_id`
in (1, 3, 8, 33, 34, 38, 39)) B ON A.`carpark_id`=B.`carpark_id` LEFT JOIN
`cars` C ON B.`client_id` = C.`client_id`
GROUP BY A.`carpark_id`
ORDER BY MAX(C.`created_at`) DESC

Mysql multiple inner join don't work

I have two tables A & B
A :
id
data
B :
key
value
A_id
I have a problem with my sql query (its hard to explain it, so i create an sqlfiddle)
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '18' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
in this example, I'm supposed to get (A.id = 12 & A.id = 13) but nothing
CREATE TABLE A
(`id` int, `data` int)
;
INSERT INTO A
(`id`, `data`)
VALUES
(11, 11),
(12, 11),
(13, 12)
;
CREATE TABLE B
(`key` int, `value` int, `A_id` int)
;
INSERT INTO B
(`key`, `value`, `A_id`)
VALUES
(20, 1900, 12),
(2, 19, 11),
(11, 19, 11),
(9, 19, 11),
(18, 1950, 13),
(19, 1950, 12)
;
Any idea ?
thanks
First, if you code for more than 10 minutes, you will learn to despise the phrase "don't work"... "don't work" is the phrase that doesn't work.
/rant
You are trying to join tables in an effort to filter. Instead, filter accordingly. Check this out:
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.A_id = A.id
WHERE
(b1.key = '20' AND b1.value = '1900')
OR
(b1.key = '18' AND b1.value >= '1900')
That asks for what you want and joins only when necessary.
INNER JOIN means that it will only return a result if a result exists in both tables. Since you inner join the same table twice each time with unique ids you will never get a result with those three results (table A inner join B inner join b).
I'd suggest something different:
SELECT A.id
FROM A
INNER JOIN B on A.id = B.A_id
WHERE
(B.id = '20' AND B.value = '1980') OR (B.id = '18' AND B.value >= '1990')
;
Your query is using the same A.id for both joins - so it means they must be the same on the rows returned. For b2.key = 18, b2.A_id is 13, while b1.key = 20 forces a b1.A_id of 12. So you want to get a return where b1.A_id = b2.A_id, where one is 12 and the other is 13. If you change the query to
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '19' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
you'll get a return of 12 for A.id.

IF statement in mysql JOIN - Adding multiple conditions if condition success

Here is the case:-
I want to join 2 tables. Lets say table a and b
SELECT *
FROM a
JOIN b ON a.id = b.id AND b.status = '1'
Here is the problem:
b.status = '1'
should only be added when
b.stage in (1, 3, 5, 6, 8)
How can I add such condition in ON clause ?
Like
ON a.id = b.id
CASE
IF (b.stage in (1, 3, 5, 6, 8))
THEN
AND b.status = '1'
END
Your condition is logically the same as "either stage is not in the list or status is 1":
SELECT *
FROM a
JOIN b ON a.id = b.id
AND (b.stage not in (1, 3, 5, 6, 8) OR b.status = '1')

MySQL: Problem with max() and group by - wrong values

I have a problem with a SQL statement:
Using this
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2200, 0
31, 10, '2010-07-14', 2205, 0
31, 10, '2010-07-14', 2206, 2
31, 10, '2010-07-14', 2207, 0
31, 10, '2010-07-14', 2210, 2
31, 10, '2010-07-15', 2211, 0
31, 10, '2010-07-14', 2213, 1
32, 10, '2010-07-14', 2203, 0
32, 10, '2010-07-14', 2204, 0
32, 10, '2010-07-14', 2208, 2
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Now I want to get one result row for one ID and date combination. Also I want to get this result row with the highest RE number.
So I write my statement:
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
max(a_au.re) as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
group by ID, date
Now I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2213, 0
31, 10, '2010-07-15', 2211, 0
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Everything seems to be okay, I have one result row per day/ID combination and the row with the highest RE number. But: the column STAT does not have the correct values!
The row
31, 10, '2010-07-14', 2213, 0
must have the status 1:
31, 10, '2010-07-14', 2213, 1
So there must be a mistake in my statement. It seems that MySQL takes the first STAT column value it found. But I want to have the corresponding one.
What should I do?
I saw other topics about this like here:
Selecting all corresponding fields using MAX and GROUP BY
but I can not transfer it to my SQL statement.
Thanks a lot in advance & Best Regards.
Use:
SELECT a.id as ID,
a.dur as DUR,
DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE,
a_au.stat as STAT
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
JOIN ( SELECT a.id as ID,
DATE(FROM_UNIXTIME(timestampCol)) as date,
MAX(a_au.re) as Max_RE
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
WHERE b_c.b_id = 5
GROUP BY a.id, DATE(FROM_UNIXTIME(timestampCol))) x ON x.id = a.id
AND x.date = DATE(FROM_UNIXTIME(timestampCol))
AND x.max_re = a_au.re
WHERE b_c.b_id = 5
Sadly, MySQL doesn't support the WITH clause which could've made this a lot easier to read.