There are two tables: question and answer. In answer I hold user_id and question_id. I want to count how many times each choice is selected.
Below is a working query, but instead of joining the same table 4 times, what is a faster way i.e. joining the answer table only once.
SELECT question.question_id,
question.correct_choice,
COUNT(DISTINCT a.user_id) as num_of_a,
COUNT(DISTINCT b.user_id) as num_of_b,
COUNT(DISTINCT c.user_id) as num_of_c,
COUNT(DISTINCT d.user_id) as num_of_d
FROM answer a,
answer b,
answer c,
answer d,
question
WHERE a.question_id = question.question_id
AND b.question_id = question.question_id
AND c.question_id = question.question_id
AND d.question_id = question.question_id
AND a.choice = 'A'
AND b.choice = 'B'
AND c.choice = 'C'
AND d.choice = 'D'
GROUP BY question.question_id
ORDER BY question.question_id asc;
returns
273, D, 5, 2, 8, 39
274, C, 2, 14, 50, 2
277, C, 3, 5, 41, 17
278, C, 16, 9, 34, 9
279, C, 8, 30, 24, 12
280, B, 17, 21, 20, 3
284, C, 2, 3, 19, 1
286, A, 16, 3, 2, 2
287, D, 1, 2, 1, 18
289, B, 3, 18, 2, 2
290, D, 6, 9, 8, 6
This solution only does a single join... additionally, I converted your implicit joins to explicit, and rounded out your GROUP BY:
SELECT
q.question_id,
q.correct_choice,
COUNT(DISTINCT CASE WHEN a.choice = 'A' THEN a.user_id END) as num_of_a,
COUNT(DISTINCT CASE WHEN a.choice = 'B' THEN a.user_id END) as num_of_b,
COUNT(DISTINCT CASE WHEN a.choice = 'C' THEN a.user_id END) as num_of_c,
COUNT(DISTINCT CASE WHEN a.choice = 'D' THEN a.user_id END) as num_of_d
FROM
answer a
JOIN question q ON a.question_id = q.question_id
GROUP BY q.question_id, q.correct_choice
ORDER BY q.question_id asc;
This works because when the CASE statement doesn't evaluate to true, it returns NULL, which won't be included in the COUNT DISTINCT of user Ids.
You might consider using a SELECT... UNION SELECT style if you are concerned about performance.
Although I would agree with #benjam that you should EXPLAIN the results to see what optimizer is saying, since you do not have a dependent queries.
Make sure that you have indexes on question.question_id, and on answer.question_id, answer.choice, and answer.user_id and your query should be just as fast as any other that does not join answer for each choice. Then use the following query:
SELECT `question`.`question_id`,
`question`.`correct_choice`,
COUNT(DISTINCT `a`.`user_id`) as `num_of_a`,
COUNT(DISTINCT `b`.`user_id`) as `num_of_b`,
COUNT(DISTINCT `c`.`user_id`) as `num_of_c`,
COUNT(DISTINCT `d`.`user_id`) as `num_of_d`
FROM `question`
LEFT JOIN `answer` AS `a`
USING(`a`.`question_id` = `question`.`question_id`
AND `a`.`choice` = 'A'),
LEFT JOIN `answer` AS `b`
USING(`b`.`question_id` = `question`.`question_id`
AND `b`.`choice` = 'B'),
LEFT JOIN `answer` AS `c`
USING(`c`.`question_id` = `question`.`question_id`
AND `c`.`choice` = 'C'),
LEFT JOIN `answer` AS `d`
USING(`d`.`question_id` = `question`.`question_id`
AND `d`.`choice` = 'D')
GROUP BY `question`.`question_id` ;
The ORDER BY clause is not needed and implied from the GROUP BY clause.
Related
mysql versin is 5.7,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 4.25sec,
SELECT distinct(t.id),
t.sys_user_id,
t.update_time
FROM institute_student t
inner join institute_student_campus student_campus on student_campus.student_id = t.id
WHERE t.deleted = 0
AND t.birth_country_id = 9
AND t.type = 'FINISHED'
AND t.gender = 6
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
and student_campus.campus_id in (1, 2, 17, 18, 19, 20)
ORDER by t.update_time desc
LIMIT 15;
spend 0.02sec.
The only diff is the double "in query" sql as below.
and t.id in (SELECT distinct s_alert.student_id FROM institute_student_alert_welfare s_alert inner join institute_alert_welfare alert on s_alert.alert_welfare_id=alert.id and alert.type='Alert')
Anyone can tell me why? Thanks :)
my mySQL (pun intended) is a bit rusty. I am trying to join a table through another table.
carparks has many clients
clients has many cars
This is the query
select `carparks`.* from `carparks`
left join `clients` on `carparks`.`carpark_id` = `clients`.`carpark_id`
left join `cars` on `clients`.`client_id` = `cars`.`client_id`
where `carparks`.`carpark_id` in (1, 3, 8, 33, 34, 38, 39)
order by `cars`.`created_at` desc
As you can see I am trying to order by the created_at column of cars, the above query though returns duplicated carparks for each of the cars within the carpark.
What I am looking at is to return only those carparks with the ids in the WHERE IN clause, simply ordered by the created_at column of the cars table.
Thanks
You can use aggregation in your order by clause on max created date from cars table
SELECT cp.*
FROM `carparks` cp
LEFT JOIN `clients` cl ON cp.`carpark_id` = cl.`carpark_id`
LEFT JOIN `cars` c ON cl.`client_id` = c.`client_id`
WHERE cp.`carpark_id` IN (1, 3, 8, 33, 34, 38, 39)
GROUP BY cp.`carpark_id`
ORDER BY MAX(c.`created_at`) DESC
Reduce the wanted dates to one per carpark before joining back to carparks. Note if a carpark does have no cars than a left join is logical, however I expect every carpark (that is open for business) will have cars, so that left join might not be needed.
SELECT `carparks`.*
FROM `carparks`
LEFT JOIN (
SELECT
`carparks`.`carpark_id`
, max(`cars`.`created_at`) max_car_created
FROM `clients`
INNER JOIN `cars` ON `clients`.`client_id` = `cars`.`client_id`
GROUP BY
`carparks`.`carpark_id`
) d ON `carparks`.`carpark_id` = d.`carpark_id`
WHERE `carparks`.`carpark_id` IN (1, 3, 8, 33, 34, 38, 39)
ORDER BY max_car_created DESC
Reduce the number of carparks and clients before doing the joins, this will reduce the execution time of the query.
SELECT A.* FROM (SELECT * FROM `carparks` WHERE `carpark_id` in
(1, 3, 8, 33, 34, 38, 39)) A LEFT JOIN
(SELECT `carpark_id`, `client_id` FROM `clients` WHERE `carpark_id`
in (1, 3, 8, 33, 34, 38, 39)) B ON A.`carpark_id`=B.`carpark_id` LEFT JOIN
`cars` C ON B.`client_id` = C.`client_id`
GROUP BY A.`carpark_id`
ORDER BY MAX(C.`created_at`) DESC
I have two tables A & B
A :
id
data
B :
key
value
A_id
I have a problem with my sql query (its hard to explain it, so i create an sqlfiddle)
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '18' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
in this example, I'm supposed to get (A.id = 12 & A.id = 13) but nothing
CREATE TABLE A
(`id` int, `data` int)
;
INSERT INTO A
(`id`, `data`)
VALUES
(11, 11),
(12, 11),
(13, 12)
;
CREATE TABLE B
(`key` int, `value` int, `A_id` int)
;
INSERT INTO B
(`key`, `value`, `A_id`)
VALUES
(20, 1900, 12),
(2, 19, 11),
(11, 19, 11),
(9, 19, 11),
(18, 1950, 13),
(19, 1950, 12)
;
Any idea ?
thanks
First, if you code for more than 10 minutes, you will learn to despise the phrase "don't work"... "don't work" is the phrase that doesn't work.
/rant
You are trying to join tables in an effort to filter. Instead, filter accordingly. Check this out:
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.A_id = A.id
WHERE
(b1.key = '20' AND b1.value = '1900')
OR
(b1.key = '18' AND b1.value >= '1900')
That asks for what you want and joins only when necessary.
INNER JOIN means that it will only return a result if a result exists in both tables. Since you inner join the same table twice each time with unique ids you will never get a result with those three results (table A inner join B inner join b).
I'd suggest something different:
SELECT A.id
FROM A
INNER JOIN B on A.id = B.A_id
WHERE
(B.id = '20' AND B.value = '1980') OR (B.id = '18' AND B.value >= '1990')
;
Your query is using the same A.id for both joins - so it means they must be the same on the rows returned. For b2.key = 18, b2.A_id is 13, while b1.key = 20 forces a b1.A_id of 12. So you want to get a return where b1.A_id = b2.A_id, where one is 12 and the other is 13. If you change the query to
SELECT A.id
FROM A
INNER JOIN B b1 ON b1.key = '20' AND b1.A_id = A.id
INNER JOIN B b2 ON b2.key = '19' AND b2.A_id = A.id
WHERE
b1.value = '1900' AND
b2.value >= '1900'
you'll get a return of 12 for A.id.
Here is the case:-
I want to join 2 tables. Lets say table a and b
SELECT *
FROM a
JOIN b ON a.id = b.id AND b.status = '1'
Here is the problem:
b.status = '1'
should only be added when
b.stage in (1, 3, 5, 6, 8)
How can I add such condition in ON clause ?
Like
ON a.id = b.id
CASE
IF (b.stage in (1, 3, 5, 6, 8))
THEN
AND b.status = '1'
END
Your condition is logically the same as "either stage is not in the list or status is 1":
SELECT *
FROM a
JOIN b ON a.id = b.id
AND (b.stage not in (1, 3, 5, 6, 8) OR b.status = '1')
I have a problem with a SQL statement:
Using this
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2200, 0
31, 10, '2010-07-14', 2205, 0
31, 10, '2010-07-14', 2206, 2
31, 10, '2010-07-14', 2207, 0
31, 10, '2010-07-14', 2210, 2
31, 10, '2010-07-15', 2211, 0
31, 10, '2010-07-14', 2213, 1
32, 10, '2010-07-14', 2203, 0
32, 10, '2010-07-14', 2204, 0
32, 10, '2010-07-14', 2208, 2
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Now I want to get one result row for one ID and date combination. Also I want to get this result row with the highest RE number.
So I write my statement:
select a.id as ID, a.dur as DUR, DATE(FROM_UNIXTIME(timestampCol)) as date,
max(a_au.re) as RE, a_au.stat as STAT from b_c
inner join c on b_c.c_id = c.id
inner join a on c.id = a.c_id
inner join a_au on a.id = a_au.id
inner join revi on a_au.rev = revi.rev
where b_c.b_id = 5
group by ID, date
Now I get this result:
ID DUR date RE STAT
-------------------------------
31, 10, '2010-07-14', 2213, 0
31, 10, '2010-07-15', 2211, 0
32, 10, '2010-07-14', 2209, 0
32, 10, '2010-07-15', 2212, 2
Everything seems to be okay, I have one result row per day/ID combination and the row with the highest RE number. But: the column STAT does not have the correct values!
The row
31, 10, '2010-07-14', 2213, 0
must have the status 1:
31, 10, '2010-07-14', 2213, 1
So there must be a mistake in my statement. It seems that MySQL takes the first STAT column value it found. But I want to have the corresponding one.
What should I do?
I saw other topics about this like here:
Selecting all corresponding fields using MAX and GROUP BY
but I can not transfer it to my SQL statement.
Thanks a lot in advance & Best Regards.
Use:
SELECT a.id as ID,
a.dur as DUR,
DATE(FROM_UNIXTIME(timestampCol)) as date,
a_au.re as RE,
a_au.stat as STAT
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
JOIN ( SELECT a.id as ID,
DATE(FROM_UNIXTIME(timestampCol)) as date,
MAX(a_au.re) as Max_RE
FROM b_c
JOIN c on b_c.c_id = c.id
JOIN a on c.id = a.c_id
JOIN a_au on a.id = a_au.id
JOIN revi on a_au.rev = revi.rev
WHERE b_c.b_id = 5
GROUP BY a.id, DATE(FROM_UNIXTIME(timestampCol))) x ON x.id = a.id
AND x.date = DATE(FROM_UNIXTIME(timestampCol))
AND x.max_re = a_au.re
WHERE b_c.b_id = 5
Sadly, MySQL doesn't support the WITH clause which could've made this a lot easier to read.