Optimize SQL query by eliminating extensive GROUP BY - mysql

I have a query with multiple select and a single aggregated value, coming from a joined table, resulting an extensive and ugly GROUP BY (because of the one-to-many relation with the joined table).
It's something like this:
SELECT user.id, user.name, user.type, etc.
GROUP_CONCAT(car.id SEPARATOR ', ') AS cars
FROM user
INNER JOIN car ON user.id = car.userid
GROUP BY user.id, etc.
ORDER BY user.name, user.type, cars
I would like to eliminate the long GROUP BY, but how could I get the aggregated value without the JOIN? Is there a way with something like a subquery to join the values together like with the GROUP_CONCAT?

You can aggregate in car and then join to user:
SELECT u.id, u.name, u.type, etc.,
c.cars
FROM user u
INNER JOIN (
SELECT userid, GROUP_CONCAT(id SEPARATOR ', ') AS cars
FROM car
GROUP BY userid
) c ON u.id = c.userid
ORDER BY u.name, u.type, c.cars;
Or with a correlated subquery, which is equivalent to a LEFT join but may perform better:
SELECT u.id, u.name, u.type, etc.,
(SELECT GROUP_CONCAT(c.id SEPARATOR ', ') FROM car c WHERE u.id = c.userid) AS cars
FROM user u
ORDER BY u.name, u.type, cars;

You can group this way
SELECT user.id, user.name, user.type,
uc.cars
FROM (
SELECT userid, GROUP_CONCAT(id SEPARATOR ', ') AS cars
FROM car
GROUP BY userid) uc
INNER JOIN user ON user.id = uc.userid

Related

SQL Query WHERE clause gives errors

I have this SQL query and it is rather advanced for me. I have a table add_design with two columns status and home, so I want to add a part WHERE status=1 and home='Yes' in the query. I just don't understand where to put it. Can someone breakdown the query for me please?
SELECT mytable.*, CONCAT(fname, ' ', lname) AS user_name, user.image AS user_image,
user.fname, user.lname, add_design_doc.image as product_image
FROM
(
SELECT
add_design.id, add_design.user_id, title, category, COUNT(add_design.user_id) AS total_likes
FROM
add_design INNER JOIN
like_unlike ON (add_design.id = like_unlike.product_id AND
like_unlike.type = '1')
GROUP BY
add_design.id
ORDER BY
total_likes DESC ) AS mytable INNER JOIN
user ON mytable.user_id = user.id LEFT OUTER JOIN
add_design_doc ON mytable.id = add_design_id
GROUP BY
user_id LIMIT 4;

How to acquire the same result without disabling ONLY_FULL_GROUP_BY

The following query is done after disabling ONLY_FULL_GROUP_BY in MySQL. Now I want the same result without disabling ONLY_FULL_GROUP_BY mode like GROUP BY r.user_id, u.fullname, r.time_taken
SELECT u.fullname, ROUND(AVG(r.correct), 2) avg_correct, date_format(r.time_taken,'%d-%m-%Y') time_taken
FROM (SELECT user_id, concat( first_name, ' ', last_name) fullname from user) u
LEFT JOIN test_result r ON u.user_id = r.user_id
GROUP BY r.user_id
ORDER BY r.time_taken DESC
Screenshot:
Can anyone help me?
You must add u.fullname to the group by clause and this does not affect the results as the user's id and fullname are uniquely grouped together, but in the case of time_taken you must use any_value():
SELECT
u.fullname,
ROUND(AVG(r.correct), 2) avg_correct,
date_format(any_value(r.time_taken),'%d-%m-%Y') time_taken
FROM (SELECT user_id, concat( first_name, ' ', last_name) fullname from user) u
LEFT JOIN test_result r ON u.user_id = r.user_id
GROUP BY r.user_id, u.fullname
ORDER BY time_taken DESC
You can find more here: MySQL Handling of GROUP BY
I would recommend writing this query as:
SELECT CONCAT(u.first_name, ' ', u.last_name) as fullname,
ROUND(AVG(r.correct), 2) as avg_correct,
DATE_FORMAT(MAX(r.time_taken), '%d-%m-%Y') as time_taken
FROM user u LEFT JOIN
test_result r
ON u.user_id = r.user_id
GROUP BY u.user_id, fullname
ORDER BY MAX(r.time_taken) DESC;
Notes:
The subquery in the FROM clause does not help the query. It might impede the optimizer.
Don't GROUP BY columns from the second table in a LEFT JOIN (unless you really know what you are doing). The value would be NULL for non-matches.
MySQL and MariaDB allow column aliases in the GROUP BY clause.
For the ORDER BY to work as you intend, it needs to be on the value before formatting, not after formatting. The format %d-%m-%Y does not order by time.

mysql JOIN between three tables

I have 4 tables in my database:
users (id,name)
roles (id,name)
positions (id,name)
position_user (user_id,position_id)
Relationship between users to roles is one to one
Relationship between users positions is many to many
i want to take all users with their role name and list with their positions but i don't know how to structure my query. I think that one of my query must be something like this:
SELECT pu.user_id AS user_id,
group_concat(p.name separator ',') AS list_pos
FROM position_user pu
INNER JOIN positions p
ON p.id = pu.position_id
GROUP BY pu.user_id
And other one must be like this :
SELECT users.id, users.first_name, roles.name
FROM users
JOIN roles
ON users.role_id = roles.id
Can I combine these two in one query and how ?
Try something like this and check the MySQL documentation.
SELECT pu.user_id AS user_id, u.first_name, r.name as rol_name, group_concat(p.name separator ',') AS list_pos
FROM position_user pu
INNER JOIN positions p ON p.id = pu.position_id
INNER JOIN users u ON u.id = pu.uder_id
INNER JOIN roles R ON u.role_id = r.id
GROUP BY pu.user_id, u.first_name, r.name

Count different totals from multiple tables in mysql grouped by user_id in one query

I want to count user_id from courses_taken and quiz_attempts table but my query brings me wrong numbers.
SELECT
u.id,
u.email,
u.user,
u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(qa.user_id) total_quiz,
COUNT(ct.user_id) total_courses
FROM users u
LEFT JOIN courses_taken ct
ON u.id = ct.user_id
LEFT JOIN quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode=1
GROUP BY u.id
ORDER BY total_courses DESC
Table structure
users table
id, email, user, joined
quiz_attempt table
id,user_id, last_attempt_time, attempt_mode etc.
courses_taken table
id,user_id,course_id,taken_on etc.
Here i am trying to get all users with their total number of quiz attempts and total number of courses taken. But my query returns same numbers for both quiz attempts and courses taken.
What you can do is use COUNT DISTINCT on a column which varies uniquely with the value that you are trying to count, i.e.:
...
COUNT(DISTINCT qa.id) total_quiz,
COUNT(DISTINCT ct.course_id) total_courses
...
SqlFiddle here
You should not put distinct on the user_ID column but put it on the id for that table like this:
SELECT u.id, u.email, u.userid, u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(DISTINCT qa.id) as total_quiz,
COUNT(DISTINCT ct.id) as total_courses
FROM users u LEFT JOIN
courses_taken ct
ON u.id = ct.user_id LEFT JOIN
quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode = 1
GROUP BY u.id, u.email, u.userid, u.joined
ORDER BY total_courses DESC;
or if this confuses you, you can use subquery like this:-
SELECT
u.id,
u.email,
u.UserId,
u.joined,
qa.last_attempt_time as last_attempt_time,
qa.total_quizCOUNT,
ct.total_coursesCOUNT
FROM users u
LEFT JOIN
(Select user_id, Count(user_id) as total_coursesCOUNT from courses_taken group by user_id) ct
ON u.id = ct.user_id
LEFT JOIN (Select user_id, Count(user_id) total_quizCOUNT, MAX(last_attempt_time) as last_attempt_time from quiz_attempt where attempt_mode = 1 group by user_id) qa
ON u.id = qa.user_id
ORDER BY total_coursesCOUNT DESC
You probably have a cartesian product problem because of the join. The better solution is to pre-aggregate the results. However, in many cases if the tables are not too big, then count(distinct) solves the problem:
SELECT u.id, u.email, u.user, u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(DISTINCT qa.id) as total_quiz,
COUNT(DISTINCT ct.id) as total_courses
FROM users u LEFT JOIN
courses_taken ct
ON u.id = ct.user_id LEFT JOIN
quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode = 1
GROUP BY u.id
ORDER BY total_courses DESC;
Note that this works because you are using MAX() and COUNT(). It would not work with SUM() or AVG().

Left join with group_concat is too slow

I have 2 tables:
users (id, firstname, lastname, etc)
users_to_groups (user_id(index), group_id(index))
I would like to make a query that returns records like the following:
firstname lastname groups
John Smith 1,2,3,5
Tom Doe 3,5
I use the GROUP_CONCAT function, and currently my query is:
SELECT * FROM users
LEFT OUTER JOIN
(
SELECT user_id, group_concat(group_id) FROM users_to_groups GROUP BY user_id
) AS i
ON users.id = i.user_id
It works, but it's very slow. I have 40k users and 260k records in the groups table.
Looks like the query doesn't use the index and goes through all the 260k lines for every user.
Is there any way to make it faster? It takes 3+ minutes, but I think it shouldn't.
Thanks!
try:
SELECT
u.user_id, u.firstname, u.lastname, group_concat(g.group_id)
FROM users u
LEFT OUTER JOIN users_to_groups g ON u.id on g.user_id
GROUP BY u.id, u.firstname, u.lastname
It's not the left join, but the sub select that makes your query slow. MySQL really suck when it comes to sub select.
This is probably faster:
SELECT
u.id, u.firstname, u.lastname,
group_concat(ug.group_id) AS groups
FROM
users u
LEFT JOIN users_to_groups ug ON ug.user_id = u.id
GROUP BY
u.id, u.firstname, u.lastname