Selecting a count of rows having a max value - mysql

Working example: http://sqlfiddle.com/#!9/80995/20
I have three tables, a user table, a user_group table, and a link table.
The link table contains the dates that users were added to user groups. I need a query that returns the count of users currently in each group. The most recent date determines the group that the user is currently in.
SELECT
user_groups.name,
COUNT(l.name) AS ct,
GROUP_CONCAT(l.`name` separator ", ") AS members
FROM user_groups
LEFT JOIN
(SELECT MAX(added), group_id, name FROM link LEFT JOIN users ON users.id = link.user_id GROUP BY user_id) l
ON l.group_id = user_groups.id
GROUP BY user_groups.id
My question is if the query I have written could be optimized, or written better.
Thanks!
Ben

You actual query is not giving you the answer you want; at least, as far as I understand your question. John actually joined group 2 on 2017-01-05, yet it appears on group 1 (that he joined on 2017-01-01) on your results. Note also you're missing one Group 4.
Using standard SQL, I think the next query is what you're looking for. The comments in the query should clarify what each part is doing:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT * FROM
(-- For each user, find most recent date s/he got into a group
SELECT
user_id AS the_user_id, MAX(added) AS last_added
FROM
link
GROUP BY
the_user_id
) AS u_a
-- Join back to the link table, so that the `group_id` can be retrieved
JOIN link l2 ON l2.user_id = u_a.the_user_id AND l2.added = u_a.last_added
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
This can be written in a more compact way in MySQL (abusing the fact that, in older versions of MySQL, it doesn't follow the SQL standard for the GROUP BY restrictions).
That's what you'll get:
group_name | member_count | members
:--------- | -----------: | :-------------
Group 1 | 2 | Mikie, Dominic
Group 2 | 2 | John, Paddy
Group 3 | 0 | null
Group 4 | 1 | Nellie
dbfiddle here
Note that this query can be simplified if you use a database with window functions (such as MariaDB 10.2). Then, you can use:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT
user_id AS the_user_id,
last_value(group_id) OVER (PARTITION BY user_id ORDER BY added) AS group_id
FROM
link
GROUP BY
user_id
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
dbfiddle here

Related

GROUP_CONCAT in sub-query based on specified values

User Table:
ID InstructionSets
1 123,124
Instruction Set Table:
ID Name
123 Learning SQL
124 Learning More SQL
Desired Query Result:
UserID SetID SetNames
1 123,124 Learning SQL,Learning More SQL
Current SQL:
SELECT U1.ID AS UserID, U1.InstructionSets AS SetID, (
SELECT GROUP_CONCAT(Name ORDER BY FIELD(I1.ID, U1.InstructionSets))
FROM Instructions I1
WHERE I1.ID IN (U1.InstructionSets)
) AS SetName
FROM Users U1
WHERE `ID` = 1
RESULT
UserID SetID SetNames
1 123,124 Learning SQL
As expected, if I remove the WHERE clause in the sub-query, all of the SetNames appear; but if I specify the required IDs, I only get the name associated with the first ID. Obviously, I also need to fetch the SetNames in the same order as the IDs. Hence ORDER BY in GROUP_CONCAT.
Also:
Is there a better approach (other than storing the user instruction set assignments in a separate table — overkill for this application)? Couldn't see how to use JOIN in this
situation.
Is there a better title for this question?
Thanks.
Instead of IN use LIKE operator like this:
SELECT U1.ID AS UserID, U1.InstructionSets AS SetID, (
SELECT GROUP_CONCAT(Name ORDER BY (I1.ID))
FROM Instructions I1
WHERE CONCAT(',', U1.InstructionSets, ',') LIKE concat('%,', I1.ID, ',%')
) AS SetName
FROM Users U1
WHERE `ID` = 1
See the demo.
Results:
| UserID | SetID | SetName |
| ------ | ------- | ------------------------------ |
| 1 | 123,124 | Learning SQL,Learning More SQL |
We can use FIND_IN_SET(). In this context, using FIELD() function doesn't make sense.
We can also use FIND_IN_SET() in the WHERE clause. (Function returns 0 when the string isn't found in the string list.)
e.g.
SELECT u.id AS userid
, u.instructionsets AS setid
, ( SELECT GROUP_CONCAT(i.name ORDER BY FIND_IN_SET(i.id, u.instructionsets))
FROM `Instructions` i
WHERE FIND_IN_SET(i.id, u.instructionsets))
) AS setname
FROM `Users` u
WHERE u.id = 1
Storing comma separated lists is an anti-pattern; a separate table isn't overkill.
Assuming id is unique in Users table, we could do a join operation with a GROUP BY
SELECT u.id AS userid
, MIN(u.instructionsets) AS setid
, GROUP_CONCAT(i.name ORDER BY FIND_IN_SET(i.id, u.instructionsets))) AS setname
FROM `Users` u
LEFT
JOIN `Instructions` i
ON FIND_IN_SET(i.id, u.instructionsets)
WHERE u.id = 1
GROUP BY u.id

Count comments and get average rating from mysql

I just can't figure out how to get average rating and count comments from my mysql database.
I have 3 tables (activity, rating, comments) activity contains the main data the "activities", rating holds the ratings and comments - of course, the ratings.
activity_table
id | title |short_desc | long_desc | address | lat | long |last_updated
rating_table
id | activityid | userid | rating
comment_table
id | activityid | userid | rating
I'm now trying to the data from activity plus the comment_counts and average_rating in one query.
SELECT activity.*, AVG(rating.rating) as average_rating, count(comments.activityid) as total_comments
FROM activity LEFT JOIN
rating
ON activity.aid = rating.activityid LEFT JOIN
comments
ON activity.aid = comments.activityid
GROUP BY activity.aid
...doesn't do the job. It gives me the right average_rating, but the wrong amount of comments.
Any ideas?
Thanks a lot!
You are aggregating along two different dimensions. The Cartesian product generated by the joins affects the aggregation.
So, you should aggregate before the joins:
SELECT a.*, r.average_rating, COALESCE(c.total_comments, 0) as total_comments
FROM activity a LEFT JOIN
(SELECT r.activityid, AVG(r.rating) as average_rating
FROM rating r
GROUP BY r.activityid
) r
ON a.aid = r.activityid LEFT JOIN
(SELECT c.activityid, COUNT(*) as total_comments
FROM comments c
GROUP BY c.activityid
) c
ON a.aid = c.activityid;
Notice that the outer GROUP BY is no longer needed.

Sql conditional count with join

I cannot find the answer to my problem here on stackoverflow. I have a query that spans 3 tables:
newsitem
+------+----------+----------+----------+--------+----------+
| Guid | Supplier | LastEdit | ShowDate | Title | Contents |
+------+----------+----------+----------+--------+----------+
newsrating
+----+----------+--------+--------+
| Id | NewsGuid | UserId | Rating |
+----+----------+--------+--------+
usernews
+----+----------+--------+----------+
| Id | NewsGuid | UserId | ReadDate |
+----+----------+--------+----------+
Newsitem obviously contains newsitems, newsrating contains ratings that users give to newsitems, and usernews contains the date when a user has read a newsitem.
In my query I want to get every newsitem, including the number of ratings for that newsitem and the average rating, and how many times that newsitem has been read by the current user.
What I have so far is:
select newsitem.guid, supplier, count(newsrating.id) as numberofratings,
avg(newsrating.rating) as rating,
count(case usernews.UserId when 3 then 1 else null end) as numberofreads from newsitem
left join newsrating on newsitem.guid = newsrating.newsguid
left join usernews on newsitem.guid = usernews.newsguid
group by newsitem.guid
I have created an sql fiddle here: http://sqlfiddle.com/#!9/c8add/8
Both count() calls don't return the numbers I want. numberofratings should return the total number of ratings for that newsitem (by all users). numberofreads should return the number of reads for the current user for that newsitem.
So, newsitem with guid d104c330-c319-40e8-8be3-a7c4f549d35c should have 2 ratings and 3 reads for the current user with userid = 3.
I have tried conditional counts and sums, but no success yet. How can this be accomplished?
The main problem that I see is that you're joining in both tables together, which means that you're going to effectively be multiplying out by both numbers, which is why your counts aren't going to be correct. For example, if the Newsitem has been read 3 times by the user and rated by 8 users then you're going to end up getting 24 rows, so it will look like it has been rated 24 times. You can add a DISTINCT to your COUNT of the ratings IDs and that should correct that issue. Average should be unaffected because the average of 1 and 2 is the same as the average of 1, 1, 2, & 2 (for example).
You can then handle the reads by adding the userid to the JOIN condition (since it's an OUTER JOIN it shouldn't cause any loss of results) instead of in a CASE statement for your COUNT, then you can do a COUNT on distinct id values from Usernews. The resulting query would be:
SELECT
I.guid,
I.supplier,
COUNT(DISTINCT R.id) AS number_of_ratings,
AVG(R.rating) AS avg_rating,
COUNT(DISTINCT UN.id) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN NewsRating R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON
UN.newsguid = I.guid AND
UN.userid = #userid
GROUP BY
I.guid,
I.supplier
While that should work, you might get better results from a subquery, as the above needs to explode out the results and then aggregate them, perhaps unnecessarily. Also, some people might find the below to be a little clearer.
SELECT
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating,
COUNT(*) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN
(
SELECT
newsguid,
COUNT(*) AS number_of_ratings,
AVG(rating) AS avg_rating
FROM
NewsRating
GROUP BY
newsguid
) R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON UN.newsguid = I.guid AND UN.userid = #userid
GROUP BY
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating
I'm with Tom you should use a subquery to calculate the user count.
SQL Fiddle Demo
SELECT NI.guid,
NI.supplier,
COUNT(NR.ID) as numberofratings,
AVG(NR.rating) as rating,
user_read as numberofreads
FROM newsitem NI
LEFT JOIN newsrating NR
ON NI.guid = NR.newsguid
LEFT JOIN (SELECT NewsGuid, COUNT(*) user_read
FROM usernews
WHERE UserId = 3 -- use a variable #user_id here
GROUP BY NewsGuid) UR
ON NI.guid = UR.NewsGuid
GROUP BY NI.guid,
NI.supplier,
numberofreads;

How to optimize this MySQL query? (CROSS JOIN, subquery)

I have a challenging question for MySQL experts.
I have a users permissions system with 4 tables:
users (id | email | created_at)
permissions (id | responsibility_id | key | weight)
permission_user (id | permission_id | user_id)
responsibilities (id | key | weight)
Users can have any number of permissions assigned and any permission can be granted to any number of users (many to many). Responsibilities are like groups for permissions, each permission belongs to exactly one responsibility. For example, one permission is called update with responsibility of customers. Another one would be delete with orders responsibility.
I need to get a full map of permissions per user, but only for those who have at least one permission granted. Results should be ordered by:
User's number of permissions from most to least
User's created_at column, oldest first
Responsibility's weight
Permission's weight
Example result set:
user_id | responsibility | permission | granted
-----------------------------------------------
5 | customers | create | 1
5 | customers | update | 1
5 | orders | create | 1
5 | orders | update | 1
2 | customers | create | 0
2 | customers | delete | 0
2 | orders | create | 1
2 | orders | update | 0
Let's say I have 10 users in database, but only two of them have any permissions granted. There are 4 permissions in total:
create of customers responsibility
update of customers responsibility
create of orders responsibility
update of orders responsibility.
That's why we have 8 records in results (2 users with any permission × 4 permissions). User with id = 5 is displayed first, because he's got more permissions. If there were any draws, the ones with older created_at date would go first. Permissions are always sorted by the weight of their responsibility and then by their own weight.
My question is, how to write optimal query for this case? I have already made one myself and it works good:
SELECT `users`.`id` AS `user_id`,
`responsibilities`.`key` AS `responsibility`,
`permissions`.`key` AS `permission`,
!ISNULL(`permission_user`.`id`) AS `granted`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
ON `permission_user`.`user_id` = `users`.`id`
AND `permission_user`.`permission_id` = `permissions`.`id`
WHERE (
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) > 0
ORDER BY (
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC
The problem is that I'm using the same subquery twice.
Can I do better? I count on you, MySQL experts!
--- EDIT ---
Thanks to Gordon Linoff's comment I made it use HAVING clause:
SELECT `users`.`email`,
`responsibilities`.`key`,
`permissions`.`key`,
!ISNULL(`permission_user`.`id`) as `granted`,
(
SELECT COUNT(*)
FROM `permission_user`
WHERE `user_id` = `users`.`id`
) AS `total_permissions`
FROM `users`
CROSS JOIN `permissions`
JOIN `responsibilities`
ON `responsibilities`.`id` = `permissions`.`responsibility_id`
LEFT JOIN `permission_user`
ON `permission_user`.`user_id` = `users`.`id`
AND `permission_user`.`permission_id` = `permissions`.`id`
HAVING `total_permissions` > 0
ORDER BY `total_permissions` DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC
I was surprised to discover that HAVING can go alone without GROUP BY.
Can it now be improved for better performance?
Probably the most efficient way to do this is:
SELECT u.email, r.`key`, r.`key`,
!ISNULL(pu.id) as `granted`
FROM (SELECT u.*,
(SELECT COUNT(*) FROM `permission_user` pu WHERE pu.user_id = u.id
) AS `total_permissions`
FROM `users` u
) u CROSS JOIN
permissions p JOIN
responsibilities r
ON r.id = p.responsibility_id LEFT JOIN
permission_user pu
ON pu.user_id = u.id AND
pu.permission_id = p.id
WHERE u.total_permissions > 0
ORDER BY `total_permissions` DESC,
`users`.`created_at` ASC,
`responsibilities`.`weight` ASC,
`permissions`.`weight` ASC;
This will run the subquery once per user, rather than once per user/permission combination (as both the modified query and the original query were doing). This has two costs. The first is the materialization of the subquery, so the data in the users table has to be read and written again. Probably not a big deal, given everything else in the query. The second is the loss of indexes on the users table. Once again, with a cross join, indexes are (probably) not being used, so this is also minor.

Count not giving right results with three joins

I have three tables (MySQL)
forum: each line in this table is a comment in the forum related to the match by static_id and related to the author by user_id
|match_static_id| date | time | comments | user_id |
matches: this table contains matches with all its information
| static_id | localteam_name | visitorteam_name | date | time |.......
iddaa : this table contains a code for each match (some matches do not have codes here)
|match_static_id| iddaa_code |
I make a query like following:
SELECT forum.match_static_id, forum.date, forum.time,
count(forum.comments) 'comments_no', matches.*, users.username, iddaa.iddaa_code
FROM forum
INNER JOIN matches ON forum.match_static_id = matches.static_id
INNER JOIN users on forum.user_id = users.id
LEFT JOIN iddaa on forum.match_static_id = iddaa.match_static_id
GROUP BY forum.match_static_id
ORDER BY forum.date DESC, forum.time DESC
the query work as I want (I get the match information, iddaa code for the match if there is one, and the author of the comment(last comment) ).
The problem is in the "count function" I should get the number of the comments related to the same match bur the query returned (double of each value)
for example if I have 5 comments for a match it returns 10
I want to know if all parts of my query is right and any help will be good?
Maybe it can be wrapped in a sub query? Its hard when i dont have the table def + data.
SELECT Sub.*, COUNT(1) 'comments_no'
FROM
(
SELECT forum.match_static_id, forum.date, forum.time,
matches.*, users.username, iddaa.iddaa_code
FROM forum
INNER JOIN matches ON forum.match_static_id = matches.static_id
INNER JOIN users on forum.user_id = users.id
GROUP BY forum.match_static_id
) Sub
LEFT JOIN iddaa on Sub.match_static_id = iddaa.match_static_id
ORDER BY forum.date DESC, forum.time DESC