MySQL Frequency of frequency report - mysql

Given many users have many posts
I have a table of posts that has a foreign key user_id
I want to generate a report that shows the frequency of users against frequency of posts
e.g.
3 users wrote 2 posts each
2 users wrote 1 post each
1 user wrote 4 posts
Number of users | Number of posts
--------------- | ------------------
1 | 4
2 | 1
3 | 2
My attempt:
SELECT inner_table.frequency_posts,
Count(*) AS frequency_users
FROM posts
INNER JOIN (SELECT user_id,
Count(*) AS frequency_posts
FROM posts
GROUP BY user_id) AS inner_table
ON posts.user_id = inner_table.user_id
GROUP BY inner_table.frequency_posts
I think frequency_posts is working but counting frequency_users isn't giving the right values - when I look at the inner select on it's own and manually add up the posts I don't get the same values

You have to use Group by twice:
SELECT
COUNT(*) AS NumberOfUsers,
foo.NumberOfPosts
FROM
(SELECT
p.UserId AS UserId,
COUNT(*) AS NumberOfPosts
FROM
posts AS p
GROUP BY UserId) as foo
GROUP BY foo.NumberOfPosts

Related

Joining 3 tables and using SUM to calculate ratings of forum posts belonging to one thread

I'm doing a simple discussion board for my school project. I have these tables to store users, posts and ratings for those posts (I'll leave out columns and tables that are insignificant for this question.
+------------+
| User |
+------------+
|PK| id_user|
+------------+
| username|
| profile_pic|
+------------+
+------------+
| Post |
+------------+
|PK| id_post|
+------------+
| id_user|
| id_thread|
| content|
| date_posted|
| deleted|
+------------+
+------------+
|Post_ratings|
+------------+
|PK| id_voter|
|PK| id_post|
+------------+
| rating|
+------------+
What I want to do is select all rows from the Post table with a specific id_thread, join it with the User table to select the username and profile_pic of the poster of each post, and a sum of ratings given to each post, so the columns of the result should be id_post, id_user, content, date_posted, deleted, username, profile_pic, and rating.
I managed to come up only with this sloppy query:
SELECT * FROM Post p LEFT JOIN
(SELECT id_user, username, profile_pic FROM User) u
ON p.id_user = u.id_user LEFT JOIN
(SELECT id_post redundant, SUM(rating) rating FROM Post_ratings) pr
ON p.id_post = pr.redundant
WHERE id_thread = 5 AND deleted = 0
ORDER BY date_posted
This does return all posts belonging to one thread, but it shows post's ID twice (the redundant column) and displays SUM of all ratings across all threads in a row with the lowest id_post, shows NULL in other rows.
If anyone can help, thank you in advance.
I would use a join to associate the users and the posts table, and a subquery to sum the ratings:
select p.*, u.username, u.profile_pic,
(select sum(pr.rating) from post_ratings pr where pr.id_post = p.id_post) as rating
from posts p
inner join users u on u.id_user = p.id_user
where p.id_thread = 5 and p.deleted = 0
Of course, you can also use outer aggregation:
select p.*, u.username, u.profile_pic, sum(pr.rating) as rating
from posts p
inner join users u on u.id_user = p.id_user
inner join post_ratings pr on pr.id_post = p.id_post
where p.id_thread = 5 and p.deleted = 0
group by p.id_post, u.user_id
MySQL understand functionally-dependent columns, so it is sufficient to put the primary key of the posts and users table in the group by clause.

How to count make an SQL based on 3 tables

I have three tables and Im trying to count the number of likes per user on all his/her post.
USER TABLE
id name
1 John
2 Joe
POSTS TABLE
id user_id post_title
1 1 Some Title
2 1 Another Title
3 2 Yeah Title
LIKES TABLE
id post_id
1 1
2 1
3 1
4 2
5 3
My expected output is
ID LIKES
1 4
2 1
Im kinda stuck with the code below. I don't know how to add and count the likes table.
SELECT *
FROM user
INNER JOIN posts
ON user.id = posts.user_id;
You need to extend the join to the LIKES table and then use GROUP BY to group by the user ID and COUNT() all of the records for that user...
SELECT user.id, COUNT(likes.id)
FROM user
INNER JOIN posts ON user.id = posts.user_id
INNER JOIN likes ON posts.id = likes.post_id
GROUP BY user.id
If you want to list people who don't have posts or likes, then you should use outer joins (so change INNER JOIN to LEFT JOIN) so that these users show up.
For your desired result, you don't need the user table. You can simply do:
SELECT p.user_id, COUNT(*)
FROM posts p JOIN
likes l
ON l.post_id = p.id
GROUP BY p.user_id;
The only information you are taking from users is the id, which is already in posts. This assumes that all the user_id values in posts are valid, but that seems like a very reasonable assumption.

How to join the same table more than once to select different columns?

I need to find out users who have either made or received a booking.
I have two tables that look like this:
Users:
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
Bookings:
+----+-----+-----+
| id | rid | oid |
+----+-----+-----+
| 1 | 1 | 2 |
| 2 | 2 | 1 |
| 3 | 3 | 4 |
+----+-----+-----+
A booking has two users, a 'rider' (rid), and an 'owner' (oid).
The rider and owner can't be the same for each booking but riders can also be owners.
My output should be a list of user IDs that correspond with users who have made or received a booking.
So far I have written
select u.id, b1.rid, b2.oid
from users u
left join bookings b1
on u.id = b1.rid
left join bookings b2
on u.id = b2.oid;
And various other permutations, but I'm not getting the desired result. Any help would be appreciated.
You want all User IDs that are either in Bookings.rid or Bookdings.oid. So you could do something like:
select
users.id
from
users
where
users.id in (select bookings.rid from bookings)
or
users.id in (select bookings.oid from bookings);
You should be able to utilize a UNION clause here.
However, you don't define what the "time window" is, so I am not sure we can come up with a complete solution for you. However, try something like the following:
SELECT
users.id,
bookings.rid,
bookings.oid
FROM
users
LEFT JOIN bookings ON users.id = bookings.rid
UNION ALL
SELECT
users.id,
bookings.rid,
bookings.oid
FROM
users
LEFT JOIN bookings ON users.id = bookings.oid
My output should be a list of user IDs that correspond with users who have made or received a booking.
To do that, you only need to look at the bookings table :
SELECT DISTINCT rid id FROM bookings
UNION ALL SELECT DISTINCT oid FROM bookings
The DISTINCT removes the duplicates returned by each query, and the UNION ALL removes duplicates across both queries.
If you are looking to filter by time frame :
SELECT DISTINCT rid id FROM bookings WHERE some_date BETWEEN :start_date AND :end_date
UNION ALL SELECT DISTINCT oid FROM bookings WHERE some_date BETWEEN :start_date AND :end_date
Where some_date is the field that contains the booking date, and :start_date/end_date are the beginning and the end of the date interval.
I guess there is a name column in Users table.
If you want this too then:
select users.id, users.name from (
select rid userid from bookings
union
select oid userid from bookings
) t inner join users
on users.id = t.userid
group by users.id, users.name
See the demo
If not you only need to scan the bookings table:
select distinct userid from (
select rid userid from bookings
union
select oid userid from bookings
) t
See the demo

Selecting a count of rows having a max value

Working example: http://sqlfiddle.com/#!9/80995/20
I have three tables, a user table, a user_group table, and a link table.
The link table contains the dates that users were added to user groups. I need a query that returns the count of users currently in each group. The most recent date determines the group that the user is currently in.
SELECT
user_groups.name,
COUNT(l.name) AS ct,
GROUP_CONCAT(l.`name` separator ", ") AS members
FROM user_groups
LEFT JOIN
(SELECT MAX(added), group_id, name FROM link LEFT JOIN users ON users.id = link.user_id GROUP BY user_id) l
ON l.group_id = user_groups.id
GROUP BY user_groups.id
My question is if the query I have written could be optimized, or written better.
Thanks!
Ben
You actual query is not giving you the answer you want; at least, as far as I understand your question. John actually joined group 2 on 2017-01-05, yet it appears on group 1 (that he joined on 2017-01-01) on your results. Note also you're missing one Group 4.
Using standard SQL, I think the next query is what you're looking for. The comments in the query should clarify what each part is doing:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT * FROM
(-- For each user, find most recent date s/he got into a group
SELECT
user_id AS the_user_id, MAX(added) AS last_added
FROM
link
GROUP BY
the_user_id
) AS u_a
-- Join back to the link table, so that the `group_id` can be retrieved
JOIN link l2 ON l2.user_id = u_a.the_user_id AND l2.added = u_a.last_added
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
This can be written in a more compact way in MySQL (abusing the fact that, in older versions of MySQL, it doesn't follow the SQL standard for the GROUP BY restrictions).
That's what you'll get:
group_name | member_count | members
:--------- | -----------: | :-------------
Group 1 | 2 | Mikie, Dominic
Group 2 | 2 | John, Paddy
Group 3 | 0 | null
Group 4 | 1 | Nellie
dbfiddle here
Note that this query can be simplified if you use a database with window functions (such as MariaDB 10.2). Then, you can use:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT
user_id AS the_user_id,
last_value(group_id) OVER (PARTITION BY user_id ORDER BY added) AS group_id
FROM
link
GROUP BY
user_id
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
dbfiddle here

SQL syntax to retrieve votes strangely misses some results

I've tried to look other posts to do a Mysql query, I think I'm almost there but for some reason I have a bug.
I have those tables :
POSTS [id_post (int) / activity (int) ]
VOTES [id_vote (int) / id_post (int) / user_id (int)]
ACTIVITIES [id_activity (int) / activity_name (varchar)]
I have in fact more tables and fields than that, but these are the relevant one for my problem. I created a vote system that adds a user vote to the VOTES table, refering to the post that is being voted and the user account of the voting person.
So the votes table may look like this :
id_vote | id_post | user_id
1 5 8
2 6 8
Every post belongs to an activity. I would like to lists the posts which have the most votes for each activity.
That's where I am so far :
SELECT activities.id_activity, activities.activity_name, votes.id_post, votes.totalvotes AS allvotes
FROM ( SELECT votes.id_post, COUNT(*) as totalvotes
FROM `votes`
GROUP BY votes.id_post
) AS votes
JOIN posts ON posts.id_post = votes.id_post
JOIN activities ON posts.activity = activities.id_activity
GROUP BY activities.id_activity
HAVING allvotes = MAX(totalvotes)
This works well, I retrieve what I want except that if 2 posts in the same activity have the same amount of votes, I have no idea which one only appears after grouping those posts, and why it's not the other one.
id_activity | activity_name | id_post | allvotes
5 eating 5 2
3 sleeping 6 1
More importantly, what is really bugging me is that some activities won't show up for some reason. I noticed that, in the above example, if the post 5 belonging to the eating category is the one that has the most votes indeed, then the category appears. BUT if it happens that the post for the eating category which have the mosts votes IS NOT post 5 (which is the one MYSQL decided to show up by default), the the whole row about the eating category just WON'T SHOW UP at all.
It's been 2 days I'm on this...
Any ideas ?
Thanks a bunch.
It is little complex to implement in a single query. it will be easy with some temporary tables. if you want it in a single query then you can try with row_number.
example:
SELECT id_activity,activity_name,id_post,v_count FROM(
SELECT id_activity,activity_name,id_post,v_count
,#rownum := IF(#prev_value=id_activity,#rownum+1,1) AS RowNumber
,#prev_value := id_activity
FROM (
select a.id_activity,a.activity_name,p.id_post,v.v_count
from activities a
left join posts p on p.activity=a.id_activity
left join (
select id_post,count(*) v_count from votes v1
group by id_post
) v on v.id_post=p.id_post
order by a.id_activity, v.v_count desc) as tmp
,(SELECT #rownum := 0) r
,(SELECT #prev_value := '') y
)tmp2
WHERE rownumber=1
FIDDLE
i guess its because you group by the activity, i wonder why this statement is allowed in mysql, in mssql you are not allowed to group only by activities.id_activity in this case, you had to group by activity and/or id_post
try this:
SELECT activities.id_activity, activities.activity_name, votes.id_post, COUNT(*) as totalvotes
FROM votes
INNER JOIN posts ON posts.id_post = votes.id_post
INNER JOIN activities ON posts.activity = activities.id_activity
GROUP BY votes.id_post
HAVING totalvotes = MAX(totalvotes)
and lmk if it works :)
or your statement modified:
SELECT activities.id_activity, activities.activity_name, votes.id_post, votes.totalvotes AS allvotes
FROM ( SELECT votes.id_post, COUNT(*) as totalvotes
FROM `votes`
GROUP BY votes.id_post
) AS votes
JOIN posts ON posts.id_post = votes.id_post
JOIN activities ON posts.activity = activities.id_activity
GROUP BY votes.id_post
HAVING allvotes = MAX(totalvotes)
if i get you right then its because you have 2 posts with the same activity, as soon as you group the activities and not group the posts, one post disappears.