MySQL: where count is higher than average - mysql

I want to select posts from users who have specific followers which is higher than the overall average (compared to other users)
The problem is when I use AVG() it limits the number of posts/users coming through, yet I can't use GROUP BY j.id as it will break the average count and WHERE j2.fCount >= j2.oAvg stops working properly
Here's my code
SELECT * FROM (
SELECT j.*, ROUND(AVG(j.fCount)) as oAvg
FROM (
SELECT p.id , COUNT(fCount.id) as fCount
FROM `post` p
LEFT JOIN `table` table ON ...
LEFT JOIN `user` user ON ....
LEFT JOIN `follow` fCount ON fCount.user_id=user.id AND fCount.follow_id=table.ids
WHERE p.user_id=fCount.user_id
group by p.id
) j
---- > `GROUP BY j.id` - BREAKS THE AVERAGE BELOW
) j2
WHERE j2.fCount >= j2.oAvg
Thank you :)

because you're trying to compare to average, you might have to do your inner query twice like this.
SELECT *,
(SELECT AVG(fCount) as average FROM
(SELECT COUNT(fCount.id) as fCount
FROM post p
LEFT JOIN follow fCount ON fCount.user_id = p.user_id
GROUP BY p.id
)j1
)as average
FROM
(SELECT p2.id, COUNT(fCount2.id) as fCount
FROM post p2
LEFT JOIN follow fCount2 ON fCount2.user_id = p2.user_id
GROUP BY p2.id
)j2
HAVING fCount >= average
sqlfiddle
just replace inner queries of j1 and j2 with your j
if you just want to run inner query once you can use user-defined variables to total up your count divide it by count to calculate your own average like this
SELECT id,fCount,#sum/#count as average
FROM
(SELECT id,
fCount,
#sum := #sum + fCount as total,
#count := #count + 1 as posts
FROM
(SELECT p.id,COUNT(fCount.id) as fCount
FROM post p
LEFT JOIN follow fCount ON fCount.user_id = p.user_id
GROUP BY p.id
)j,
(SELECT #sum:=0.0,#count:=0.0)initialize
)T
HAVING fCount >= average
sqlfiddle

Related

How to properly join these three tables in SQL?

I'm currently creating a small application where users can post a text which can be commented and the post can also be voted (+1 or -1).
This is my database:
Now I want to select all information of all posts with status = 1 plus two extra columns: One column containing the count of comments and one column containing the sum (I call it score) of all votes.
I currently use the following query, which correctly adds the count of the comments:
SELECT *, COUNT(comments.fk_commented_post) as comments
FROM posts
LEFT JOIN comments
ON posts.id_post = comments.fk_commented_post
AND comments.status = 1
WHERE posts.status = 1
GROUP BY posts.id_post
Then I tried to additionally add the sum of the votes, using the following query:
SELECT *, COUNT(comments.fk_commented_post) as comments, SUM(votes_posts.type) as score
FROM posts
LEFT JOIN comments
ON posts.id_post = comments.fk_commented_post
AND comments.status = 1
LEFT JOIN votes_posts
ON posts.id_post = votes_posts.fk_voted_post
WHERE posts.status = 1
GROUP BY posts.id_post
The result is no longer correct for either the votes or the comments. Somehow some of the values seem to be getting multiplied...
This is probably simpler using correlated subqueries:
select p.*,
(select count(*)
from comments c
where c.fk_commented_post = p.id_post and c.status = 1
) as num_comments,
(select sum(vp.type)
from votes_posts vp
where c.fk_voted_post = p.id_post
) as num_score
from posts p
where p.status = 1;
The problem with join is that the counts get messed up because the two other tables are not related to each tother -- so you get a Cartesian product.
You want to join comments counts and votes counts to the posts. So, aggregate to get the counts, then join.
select
p.*,
coalesce(c.cnt, 0) as comments,
coalesce(v.cnt, 0) as votes
from posts p
left join
(
select fk_commented_post as id_post, count(*) as cnt
from comments
where status = 1
group by fk_commented_post
) c on c.id_post = p.id_post
left join
(
select fk_voted_post as id_post, count(*) as cnt
from votes_posts
group by fk_voted_post
) v on v.id_post = p.id_post
where p.status = 1
order by p.id_post;

Merge 2 SQL Queries/Tables

I spent so much time googling today but i don't even know which keywords to use. So …
The project is an evaluation of a betting game (Football). I have 2 SQL Queries:
SELECT players.username, players.userid, matchdays.userid, matchdays.points, SUM(points) AS gesamt
FROM players INNER JOIN matchdays ON players.userid = matchdays.userid AND matchdays.season_id=5
GROUP BY players.username
ORDER BY gesamt DESC
And my second query:
SELECT max(matchday) as lastmd, points, players.username from players INNER JOIN matchdays ON players.userid = matchdays.userid WHERE matchdays.season_id=5 AND matchday=
(select max(matchday) from matchdays)group by players.username ORDER BY points DESC
The first one adds up the points of every matchday and shows the sum.
The second shows the points of the last gameday.
My Goal is to merge those 2 queries/tables so that the output is a table like
Rank | Username | Points last gameday | Overall points |
I don't even know where to start or what to look for. Any help would be appreciated ;)
use both query with join....use inner join if each userid have value in 2nd query also.also add userid in 2nd query also for join
SET #rank = 0;
SELECT #rank := rank + 1,
t1.username,
t2.points,
t1.gesamt
FROM (
SELECT players.username, players.userid puserid, matchdays.userid muserid, matchdays.points, SUM(points) AS gesamt
FROM players INNER JOIN matchdays ON players.userid = matchdays.userid AND matchdays.season_id=5
GROUP BY players.username
)t1
INNER JOIN
(
SELECT players.userid, max(matchday) as lastmd, points, players.username
from players INNER JOIN matchdays ON players.userid = matchdays.userid
WHERE matchdays.season_id=5 AND matchday=
(select max(matchday) from matchdays)group by players.username
)t2
ON t1.puserid = t2.userid
ORDER BY t1.gesamt
You can use conditional aggregation, i.e. sum the points only when the day is the last day:
SELECT
p.username,
SUM(case when m.matchday = (select max(matchday) from matchdays) then m.points end)
AS last_day_points,
SUM(m.points) AS total_points
FROM players p
INNER JOIN matchdays m ON p.userid = m.userid AND m.season_id = 5
GROUP BY p.userid
ORDER BY total_points DESC;
Or with a join instead of a non-correlated subquery (MySQL should come to the same execution plan):
SELECT
p.username,
SUM(case when m.matchday = last_day.matchday then m.points end) AS last_day_points,
SUM(m.points) AS total_points
FROM players p
INNER JOIN matchdays m ON p.userid = m.userid AND m.season_id = 5
CROSS JOIN
(
select max(matchday) as matchday
from matchdays
) last_day
GROUP BY p.userid
ORDER BY total_points DESC;

Sub Query counting character strings in MySQL

LEFT JOIN
(
SELECT user_id, review, COUNT(user_id) totalCount
FROM reviews
GROUP BY user_id
) b ON b.user_id= b.user_id
I am trying to fit WHERE LENGTH(review) > 100 in this somewhere but every I put it, it gives me problems.
The sub-query above counts all total reviews by user_id. I simply want to add one more qualification. Only count reviews greater than 100 length.
On a side note, I've seen the function CHAR_LENGTH -- not sure if that i what I need either.
EDIT:
Here is complete query working perfectly as expected for my needs:
static public $top_users = "
SELECT u.username, u.score,
(COALESCE(a.totalCount, 0) * 4) +
(COALESCE(b.totalCount, 0) * 5) +
(COALESCE(c.totalCount, 0) * 1) +
(COALESCE(d.totalCount, 0) * 2) +
(COALESCE(u.friend_points, 0)) AS totalScore
FROM users u
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM items
GROUP BY user_id
) a ON a.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM reviews
GROUP BY user_id
) b ON b.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM ratings
GROUP BY user_id
) c ON c.user_id = u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM comments
GROUP BY user_id
) d ON d.user_id = u.user_id
ORDER BY totalScore DESC LIMIT 25;";
LENGTH() returns the length of the string measured in bytes. You probably want CHAR_LENGTH() as it will give you the actual characters.
SELECT user_id, review, COUNT(user_id) totalCount
FROM reviews
WHERE CHAR_LENGTH(review) > 100
GROUP BY user_id, review
You're also not using GROUP BY correctly.
See the documentation
The query that you want is:
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount,
sum(case when length(review) > 100 then 1 else 0 end
) as NumLongReviews
FROM reviews
GROUP BY user_id
) b ON b.user_id= b.user_id
This counts both the reviews and the "long" reviews. That count is done using a case statement nested in a sum() function.

MySQL INNER JOIN select only one row from second table

I have a users table and a payments table, for each user, those of which have payments, may have multiple associated payments in the payments table. I would like to select all users who have payments, but only select their latest payment. I'm trying this SQL but i've never tried nested SQL statements before so I want to know what i'm doing wrong. Appreciate the help
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*
FROM payments AS p
ORDER BY date DESC
LIMIT 1
)
ON p.user_id = u.id
WHERE u.package = 1
You need to have a subquery to get their latest date per user ID.
SELECT u.*, p.*
FROM users u
INNER JOIN payments p
ON u.id = p.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON p.user_ID = b.user_ID AND
p.date = b.maxDate
WHERE u.package = 1
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.id = (
SELECT id
FROM payments AS p2
WHERE p2.user_id = u.id
ORDER BY date DESC
LIMIT 1
)
Or
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
WHERE NOT EXISTS (
SELECT 1
FROM payments AS p2
WHERE
p2.user_id = p.user_id AND
(p2.date > p.date OR (p2.date = p.date AND p2.id > p.id))
)
These solutions are better than the accepted answer because they work correctly when there are multiple payments with same user and date. You can try on SQL Fiddle.
SELECT u.*, p.*, max(p.date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
ORDER BY p.date DESC
Check out this sqlfiddle
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM payments AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
ORDER BY p.user_id ASC, date DESC)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
You can try this:
SELECT u.*, p.*
FROM users AS u LEFT JOIN (
SELECT *, ROW_NUMBER() OVER(PARTITION BY userid ORDER BY [Date] DESC) AS RowNo
FROM payments
) AS p ON u.userid = p.userid AND p.RowNo=1
There are two problems with your query:
Every table and subquery needs a name, so you have to name the subquery INNER JOIN (SELECT ...) AS p ON ....
The subquery as you have it only returns one row period, but you actually want one row for each user. For that you need one query to get the max date and then self-join back to get the whole row.
Assuming there are no ties for payments.date, try:
SELECT u.*, p.*
FROM (
SELECT MAX(p.date) AS date, p.user_id
FROM payments AS p
GROUP BY p.user_id
) AS latestP
INNER JOIN users AS u ON latestP.user_id = u.id
INNER JOIN payments AS p ON p.user_id = u.id AND p.date = latestP.date
WHERE u.package = 1
#John Woo's answer helped me solve a similar problem. I've improved upon his answer by setting the correct ordering as well. This has worked for me:
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN (
SELECT user_ID, MAX(date) as maxDate FROM
(
SELECT user_ID, date
FROM payments
ORDER BY date DESC
) d
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
WHERE a.package = 1
I'm not sure how efficient this is, though.
SELECT U.*, V.* FROM users AS U
INNER JOIN (SELECT *
FROM payments
WHERE id IN (
SELECT MAX(id)
FROM payments
GROUP BY user_id
)) AS V ON U.id = V.user_id
This will get it working
Matei Mihai given a simple and efficient solution but it will not work until put a MAX(date) in SELECT part so this query will become:
SELECT u.*, p.*, max(date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
And order by will not make any difference in grouping but it can order the final result provided by group by. I tried it and it worked for me.
My answer directly inspired from #valex very usefull, if you need several cols in the ORDER BY clause.
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM (SELECT * FROM payments ORDER BY p.user_id ASC, date DESC) AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
This is quite simple do The inner join and then group by user_id and use max aggregate function in payment_id assuming your table being user and payment query can be
SELECT user.id, max(payment.id)
FROM user INNER JOIN payment ON (user.id = payment.user_id)
GROUP BY user.id
If you do not have to return the payment from the query you can do this with distinct, like:
SELECT DISTINCT u.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
This will return only users which have at least one record associated in payment table (because of inner join), and if user have multiple payments, will be returned only once (because of distinct), but the payment itself won't be returned, if you need the payment to be returned from the query, you can use for example subquery as other proposed.

mysql: multiple join problem

Im trying to select a table with multiple joins, one for the number of comments using COUNT and one to select the total vote value using SUM, the problem is that the two joins affect each other, instead of showing:
3 votes 2 comments
I get 3 * 2 = 6 votes and 2 * 3 comments
This is the query I'm using:
SELECT t.*, COUNT(c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9
What you're doing is an SQL antipattern that I call Goldberg Machine. Why make the problem so much harder by forcing it to be done in a single SQL query?
Here is how I would really solve this problem:
SELECT t.*, COUNT(c.id) as comments
FROM topics t
LEFT JOIN comments c ON c.topic_id = t.id
WHERE t.id = 9;
SELECT t.*, SUM(v.vote) as votes
FROM topics t
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9;
As you have found, combining these two into one query results in a Cartesian product. There may be clever and subtle ways to force it to give you the correct answer in one query, but what happens when you need a third statistic? It's much simpler to do it in two queries.
SELECT t.*, COUNT(c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9
GROUP BY t.id
or perhaps
SELECT `topics`.*,
(
SELECT COUNT(*)
FROM `comments`
WHERE `topic_id` = `topics`.`id`
) AS `num_comments`,
(
SELECT IFNULL(SUM(`vote`), 0)
FROM `votes`
WHERE `topic_id` = `topics`.`id`
) AS `vote_total`
FROM `topics`
WHERE `id` = 9
SELECT t.*, COUNT(DISTINCT c.id) as comments, COALESCE(SUM(v.vote), 0) as votes
FROM (topics t)
LEFT JOIN comments c ON c.topic_id = t.id
LEFT JOIN votes v ON v.topic_id = t.id
WHERE t.id = 9