MySQL nested query counting - mysql

A bit of background info; this is an application that allows users to created challenges and then vote on those challenges (bog standard userX-vs-userY type application).
The end goal here is to get a list of 5 users sorted by the number of challenges they have won, to create a type of leaderboard. A challenge is won by a user if it's status = expired and the user has > 50 votes for that challenge (challenges expire after 100 votes in total).
I'll simplify things a bit here, but essentially there are three tables:
users
id
username
...
challenges
id
issued_to
issued_by
status
challenges_votes
id
challenge_id
user_id
voted_for
So far I have an inner query which looks like:
SELECT `challenges`.`id`
FROM `challenges_votes`
LEFT JOIN `challenges` ON (`challenges`.`id` = `challenges_votes`.`challenge_id`)
WHERE `voted_for` = 1
WHERE `challenges`.`status` = 'expired'
GROUP BY `challenges`.`id`
HAVING COUNT(`challenges_votes`.`id`) > 50
Which in this example would return challenge IDs that have expired and where the user with ID 1 has > 50 votes for.
What I need to do is count the number of rows returned here, apply it to each user from the users table, order this by the number of rows returned and limit it to 5.
To this end I have the following query:
SELECT `users`.`id`, `users`.`username`, COUNT(*) AS challenges_won
FROM (
SELECT `challenges`.`id`
FROM `challenges_votes`
LEFT JOIN `challenges` ON (`challenges`.`id` = `challenges_votes`.`challenge_id`)
WHERE `voted_for` = 1
GROUP BY `challenges`.`id`
HAVING COUNT(`challenges_votes`.`id`) > 0
) AS challenges_won, `users`
GROUP BY `users`.`id`
ORDER BY challenges_won
LIMIT 5
Which is kinda getting there but of course the voted_for user ID here is always 1. Is this even the right way to go about this type of query? Can anyone shed any light on how I should be doing it?
Thanks!

I guess the following script will solve your problem:
-- get the number of chalenges won by each user and return top 5
SELECT usr.id, usr.username, COUNT(*) AS challenges_won
FROM users usr
JOIN (
SELECT vot.challenge_id, vot.voted_for
FROM challenges_votes vot
WHERE vot.challenge_id IN ( -- is this check really necessary?
SELECT cha.id -- if any user is voted 51 he wins, so
FROM challenges cha -- why wait another 49 votes that won't
WHERE cha.status = 'expired' -- change the result?
) --
GROUP BY vot.challenge_id
HAVING COUNT(*) > 50
) aux ON (aux.voted_for = usr.id)
GROUP BY usr.id, usr.username
ORDER BY achallenges_won DESC LIMIT 5;
Please allow me to propose a small consideration to the condition to close a challenge: if any user wins after 51 votes, why is it necessary to wait another 49 votes that will not change the result? If this constraint can be dropped, you won't have to check challenges table and this can improve the query performance -- but, it can worsen too, you can only tell after testing with your actual database.

Related

Can't address parent field in query with multiple subqueries

EDIT: Better explanation
I have a page with a job. The job as an idea and three skills (skill_ids) and skill requirements (a user must have at least this skill value to be qualified).
I click on the job to find candidates, so I have the job_id and the three skill_ids and skill_id_requirements. So I can do this so far as the first answer proposed with joins. I find all users who have the three skills. The skills are saved in skill_ratings. So far it works as I use to find the skill_id's only.
But now I want the value and here I have my code where I compute the final value (called rating). The rating respects all given values, but isn't a simple average or the sum of all. That's why I need the long horrible code. In the long horrible code I usually insert a user's ID. But here I need all user_id's who have the skills mentioned above just to calculate if they are qualified. This is dynamic.
I'm having a table where I want to find people who are qualified for a position under some requirements. Here I work with one table called skill_ratings, but (as far as I see) need to add some subqueries. And here I have the problem. There are many subqueries and I've tried to address a parent query field. But it only seems to work in a first-grade subquery to a parent query.
Here's my structure:
SELECT * FROM table t
WHERE EXISTS (SELECT * FROM table d WHERE x > 1
AND b=t.id
AND y <= (SELECT a FROM (MAIN SUBQUERY WITH CALCULATIONS)))
GROUP BY xyx
But the error I get is: #1054 - Unknown column 'skra.usr_id_get' in 'where clause'. skra is the parent table in this case.
I want to get the following (pseudo-sql):
SELECT all FROM table t AS x
WHERE EXISTS (
SELECT all FROM table t AS y
WHERE y.skill_id = 1
AND y.usr_id_get = t.usr_id_get
AND y.value <= (my algorithm)
)
The main subquery is important so far as I want to get a computed number. Elsewhere the code works because I were able to work with predefined PHP-variables for a user's ID. But I can't do this here as I need to find the users within the boundaries of the where-clauses.
How can I solve this? Because addressing a parent-field in a subquery seems to be limited to a first-grade subquery.
EDIT: Code
Code removed due to project status.
Error: #1054 - Unknown column 'c.usr_id_get' in 'where clause'
We want users that have certain skills of certain levels. For example all users that have skill 1 with at least level 20 and skill 2 with at least level 70.
Here is an algorithm:
First of all we must get the skill levels. A user has several skill ratings and the average rating per skill is the level.
Then we want a table of criteria (skill 1 / level 20, skill 2 / level 70 in our example).
We collect all user skill levels that match the criteria (EXISTS clause) and then
keep the users that match all skill levels (count(*) = <desired number of skills>).
The query:
select
sr.usr_id_get
from
(
select usr_id_get, skill_id, avg(value) as level
from skill_ratings
group by usr_id_get, skill_id
) sr
where exists
(
select *
from
(
select 1 as skill_id, 20 as level
union all
select 2 as skill_id, 70 as level
) criteria
where sr.skill_id = criteria.skill_id
and sr.level >= criteria.level
)
group by usr_id_get
having count(*) = 2;
You can also make criteria a real (temporary) table. Then your query stays the same, no matter how many skills are requested. You'd have
where exists
(
select *
from criteria
where sr.skill_id = criteria.skill_id
and sr.level >= criteria.level
)
group by usr_id_get
having count(*) = (select count(*) from criteria);
then.
This looks like it could be done with a simple JOIN:
SELECT T.*
FROM your_table T
JOIN other_table Y ON (
T.usr_id_get = Y.usr_id_get
AND T.skill_id = 1
AND Y.value <= [...]
)
If you need to perform some sort of calculations before the join, then you could join with a subquery:
SELECT T.*
FROM your_table T
JOIN (
SELECT *
FROM other_table Y
WHERE Y.skill_id = 1
AND Y.value = [...]
) Y USING(usr_id_get)
If I understand correctly, you have a user, say user 123, and a skill, say skill 99. Now you want to get the avarage rating for user 123 and skill 99 and then find all users with an equal or better average rating on that skill.
This is how to get the avarage ratings for skill 99 per user:
select usr_id_get, avg(value)
from skill_ratings
where skill_id = 99
group by usr_id_get;
This is how to get all users with an equal or better avarage rating for skill 99 than user 123:
select usr_id_get
from skill_ratings
where skill_id = 99
group by usr_id_get
having avg(value) >=
(select avg(value) from skill_ratings where skill_id = 99 and usr_id_get = 123);
Add to this whatever other criteria you need.

Group by with join and subquery cannot see parent table alias

I am building a question / answer panel.
Tables:
users - typical with id, name, etc...,
Replies - user's replies,
Ratings - another user's ratings (1 or -1) related to a reply, every rating got a row
I need to get one row per user with related table having only 1 row for each user who rated his answer once. He got 20 points for each one. I also need when the rater sent 2 or more ratings for 60 points (not included in example) , but if I have solution I can extend it.
Problem with the following query: subquery cant see "user" alias, it says user.id is not found.
Thanks in advance!
SELECT user.*, (
SELECT COUNT(*) * 20
FROM (
SELECT SUM(rating_value) AS rv
FROM Ratings
LEFT JOIN Replies ON Replies.id = Ratings.replyId
WHERE rating_value > 0
AND Replies.userId = user.id
GROUP BY Ratings.userId
HAVING (rv = 1)
) ss
) AS points
FROM users user

MySQL ORDER BY multiple column ASC and DESC

I have 2 MYSQL tables, users and scores. Detail:
users table:
scores table:
My intention is get 20 users list that have point field sort DESC (descending) combine avg_time field sort ASC (ascending). I use the query:
SELECT users.username, scores.point, scores.avg_time
FROM scores, users
WHERE scores.user_id = users.id
GROUP BY users.username
ORDER BY scores.point DESC, scores.avg_time
LIMIT 0, 20
The result is:
The result is wrong because the first line is exactly point = 100 and avg_time = 60.
My desired result is:
username point avg_time
demo123 100 60
demo123456 100 100
demo 90 120
I tried many times with different queries but the result is still wrong. Could you give me some solutions?
Ok, I THINK I understand what you want now, and let me clarify to confirm before the query. You want 1 record for each user. For each user, you want their BEST POINTS score record. Of the best points per user, you want the one with the best average time. Once you have all users "best" values, you want the final results sorted with best points first... Almost like ranking of a competition.
So now the query. If the above statement is accurate, you need to start with getting the best point/average time per person and assigning a "Rank" to that entry. This is easily done using MySQL # variables. Then, just include a HAVING clause to only keep those records ranked 1 for each person. Finally apply the order by of best points and shortest average time.
select
U.UserName,
PreSortedPerUser.Point,
PreSortedPerUser.Avg_Time,
#UserRank := if( #lastUserID = PreSortedPerUser.User_ID, #UserRank +1, 1 ) FinalRank,
#lastUserID := PreSortedPerUser.User_ID
from
( select
S.user_id,
S.point,
S.avg_time
from
Scores S
order by
S.user_id,
S.point DESC,
S.Avg_Time ) PreSortedPerUser
JOIN Users U
on PreSortedPerUser.user_ID = U.ID,
( select #lastUserID := 0,
#UserRank := 0 ) sqlvars
having
FinalRank = 1
order by
Point Desc,
Avg_Time
Results as handled by SQLFiddle
Note, due to the inline #variables needed to get the answer, there are the two extra columns at the end of each row. These are just "left-over" and can be ignored in any actual output presentation you are trying to do... OR, you can wrap the entire thing above one more level to just get the few columns you want like
select
PQ.UserName,
PQ.Point,
PQ.Avg_Time
from
( entire query above pasted here ) as PQ
i think u miss understand about table relation..
users : scores = 1 : *
just join is not a solution.
is this your intention?
SELECT users.username, avg(scores.point), avg(scores.avg_time)
FROM scores, users
WHERE scores.user_id = users.id
GROUP BY users.username
ORDER BY avg(scores.point) DESC, avg(scores.avg_time)
LIMIT 0, 20
(this query to get each users average point and average avg_time by desc point, asc )avg_time
if you want to get each scores ranking? use left outer join
SELECT users.username, scores.point, scores.avg_time
FROM scores left outer join users on scores.user_id = users.id
ORDER BY scores.point DESC, scores.avg_time
LIMIT 0, 20
#DRapp is a genius. I never understood how he coded his SQL,so I tried coding it in my own understanding.
SELECT
f.username,
f.point,
f.avg_time
FROM
(
SELECT
userscores.username,
userscores.point,
userscores.avg_time
FROM
(
SELECT
users.username,
scores.point,
scores.avg_time
FROM
scores
JOIN users
ON scores.user_id = users.id
ORDER BY scores.point DESC
) userscores
ORDER BY
point DESC,
avg_time
) f
GROUP BY f.username
ORDER BY point DESC
It yields the same result by using GROUP BY instead of the user #variables.
group by default order by pk id,so the result
username point avg_time
demo123 100 90 ---> id = 4
demo123456 100 100 ---> id = 7
demo 90 120 ---> id = 1

Count retweets per user by linking two tables

I have the following tables:
tweets retweets
----------------- ----------------
user_id retweets user_id (etc...)
----------------- ----------------
1 0 1
2 0 1
1
2
2
I want to count the number of retweets per user and update tweets.retweets accordingly:
UPDATE users
SET retweets = (
SELECT COUNT(*) FROM retweets WHERE retweets.user_id = users.user_id
)
I have been running this query two times, but it times out (on tables that are not that large). Is my query wring?
Also see the SQL Fiddle (although it apparently doesn't allow UPDATE statements): http://www.sqlfiddle.com/#!2/f591e/1
This solution should be much faster than using subqueries for getting the count of tweets of each user (your correlated subquery will execute for each user):
UPDATE users a
LEFT JOIN
(
SELECT user_id, COUNT(1) AS retweet_count
FROM retweets
GROUP BY user_id
) b ON a.user_id = b.user_id
SET a.retweets = COALESCE(b.retweet_count, 0)
If your retweets table is not changing dynamically why not to gather data at first and then update destination table like this:
create table retweets_hist AS SELECT COUNT(*) AS retweets,user_id FROM retweets group by user_id;
then
UPDATE users
SET retweets = NVL(
SELECT retweets FROM retweets_hist WHERE retweets_hist.user_id = users.user_id
),0)
If it is dynamic, then I think using triggers is better.
The main issue here is when there is a user which has never retweeted ever counting it's retweets is time-consuming.
In answer to your question, Yes counting takes a fraction but counting something which never existed take time! this is the problem!
May this one would have better timing:
UPDATE users
SET retweets = NVL(
SELECT retweets
FROM retweets
WHERE retweets.user_id = users.user_id),0)
WHERE EXISTS(select *
FROM retweets
WHERE retweets.user_id = users.user_id)
But then again you have to update never retweets to Zero.
**Keyword EXISTS is in Oracle I don't know if mysql supports it

MySQL Join Query - joining tables into themselves many times

I have 4 queries I need to excecute in order to suggest items to users based on items they've already expressed an interest in:
Select 5 random items the user already likes
SELECT item_id
FROM user_items
WHERE user_id = :user_person
ORDER BY RAND()
LIMIT 5
Select 50 people who like the same items
SELECT user_id
FROM user_items
WHERE user_id != :user_person
AND item_id = :selected_item_list
LIMIT 50
SELECT all items that the original user likes
SELECT item_id
FROM user_items
WHERE user_id = :user_person
SELECT 5 items the user doesn't already like to suggest to the user
SELECT item_id
FROM user_items
WHERE user_id = :user_id_list
AND item_id != :item_id_list
LIMIT 5
What I would like to know is how would I excecute this as one query?
There are a few reasons for me wanting to do this:
at the moment, I have to excecute the 'select 50 people' query 5 times and pick the top 50 people from it
I then have to excecute the 'select 5 items' query 50 * (number of items initial user likes)
Once the query has been excecuted, I intend to store the query result in a cookie (if the user gives consent to me using cookies, otherwise they don't get the 'item suggestion' at all) with the key being a hash of the query, meaning it will only fire once a day / once a week (that's why I return 5 suggestions and select a key at random to display)
Basically, if anybody knows how to write these queries as one query, could you show me and explain what is going on in the query?
This will select all items you need:
SELECT DISTINCT ui_items.item_id
FROM user_items AS ui_own
JOIN user_items AS ui_others ON ui_own.item_id = ui_others.item_id
JOIN user_items AS ui_items ON ui_others.user_id = ui_items.user_id
WHERE ui_own.user_id = :user_person
AND ui_others.user_id <> :user_person
AND ui_items.item_id <> ui_own.item_id
(please, check if result are exact same with you version - I tested it on a very small fake data set)
Next you just cache this list and show 5 items randomly, because ORDER BY RAND() is VERY inefficient (non-deterministic query => no caching)
EDIT: Added the DISTINCT to not show duplicate rows.
You can also return a most popular suggestions in descending popularity order by removing DISTINCT and adding the following code to the end of the query:
GROUP BY ui_items.item_id
ORDER BY COUNT(*) DESC
LIMIT 20
To the end of the query which will return the 20 most popular items.