Multiple aggregate count columns SQL - mysql

I have a table that roughly looks like the following:
winner_name
loser_name
Person A
Person B
Person A
Person C
Person B
Person A
Person C
Person B
I'm trying to return a table that looks like the following:
player_name
number_wins
number_losses
Person A
2
1
Person B
1
2
Person C
1
1
I can't quite figure out how to get there. I have been able to write a query that returns player_name and either number_wins or number_losses, but not both.
SELECT winner_name AS player_name, COUNT(*) AS number_wins
FROM my_table
GROUP BY player_name
I have looked into using procedures, functions, and subqueries to do this but I haven't found the right solution yet. Any direction would be helpful.

You need to pivot the names into the same column using UNION. Then you can calculate the sum of wins and losses.
SELECT player_name, SUM(win) AS number_wins, SUM(loss) AS number_losses)
FROM (
SELECT winner_name AS player_name, 1 AS win, 0 AS loss
FROM my_table
UNION ALL
SELECT loser_name AS player_name, 0 AS win, 1 AS loss
FROM my_table
) AS x
GROUP BY player_name

Since each aggregate statistic is for a different group (one for winner_name, the other for loser_name), they can't be calculated in the same query, but each query can be run separately and then combined with a JOIN. Simply take each query:
SELECT winner_name AS player, COUNT(loser_name) AS wins
FROM games
GROUP BY winner_name
;
SELECT loser_name AS player, COUNT(winner) AS losses
FROM games
GROUP BY loser_name
;
and join on the common attribute, the player name:
SELECT gw.player, gw.wins, gl.losses
FROM (
SELECT winner_name AS player, COUNT(loser_name) AS wins
FROM games
GROUP BY winner_name
) AS gw
JOIN (
SELECT loser_name AS player, COUNT(winner_name) AS losses
FROM games
GROUP BY loser_name
) AS gl
ON gl.player = gw.player
;
Whether using unions or joins, each distinct group that is the basis for aggregate statistics will require a separate sub-select.

Related

Good alternatives of comporting two subquery in the 'Where' clause

I have this schema:
CLUB(Name, Address, City)
TEAM(TeamName, club)
PLAYER(Badge, teamName)
MATCH(matchNumber, player1, player2, club, winner)
I need to make this query:
For each club, find the number of players in that club that have won
at least two games.
I wrote this:
SELECT teamName
From TEAM t join Match m1 on t.club=m1.club
WHERE Q2 >= ALL Q1
Q1:
SELECT Count (Distinct winner)
FROM MATCH
WHERE match m join player p on m. winner=player.badge
GROUP BY teamName
Q2:
SELECT Count (distinct winner)
FROM match m2
WHERE m2.club=m1.club
I don’t know if it is correct, however I heard that using this form where I confront two counts is not the best. Why?
Try something like this:
SELECT club, COUNT(*) as PlayerCount
FROM (SELECT club, winner
FROM match
GROUP BY club, winner
HAVING COUNT(*) > 1) a
GROUP BY club
The inner query should limit results to club/player combinations that have 2 or more wins, and the outer query will count the number of these players per club.
I don’t know if it is correct, however I heard that using this form where I confront two counts is not the best. Why?
Comparing two count subqueries is fine if you need to, but a good rule of thumb is to hit each table as few times as possible. Using multiple subqueries will end up hitting each table multiple times, and will usually result in longer execution times.
Try this query
SELECT t.club, COUNT(*)
FROM TEAM t
JOIN PLAYER p ON p.teamName = t.TeamName
JOIN (
-- Won at least 2 matches.
SELECT club, winner, COUNT(*) AS TheCount
FROM MATCH
GROUP BY club, winner
HAVING COUNT(*) > 1
) w ON w.winner = p.badge AND w.club = t.club
GROUP BY t.club

Mysql: Count wins but only once per opponent

I'm looking for help using sum() in my SQL query:
Task: Count tournament wins of all players. (one number per player) (battles.result = 1 means Player1 wins)
SELECT members.id, members.name,
(
SELECT SUM(battles.result = 1)
FROM battles
WHERE members.id = battles.player1 AND battles.result=1 order by battles.gametime
( as wins,
FROM members
Next: Only count ONE result per two players.
So if there are multiple results of two players, count only the first result (first gametime).
I've already tried using order by battles.player2, but i guess there is a much better solution?
You can easily get the result by doing a join and aggregation instead. Try:
SELECT A.id, A.name, SUM(IFNULL(B.result,0)) wons
FROM members A LEFT JOIN battles B
ON A.id=B.player1
GROUP BY A.id, A.name;

Derived joins and aggregate functions throwing errors

I am stuck in a query. i have two tables player and player_attributes with player_api_id as primary key.
I need to find the youngest,oldest player and the average overall rating of the oldest and youngest player.
Query to write the youngest and oldest player:
select player_name, birthday,YEAR(CURDATE()) - YEAR(birthday) as age from player where
birthday=(select max(birthday) from player)
or
birthday=(select min(birthday) from player)
Query for the average overall rating of all players:
SELECT player_api_id, avg(overall_rating) as avg_score,
FROM (
SELECT player_api_id, overall_rating FROM player_attributes
) as p
GROUP BY player_api_id;
Error while joining:
select player_api_id, avg(overall_rating),min(birthday),max(birthday) as avg_score
FROM (
SELECT player_api_id, overall_rating FROM player_attributes
) as p
join
(select birthday from player) as p1
on p.player_api_id=p1.player_api_id
GROUP BY player_api_id;
I am confused now??
There is no reason to use subqueries just to select columns. In fact, in MySQL, it is a really, really bad idea -- because MySQL materializes the subqueries.
So, just do:
select pa.player_api_id, avg(overall_rating) as avg_score,
min(p.birthday), max(p.birthday)
from player_attributes pa join
player p
on pa.player_api_id = p.player_api_id
group by pa.player_api_id;
I'm not sure if the rest of the logic is okay. But this should at least fix the syntax error.

Joining two tables twice

Writing a query for a basketball database, our table for games has as a winnerID and a loserID, each being a teamID. Tried the following two queries, each correctly giving me the number of wins but giving me the same number for losses.
SELECT team.name as Team_Name, COUNT(team.teamID=winner.winnerID) as Wins, COUNT(team.teamID=loser.loserID) as Losses
FROM team join games winner on winner.winnerID=team.teamID join games loser on loser.loserID=team.teamID
GROUP BY team.name
ORDER BY Wins, Team_Name;
SELECT team.name as Team_Name, COUNT(team.teamID=games.winnerID) as Wins, COUNT(team.teamID=games.loserID) as Losses
FROM (team INNER JOIN games on games.winnerID=team.teamID)
GROUP BY team.name
ORDER BY Wins, team.name;
Help?
EDIT: Forgot to mention, purpose of query is to get number of wins and number of losses of each team.
The COUNT aggregate gets a count of non-NULL values. That means it includes ones and zeros.
Evaluated in a numeric context, an equality comparison returns 1 for TRUE and returns 0 for FALSE, and only returns NULL if either side (or both sides) is NULL.
To add up the ones and ignore the zeros, you could use a SUM aggregate instead.
One of the big problems with the query is the potential to return duplicates, due to the cross join between winner and loser. If a team has 5 wins and 4 losses, the query is going to generate an intermediate set of 20 (= 5 x 4) rows.
To get a count of wins and losses from that, we'd need a unique identifier for a game (for example, a gameid column in the game table that is the PRIMARY KEY.) With that , we could get a count of distinct values of gameid. For example:
COUNT(DISTINCT IF(winner.winnerid=team.teamid,winner.gameid,NULL)) AS wins
There are several query patterns that will get you the number of wins and the number of losses for each team.
Here's one example:
SELECT t.name AS team_name
, COUNT(IF(t.teamid=g.winnerid,1,NULL) AS wins
, COUNT(IF(t.teamid=g.loserid ,1,NULL) AS losses
FROM team
LEFT
JOIN games g
ON ( g.winnerid = t.teamid OR g.loserid = t.teamid )
GROUP
BY t.name
ORDER
BY wins DESC
, t.name
With this, we are only joining to the games table once, so we won't get a cross product. Also note that if teamid is not equal to winnerid, we return a NULL instead of a 0. So a COUNT will include only winners, not all of the rows.
We use an outer join (rather than an inner join), in case there are no related rows in games for the team. That allows us to return a team with counts of zero.
We could use a SUM aggregate instead of a COUNT, For example:
SELECT t.name AS team_name
, IFNULL(SUM(t.teamid=g.winnerid),0) AS wins
, IFNULL(SUM(t.teamid=g.loserid) ,0) AS losses
FROM team
LEFT
JOIN games g
ON ( g.winnerid = t.teamid OR g.loserid = t.teamid )
GROUP
BY t.name
ORDER
BY wins DESC
, t.name
With a SUM() we have a potential to return NULL values. To get those converted to zeros, we use an IFNULL function.
And there are several other query patterns that would get you an equivalent result... e.g. instead of using joins, use correlated subqueries in the SELECT list...
SELECT t.name
, ( SELECT COUNT(1)
FROM game w
WHERE w.winner_id = t.teamid
) AS wins
, ( SELECT COUNT(1)
FROM game l
WHERE l.loserid = t.teamid
) AS losses
FROM team t
ORDER BY wins DESC
, t.name

MySQL How to use value of variable in a Query in its own Subquery

I am in need of your help:
I have a set of relational tables set up for Users, Games and Results and I am trying to calculate each users Win Percentage.
Thanks to a lot of browsing I have discovered the following method:
SELECT Winner, (COUNT(Winner) * games.TotalPlayed) AS PercentWon
FROM Results
JOIN (
SELECT 100/COUNT(*) AS TotalPlayed
FROM games
where Player_1 = 'John' OR Player_2 = 'John'
) AS games
where Winner == 'John'
GROUP BY Winner
ORDER BY PercentWon DESC;
This works perfectly for getting Johns % but I want it to scan the whole Results table and print values for everyone. However, I am not allowed use:
SELECT Winner, (COUNT(Winner) * games.TotalPlayed) AS PercentWon
FROM Results
JOIN (
SELECT 100/COUNT(*) AS TotalPlayed
FROM games
where Player_1 = Winner OR Player_2 = Winner
) AS games
where Winner != 'Draw'
GROUP BY Winner
ORDER BY PercentWon DESC;
as Winner is not defined. How can I get the value of Winner passed to the subquery as it is being performed?
I've finally come up with something that'll work for you. It was tricky because the player can be in one of two fields in the games table.
SELECT player, (COALESCE(totalWon, 0) / count(*) * 100) as PercentWon
FROM ((SELECT DISTINCT Player_1 as player FROM games) UNION (SELECT DISTINCT Player_2 as player FROM games)) players
INNER JOIN games on (games.Player_1 = player or games.Player_2=player)
LEFT JOIN
(
SELECT Winner, COUNT(Winner) as totalWon
FROM Results
WHERE Winner != 'Draw'
GROUP BY Winner
) as winners
ON winners.Winner = player
GROUP BY player
ORDER BY PercentWon DESC;
The UNION in the FROM clause is building a user list that I can use to join games (to find the total number of games per player) and winners (to find the number of wins per player).
COALESCE is used for the case when a player has not won any games. Since we are joining table, this allows us to default 0 instead of NULL when there are no matches.
Here's an sqlfiddle based on my assumptions about your schema:
http://sqlfiddle.com/#!2/d762c/1/0