mysql query - multiple counts using left join and where clause - mysql

I'm currently trying to get the following data:
UserName, UserImageURL, Total Games Played, Games Completed, Games Lost, Average Won (as percentage) and Points of the user
And as well another set of data:
User Statistics data such as:
Most Games Played on League: 23 - Monster Killers
Games Most Won On: 19/23 - Monster Killers
Games Most Lost On: 3/32 - Frog Racers
Your Game Winning Accuracy (total from all games) - 68% accuracy
Site Stats:
Most Games Played on League: 650 - Helicopter Run
Top Game Played: 1200 - Monster Killers
Whole site winning accuracy: 82%
I have the following Tables:
-User Table-
userID (int-pk), userName (varchar), userImageUrl (text)
-Games table-
gameId (int-pk), gameName (varchar), gameUserID (int), gameLeagueId (int), score1 (int), score2 (int), gameResultOut (0 or 1), gameWon (0 or 1)
-UserBalance table-
ubId(int-pk) userId (int) balance (int)
-League table-
leagueId (int-pk) leagueName (varchar)
Just to give you a heads up on what's happening, when a user plays a game and chooses some results a row is inserted into the games table. Since the game is time based, when the results are out, there is a check that checks if there are any games which have that id and will update the gameResultOut to 1 and gameWon to 1 or 0 according to what the user had selected as a score.
I tried the following:
SELECT u.userID, u.userName, u.userImageUrl, l.leagueName ,
COUNT(g.gameId) AS predTotal,
(SELECT COUNT(g.gameId) FROM games AS g WHERE g.gameResultOut = 1 AND g.gameWon = 1) AS gamesWon,
(SELECT COUNT(g.gameId) FROM games AS g WHERE g.gameResultOut = 1 AND g.gameWon = 0) AS gamesLost,
ub.balance
FROM games AS g
LEFT JOIN league AS l ON l.leagueId = g.gameLeagueId
LEFT JOIN user AS u ON u.user_id = g.gameUserID
LEFT JOIN user_balance AS ub ON ub.userId = u.userID
WHERE l.leagueId = 4
GROUP BY u.userId
ORDER BY ub.balance DESC
I can calculate easily the win percentage after the query so that's not a problem, but the result for the Wins and Lost are all the same and even when it comes to changing the leageId, the results are still the same which is not what I want.
Can anyone help?
Thanks & Regards,
Necron

As far as I see, the games table stores games that users played. So, in order to know how many games each user played/won/lost, you're missing the link in the subqueries between games and users.
Your subqueries are:
(SELECT COUNT(g.gameId ) FROM games AS g WHERE g.gameResultOut = 1 AND g.gameWon = 1) AS gamesWon,
(SELECT COUNT(g.gameId) FROM games AS g WHERE g.gameResultOut = 1 AND g.gameWon = 0) AS gamesLost,
And they should be:
(SELECT COUNT(gw.gameId ) FROM games AS gw WHERE gw.gameResultOut = 1 AND gw.gameWon = 1 AND gw.gameUserID = u.user_id) AS gamesWon,
(SELECT COUNT(gl.gameId) FROM games AS gl WHERE gl.gameResultOut = 1 AND gl.gameWon = 0 AND gl.gameUserID = u.user_id) AS gamesLost,
I guess this is what you're looking for :)
EDIT based on comments, adding tips for User and Site statistics:
For those information you'll need to perform several distinct queries, as most of them are going to sum some values and/or group by a given column, which won't fit for another query. I'll try to give you some ideas so you can work on them.
User Statistics
Most Games Won or Lost
The previous answer for the query you provided counts how many times user has lost/won any game, but does not distinct this data between games.
So, if you want to know in which game user has most wins/losses, you should have something like this:
SELECT
g.gameName,
-- How many times the user won per game
(SELECT COUNT(gw.gameId) FROM games gw WHERE gw.gameResultOut = 1 AND gw.gameWon = 1 AND gw.gameUserID = u.user_id) AS gamesWon,
-- How many times the user payed each game
COUNT(g.gameId) AS gamesPlayed,
-- The Win Ratio. This may need a little work on, depending on what you want.
-- Be aware that if a user played a game 1 time and won, it's ratio will be 1 (100%)
-- So maybe you'll want to add your own rule to determine which game should show up here
(gamesWon / gamesPlayed) AS winRatio
FROM
games g
INNER JOIN user u ON u.user_id = g.gameUserID
-- Groups and counts data based on games + users
GROUP BY g.gameId, u.user_id
-- Now you order by the win ratio
ORDER BY winRatio DESC
-- And get only the first result, which means the game the player has most wins.
LIMIT 1
For lost games, it's pretty much the same query, changing the desired fields and maths.
Game winning accuracy
Somewhat the previous query, except that you won't group by the gameID anymore. Just group by the user and do your math.
Site Statistics
Well, as far as I see, we're still on a similar query. The difference is that for the whole Site statistics you won't ever group by user. You may group by game or league, depending on what you are trying to achieve.
Bottom line: looks that most queries are similar, you'll have to play with them and adapt for each information you need to retrieve. Please note that they might not work plenty as I could not test them on your DB. You may need to correct some inconsistence according to your database/tables schema.
I hope this may give you some insight to work on.

Related

Relational Database Logic

I'm fairly new to php / mysql programming and I'm having a hard time figuring out the logic for a relational database that I'm trying to build. Here's the problem:
I have different leaders who will be in charge of a store anytime between 9am and 9pm.
A customer who has visited the store can rate their experience on a scale of 1 to 5.
I'm building a site that will allow me to store the shifts that a leader worked as seen below.
When I hit submit, the site would take the data leaderName:"George", shiftTimeArray: 11am, 1pm, 6pm (from the example in the picture) and the shiftDate and send them to an SQL database.
Later, I want to be able to get the average score for a person by sending a query to mysql, retrieving all of the scores that that leader received and averaging them together. I know the code to build the forms and to perform the search. However, I'm having a hard time coming up with the logic for the tables that will relate the data. Currently, I have a mysql table called responses that contains the following fields,
leader_id
shift_date // contains the date that the leader worked
shift_time // contains the time that the leader worked
visit_date // contains the date that the survey/score was given
visit_time // contains the time that the survey/score was given
score // contains the actual score of the survey (1-5)
I enter the shifts that the leader works at the beginning of the week and then enter the survey scores in as they come in during the week.
So Here's the Question: What mysql tables and fields should I create to relate this data so that I can query a leader's name and get the average score from all of their surveys?
You want tables like:
Leader (leader_id, name, etc)
Shift (leader_id, shift_date, shift_time)
SurveyResult (visit_date, visit_time, score)
Note: omitted the surrogate primary keys for Shift and SurveyResult that I would probably include.
To query you join shifts and surveys group on leader and taking the average then jon that back to leader for a name.
The query might be something like (but I haven;t actually built it in MySQL to verify syntax)
SELECT name
,AverageScore
FROM Leader a
INNER JOIN (
SELECT leader_id
, AVG(score) AverageScore
FROM Shift
INNER JOIN
SurveyResult ON shift_date = visit_date
AND shift_time = visit_time --depends on how you are recording time what this really needs to be
GROUP BY leader ID
) b ON a.leader_id = b.leader_id
I would do the following structure:
leaders
id
name
leaders_timetabke (can be multiple per leader)
id,
leader_id
shift_datetime (I assume it stores date and hour here, minutes and seconds are always 0
survey_scores
id,
visit_datetime
score
SELECT l.id, l.name, AVG(s.score) FROM leaders l
INNER JOIN leaders_timetable lt ON lt.leader_id = l.id
INNER JOIN survey_scores s ON lt.shift_datetime=DATE_FORMAT('Y-m-d H:00:00', s.visit_datetime)
GROUP BY l.id
DATE_FORMAT here helps to cut hours and minutes from visit_datetime so that it could be matched against shift_datetime. This is MYSQL function, so if you use something else you'll need to use different function
Say you have a 'leader' who has 5 survey rows with scores 1, 2, 3, 4 and 5.
if you select all surveys from this leader, sum the survey scores and divide them by 5 (the total amount of surveys that this leader has). You will have the average, in this case 3.
(1 + 2 + 3 + 4 + 5) / 5 = 3
You wouldn't need to create any more tables or fields, you have what you need.

Count, max, and multiple sub querys SQL

I'm currently working on a league systeme for my sport team. A ladder, as seen as in some video games.
It's a mobile web site, allowing coaches to create games, and monitor players performances.
I have games automatically balanced, taking into accounts player's experiences and points, then, i give bonus points to the all the players of the winner team, and remove points from the losers.
I have a relatively simple database. 3 tables.
User : id - name
Games : id - ETA - cration_date
game_joueur: id- id_game - id_joueur - team - result - bonus
game_joueur beeing an assoc table, in wich i register for each new game players id, the team he has been seeded on, and afterwards, update the bonus field with the points earned and the result field with an integer (1 = lose, 2= win)
That way i can sum the bonus on my players stat and get the total points.
You can have a better look at the table here :
http://sqlfiddle.com/#!2/d3e06/2
What i'm tryng to acomplish is for each player's stat page, retrieve from the database the name of his most succesfull partner( the guy wich whom he won the most games), and also his worst ally , the men he lost the most match with.
This is what i do on my user stat page :
SELECT
(SELECT COUNT(lad_game_joueur.result) FROM lad_game_joueur WHERE result = 1 AND lad_game_joueur.id_joueur = lad_user.id) as lose,
(SELECT SUM(lad_game_joueur.bonus) FROM lad_game_joueur WHERE lad_game_joueur.id_joueur = lad_user.id) as points,
lad_user.id as id ,
(SELECT COUNT(lad_game_joueur.result) FROM lad_game_joueur WHERE lad_game_joueur.id_joueur = lad_user.id AND result =2) as win,
lad_user.name
FROM lad_user,lad_game_joueur
WHERE lad_game_joueur.id_joueur = lad_user.id AND lad_user.id
='.$id_joueur.'
GROUP BY lad_user.id
ORDER BY puntos DESC
I'm sure this is not the best way to do it, but it works :) ( i'm no sql specialist)
How can i tune this query to also retrive the informations i'm looking for?
I wont mind doing another query.
Thanks a lot in advance!
Ben
Ok i finealy found a way.
Here's what i did :
SELECT
SUM(result)as result_sum, sum(Bonus) as bonus_sum, id_joueur
from lad_game_joueur
where result= 2
and id_game in
(SELECT lad_game_joueur.id_game from lad_game_joueur,lad_game where id_joueur=2
AND result= 2 and lad_game_joueur.id_game=lad_game.id)
group by id_joueur
order by result_sum DESC, bonus_sum desc
As you see, the sum of result would give me 4 if i won two games with the person, but i just divide by 2 on php and voilĂ  :)

Calculate a variable using 2 Mysql tables and make a select based on that variable

I own an online game in which you become the coach of a rugby team and I recently started to optimize my database. The website uses CodeIgniter framework.
I have the following tables (the tables have more fields but I posted only those which are important now):
LEAGUES: id
STANDINGS: league_id, team_id, points
TEAMS: id, active
Previously, I was having in the LEAGUES table a field named teams. This was representing the number of active teams in that league (of which users logged in recently).
So, I was doing the following select to get a random league that has between 0 and 4 active teams (leagues with less teams first).
SELECT id FROM LEAGUES WHERE teams>0 AND teams<4 ORDER BY teams ASC, RAND( ) LIMIT 1
Is there any way I can do the same command now without having to add the teams field?
Is it efficient? Or It's better to keep the teams field in the database?
LATER EDIT
This is what I did until now:
function test()
{
$this->db->select('league_id, team_id');
$this->db->join('teams', 'teams.id = standings.team_id');
$this->db->where('active', 0);
$query = $this->db->get('standings');
return $query->result_array();
}
The function returns all inactive teams alongside with their league_id.
Now how do I count the number of inactive teams in each league and how to I sort them after this number?
Try this:
select league_id
from standings s
join teams t on t.id = s.team_id and t.active
group by 1
having count(*) < 5

Using SQL to Aggregate and Calculate Stats

I have shoot 'em game where users compete against each other over the course of a week to accumulate the most points. I want to write a query that aggregates statistical data from the shots table. The tables and relationships of concern here are:
user has many competition_periods
competition_period belongs to user
competition_period has many shots
shot belongs to competition_period
In the shots table I have the following fields to work with:
result --> string values: WON, LOST or TIED
amount_won --> integer values: e.g., -100, 0, 2000, etc.
For each user, I want to return a result set with the following aggregated stats:
won_count
lost_count
tied_count
total_shots_count (won_count + lost_count + tied_count)
total_amount_won (sum of amount_won)
avg_amount_won_per_shot (total_amount_won / total_shots_count)
I've worked on this query for few hours now, but haven't made much headway. The statistical functions trip me up. A friend suggested that I try to return the results in a new virtual table called shot_records.
Here is the basic solution, computing the statistics across all shots for a given player (you didn't specify if you want them on a per-competition-period basis or not):
SELECT user, SUM(IF(result = 'WON', 1, 0)) AS won_count,
SUM(IF(result = 'LOST', 1, 0)) AS lost_count,
SUM(IF(result = 'TIED', 1, 0)) AS tied_count,
COUNT(*) AS total_shots_count,
SUM(amount_won) AS total_amount_won,
(SUM(amount_won) / COUNT(*)) AS avg_amount_won_per_shot
FROM user U INNER JOIN competition_periods C ON U.user_id = C.user_id
INNER JOIN shots S ON C.competition_period_id = S.competition_period_id
GROUP BY user
Note that this includes negatives in calculating the "total won" figure (that is, the total is decreased by losses). If that's not the correct algorithm for your game, you would change SUM(Amount) to SUM(IF(Amount > 0, Amount, 0)) in both places it occurs in the query.

Determining Rookie Years in Lahman Database

I'm using the MySQL version of the Lahman Baseball Database and I'm having trouble trying to determine the year a player lost their rookie standing. The rules for an MLB player losing rookie standing are:
A player shall be considered a rookie unless, during a previous season or seasons, he has (a) exceeded 130 at-bats or 50 innings pitched in the Major Leagues; or (b) accumulated more than 45 days on the active roster of a Major League club or clubs during the period of 25-player limit (excluding time in the military service and time on the disabled list).
Is there a query that can be run to do this for Batters and Pitchers, or is this something that would be programmatically done?
Using the Lahman Database you can figure out Rookies by At Bats (>130) and Innings Pitched (>50), however there isn't anything for service time during the 25 man roster (non-Sept) limit.
You would need retrosheets {http://www.retrosheet.org/game.htm} data to do that.
The queries below would give you ALL of the rookies by At Bats and Innings Pitched, however the service time rookies would be the exception. There's only a few of those as teams don't tend to keep rookies on the MLB roster and not play them. The lose development time (not playing) and accelerate their service time to lose out on controlled years. So if you're happy with that, these tables will do.
You can use this as a Xref table with batters or pitchers to highlight their rookie year. Or you could add an extra column to batters and pitchers with the RookieYr distinction (advise against it as if you want to add new seasons to your Lahman DB - less customizing needed).
/************************************ Create MLB Rookie Xref Table **********************************************
-- Sort Out Batters who accumulate 130 AB
-- Sort Out Pitchers who accumulate 50 IP
-- Define Rookie Year, Drop off years previous and years after
-- Can be updated Annually using "player ID not in (select distinct playerID from Xref_RookieYr)
-- Using the Sean Lahman Database
-- Authored By Paul DeVos {www.linkedin.com/in/devosp/}
*****************************************************************************************************************/
/****** Query uses T-SQL, Query ran in MS SQL 2012 - you may need to tweek for other platorms or versions. ******/
--Step 1 - Run this for hitter accumulated ABs and when Rookie Year (130 Career At Bats)
Select
concat(m.nameFirst, ' ', m.nameLast) as Name,
b.PlayerID,
b.yearID,
m.debut,
sum(b.ab) over (partition by b.playerID order by b.playerID, b.yearID) as CumulativeAB,
null as CumulativeIP, -- Place Holder for Rookie Pitchers Insert
case when sum(b.ab) over (partition by b.playerID order by b.playerID, b.yearID) >= 130 then b.yearID end as RookieYR
into #temp_rookie_year
from
[master] m
inner join Batting b
on m.playerID=b.playerID
-- Selects Position Players
where b.playerID not in (select distinct f.playerID from Fielding f where f.pos = 'P')
--Step 2 - Run this to get accumulated IP and Rookie Year (50 Career IP)
Insert into #temp_rookie_year
(
Name, PlayerID, YearID, Debut, CumulativeAB, CumulativeIP, RookieYR
)
Select
concat(m.nameFirst, ' ', m.nameLast) as Name,
p.PlayerID,
p.yearID,
m.debut,
null as CumulativeAB,
sum(p.IPouts) over (partition by p.playerID order by p.playerID, p.yearID) as CumulativeIP,
case when sum(p.IPouts) over (partition by p.playerID order by p.playerID, p.yearID) >= 150 then p.yearID end as RookieYR
from [master] m
inner join pitching p
on m.playerID=p.playerID
--Chooses Pitchers
where p.playerID in (select distinct f.playerID from Fielding f where f.pos = 'P')
--Step 3 Run this - sorts out the rookie year into Rookie Xref Table
select Name, PlayerID, min(RookieYr) as RookieYear
into #Xref_RookieYr
from #temp_rookie_year
--where name = 'Hank Aaron'
group by Name, PlayerID
order by RookieYear desc
--Step 4 - run IF you want to remove players who never lost rookie status (cup of cofee players, etc - anyone under 130 AB or 50 IP)
select * from #Xref_RookieYr
order by playerID
Delete from #Xref_RookieYr where RookieYear is null
select * from #Xref_RookieYr
order by playerID
/*****************************************************************************************************************
You can change drop the "#" in front of the table (and name it whatever you want) when you want a permanent table.
If you leave it, it'll drop off when you close the program. e.g. Xref_Rookie_2013
*****************************************************************************************************************/
This can be done in SQL. How it is done will be based upon what is the most optimal way of doing it. Most likely it could be done with one query like so (pseudo-code):
SELECT Master.*
FROM Master
LEFT JOIN Batting ON Master.player_id = Batting.player_id
LEFT JOIN Pitching ON Master.player_id = Pitching.player_id
WHERE Batting.AB > 130 OR Pitching.IPOuts > (50 x 3)
OR Master.DaysActive > 45
That last part of the WHERE statement is a bit iffy because I don't find anything like that in the data from your database provider. I see active games but that isn't the same thing. The Appearances table might get you close but that is about all you can do.
Here is the data I based my pseudo-code off of:
http://baseball1.com/files/database/readme58.txt
I did find another guy who was doing something similar to what you are doing (including calculating who is a rookie). Here is his site (with code):
http://baseballsimulator.com/blog/category/database/