Sean Lahman database sample queries - mysql

Is there any place where can I find sample queries (SELECT, UPDATE, DELETE) for Sean Lahman database? I wanna see what can be done with this database..

I use to ship the database with sample queries. Maybe I should revisit that idea. Here are a few to get you started.
A simple one to show all of the players named "Sean:"
SELECT nameLast, nameFirst, debut
FROM Master
WHERE (nameFirst="Sean")
ORDER BY nameLast;
Here's one to show a list of players with 50 HRs in a season:
SELECT Master.nameLast, Master.nameFirst, Batting.HR, Batting.yearID
FROM Batting INNER JOIN Master ON Batting.playerID = Master.playerID
WHERE (((Batting.HR)>=50))
ORDER BY Batting.HR DESC;
Here's one to show the all-time leaders in strikeouts:
SELECT Master.nameLast, Master.nameFirst, Sum(Pitching.SO) AS SumOfSO
FROM Pitching INNER JOIN Master ON Pitching.playerID = Master.playerID
GROUP BY Pitching.playerID, Master.nameLast, Master.nameFirst
ORDER BY Master.nameLast;
There are several websites with tutorials on using the database that include sample queries. See:
http://webdev.cas.msu.edu/cas992/weeks/week5.html
http://www.hardballtimes.com/main/article/databases-for-sabermetricians-part-one/
You can find more by googling 'sql lahman'

This database can do most anything except game-by-game analysis, for that you will need to go to Retrosheet {http://www.retrosheet.org/game.htm}.
But let's say that you want to replicate the totals you see on Baseball-Reference.com, you could easily do that.
If you want some advanced metrics (Sabermetric-like stats), I recommend Tom Tango's website. There you can find help to do your own queries for wOBA. You can also formulate (try to duplicate) FanGraph's or Baseball-Reference's WAR.
Basically, anything you want (provided you can do the calculations/master the SQL syntax) from this except game-by-game or pitch-by-pitch types of data.
Here's a query to determine Salary by games played (offensively and defensively) to figure out how much a player cost/makes per game. (T-SQL w/ SQL Server 2012 Express)
select
m.namefirst,
m.namelast,
s.yearID,
s.teamID,
s.salary,
Cast ('162' as Int) as FullSeason,
round(sum(s.salary)*1.00/162,0) as Game_Rate,
sum (case when s.playerID=b.playerID then f.g else 0 end) as Gm_App_Field,
b.g as Batting,
--sum(case when s.playerID=b.playerID and s.yearID=b.yearID then b.g else 0 end) as Gm_App_Hit,
sum (case when s.playerID=b.playerID then f.innouts else 0 end) as InnOuts,
sum(F.InnOuts)/27 as FullGames,
round((sum (case when s.playerID=b.playerID then f.g else 0 end)/162.0)*s.salary,0) as PayByGmFielding,
round(sum(b.g*s.salary)/162,0) as PayByGmHitting,
round((sum(F.InnOuts)/27)*(s.salary/162),0) as PlayingSalary
from Fielding f
inner join batting b
on f.playerID=b.playerID and f.yearID=b.yearID
inner join salaries s
on f.playerID=s.playerID and f.yearID=s.yearID
inner join [master] m
on b.playerID=m.playerID and f.playerID=m.playerID and s.playerID=m.playerID
where
f.yearID = '2013' and f.POS <> 'P' --b.playerID = 'zimmejo02'
group by
m.namefirst,m.namelast, s.yearID , s.teamID, s.salary, b.g
Which outputs this:
namefirst namelast yearID teamID salary FullSeason Game_Rate Gm_App_Field Batting InnOuts FullGames PayByGmFielding PayByGmHitting PlayingSalary
A.J. Pollock 2013 ARI 491000 162 9093 119 137 2897 107 360672 1245685 324302
You can also create your own searches, here's one with players who have more BB/SO with full player cards, including the WAR I came up with (may be off a little from FanGraphs or Baseball Reference) - (T-SQL w/ SQL Server 2012 Express)
--1. Retrives Full Player Records of guys with more BB than SO
select
m.namefirst,
m.namelast,
b.yearID,
b.yearID-m.birthyear as Age,
b.G,b.AB,b.R,b.H,b.[2B],b.[3B],b.HR,b.RBI,b.SB,b.BB,b.SO, left(round((b.bb*1.000/b.SO),3),4) [BB/SO Rate], left(round((b.h*1.000/b.ab),3),5) as Average
,b.IBB,b.HBP,b.SH,b.SF,b.SF,b.GIDP,case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end as WAR
from [master] m
inner join batting b on b.playerID=m.playerID
inner join BR_WAR_2013 br on br.playerID=m.playerID
where b.SO <> 0 and b.AB > 300 and b.bb>b.SO
group by
m.namefirst,
m.namelast,
b.yearID,
b.yearID-m.birthyear,
b.G,b.AB,b.R,b.H,b.[2B],b.[3B],b.HR,b.RBI,b.SB,b.BB,b.SO, left(round((b.bb*1.000/b.SO),3),4), left(round((b.h*1.000/b.ab),3),5)
,b.IBB,b.HBP,b.SH,b.SF,b.SF,b.GIDP,case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end
having case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end <> 'error'
order by b.yearID desc, left(round((b.bb*1.000/b.SO),3),4) desc

Related

Optimizing Parameterized MySQL Queries

I have a query that has a number of parameters which if I run from in MySQLWorkbench takes around a second to run.
If I take this query and get rid of the parameters and instead substitute the values into the query then it takes about 22 seconds to run, same as If I convert this query to a parameterized stored procedure and run it (it then takes about 22 seconds).
I've enabled profiling on MySQL and I can see a few things there. For example, it shows the number of rows examined and there's an order of difference (20,000 to 400,000) which I assume is the reason for the 20x increase in processing time.
The other difference in the profile is that the parameterized query sent from MySQLWorkbench still has the parameters in (e.g. where limit < #lim) while the sproc the values have been set (where limit < 300).
I've tried this a number of different ways, I'm using JetBrains's DataGrip (as well as MySQLWorkbench) and that works like MySQLWorkbench (sends through the # parameters), I've tried executing the queries and the sproc from MySQLWorkbench, DataGrip, Java (JDBC) and .Net. I've also tried prepared statements in Java but I can't get anywhere near the performance of sending the 'raw' SQL to MySQL.
I feel like I'm missing something obvious here but I don't know what it is.
The query is relatively complex, it has a CTE a couple of sub-selects and a couple of joins, but as I said it runs quickly straight from MySQL.
My main question is why the query is 20x faster in one format than another.
Does the way the query is sent to MySQL have anything to do with this (the '#' values sent through and can I replicate this in a stored procedure?
Updated 1st Jan
Thanks for the comments, I didn't post the query originally as I'm more interested in the general concepts around the use of variables/parameters and how I could take advantage of that (or not)
Here is the original query:
with tmp_bat as (select bd.MatchId,
bd.matchtype,
bd.playerid,
bd.teamid,
bd.opponentsid,
bd.inningsnumber,
bd.dismissal,
bd.dismissaltype,
bd.bowlerid,
bd.fielderid,
bd.score,
bd.position,
bd.notout,
bd.balls,
bd.minutes,
bd.fours,
bd.sixes,
bd.hundred,
bd.fifty,
bd.duck,
bd.captain,
bd.wicketkeeper,
m.hometeamid,
m.awayteamid,
m.matchdesignator,
m.matchtitle,
m.location,
m.tossteamid,
m.resultstring,
m.whowonid,
m.howmuch,
m.victorytype,
m.duration,
m.ballsperover,
m.daynight,
m.LocationId
from (select *
from battingdetails
where matchid in
(select id
from matches
where id in (select matchid from battingdetails)
and matchtype = #match_type
)) as bd
join matches m on m.id = bd.matchid
join extramatchdetails emd1
on emd1.MatchId = m.Id
and emd1.TeamId = bd.TeamId
join extramatchdetails emd2
on emd2.MatchId = m.Id
and emd2.TeamId = bd.TeamId
)
select players.fullname name,
teams.teams team,
'' opponents,
players.sortnamepart,
innings.matches,
innings.innings,
innings.notouts,
innings.runs,
HS.score highestscore,
HS.NotOut,
CAST(TRUNCATE(innings.runs / (CAST((Innings.Innings - innings.notOuts) AS DECIMAL)),
2) AS DECIMAL(7, 2)) 'Avg',
innings.hundreds,
innings.fifties,
innings.ducks,
innings.fours,
innings.sixes,
innings.balls,
CONCAT(grounds.CountryName, ' - ', grounds.KnownAs) Ground,
'' Year,
'' CountryName
from (select count(case when inningsnumber = 1 then 1 end) matches,
count(case when dismissaltype != 11 and dismissaltype != 14 then 1 end) innings,
LocationId,
playerid,
MatchType,
SUM(score) runs,
SUM(notout) notouts,
SUM(hundred) Hundreds,
SUM(fifty) Fifties,
SUM(duck) Ducks,
SUM(fours) Fours,
SUM(sixes) Sixes,
SUM(balls) Balls
from tmp_bat
group by MatchType, playerid, LocationId) as innings
JOIN players ON players.id = innings.playerid
join grounds on Grounds.GroundId = LocationId and grounds.MatchType = innings.MatchType
join
(select pt.playerid, t.matchtype, GROUP_CONCAT(t.name SEPARATOR ', ') as teams
from playersteams pt
join teams t on pt.teamid = t.id
group by pt.playerid, t.matchtype)
as teams on teams.playerid = innings.playerid and teams.matchtype = innings.MatchType
JOIN
(SELECT playerid,
LocationId,
MAX(Score) Score,
MAX(NotOut) NotOut
FROM (SELECT battingdetails.playerid,
battingdetails.score,
battingdetails.notout,
battingdetails.LocationId
FROM tmp_bat as battingdetails
JOIN (SELECT battingdetails.playerid,
battingdetails.LocationId,
MAX(battingdetails.Score) AS score
FROM tmp_bat as battingdetails
GROUP BY battingdetails.playerid,
battingdetails.LocationId,
battingdetails.playerid) AS maxscore
ON battingdetails.score = maxscore.score
AND battingdetails.playerid = maxscore.playerid
AND battingdetails.LocationId = maxscore.LocationId ) AS internal
GROUP BY internal.playerid, internal.LocationId) AS HS
ON HS.playerid = innings.playerid and hs.LocationId = innings.LocationId
where innings.runs >= #runs_limit
order by runs desc, KnownAs, SortNamePart
limit 0, 300;
Wherever you see '#match_type' then I substitute that for a value ('t'). This query takes ~1.1 secs to run. The query with the hard coded values rather than the variables down to ~3.5 secs (see the other note below). The EXPLAIN for this query gives this:
1,PRIMARY,<derived7>,,ALL,,,,,219291,100,Using temporary; Using filesort
1,PRIMARY,players,,eq_ref,PRIMARY,PRIMARY,4,teams.playerid,1,100,
1,PRIMARY,<derived2>,,ref,<auto_key3>,<auto_key3>,26,"teams.playerid,teams.matchtype",11,100,Using where
1,PRIMARY,grounds,,ref,GroundId,GroundId,4,innings.LocationId,1,10,Using where
1,PRIMARY,<derived8>,,ref,<auto_key0>,<auto_key0>,8,"teams.playerid,innings.LocationId",169,100,
8,DERIVED,<derived3>,,ALL,,,,,349893,100,Using temporary
8,DERIVED,<derived14>,,ref,<auto_key0>,<auto_key0>,13,"battingdetails.PlayerId,battingdetails.LocationId,battingdetails.Score",10,100,Using index
14,DERIVED,<derived3>,,ALL,,,,,349893,100,Using temporary
7,DERIVED,t,,ALL,PRIMARY,,,,3323,100,Using temporary; Using filesort
7,DERIVED,pt,,ref,TeamId,TeamId,4,t.Id,65,100,
2,DERIVED,<derived3>,,ALL,,,,,349893,100,Using temporary
3,DERIVED,matches,,ALL,PRIMARY,,,,114162,10,Using where
3,DERIVED,m,,eq_ref,PRIMARY,PRIMARY,4,matches.Id,1,100,
3,DERIVED,emd1,,ref,"PRIMARY,TeamId",PRIMARY,4,matches.Id,1,100,Using index
3,DERIVED,emd2,,eq_ref,"PRIMARY,TeamId",PRIMARY,8,"matches.Id,emd1.TeamId",1,100,Using index
3,DERIVED,battingdetails,,ref,"TeamId,MatchId,match_team",match_team,8,"emd1.TeamId,matches.Id",15,100,
3,DERIVED,battingdetails,,ref,MatchId,MatchId,4,matches.Id,31,100,Using index; FirstMatch(battingdetails)
and the EXPLAIN for the query with the hardcoded values looks like this:
1,PRIMARY,<derived8>,,ALL,,,,,20097,100,Using temporary; Using filesort
1,PRIMARY,players,,eq_ref,PRIMARY,PRIMARY,4,HS.PlayerId,1,100,
1,PRIMARY,grounds,,ref,GroundId,GroundId,4,HS.LocationId,1,100,Using where
1,PRIMARY,<derived2>,,ref,<auto_key0>,<auto_key0>,30,"HS.LocationId,HS.PlayerId,grounds.MatchType",17,100,Using where
1,PRIMARY,<derived7>,,ref,<auto_key0>,<auto_key0>,46,"HS.PlayerId,innings.MatchType",10,100,Using where
8,DERIVED,matches,,ALL,PRIMARY,,,,114162,10,Using where; Using temporary
8,DERIVED,m,,eq_ref,"PRIMARY,LocationId",PRIMARY,4,matches.Id,1,100,
8,DERIVED,emd1,,ref,"PRIMARY,TeamId",PRIMARY,4,matches.Id,1,100,Using index
8,DERIVED,emd2,,eq_ref,"PRIMARY,TeamId",PRIMARY,8,"matches.Id,emd1.TeamId",1,100,Using index
8,DERIVED,<derived14>,,ref,<auto_key2>,<auto_key2>,4,m.LocationId,17,100,
8,DERIVED,battingdetails,,ref,"PlayerId,TeamId,Score,MatchId,match_team",MatchId,8,"matches.Id,maxscore.PlayerId",1,3.56,Using where
8,DERIVED,battingdetails,,ref,MatchId,MatchId,4,matches.Id,31,100,Using index; FirstMatch(battingdetails)
14,DERIVED,matches,,ALL,PRIMARY,,,,114162,10,Using where; Using temporary
14,DERIVED,m,,eq_ref,PRIMARY,PRIMARY,4,matches.Id,1,100,
14,DERIVED,emd1,,ref,"PRIMARY,TeamId",PRIMARY,4,matches.Id,1,100,Using index
14,DERIVED,emd2,,eq_ref,"PRIMARY,TeamId",PRIMARY,8,"matches.Id,emd1.TeamId",1,100,Using index
14,DERIVED,battingdetails,,ref,"TeamId,MatchId,match_team",match_team,8,"emd1.TeamId,matches.Id",15,100,
14,DERIVED,battingdetails,,ref,MatchId,MatchId,4,matches.Id,31,100,Using index; FirstMatch(battingdetails)
7,DERIVED,t,,ALL,PRIMARY,,,,3323,100,Using temporary; Using filesort
7,DERIVED,pt,,ref,TeamId,TeamId,4,t.Id,65,100,
2,DERIVED,matches,,ALL,PRIMARY,,,,114162,10,Using where; Using temporary
2,DERIVED,m,,eq_ref,PRIMARY,PRIMARY,4,matches.Id,1,100,
2,DERIVED,emd1,,ref,"PRIMARY,TeamId",PRIMARY,4,matches.Id,1,100,Using index
2,DERIVED,emd2,,eq_ref,"PRIMARY,TeamId",PRIMARY,8,"matches.Id,emd1.TeamId",1,100,Using index
2,DERIVED,battingdetails,,ref,"TeamId,MatchId,match_team",match_team,8,"emd1.TeamId,matches.Id",15,100,
2,DERIVED,battingdetails,,ref,MatchId,MatchId,4,matches.Id,31,100,Using index; FirstMatch(battingdetails)
Pointers as to ways to improve my SQL are always welcome (I'm definitely not a database person), but I''d still like to understand whether I can use the SQL with the variables from code and why that improves the performance by so much
Update 2 1st Jan
AAArrrggghhh. My machine rebooted overnight and now the queries are generally running much quicker. It's still 1 sec vs 3 secs but the 20 times slowdown does seem to have disappeared
In your WITH construct, are you overthinking your select in ( select in ( select in ))) ... overstating what could just be simplified to the with Innings I have in my solution.
Also, you were joining to the extraMatchDetails TWICE, but joined on the same conditions on match and team, but never utliized either of those tables in the "WITH CTE" rendering that component useless, doesn't it? However, the MATCH table has homeTeamID and AwayTeamID which is what I THINK your actual intent was
Also, your WITH CTE is pulling many columns not needed or used in subsequent return such as Captain, WicketKeeper.
So, I have restructured... pre-query the batting details once up front and summarized, then you should be able to join off that.
Hopefully this MIGHT be a better fit, function and performance for your needs.
with innings as
(
select
bd.matchId,
bd.matchtype,
bd.playerid,
m.locationId,
count(case when bd.inningsnumber = 1 then 1 end) matches,
count(case when bd.dismissaltype in ( 11, 14 ) then 0 else 1 end) innings,
SUM(bd.score) runs,
SUM(bd.notout) notouts,
SUM(bd.hundred) Hundreds,
SUM(bd.fifty) Fifties,
SUM(bd.duck) Ducks,
SUM(bd.fours) Fours,
SUM(bd.sixes) Sixes,
SUM(bd.balls) Balls
from
battingDetails bd
join Match m
on bd.MatchID = m.MatchID
where
matchtype = #match_type
group by
bd.matchId,
bd.matchType,
bd.playerid,
m.locationId
)
select
p.fullname playerFullName,
p.sortnamepart,
CONCAT(g.CountryName, ' - ', g.KnownAs) Ground,
t.team,
i.matches,
i.innings,
i.runs,
i.notouts,
i.hundreds,
i.fifties,
i.ducks,
i.fours,
i.sixes,
i.balls,
CAST( TRUNCATE( i.runs / (CAST((i.Innings - i.notOuts) AS DECIMAL)), 2) AS DECIMAL(7, 2)) 'Avg',
hs.maxScore,
hs.maxNotOut,
'' opponents,
'' Year,
'' CountryName
from
innings i
JOIN players p
ON i.playerid = p.id
join grounds g
on i.locationId = g.GroundId
and i.matchType = g.matchType
join
(select
pt.playerid,
t.matchtype,
GROUP_CONCAT(t.name SEPARATOR ', ') team
from
playersteams pt
join teams t
on pt.teamid = t.id
group by
pt.playerid,
t.matchtype) as t
on i.playerid = t.playerid
and i.MatchType = t.matchtype
join
( select
i2.playerid,
i2.locationid,
max( i2.score ) maxScore,
max( i2.notOut ) maxNotOut
from
innings i2
group by
i2.playerid,
i2.LocationId ) HS
on i.playerid = HS.playerid
AND i.locationid = HS.locationid
FROM
where
i.runs >= #runs_limit
order by
i.runs desc,
g.KnownAs,
p.SortNamePart
limit
0, 300;
Now, I know that you stated that after the server reboot, performance is better, but really, what you DO have appears to really have overbloated queries.
Not sure this is the correct answer but I thought I'd post this in case other people have the same issue.
The issue seems to be the use of CTEs in a stored procedure. I have a query that creates a CTE and then uses that CTE 8 times. If I run this query using interpolated variables it takes about 0.8 sec, if I turn it into a stored procedure and use the stored procedure parameters then it takes about to a minute (between 45 and 63 seconds) to run!
I've found a couple of ways of fixing this, one is to use multiple temporary tables (8 in this case) as MySQL cannot re-use a temp table in a query. This gets the query time right down but just doesn't fell like a maintainable or scalable solution. The other fix is to leave the variables in place and assign them from the stored procedure parameters, this also has no real performance issues. So my sproc looks like this:
create procedure bowling_individual_career_records_by_year_for_team_vs_opponent(IN team_id INT,
IN opponents_id INT)
begin
set #team_id = team_id;
set #opponents_id = opponents_id;
# use these variables in the SQL below
...
end
Not sure this is the best solution but it works for me and keeps the structure of the SQL the same as it was previously.

How select count distinct (unique truckers) without group by function and maybe without using Having (not sure about last)

I have a task, but couldn't solve it:
There are truckers and they have to travel between cities.
We have data of these travels in our database in 2 tables:
trucker_traffic
tt_id (key)
date
starting_point_coordinate
destination_coordinate
traveller_id
event_type ('travel', 'accident')
parent_event_id (For 'accident' event type it's tt_id of the original travel. There might be few accidents within one travel.)
trucker_places
coordinate (key)
country
city
I need SQL query to pull the number of all unique truckers who travelled more than once from or to London city in June 2020.
In the same query pull the number of these travels who got into an accident.
Example of my tries
SELECT
count(distinct(tt.traveller_id)),
FROM trucker_traffic tt
JOIN trucker_places tp
ON tt.starting_point_coordinate = tp.coordinate
OR tt.destination_coordinate = tp.coordinate
WHERE
tp.city = 'London'
AND month(tt.date) = 6
AND year(tt.date) = 2020
GROUP BY tt.traveller_id
HAVING count(tt.tt_id) > 1
But it's select count distinct truckers with grouping and works only if I had one tracker in db
For second part of task (where I have select number of travels with accident - I think that good to use function like this
SUM(if(count(tt_id = parent_event_id),1,0))
But I'm not sure
This is rather complicated, so make sure you do this step by step. WITH clauses help with this.
Steps
Find travels from and to London in June 2020. You can use IN or EXISTS in order to see whether a travel had accidents.
Group the London travels by traveller, count travels and accident travels and only keep those travellers with more than one travel.
Take this result set to count the travellers and sum up their travels.
Query
with london_travels as
(
select
traveller_id,
case when tt_id in
(select parent_event_id from trucker_traffic where event_type = 'accident')
then 1 else 0 end as accident
from trucker_traffic tt
where event_type = 'travel'
and month(tt.date) = 6
and year(tt.date) = 2020
and exists
(
select
from trucker_places tp
where tp.coordinate in (tt.starting_point_coordinate, tt.destination_coordinate)
and tp.city = 'London'
)
)
, london_travellers as
(
select
traveller_id,
count(*) as travels,
sum(accident) as accident_travels
from london_travels
group by traveller_id
having count(*) > 1;
)
select
count(*) as total_travellers,
sum(travels) as total_travels,
sum(accident_travels) as total_accident_travels
from london_travellers;
If your MySQL version doesn't support WITH clauses, you can of course just nest the queries. I.e.
with a as (...), b as (... from a) select * from b;
becomes
select * from (... from (...) a) b;
You say in the request title that you don't want GROUP BY in the query. This is possible, but makes the query more complicated. If you want to do this I leave this as a task for you. Hint: You can select travellers and count in subqueries per traveller.

My SUM with cases seems to be repeating twice

I have some camp management software that registers users for a camp.
I am trying to get how much a user owes on their account based on how much a camp costs and whether they are using the bus, and whether or not they sign up for the horse option. (These all cost extra).
I originally was grouping by registration_ids which a camper can have multiple of if they sign up for a camp. But when I put this in I get this:
https://imgur.com/i63Bnsu
This is my sql:
SELECT srbc_campers.camper_id,
/*Calculate how much the user owes*/
SUM(
srbc_camps.cost + (CASE WHEN srbc_registration.horse_opt = 1 THEN srbc_camps.horse_opt_cost
ELSE 0
END)
+
(CASE WHEN srbc_registration.busride = 'to' THEN 35
WHEN srbc_registration.busride = 'from' THEN 35
WHEN srbc_registration.busride = 'both' THEN 60
ELSE 0
END)
- IF(srbc_registration.discount IS NULL,0,srbc_registration.discount)
- IF(srbc_registration.scholarship_amt IS NULL,0,srbc_registration.scholarship_amt)
) AS owe
FROM (
srbc_registration INNER JOIN srbc_camps ON srbc_registration.camp_id=srbc_camps.camp_id)
INNER JOIN srbc_payments ON srbc_registration.registration_id = srbc_payments.registration_id)
INNER JOIN srbc_campers ON srbc_campers.camper_id=srbc_registration.camper_id)
WHERE NOT srbc_payments.payment_type='Store'
GROUP BY srbc_campers.camper_id
This seems to be affected by how many payments they have made in their account. It multiplies the amount they owe times how many individual payments were made toward that camp. I can't figure out how to stop this.
For instance in picture above^
We have camper_id #4 and they owe 678.
I expect camper_id #4 to owe 339. They have made 2 payments on their account in srbc_payments.
Haven't been using sql for that long, so any suggestions for a better way I am open too!
You are not selecting anything from srbc_payments, just checking for registration_id in srbc_payments. Or did you forget to subtract payments from srbc_payments? You can replace the inner join with:
where srbc_registration.registration_id in
(
select t1.registration_id from srbc_payments t1
where t1.registration_id = srbc_registration.registration_id
and t1.payment_type <> 'Store'
)
This is what I ended up getting to work how I wanted it too:
SELECT owedTble.registration_id,owe
FROM (SELECT registration_id,
SUM(
srbc_camps.cost + (CASE WHEN srbc_registration.horse_opt = 1 THEN srbc_camps.horse_opt_cost
ELSE 0
END)
+
(CASE WHEN srbc_registration.busride = 'to' THEN 35
WHEN srbc_registration.busride = 'from' THEN 35
WHEN srbc_registration.busride = 'both' THEN 60
ELSE 0
END)
- IF(srbc_registration.discount IS NULL,0,srbc_registration.discount)
- IF(srbc_registration.scholarship_amt IS NULL,0,srbc_registration.scholarship_amt)
) AS owe
FROM srbc_camps INNER JOIN srbc_registration ON srbc_camps.camp_id=srbc_registration.camp_id
GROUP BY srbc_registration.registration_id
) as owedTble
I kind of understand what I did here. I ended up trying different things from this answer: My SUM with cases seems to be repeating twice
Thanks for the helpful comments from #nick and #a_horse_with_no_name

mysql order with many criterias on two tables

I'm trying to sort a visitor list after some different criteria and got stuck, as I can't figure out, how to do this.
I have a queue of people who check in first, and out of that the list is generated. The client is marked as showedUp, if he comes to the door (after called with his number on the list). If someone comes late, he must be at the end of the list. Another thing is, the list starts everytime with a different number.
Day 1 -> List from 1 to 160
Day 2 -> List from 33 to 160, 1 to 32
Day 3 -> List from 65 to 160, 1 to 64
If someone comes late, meaning the number after him is already called, he should be added to the end of the list, like 1 to 160, 10 was late, as 20 was already called, it should be 1 to 160, 10. If there is another starting number it should be 33 to 160, 1 to 32, 10. The criteria here is: if a placeNr after your number is already called (showedUp), than you be at the end of the list.
Tables
clients (id, name, placeNr)
visits (id, pid, checkInTime, showedUp, showedUpTime)
Select
SELECT clients.id AS id, visits.id AS visitId, clients.placeNr AS placeNr, clients.name AS name
FROM clients, visits
WHERE clients.id = visits.pid AND visits.checkInTime >= '1447286401' AND visits.checkInTime <= '1447372799'
ORDER BY clients.placeNr < '1', if(visits.showedUpTime < visits.checkInTime, clients.placeNr, 1), ttc.placeNr
So how do I get the late showers at the end of my list?
Thank you very much in advance!
If I follow your logic, you need to specify whether or not someone is late. The following is the structure that you want for this type of query. I think I've captured the rules in your question:
select v.id, v.id AS visitId, c.placeNr, c.name,
(case when v.showedUpTime >
(select min(v.checkInTime)
from visits v2 join
clients c2
on v2.pid = c2.id
where date(v2.showedUpTime) = date(v.showedUpTime) and
c2.placeNr > c.placeNr
)
then 1 else 0 end) as IsLate
from clients c join
visits v
on c.id = v.pid
order by date(v.showedUpTime),
isLate,
c.placeNr;

mysql query if condition

Hi there i have two tables a2_deal(I havent mentioned entire table as its very big)
deviceID companyID stage serverTime
1 14 -1 1349449200
1 1 -1 1349445600
2 21 -1 1349449200
3 17 -1 1349447160
1 14 3 1344449200
1 14 2 1340449200
and another table called a2_comp
companyID name
1 Microsoft
14 DELL
15 APPLE
17 Google
I am trying to get the most recent stage of a company By using below query:
SELECT deal.companyID, companies.name as Company,
if(max(serverTime),stage,Null) as Stage
FROM `a2_deal` AS deal
LEFT JOIN `a2_comp` AS companies ON deal.companyID = companies.companyID
GROUP BY companyID
ORDER BY serverTime
in my query i am using if(max(serverTime),stage,Null) as Stage which means select the stage value related to most recent server time . ie it should give me -1 as the stage of companyID 14.... But for some reason i am not getting correct output..Please explain how my logic is wrong here... Thank You
You want the groupwise maximum:
SELECT a2_comp.*, a2_deal.*
FROM a2_deal NATURAL JOIN (
SELECT companyID, MAX(serverTime) AS serverTime
FROM a2_deal
GROUP BY companyID
) t JOIN a2_comp USING (companyID)
See it on sqlfiddle.
case is used for inline conditions in your query. Also, you may need to do
(case when max(serverTime) = serverTime then stage else null end) as Stage
I'm not totally sure that's valid, but you can try it out.
Try this
SELECT deal.companyID, deal.stage, comp.name
FROM a2_deal AS deal, a2_comp AS comp
WHERE deal.serverTime =
(SELECT MAX(deal2.serverTime)
FROM a2_deal AS deal2
WHERE deal2.companyID = deal.companyID)
AND comp.companyID = deal.companyID
GROUP BY deal.companyID
This might be a little confusing but the most interesting part is the sub query which selecting recent serverTime for each company. I have used theta style query and hence JOIN is not necessary.