My SUM with cases seems to be repeating twice - mysql

I have some camp management software that registers users for a camp.
I am trying to get how much a user owes on their account based on how much a camp costs and whether they are using the bus, and whether or not they sign up for the horse option. (These all cost extra).
I originally was grouping by registration_ids which a camper can have multiple of if they sign up for a camp. But when I put this in I get this:
https://imgur.com/i63Bnsu
This is my sql:
SELECT srbc_campers.camper_id,
/*Calculate how much the user owes*/
SUM(
srbc_camps.cost + (CASE WHEN srbc_registration.horse_opt = 1 THEN srbc_camps.horse_opt_cost
ELSE 0
END)
+
(CASE WHEN srbc_registration.busride = 'to' THEN 35
WHEN srbc_registration.busride = 'from' THEN 35
WHEN srbc_registration.busride = 'both' THEN 60
ELSE 0
END)
- IF(srbc_registration.discount IS NULL,0,srbc_registration.discount)
- IF(srbc_registration.scholarship_amt IS NULL,0,srbc_registration.scholarship_amt)
) AS owe
FROM (
srbc_registration INNER JOIN srbc_camps ON srbc_registration.camp_id=srbc_camps.camp_id)
INNER JOIN srbc_payments ON srbc_registration.registration_id = srbc_payments.registration_id)
INNER JOIN srbc_campers ON srbc_campers.camper_id=srbc_registration.camper_id)
WHERE NOT srbc_payments.payment_type='Store'
GROUP BY srbc_campers.camper_id
This seems to be affected by how many payments they have made in their account. It multiplies the amount they owe times how many individual payments were made toward that camp. I can't figure out how to stop this.
For instance in picture above^
We have camper_id #4 and they owe 678.
I expect camper_id #4 to owe 339. They have made 2 payments on their account in srbc_payments.
Haven't been using sql for that long, so any suggestions for a better way I am open too!

You are not selecting anything from srbc_payments, just checking for registration_id in srbc_payments. Or did you forget to subtract payments from srbc_payments? You can replace the inner join with:
where srbc_registration.registration_id in
(
select t1.registration_id from srbc_payments t1
where t1.registration_id = srbc_registration.registration_id
and t1.payment_type <> 'Store'
)

This is what I ended up getting to work how I wanted it too:
SELECT owedTble.registration_id,owe
FROM (SELECT registration_id,
SUM(
srbc_camps.cost + (CASE WHEN srbc_registration.horse_opt = 1 THEN srbc_camps.horse_opt_cost
ELSE 0
END)
+
(CASE WHEN srbc_registration.busride = 'to' THEN 35
WHEN srbc_registration.busride = 'from' THEN 35
WHEN srbc_registration.busride = 'both' THEN 60
ELSE 0
END)
- IF(srbc_registration.discount IS NULL,0,srbc_registration.discount)
- IF(srbc_registration.scholarship_amt IS NULL,0,srbc_registration.scholarship_amt)
) AS owe
FROM srbc_camps INNER JOIN srbc_registration ON srbc_camps.camp_id=srbc_registration.camp_id
GROUP BY srbc_registration.registration_id
) as owedTble
I kind of understand what I did here. I ended up trying different things from this answer: My SUM with cases seems to be repeating twice
Thanks for the helpful comments from #nick and #a_horse_with_no_name

Related

Use the result of a sub-query outside of the sub-query

I have a table structured like this.
User_id
Subscription_type
timestamp
100
PAYING
2/10/2021
99
TRIAL
2/10/2021
100
TRIAL
15/9/2021
I want my output to be the same, with an additional column pulling the trial start date when the subscriber converts to a paying subscription.
User_id
Subscription_type
timestamp
Trial_Start_date
100
PAYING
2/10/2021
15/9/2021
99
TRIAL
2/10/2021
100
TRIAL
2/10/2021
At the moment, I have this query:
SELECT *,
CASE WHEN
(SELECT `subscription_type` FROM subscription_event se1
WHERE se1.`timestamp` < se.`timestamp` AND se1.user_id = se.user_id
ORDER BY user_id DESC LIMIT 1) = 'TRIAL'
then se1.`timestamp` else 0 end as "Converted_from_TRIAL"
FROM subscription_event se
I have an error message with se1.timestamp not been defined. I understand why, but I cannot see a workaround.
Any pointer?
If you need to get two values out of the subquery, you have to join with it, not use it as an expression.
SELECT se.*,
MAX(se1.timestamp) AS Converted_from_TRIAL
FROM subscription_event AS se
LEFT JOIN subscription_event AS se1 ON se.user_id = se1.user_id AND se1.timestamp < se.timestamp AND se1.subscription_type = 'TRIAL'
GROUP BY se.user_id, se.subscription_type, se.timestamp
Thanks a lot!
For some reasons I needed to declare explicitely in SELECT the variables used in the GROUP BY . Not sure why ( I am using MySQL5.7 so maybe it is linked with that).
In any case, this is the working query.
SELECT se.user_id, se.subscription_type, se.timestamp,
MAX(se1.timestamp) AS Converted_from_TRIAL
FROM subscription_event AS se
LEFT JOIN subscription_event AS se1 ON se.user_id = se1.user_id AND se1.timestamp < se.timestamp AND se1.subscription_type = 'TRIAL'
GROUP BY se.user_id, se.subscription_type, se.timestamp

mysql order with many criterias on two tables

I'm trying to sort a visitor list after some different criteria and got stuck, as I can't figure out, how to do this.
I have a queue of people who check in first, and out of that the list is generated. The client is marked as showedUp, if he comes to the door (after called with his number on the list). If someone comes late, he must be at the end of the list. Another thing is, the list starts everytime with a different number.
Day 1 -> List from 1 to 160
Day 2 -> List from 33 to 160, 1 to 32
Day 3 -> List from 65 to 160, 1 to 64
If someone comes late, meaning the number after him is already called, he should be added to the end of the list, like 1 to 160, 10 was late, as 20 was already called, it should be 1 to 160, 10. If there is another starting number it should be 33 to 160, 1 to 32, 10. The criteria here is: if a placeNr after your number is already called (showedUp), than you be at the end of the list.
Tables
clients (id, name, placeNr)
visits (id, pid, checkInTime, showedUp, showedUpTime)
Select
SELECT clients.id AS id, visits.id AS visitId, clients.placeNr AS placeNr, clients.name AS name
FROM clients, visits
WHERE clients.id = visits.pid AND visits.checkInTime >= '1447286401' AND visits.checkInTime <= '1447372799'
ORDER BY clients.placeNr < '1', if(visits.showedUpTime < visits.checkInTime, clients.placeNr, 1), ttc.placeNr
So how do I get the late showers at the end of my list?
Thank you very much in advance!
If I follow your logic, you need to specify whether or not someone is late. The following is the structure that you want for this type of query. I think I've captured the rules in your question:
select v.id, v.id AS visitId, c.placeNr, c.name,
(case when v.showedUpTime >
(select min(v.checkInTime)
from visits v2 join
clients c2
on v2.pid = c2.id
where date(v2.showedUpTime) = date(v.showedUpTime) and
c2.placeNr > c.placeNr
)
then 1 else 0 end) as IsLate
from clients c join
visits v
on c.id = v.pid
order by date(v.showedUpTime),
isLate,
c.placeNr;

SQL - Query same column twice with different dates in where clause

I have tried searching all over for answers but none have answered my exact issue. I have what should be a relatively simple query. However, I am very new and still learning SQL.
I need to query two columns with different dates. I want to return rows with the current number of accounts and current outstanding balance and in the same query, return rows for the same columns with data 90 days prior. This way, we can see how much the number of accounts and balance increased over the past 90 days. Optimally, I am looking for results like this:
PropCode|PropCat|Accts|AcctBal|PriorAccts|PriorBal|
----------------------------------------------------
77 |Comm | 350 | 1,000| 275 | 750
Below is my starting query. I realize it's completely wrong but I have tried numerous different solution attempts but none seem to work for my specific problem. I included it to give an idea of my needs. The accts & AcctBal columns would contain the 1/31/14 data. The PriorAcct & PriorBal columns would contain the 10/31/13 data.
select
prop_code AS PropCode,
prop_cat,
COUNT(act_num) Accts,
SUM(act_bal) AcctBal,
(SELECT
COUNT(act_num)
FROM table1
where date = '10/31/13'
and Pro_Group in ('BB','FF')
and prop_cat not in ('retail', 'personal')
and Not (Acct_Code = 53 and ACTType in (1,2,3,4,5,6,7))
)
AS PriorAccts,
(SELECT
SUM(act_bal)
FROM table1
where date = '10/31/13'
and Pro_Group in ('BB','FF')
and prop_cat not in ('retail', 'personal')
and Not (Acct_Code = 53 and ACTType in (1,2,3,4,5,6,7))
)
AS PriorBal
from table1
where date = '01/31/14'
and Pro_Group in ('BB','FF')
and prop_cat not in ('retail', 'personal')
and Not (Acct_Code = 53 and ACTType in (1,2,3,4,5,6,7))
group by prop_code, prop_cat
order by prop_cat
You can use a CASE with aggregates for this (at least in SQL Server, not sure about MySQL):
...
COUNT(CASE WHEN date='1/31/14' THEN act_num ELSE NULL END) as 'Accts'
,SUM(CASE WHEN date='1/31/14' THEN act_bal ELSE NULL END) as 'AcctBal'
,COUNT(CASE WHEN date='10/31/13' THEN act_num ELSE NULL END) as 'PriorAccts'
,SUM(CASE WHEN date='10/31/13' THEN act_bal ELSE NULL END) as 'PriorAcctBal'
....
WHERE Date IN ('1/31/14', '10/31/13')

mySQL: LEFT JOIN where joining needs to be done on different type of data

I have 2 my tables with data and 2 "not mine" tables (in ReferenceDB) where thing ID can be mapped to its name.
One of mine tables is orders with following important columns: charName, stationID, typeID, bid.
Another table has following important columns: transactionDateTime, stationID, typeID, person, transactionType
I started my head braking with idea how to find orders that doesn't have any records for them lately (e.g. given amount of days). But for beginning I set me a task just to find orders that has no records for them at all. For that I figured out LEFT JOIN see biggest query below.
An order for me is a combination of charName/persone + stationID + typeID + transactionType/bid so if actually one of those four changes it is different order then.
Problem is that transactionType can be "yes" or "no" and bid is 0 or not 0. So I cant or DON'T KNOW HOW to JOIN ON different data types. So logically I'd like to join on 4 columns like:
FROM ordersTable LEFT JOIN recordsTable ON ordersTable.typeID = recordsTable.typeID
AND ordersTable.stationID = recordsTable.stationID
AND ordersTable.charName = recordsTable.person
AND ordersTable.bid = recordsTable.transactionType
Clearly last string of above wouldn't work cause of different data types.
So for a moment I thought that I can do such query twice for bid=0 with transactionType="yes" and second time for bid != 0 and transactionType = "no" see my query below for 0/"yes" combination. But seems it doesn't works exactly as I'd like it to. because AND ordersTable.bid IN (0) AND recordsTable.transactionType="yes" in JOIN ON doesn't sem do anything. (As I do get results where bid=1)
SELECT invTypes.typeName, stastations.stationName, main.* FROM referenceDB.invTypes, referenceDB.stastations, (
SELECT ordersTable.charName, ordersTable.stationID, ordersTable.typeID, ordersTable.bid, ordersTable.orderState, ordersTable.volRemaining
FROM ordersTable LEFT JOIN recordsTable ON ordersTable.typeID = recordsTable.typeID
AND ordersTable.stationID = recordsTable.stationID
AND ordersTable.charName = recordsTable.person
AND ordersTable.bid IN (0) AND recordsTable.transactionType="yes"
WHERE recordsTable.typeID IS NULL
AND ordersTable.orderState IN (0) ) as main
WHERE stastations.stationID = main.stationID AND invTypes.typeID = main.typeID;
Questions:
Is it possible to tell mySQL to treat "yes" as 0 or vise versa? If yes how do I do it in my query? If no what would be my work around (to find orders that doesn't have records related to them)?
And possibly some one can suggset a query that will find orders that didn't have records within given amount of days?
Thank you in advance!
One way is to use the explicit comparisons:
((ordersTable.bid = 0 and recordsTable.transactionType = 'No') or
(ordersTable.bid = 1 and recordsTable.transactionType = 'Yes')
)
Another would be to use a case statement:
(case when recordsTable.transactionType = 'No' then 0 else 1 end) = ordersTable.bid
SELECT invTypes.typeName, stastations.stationName, main.* FROM referenceDB.invTypes, referenceDB.stastations, (
SELECT ordersTable.charName, ordersTable.stationID, ordersTable.typeID, ordersTable.bid, ordersTable.orderState, ordersTable.volRemaining
FROM ordersTable LEFT JOIN recordsTable ON ordersTable.typeID = recordsTable.typeID
AND ordersTable.stationID = recordsTable.stationID
AND ordersTable.charName = recordsTable.person
AND ((ordersTable.bid = 0 AND recordsTable.transactionType = 'yes') OR
(ordersTable.bid != 0 AND recordsTable.transactionType = 'no'))
WHERE recordsTable.typeID IS NULL
AND ordersTable.orderState IN (0) ) as main
WHERE stastations.stationID = main.stationID AND invTypes.typeID = main.typeID;

Sean Lahman database sample queries

Is there any place where can I find sample queries (SELECT, UPDATE, DELETE) for Sean Lahman database? I wanna see what can be done with this database..
I use to ship the database with sample queries. Maybe I should revisit that idea. Here are a few to get you started.
A simple one to show all of the players named "Sean:"
SELECT nameLast, nameFirst, debut
FROM Master
WHERE (nameFirst="Sean")
ORDER BY nameLast;
Here's one to show a list of players with 50 HRs in a season:
SELECT Master.nameLast, Master.nameFirst, Batting.HR, Batting.yearID
FROM Batting INNER JOIN Master ON Batting.playerID = Master.playerID
WHERE (((Batting.HR)>=50))
ORDER BY Batting.HR DESC;
Here's one to show the all-time leaders in strikeouts:
SELECT Master.nameLast, Master.nameFirst, Sum(Pitching.SO) AS SumOfSO
FROM Pitching INNER JOIN Master ON Pitching.playerID = Master.playerID
GROUP BY Pitching.playerID, Master.nameLast, Master.nameFirst
ORDER BY Master.nameLast;
There are several websites with tutorials on using the database that include sample queries. See:
http://webdev.cas.msu.edu/cas992/weeks/week5.html
http://www.hardballtimes.com/main/article/databases-for-sabermetricians-part-one/
You can find more by googling 'sql lahman'
This database can do most anything except game-by-game analysis, for that you will need to go to Retrosheet {http://www.retrosheet.org/game.htm}.
But let's say that you want to replicate the totals you see on Baseball-Reference.com, you could easily do that.
If you want some advanced metrics (Sabermetric-like stats), I recommend Tom Tango's website. There you can find help to do your own queries for wOBA. You can also formulate (try to duplicate) FanGraph's or Baseball-Reference's WAR.
Basically, anything you want (provided you can do the calculations/master the SQL syntax) from this except game-by-game or pitch-by-pitch types of data.
Here's a query to determine Salary by games played (offensively and defensively) to figure out how much a player cost/makes per game. (T-SQL w/ SQL Server 2012 Express)
select
m.namefirst,
m.namelast,
s.yearID,
s.teamID,
s.salary,
Cast ('162' as Int) as FullSeason,
round(sum(s.salary)*1.00/162,0) as Game_Rate,
sum (case when s.playerID=b.playerID then f.g else 0 end) as Gm_App_Field,
b.g as Batting,
--sum(case when s.playerID=b.playerID and s.yearID=b.yearID then b.g else 0 end) as Gm_App_Hit,
sum (case when s.playerID=b.playerID then f.innouts else 0 end) as InnOuts,
sum(F.InnOuts)/27 as FullGames,
round((sum (case when s.playerID=b.playerID then f.g else 0 end)/162.0)*s.salary,0) as PayByGmFielding,
round(sum(b.g*s.salary)/162,0) as PayByGmHitting,
round((sum(F.InnOuts)/27)*(s.salary/162),0) as PlayingSalary
from Fielding f
inner join batting b
on f.playerID=b.playerID and f.yearID=b.yearID
inner join salaries s
on f.playerID=s.playerID and f.yearID=s.yearID
inner join [master] m
on b.playerID=m.playerID and f.playerID=m.playerID and s.playerID=m.playerID
where
f.yearID = '2013' and f.POS <> 'P' --b.playerID = 'zimmejo02'
group by
m.namefirst,m.namelast, s.yearID , s.teamID, s.salary, b.g
Which outputs this:
namefirst namelast yearID teamID salary FullSeason Game_Rate Gm_App_Field Batting InnOuts FullGames PayByGmFielding PayByGmHitting PlayingSalary
A.J. Pollock 2013 ARI 491000 162 9093 119 137 2897 107 360672 1245685 324302
You can also create your own searches, here's one with players who have more BB/SO with full player cards, including the WAR I came up with (may be off a little from FanGraphs or Baseball Reference) - (T-SQL w/ SQL Server 2012 Express)
--1. Retrives Full Player Records of guys with more BB than SO
select
m.namefirst,
m.namelast,
b.yearID,
b.yearID-m.birthyear as Age,
b.G,b.AB,b.R,b.H,b.[2B],b.[3B],b.HR,b.RBI,b.SB,b.BB,b.SO, left(round((b.bb*1.000/b.SO),3),4) [BB/SO Rate], left(round((b.h*1.000/b.ab),3),5) as Average
,b.IBB,b.HBP,b.SH,b.SF,b.SF,b.GIDP,case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end as WAR
from [master] m
inner join batting b on b.playerID=m.playerID
inner join BR_WAR_2013 br on br.playerID=m.playerID
where b.SO <> 0 and b.AB > 300 and b.bb>b.SO
group by
m.namefirst,
m.namelast,
b.yearID,
b.yearID-m.birthyear,
b.G,b.AB,b.R,b.H,b.[2B],b.[3B],b.HR,b.RBI,b.SB,b.BB,b.SO, left(round((b.bb*1.000/b.SO),3),4), left(round((b.h*1.000/b.ab),3),5)
,b.IBB,b.HBP,b.SH,b.SF,b.SF,b.GIDP,case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end
having case when br.yearID=b.yearID and br.playerID=b.playerID then br.War else 'error' end <> 'error'
order by b.yearID desc, left(round((b.bb*1.000/b.SO),3),4) desc