MySQL summing totals by state by year - mysql

I have been playing around with this for what seems like hours and I can't get the results I want. Here is the query I am having trouble with:
SELECT year.year, dstate,
(SELECT sum(amount) FROM gift
WHERE year.year = gift.year
AND gift.donorno = donor.donorno)
FROM donor, gift, year
WHERE year.year = gift.year
AND gift.donorno = donor.donorno;
This seems redundant. Anyway, I am trying display the total donations (gift.amount) for each state by year.
ex.
1999 GA 500 (donorno 1 from GA donated 200 and donorno 2 from GA donated 300)
1999 FL 400
2000 GA 600
2000 FL 500
...
To clarify donors can be from the same state but I am trying to total the gift amounts for that state for the year it is donated.
Any advice is appreciated. I feel like the answer is right in front of me.
Here is a picture of tables for reference:

This is a very simple join & aggregation problem.
SELECT y.year, d.state, SUM(g.amount) AS total
FROM gift AS g
INNER JOIN year AS y ON y.year=g.year
INNER JOIN donor AS d ON d.donorno=g.donorno
GROUP BY y.year, d.state
You don't need the sub-query in your SELECT clause in order to get the total amount. You can sum it by grouping. (I think the GROUP BY clause is what you're missing. I recommend reading up on it.) What you've done is called a correlated sub-query and it is going to be very slow over large data sets because it has to be calculated row-by-row instead of as a set operation.
Also, please don't use the old style comma join syntax. Instead use the explicit join syntax as shown above. It is much clearer and will help avoid accidental Cartesian products.

Related

Left Join and Sum

I have two tables, one that lists grants/loans and one that lists individual expenditures. They share an ID column as each expenditure is assigned to a specific grant or loan. I'm trying to use LEFT JOIN to sum the expenditures for all the loans combined, but not the grants.
Here's where I'm at:
SELECT SUM(expenses.total_amt) AS total
FROM expenses WHERE loans_grants.grant_loan_type = 'Loan'
LEFT JOIN loans_grants
ON expenses.grant_loan_id = loans_grants.internal_id;
Any tips much appreciated!
Edit: thanks all, and apologies for the half baked question, it was late and I was in the weeds.
Here's the basic structures:
expenses:
expenses table structure
loans_grants:
loans_grants table structure
I've updated the code based on #jwood74's answer to this:
SELECT l.internal_id, SUM(e.total_amt) amount
FROM loans_grants l
LEFT JOIN expenses e ON e.grant_loan_id = l.internal_id
WHERE grant_loan_type = 'Loan'
group by l.internal_id
which produces this:
internal id
amount
1
3234
4
null
5
7625
7
null
9
null
Please excuse my noviceness, but I'm trying to sum up all expenses for loans, so I'd like to return 3234 + 7625, rather than summing expenses for each loan separately. Thanks for your help!
If you are looking for a SINGLE ROW RETURNED, you do not need to do a group by anything... just the SUM() of what you are looking for.
Second, do not post pictures of your sample data and table structures. Edit your original post and type the values in, even if you copy/paste the data and format it for readability (via Ctrl+K, or the curly brackets {} icon above post editing header area).
In this case, your tables
Loan_Grants table
Internal_id grant_loan_type
1 Loan
2 Grant
3 Grant
4 Loan
5 Loan
6 Grant
7 Loan
8 Grant
9 Loan
Expenses Table
total_amt grant_loan_id
2000 1
245 5
4500 5
2200 5
445 5
185 5
1234 1
50 5
Starting with your Loan_Grants table filtered on just your 'Loan' records
select
sum( e.total_amt ) totalExpenses
from
loan_grants lg
JOIN expenses e
on lg.internal_id = e.grant_loan_id
where
lg.grant_loan_type = 'Loan'
You dont want a left-join unless you explicitly want to see ALL "Loan" entries, even if they have no expenses yet recorded. By doing a regular (inner) JOIN, it means there MUST be a record in the expenses table. Again, based on your needs. If you have 10,000 loans and only 247 loans have expenses, do you want to see all 10,000 or just the 247 and what their totals are. Since you are summarizing to a single return record, JOIN is your best choice here.
For future, ALWAYS try to apply a table.column or alias.column to all your fields so anyone assisting does not have to guess which table the column comes from.
Without knowing the exact format of the two tables, it's a bit hard. But here would be the general idea-
select
l.id,
sum(e.amount) amount
from loans_grants l
left join expenses e on e.grant_loan_id = l.internal_id
where grant_loan_type = 'Loan'
group by l.id

Mysql query with pivot table and multiple joins [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have a sales table, a sale_staff table, a staff table and an offices table.
We are selling properties, and I want to find out the numbers of sales per seller for X month and per office.
The pivot table looks like this
sale_id , staff_id , type
type can be either seller or lister, so I need a where clause for this.
The sales table has a FK to the offices table; office_id
What I have so far is this, its TOTALLY wrong I know, but that's why i'm here - i need to fix the sums and include the office name from the office table, so
select st.first_name, st.last_name, office, count(*) as sold
from sales s, sale_staff ss
left join staff st
on st.id = ss.staff_id
left join offices off
on off.id = s.office_id
where ss.`type` = 'lister' and
year(s.sale_date) = 2017 and
month(s.sale_date) = 12
group by st.id
Sales table is simply a property sale item, price, address, office_id.
Besides the error unknown column s.office_id, as I said, the sum value is incorrect anyway. I'm really not experienced enough to understand this level of relationship joins and aggregating, any pointers please.
Basically I would like to simply see a resultset like
staff(seller) , count , office
Mike , 12 , West
Jim , 7 , East
Fred , 3 , East
Edit: SQLFiddle in case that helps :) Will add some sample test data.
Never use commas in the FROM clause. Always use proper, explicit JOIN syntax. Your problem is because of the scoping rules around commas.
I would recommend:
select st.first_name, st.last_name, o.office, count(*) as sold
from staff st left join
sale_staff ss
on st.id = ss.staff_id join
sales sa
on sa.sale_id = ss.sale_id join
offices o
on o.id = s.office_id
where ss.`type` = 'lister' and
s.sale_date >= '2017-12-01' and
s.sale_date < '2018-01-01'
group by st.first_name, st.last_name, o.office;
I think this has the join condition correctly laid out, but it is hard to be sure without sample data and desired results.
Notes:
left join is probably not necessary. If it is, you should probably be starting with the staff table (to keep all staff).
Qualify all column names.
The group by includes all the non-aggregated columns in the from. This is a good habit if you are learning SQL.
The date comparisons are direct, without the use of functions.

Moving average query MS Access

I am trying to calculate the moving average of my data. I have googled and found many examples on this site and others but am still stumped. I need to calculate the average of the previous 5 flow for the record selected for the specific product.
My Table looks like the following:
TMDT Prod Flow
8/21/2017 12:01:00 AM A 100
8/20/2017 11:30:45 PM A 150
8/20/2017 10:00:15 PM A 200
8/19/2017 5:00:00 AM B 600
8/17/2017 12:00:00 AM A 300
8/16/2017 11:00:00 AM A 200
8/15/2017 10:00:31 AM A 50
I have been trying the following query:
SELECT b.TMDT, b.Flow, (SELECT AVG(Flow) as MovingAVG
FROM(SELECT TOP 5 *
FROM [mytable] a
WHERE Prod="A" AND [a.TMDT]< b.TMDT
ORDER BY a.TMDT DESC))
FROM mytable AS b;
When I try to run this query I get an input prompt for b.TMDT. Why is b.TMDT not being pulled from mytable?
Should I be using a different method altogether to calculate my moving averages?
I would like to add that I started with another method that works but is extremely slow. It runs fast enough for tables with 100 records or less. However, if the table has more than 100 records it feels like the query comes to a screeching halt.
Original method below.
I created two queries for each product code (There are 15 products): Q_ProdA_Rank and Q_ProdA_MovAvg
Q_ProdA_RanK (T_ProdA is a table with Product A's information):
SELECT a.TMDT, a.Flow, (Select count(*) from [T_ProdA]
where TMDT<=a.TMDT) AS Rank
FROM [T_ProdA] AS a
ORDER BY a.TMDT DESC;
Q_ProdA_MovAvg
SELECT b.TMDT, b.Flow, Round((Select sum(Flow) from [Q_PRodA_Rank] where
Rank between b.Rank-1 and (b.Rank-5))/IIf([Rank]<5,Rank-1,5),0) AS
MovingAvg
FROM [Q_ProdA_Rank] AS b;
The problem is that you're using a nested subquery, and as far as I know (can't find the right site for the documentation at the moment), variable scope in subqueries is limited to the direct parent of the subquery. This means that for your nested query, b.TMDT is outside of the variable scope.
Edit: As this is an interesting problem, and a properly-asked question, here is the full SQL answer. It's somewhat more complex than your try, but should run more efficiently
It contains a nested subquery that first lists the 5 previous flows for per TMDT and prod, then averages that, and then joins that in with the actual query.
SELECT A.TMDT, A.Prod, B.MovingAverage
FROM MyTable AS A LEFT JOIN (
SELECT JoinKeys.TMDT, JoinKeys.Prod, Avg(Top5.Flow) As MovingAverage
FROM (
SELECT JoinKeys.TMDT, JoinKeys.Prod, Top5.Flow
FROM MyTable As JoinKeys INNER JOIN MyTable AS Top5 ON JoinKeys.Prod = Top5.Prod
WHERE Top5.TMDT In (
SELECT TOP 5 A.TMDT FROM MyTable As A WHERE JoinKeys.Prod = A.Prod AND A.TMDT < JoinKeys.TMDT ORDER BY A.TMDT
)
)
GROUP BY JoinKeys.TMDT, JoinKeys.Prod
) AS B
ON A.Prod = B.JoinKeys.Prod AND A.TMDT = B.JoinKeys.TMDT
While in my previous version I advocated a VBA approach, this is probably more efficient, only more difficult to write and adjust.

How to query the most profitable item in a specific year in MySQL

I have a practice problem where I am to write a query to find the top most 15 percent profitable products in the year 2005 from a database. The database does NOT have attributes like "Saleprice, or Purchaseprice". It has tables like PUrchaseProductDetails or SalesOrderDetails, and other stuff with Unitprice, orderquantity, ProdID, LIstPrice, ActualCost, StandardPrice, etc as attributes. I am confused as to which one I should use and how to come up with a formula. I tried to write a query, but got infinitely running results.
SELECT A.ProdID, B.ProdID, A.Unitprice - (B.Unitprice * orderquantity) Profit
FROM SalesOrderDetails A join PurchaseOrderD B
ON A.ProdID = B.ProdID
WHERE year(DateOrdered) = 2005
Group by A.ProdID
I have spent hours on these type of questions and my brain is at a dead end right now. If someone can please direct me to do it the right way, it would really help me out.
SELECT sale.ProdID, sum(sale.Unitprice - buy.Unitprice) * sale.Qty AS profit
FROM SalesOrderDetails AS sale
JOIN PurchaseOrderD AS buy ON ...
WHERE year(...) = 2005
GROUP BY 1
ORDER BY 2 DESC
LIMIT 15

MySQL: How to sum distinct rows in complex joined query

I have MySQL question I cannot solve myself (for the first time).
I have a query-with-parameters database plus PHP program that, together, generate extensive MySQL queries to run.
The problem is actually a simple one: that of correct summation. I need to SUM distinct rows (not values) within a complex, multi-joined query, and I cannot get it to work.
Do not ask why I work with the data structure below - I am working with data that is supplied to me and it needs to be this way. (The tables represent existing invoices.)
I will try to reproduce the situation very simplified here.
TABLE INVOICE
=============
Inv.Nr (ID) Other Data
------------------------
#1 Stuff
#2 Stuff
#3 More Stuff
TABLE INVOICE LINE
==================
ID Inv.Nr QUANTITY ArticleID UNIT PRICE
----------------------------------------------
1 #1 1 5 € 2.50
2 #1 1 109 € 4.00
3 #2 4 77 € 5.00
4 #2 10 91 € 6.00
TABLE INVOICE LINE VAT
======================
ID LINE-ID AMOUNT VATP VAT
1 1 € 2.00 25% € 0.50
2 2 € 2.00 25% € 0.50
3 2 € 1.42 6% € 0.08
4 3 €18.87 6% € 1.23
5 4 €16.00 25% € 4.00
6 4 €37.74 6% € 2.26
As you can see: some articles have a double VAT rate, because they consist of more elements that have different VAT rates (i.e. a book with a cd).
Now the queries are very long, there are much more tables joined that can have dynamic WHERE and GROUP BY clauses. So a query might look somewhat like (again much simplified):
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN `Invoice Line VAT`
ON `Invoice Line`.ID = `Invoice Line VAT`.`Line-ID`)
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE ....
GROUP BY .....
HAVING .....
The INNER JOINs defined by ... are many to 1, so Invoice Line VAT is on the many-side of both its JOIN relations.
The WHERE, GROUP BY and HAVING are semi-dynamically created in PHP code.
My problem is that i cannot get a proper SUM(Amount) and SUM(Quantity) at the same time, since the Quantity is added multiple times if there are multiple VAT rates to one invoice line.
SUM(DISTINCT Quantity) obviously doesn't work, since I need distinct rows, not values.
I cannot really create a subquery that either calculates the number of VAT rates (and divides the SUM(Quantity)), or calculates the Amount, since the subquery needs the same WHERE/HAVING parameters as the main query to work properly, and those are semi-dynamic (the queries are in a database and contain parameters that are filled in following the user's commands). Well, to be fair, I could do it, but it would leave the query-database and the php software extremely complicated, and I don't want to use a very complex solution for such a very simple problem, especially since someone else will have to maintain it in the future.
So how do I:
SUM the quantity only on distinct rows, or
COUNT the number of VAT rates per line, given the WHERE/HAVING (so without a subquery)?
I could add extra fields to the tables to help with this problem, but that possibility didn't help me - yet. For instance: storing the number of VAT rates doesn't help, since in the WHERE there may be a selection on VAT rate.
I hope it is something VERY simple that I overlooked, but I have been searching for hours now to no avail...
If anyone can help me that would be great! Thanks in advance!
EDIT: I found a solution, but I am not very pleased with it. I have to split up the WHERE, and SUM SUMs and repeat columns... It is UGLY and badly maintainable.
It is as follows:
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN
(SELECT SUM(Amount) AS Amount, SUM(VAT) AS VAT, `Line-ID`
FROM ((`Invoice Line VAT`
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE some-where-stuff
GROUP BY `Line-ID`) x
ON `Invoice Line`.ID = x.`Line-ID`)
WHERE other-where-stuff
GROUP BY .....
HAVING .....
I hope someone got a more elegant, simpler solution!
In an update to the question, I answered the question myself. I said that I hoped for a less ugly and badly maintainable solution than:
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN
(SELECT SUM(Amount) AS Amount, SUM(VAT) AS VAT, `Line-ID`
FROM ((`Invoice Line VAT`
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE some-where-stuff
GROUP BY `Line-ID`) x
ON `Invoice Line`.ID = x.`Line-ID`)
WHERE other-where-stuff
GROUP BY .....
HAVING .....
It turns out, that, now that I am working with this solution and rephrasing all my queries based in it, it is not so humongous and ugly after all. It turns out that it works quite well and much better than other solutions and workarounds I have tried. Because I guess there is no other solution than what I wrote I close this question by answering that above cited answer is the right one.
It turns out that using the correct SQL code instead of workarounds is the right way to do, even when it looks too complicated at first. And since there is nothing like SUM(DISTINCT ...) that works with distinct records instead of values, in this case the above code is the correct code.