Left Join and Sum - mysql

I have two tables, one that lists grants/loans and one that lists individual expenditures. They share an ID column as each expenditure is assigned to a specific grant or loan. I'm trying to use LEFT JOIN to sum the expenditures for all the loans combined, but not the grants.
Here's where I'm at:
SELECT SUM(expenses.total_amt) AS total
FROM expenses WHERE loans_grants.grant_loan_type = 'Loan'
LEFT JOIN loans_grants
ON expenses.grant_loan_id = loans_grants.internal_id;
Any tips much appreciated!
Edit: thanks all, and apologies for the half baked question, it was late and I was in the weeds.
Here's the basic structures:
expenses:
expenses table structure
loans_grants:
loans_grants table structure
I've updated the code based on #jwood74's answer to this:
SELECT l.internal_id, SUM(e.total_amt) amount
FROM loans_grants l
LEFT JOIN expenses e ON e.grant_loan_id = l.internal_id
WHERE grant_loan_type = 'Loan'
group by l.internal_id
which produces this:
internal id
amount
1
3234
4
null
5
7625
7
null
9
null
Please excuse my noviceness, but I'm trying to sum up all expenses for loans, so I'd like to return 3234 + 7625, rather than summing expenses for each loan separately. Thanks for your help!

If you are looking for a SINGLE ROW RETURNED, you do not need to do a group by anything... just the SUM() of what you are looking for.
Second, do not post pictures of your sample data and table structures. Edit your original post and type the values in, even if you copy/paste the data and format it for readability (via Ctrl+K, or the curly brackets {} icon above post editing header area).
In this case, your tables
Loan_Grants table
Internal_id grant_loan_type
1 Loan
2 Grant
3 Grant
4 Loan
5 Loan
6 Grant
7 Loan
8 Grant
9 Loan
Expenses Table
total_amt grant_loan_id
2000 1
245 5
4500 5
2200 5
445 5
185 5
1234 1
50 5
Starting with your Loan_Grants table filtered on just your 'Loan' records
select
sum( e.total_amt ) totalExpenses
from
loan_grants lg
JOIN expenses e
on lg.internal_id = e.grant_loan_id
where
lg.grant_loan_type = 'Loan'
You dont want a left-join unless you explicitly want to see ALL "Loan" entries, even if they have no expenses yet recorded. By doing a regular (inner) JOIN, it means there MUST be a record in the expenses table. Again, based on your needs. If you have 10,000 loans and only 247 loans have expenses, do you want to see all 10,000 or just the 247 and what their totals are. Since you are summarizing to a single return record, JOIN is your best choice here.
For future, ALWAYS try to apply a table.column or alias.column to all your fields so anyone assisting does not have to guess which table the column comes from.

Without knowing the exact format of the two tables, it's a bit hard. But here would be the general idea-
select
l.id,
sum(e.amount) amount
from loans_grants l
left join expenses e on e.grant_loan_id = l.internal_id
where grant_loan_type = 'Loan'
group by l.id

Related

How to query scalable prices in MySQL

Dear stack overflow community,
This is my first post so please bear with me :)
I need to solve a SQL problem for a friend of mine.
He is running a web shop and wants to create a finance report.
The application he is using provides such functionality using an interface were MySQL queries can be executed.
I already created most of the report with my (limited) SQL knowledge, however I am struggeling to solve the last problem.
The goal of the report is to UNION and JOIN several tables to get an overview of all commissions, invoices and proposals with their respective articles and prices.
So what I did so far:
I did a union of commissions, invoices, proposals (lets call them receipt) and joined them with articles and prices.
That worked very well.
However, here is my problem:
An article could have multiple prices depending on the date of the respective receipt.
So I end up with more rows in my table as there should be.
There is a "valid_until" field within the prices table, which I have to use for the filter ... but how?
Example:
receipt_id
receipt_date
article_id
article_price
valid_until
price_id
209986-1
2020-09-10
2925
13
2020-12-06
1
209986-1
2020-09-10
2931
13
2020-09-09
2
209986-1
2020-09-10
2937
12,6
2020-09-12
3
209986-1
2020-09-10
2980
12,32
0000-00-00
4
In this case, only price_id 3 is valid as the receipt_date is "2020-09-10".
My Query (with limited SQL knowledge):
SELECT *
FROM (SELECT * FROM commissions UNION ALL SELECT * FROM invoices UNION ALL SELECT * FROM proposals) AS receipt
LEFT JOIN article ON receipt.article = article.id
LEFT JOIN prices ON article.id = prices.artikel
WHERE receipt.date <= IF(prices.valid_until = '0000-00-00', Date('3000-01-01'), prices.valid_until)
With that query I still get 3 results (price_id 4, 3 and 1).
I managed to identify the valid price using DATEDIFF(), ORDER BY and LIMIT, however MySQL does not allow to use LIMIT in sub-queries :(
Any help would be much appreciated.
KR,
Wlad

How to combine the data in 2 table, form the aggregation without duplicating the record by left join

Now I am working with SQL files and have a question:
I would like to review the effect of the promotion campaign with the data in the sql file. In the SQL file there are 2 tables, web traffic and promotion campaign
The web traffic table, let's say table web are as follows
visitor_id purchase date traffic_source campaign_name country purchase_value
1 1/1/2018 Search promotion101 US 100
2 2/1/2018 Direct voucher02 UK 110
3 2/1/2018 Search buyme01 US 50
4 3/1/2018 Banner Example01 DE 130
.. ....... ... ... .. ...
And in the second table I have the campaign information, let's say table promotion
Promotion_date campaign_name num_delivered promotion_fee
1/12/2017 promotion101 50 30
2/12/2017 promotion101 30 20
2/12/2017 voucher02 40 10
3/12/2017 Example01 70 30
... ... ... ...
In this case, I tried to use the left join to merge the table first but the record duplicated
Select
web.campaign_name,
sum(web.promotion_fee),
sum(promotion.purchase_value)
FROM
web LEFT JOIN promotion
ON web.campaign_name = promotion.campaign_name
GROUP BY
1
However, it doesn't work because the left join simply duplicate the record...
In this case, If I want to formulate the table like this:
Campaign_name Traffic_source Total_Customer Total_purchase_value Total expenditure
promotion101 Search 1000 2000 1500
Example01 Banner 2000 3750 3000
Is it possible to do so? If yes then How can I make it?
Many thanks for your help in advance!
You may peform the aggregations of each table in separate subqueries:
SELECT
w.campaign_name,
w.purchase_value AS Total_purchase_value,
COALESCE(p.promotion_fee, 0) AS Total_expenditure
FROM
(
SELECT campaign_name, SUM(purchase_value) AS purchase_value
FROM web
GROUP BY campaign_name
) w
LEFT JOIN
(
SELECT campaign_name, SUM(promotion_fee) AS promotion_fee
FROM promotion
GROUP BY campaign_name
) p
ON w.campaign_name = p.campaign_name;
A critical assumption I have made here is that the web table contains data for all campaigns. If not, then you might have to join to a third table containing all campaigns which happened. Actually, arguably such a table should already exist.

Joining and selecting multiple tables and creating new column names

I have very limited experience with MySQL past standard queries, but when it comes to joins and relations between multiple tables I have a bit of an issue.
I've been tasked with creating a job that will pull a few values from a mysql database every 15 minutes but the info it needs to display is pulled from multiple tables.
I have worked with it for a while to figure out the relationships between everything for the phone system and I have discovered how I need to pull everything out but I'm trying to find the right way to create the job to do the joins.
I'm thinking of creating a new table for the info I need, with columns named as:
Extension | Total Talk Time | Total Calls | Outbound Calls | Inbound Calls | Missed Calls
I know that I need to start with the extension ID from my 'user' table and match it with 'extensionID' in my 'callSession'. There may be multiple instances of each extensionID but each instance creates a new 'UniqueCallID'.
The 'UniqueCallID' field then matches to 'UniqueCallID' in my 'CallSum' table. At that point, I just need to be able to say "For each 'uniqueCallID' that is associated with the same 'extensionID', get the sum of all instances in each column or a count of those instances".
Here is an example of what I need it to do:
callSession Table
UniqueCallID | extensionID |
----------------------------
A 123
B 123
C 123
callSum table
UniqueCallID | Duration | Answered |
------------------------------------
A 10 1
B 5 1
C 15 0
newReport table
Extension | Total Talk Time | Total Calls | Missed Calls
--------------------------------------------------------
123 30 3 1
Hopefully that conveys my idea properly.
If I create a table to hold these values, I need to know how I would select, join and insert those things based on that diagram but I'm unable to construct the right query/statement.
You simply JOIN the two tables, and do a group by on the extensionID. Also, add formulas to summarize and gather the info.
SELECT
`extensionID` AS `Extension`,
SUM(`Duration`) AS `Total Talk Time`,
COUNT(DISTINCT `UniqueCallID`) as `Total Calls`,
SUM(IF(`Answered` = 1,0,1)) AS `Missed Calls`
FROM `callSession` a
JOIN `callSum` b
ON a.`UniqueCallID` = b.`UniqueCallID`
GROUP BY a.`extensionID`
ORDER BY a.`extensionID`
You can use a join and group by
select
a.extensionID
, sum(b.Duration) as Total_Talk_Time
, count(b.Answered) as Total_Calls
, count(b.Answered) -sum(b.Answered) as Missed_calls
from callSession as a
inner join callSum as b on a.UniqueCallID = b.UniqueCallID
group by a.extensionID
This should do the trick. What you are being asked to do is to aggregate the number of and duration of calls. Unless explicitly requested, you do not need to create a new table to do this. The right combination of JOINs and AGGREGATEs will get the information you need. This should be pretty straightforward... the only semi-interesting part is calculating the number of missed calls, which is accomplished here using a "CASE" statement as a conditional check on whether each call was answered or not.
Pardon my syntax... My experience is with SQL Server.
SELECT CS.Extension, SUM(CA.Duration) [Total Talk Time], COUNT(CS.UniqueCallID) [Total Calls], SUM(CASE CS.Answered WHEN '0' THEN SELECT 1 ELSE SELECT 0 END CASE) [Missed Calls]
FROM callSession CS
INNER JOIN callSum CA ON CA.UniqueCallID = CS.UniqueCallID
GROUP BY CS.Extension

MySQL: How to sum distinct rows in complex joined query

I have MySQL question I cannot solve myself (for the first time).
I have a query-with-parameters database plus PHP program that, together, generate extensive MySQL queries to run.
The problem is actually a simple one: that of correct summation. I need to SUM distinct rows (not values) within a complex, multi-joined query, and I cannot get it to work.
Do not ask why I work with the data structure below - I am working with data that is supplied to me and it needs to be this way. (The tables represent existing invoices.)
I will try to reproduce the situation very simplified here.
TABLE INVOICE
=============
Inv.Nr (ID) Other Data
------------------------
#1 Stuff
#2 Stuff
#3 More Stuff
TABLE INVOICE LINE
==================
ID Inv.Nr QUANTITY ArticleID UNIT PRICE
----------------------------------------------
1 #1 1 5 € 2.50
2 #1 1 109 € 4.00
3 #2 4 77 € 5.00
4 #2 10 91 € 6.00
TABLE INVOICE LINE VAT
======================
ID LINE-ID AMOUNT VATP VAT
1 1 € 2.00 25% € 0.50
2 2 € 2.00 25% € 0.50
3 2 € 1.42 6% € 0.08
4 3 €18.87 6% € 1.23
5 4 €16.00 25% € 4.00
6 4 €37.74 6% € 2.26
As you can see: some articles have a double VAT rate, because they consist of more elements that have different VAT rates (i.e. a book with a cd).
Now the queries are very long, there are much more tables joined that can have dynamic WHERE and GROUP BY clauses. So a query might look somewhat like (again much simplified):
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN `Invoice Line VAT`
ON `Invoice Line`.ID = `Invoice Line VAT`.`Line-ID`)
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE ....
GROUP BY .....
HAVING .....
The INNER JOINs defined by ... are many to 1, so Invoice Line VAT is on the many-side of both its JOIN relations.
The WHERE, GROUP BY and HAVING are semi-dynamically created in PHP code.
My problem is that i cannot get a proper SUM(Amount) and SUM(Quantity) at the same time, since the Quantity is added multiple times if there are multiple VAT rates to one invoice line.
SUM(DISTINCT Quantity) obviously doesn't work, since I need distinct rows, not values.
I cannot really create a subquery that either calculates the number of VAT rates (and divides the SUM(Quantity)), or calculates the Amount, since the subquery needs the same WHERE/HAVING parameters as the main query to work properly, and those are semi-dynamic (the queries are in a database and contain parameters that are filled in following the user's commands). Well, to be fair, I could do it, but it would leave the query-database and the php software extremely complicated, and I don't want to use a very complex solution for such a very simple problem, especially since someone else will have to maintain it in the future.
So how do I:
SUM the quantity only on distinct rows, or
COUNT the number of VAT rates per line, given the WHERE/HAVING (so without a subquery)?
I could add extra fields to the tables to help with this problem, but that possibility didn't help me - yet. For instance: storing the number of VAT rates doesn't help, since in the WHERE there may be a selection on VAT rate.
I hope it is something VERY simple that I overlooked, but I have been searching for hours now to no avail...
If anyone can help me that would be great! Thanks in advance!
EDIT: I found a solution, but I am not very pleased with it. I have to split up the WHERE, and SUM SUMs and repeat columns... It is UGLY and badly maintainable.
It is as follows:
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN
(SELECT SUM(Amount) AS Amount, SUM(VAT) AS VAT, `Line-ID`
FROM ((`Invoice Line VAT`
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE some-where-stuff
GROUP BY `Line-ID`) x
ON `Invoice Line`.ID = x.`Line-ID`)
WHERE other-where-stuff
GROUP BY .....
HAVING .....
I hope someone got a more elegant, simpler solution!
In an update to the question, I answered the question myself. I said that I hoped for a less ugly and badly maintainable solution than:
SELECT `Inv.Nr`, ArticleID, SUM(Quantity), SUM(Amount), SUM(VAT)
FROM ((`Invoice` INNER JOIN `Invoice Line`
ON `Invoice`.`Inv.Nr`=`Invoice Line`.`Inv.Nr`)
INNER JOIN
(SELECT SUM(Amount) AS Amount, SUM(VAT) AS VAT, `Line-ID`
FROM ((`Invoice Line VAT`
INNER JOIN `More Stuff`
ON .... )
INNER JOIN ....
ON ..... )
WHERE some-where-stuff
GROUP BY `Line-ID`) x
ON `Invoice Line`.ID = x.`Line-ID`)
WHERE other-where-stuff
GROUP BY .....
HAVING .....
It turns out, that, now that I am working with this solution and rephrasing all my queries based in it, it is not so humongous and ugly after all. It turns out that it works quite well and much better than other solutions and workarounds I have tried. Because I guess there is no other solution than what I wrote I close this question by answering that above cited answer is the right one.
It turns out that using the correct SQL code instead of workarounds is the right way to do, even when it looks too complicated at first. And since there is nothing like SUM(DISTINCT ...) that works with distinct records instead of values, in this case the above code is the correct code.

Efficient MySQL query method for multiple joins

I am asking this question in the hope there is a more efficient (faster) way to pull and insert data in the the tables I am working with.
The basic structure of the data table is
ID Doc_ID Field Value
1 10 Title abc
2 10 Abstract xyz
3 10 Author Bob
4 11 Publisher Bookworms
5 11 Title zzz
6 11 Abstract bbb
7 12 Title aaa
8 12 Sale No
In other words the data tables are row based, each row contain a document id and the corresponding field value. Not all documents have the same number of fields defined. Indeed books may differ radically from magazines.
The data table is 10,000,000 rows typically a document has 100 fields associated with it.
So the performance problem I am finding is pulling a report with reference to 50+ different fields, for example if I have a query list in an order_table the query could be like
select ord.number as 'Order ID', d1.value as 'Title', d2.value as 'Author' .......
from order_table ord
LEFT JOIN data_table as d1 on d1.Doc_ID=ord.Doc_ID and d1.Field='Title'
LEFT JOIN data_table as d2 on d2.Doc_ID=ord.Doc_ID and d2.Field='Author'
........
LEFT JOIN data_table as d50 on d50.Doc_ID=ord.Doc_ID and d50.Field='Qty'
Using LEFT JOINS as there is no guarantee that the field is defined for that document.
Given there may be some WHERE parameters to limit the list to items (in stock for example or below a price) it is a slow query. Indexes don't really much.
Without being able to change the data model, what is the best way to pull volumes of information out?