Getting sum of two rows in an inner joined table - mysql

I have these two tables;
trips
id
date
revenue
1
01/01/2020
5000
2
01/01/2020
3000
3
02/01/2020
4000
4
02/01/2020
2000
expenses
id
tripid
amount
1
1
500
2
1
300
3
2
400
4
2
200
5
2
700
I would like to get the sum of revenue collected in a day AND sum of expenses in a day. I have the following sql which gives me results but the sums are entirely wrong.
SELECT i.id, sum(i.revenue) as total, i.date trip , sum(c.amount) as exp, c.tripid expenses FROM trip i INNER JOIN expenses c ON i.id = c.tripid GROUP BY i.date ORDER BY trip DESC

You can preaggregate the expenses by trip, and then aggregate again in the outer query:
select t.date, sum(t.revenue) as revenue, coalesce(sum(e.expense), 0) as expense
from trips t
left join (
select tripid, sum(amount) as expense
from expenses
group by tripid
) e on e.tripid = t.id
group by t.date

Related

How to group rows in SQL by earliest date when are there are multiple rows with earliest date?

I am trying to come up with a query that will return the aggregate data for the earliest orders the customers have placed. What I cannot quite wrap my head around is how to construct this query when there are multiple orders placed on the same day for the earliest purchase date for customer 2.
customers
id
name
created_at
1
Sam
2019-07-12
2
Jimmy
2019-01-22
items
id
name
price
1
Watch
200
2
Belt
75
3
Wallet
150
orders
id
customer_id
item_id
created_at
1
1
1
2018-08-01
2
1
2
2018-08-11
3
2
1
2019-01-22
4
2
3
2019-01-22
5
2
2
2019-03-03
expected query
customer_id
name
first_purchase_date
n_items
total_price
1
Sam
2018-08-01
1
200
2
Jimmy
2019-01-22
2
350
I currently have the following query set up, but this query is grouping by the customer_id such that the total number of items and total price do not reflect the earliest orders.
SELECT
orders.customer_id,
customers.name AS name,
MIN(orders.created_at) AS first_purchase_date,
COUNT(*) as n_items,
SUM(items.price) as total_price
FROM orders
INNER JOIN customers
ON orders.customer_id = customers.id
INNER JOIN items
ON orders.item_id = items.id
GROUP BY
customers.id
my incorrect query
customer_id
name
first_purchase_date
n_items
total_price
1
Sam
2018-08-01
2
275
2
Jimmy
2019-01-22
3
425
I recreated the tables in a SQL Server environment but this should help...I hope as it gives you the query result you're looking for. The data is exactly the same but I'm using temporary tables so hence the # prefixes.
SELECT
#orders.customer_id,
#customer.name AS name,
#orders.created_at as first_purchase_date,
--MIN(#orders.created_at) AS first_purchase_date,
COUNT(*) as n_items,
SUM(#items.price) as total_price
FROM #orders
INNER JOIN #customer
ON #orders.customer_id = #customer.id
INNER JOIN #items
ON #orders.item_id = #items.id
inner join
(
select customer_id, name, MIN(first_purchase_date) as
first_purchase_date
from
(
SELECT
#orders.customer_id,
#customer.name AS name,
#orders.created_at as first_purchase_date,
--MIN(#orders.created_at) AS first_purchase_date,
COUNT(*) as n_items,
SUM(#items.price) as total_price
FROM #orders
INNER JOIN #customer
ON #orders.customer_id = #customer.id
INNER JOIN #items
ON #orders.item_id = #items.id
group by #orders.customer_id,#customer.name, #orders.created_at
)base
group by customer_id, name
) firstorders
on
#customer.id = firstorders.customer_id
and
#customer.name = firstorders.name
and
#orders.created_at = firstorders.first_purchase_date
group by
#orders.customer_id,#customer.name, #orders.created_at

MySQL sums from multiple tables

I have multiple tables and I need to aggregate the data from all of them, but it seems that I always get the wrong results for the sums. What am I doing wrong?
customers
ID Name
1 c1
2 c2
3 c3
budget
ID Cust_ID Value
1 1 100
2 1 300
3 2 600
4 3 450
forecast
ID Cust_ID Value
1 1 200
2 1 500
3 2 100
4 2 700
5 3 550
orders
ID Cust_ID Net_Sales
1 1 100
2 1 200
3 1 300
4 2 400
5 3 500
Here is the expected result:
ID Name sum(budget.Value) sum(forecast.Value) sum(orders.Net_Sales) count(orders.ID)
1 c1 400 700 600 3
2 c2 600 800 400 1
3 c3 450 550 500 1
And here's what I've tried so far:
SELECT customers.ID, customers.Name, sum(budget.Value), sum(forecast.Value), sum(orders.Net_Sales), count(orders.ID)
FROM customers
INNER JOIN budget ON budget.Cust_ID = customers.ID
INNER JOIN forecast ON forecast.Cust_ID = customers.ID
INNER JOIN orders ON orders.Cust_ID = customers.ID
GROUP BY customers.ID
ORDER BY customers.ID ASC
You are joining along multiple dimensions, which multiplies the results.
A simple solution is correlated subqueries:
SELECT c.ID, c.Name,
(SELECT SUM(b.Value)
FROM budget b
WHERE b.Cust_ID = c.ID
) as budget,
(SELECT SUM(f.Value)
FROM forecast f
WHERE f.Cust_ID = c.ID
) as forecast,
(SELECT SUM(o.Net_Sales)
FROM orders o
WHERE o.Cust_ID = c.ID
) as net_sales
FROM customers c
ORDER BY c.ID ASC;
With the right indexes in the second tables (budget(cust_id, value), and so on), this may actually be faster than a JOIN approach.

Creating an overdraft statement

I'm currently stuck on how to create a statement that shows daily overdraft statements for a particular council.
I have the following, councils, users, markets, market_transactions, user_deposits.
market_transaction run daily reducing user's account balance. When the account_balance is 0 the users go into overdraft (negative). When users make a deposit their account balance increases.
I Have put the following tables to show how transactions and deposits are stored.
if I reverse today's transactions I'm able to get what account balance a user had yesterday but to formulate a query to get the daily OD amount is where the problem is.
USERS
user_id
name
account_bal
1
Wells
-5
2
James
100
3
Joy
10
4
Mumbi
-300
DEPOSITS
id
user_id
amount
date
1
1
5
2021-04-26
2
3
10
2021-04-26
3
3
5
2021-04-25
4
4
5
2021-04-25
TRANSACTIONS
id
user_id
amount_tendered
date
1
1
5
2021-04-27
2
2
10
2021-04-26
3
3
15
2021-04-26
4
4
50
2021-04-25
The Relationships are as follows,
COUNCILS
council_id
name
1
a
2
b
3
c
MARKETS
market_id
name
council_id
1
x
3
2
y
1
3
z
2
MARTKET_USER_LINK
id
market_id
user_id
1
1
3
2
2
2
3
3
1
I'm running this SQL query to get the total amount users have spent and subtracting with the current user account balance.
Don't know If I can use this to figure out the account_balance for each day.
SELECT u.user_id, total_spent, total_deposits,m.council_id
FROM users u
JOIN market_user_link ul ON ul.user_id= u.user_id
LEFT JOIN markets m ON ul.market_id =m.market_id
LEFT JOIN councils c ON m.council_id =c.council_id
LEFT JOIN (
SELECT user_id, SUM(amount_tendered) AS total_spent
FROM transactions
WHERE DATE(date) BETWEEN DATE('2021-02-01') AND DATE(NOW())
GROUP BY user_id
) t ON t.user_id= u.user_id
ORDER BY user_id, total_spent ASC
// looks like this when run
| user_id | total_spent | council_id |
|-------------|----------------|------------|
| 1 | 50.00 | 1 |
| 2 | 2.00 | 3 |
I was hoping to reverse transactions and deposits done to get the account balance for a day then get the sum of users with an account balance < 0... But this has just failed to work.
The goal is to produce a query that shows daily overdraft (Only SUM the total account balance of users with account balance below 0 ) for a particular council.
Expected Result
date
council_id
o_d_amount
2021-04-24
1
-300.00
2021-04-24
2
-60.00
2021-04-24
3
-900.00
2021-04-25
1
-600.00
2021-04-25
2
-100.00
2021-04-25
3
-1200.00
This is actually not that hard, but the way you asked makes it hard to follow.
Also, your expected result should match the data you provided.
Edited: Previous solution was wrong - It counted withdraws and deposits more than once if you have more than one event for each user/date.
Start by having the total exchanged on each day, like
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d
group by user_id, date
order by user_id, date;
What follows gets the state of the account only on days that had any deposits or withdraws.
To get the results of all days (and not just those with account movement) you just have to change the cross join part to get a table with all dates you want (like Get all dates between two dates in SQL Server) but I digress...
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
order by dates.date desc, c.council_id, u.name;
From there you can rearrange to get the result you want.
select date, council_id
, sum(case when amount_on_start_of_day<0 then amount_on_start_of_day else 0 end) o_d_amount_start
, sum(case when amount_on_end_of_day<0 then amount_on_end_of_day else 0 end) o_d_amount_end
from (
select dates.date, c.council_id, u.name username
, u.account_bal - sum(case when e.date >= dates.date then e.exchanged_on_day else 0 end) as amount_on_start_of_day
, u.account_bal - sum(case when e.date > dates.date then e.exchanged_on_day else 0 end) as amount_on_end_of_day
from councils c
inner join markets m on c.council_id=m.council_id
inner join market_user_link mul on m.market_id=mul.market_id
inner join users u on mul.user_id=u.user_id
left join (
select user_id, date, sum(amount) exchanged_on_day from (
select user_id, date, amount amount from deposits
union all select user_id, date, -amount_tendered amount from transactions
) d group by user_id, date
) e on u.user_id=e.user_id --exchange on each Day
cross join (select distinct date from (select date from deposits union select date from transactions) datesInternal) dates --all days that had a transaction
group by dates.date, c.council_id, u.name, u.account_bal
) result
group by date, council_id
order by date;
You can check it on https://www.db-fiddle.com/f/msScT6B5F7FjU2aQXVr2da/6
Basically the query maps users to councils, caculates periods of overdrafts for users, them aggregates over councils. I assume that starting balance is dated start of the month '2021-04-01' (it could be ending balance as well, see below), change it as needed. Also that negative starting balance counts as an overdraft. For simplicity and debugging the query is divided into a number of steps.
with uc as (
select distinct m.council_id, mul.user_id
from markets m
join market_user_link mul on m.market_id = mul.market_id
),
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
sum(sum(s)) over(partition by user_id order by date) rt
from (
select user_id, date, -amount_tendered s
from transactions
union all
select user_id, date, amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, 0 s
union all
select '2021-04-01' d, account_bal
) se
) t
group by user_id, date
),
user_overdraft as (
select user_id, date, nxt, least(rt, 0) ovd
from user_running_total
where date <= date(NOW())
),
dates as (
select date
from user_overdraft
union
select nxt
from user_overdraft
),
council__overdraft as (
select uc.council_id, d.date, sum(uo.ovd) total_overdraft, lag(sum(uo.ovd), 1, sum(uo.ovd) - 1) over(partition by uc.council_id order by d.date) prev_ovd
from uc
cross join dates d
join user_overdraft uo on uc.user_id = uo.user_id and d.date between uo.date and uo.nxt
group by uc.council_id, d.date
)
select council_id, date, total_overdraft
from council__overdraft
where total_overdraft <> prev_ovd
order by date, council_id
Really council__overdraft is quite usable, the last step just compacts output excluding intermidiate dates when overdraft is not changed.
With following sample data:
users
user_id name account_bal
1 Wells -5
2 James 100
3 Joy 10
4 Mumbi -300
deposits, odered by date, extra row added for the last date
id user_id amount date
3 3 5 2021-04-25
4 4 5 2021-04-25
1 1 5 2021-04-26
2 3 10 2021-04-26
5 3 73 2021-05-06
transactions, odered by date (note the added row, to illustrate running total in action)
id user_id amount_tendered date
5 4 50 2021-04-25
2 2 10 2021-04-26
3 3 15 2021-04-26
1 1 5 2021-04-27
4 3 17 2021-04-27
councils
council_id name
1 a
2 b
3 c
markets
market_id name council_id
1 x 3
2 y 1
3 z 2
market_user_link
id market_id user_id
1 1 3
2 2 2
3 3 1
4 3 4
the query ouput is
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-305
3
2021-04-01
0
2
2021-04-25
-350
2
2021-04-26
-345
2
2021-04-27
-350
3
2021-04-27
-7
3
2021-05-06
0
Alternatively, provided the users table is holding a closing (NOW()) balance, replace user_running_total CTE with the following code
user_running_total as (
select user_id, date,
coalesce(lead(date) over(partition by user_id order by date) - interval 1 day, date) nxt,
coalesce(sum(sum(s)) over(partition by user_id order by date desc
rows between unbounded preceding and 1 preceding), sum(s)) rt
from (
select user_id, date, amount_tendered s
from transactions
union all
select user_id, date, -amount
from deposits
union all
select user_id, se.d, se.s
from users
cross join lateral (
select date(NOW() + interval 1 day) d, account_bal s
union all
select '2021-04-01' d, 0
) se
) t
where DATE(date) between date '2021-04-01' and date(NOW() + interval 1 day)
group by user_id, date
),
This way the query starts with closing balance dated next date after now and rollouts a running total in the reverse order till '2021-04-01' as a starting date.
Output
council_id
date
overdraft
1
2021-04-01
0
2
2021-04-01
-260
3
2021-04-01
-46
2
2021-04-25
-305
3
2021-04-25
-41
2
2021-04-26
-300
3
2021-04-26
-46
2
2021-04-27
-305
3
2021-04-27
-63
3
2021-05-06
0
db-fiddle both versions

Mysql SUM and GROUP BY from 3 tables

I have 3 database tables as follows:
projects:
id
project_name
project_location
status
project_expenses
id
project_id
expense_category
expense_subcategory
amount
userid
date
payment_status
project_status
project_income
id
project_id
project_income
date
userid
project_status
projects.id, project_expenses.project_id and project_income.project_id are related.
I need to create a query displaying Project_ID,Project_Name,SUM of Project_Income,SUM of Project_expenses
I tried the following query but not getting correct result.
SELECT p.id AS id, p.project_name AS project_name, SUM(i.project_income) AS income, SUM(e.amount) AS expenses
FROM project_income i, project_expenses e, projects p
WHERE i.project_id = p.id AND e.project_id = p.id AND p.status = 'Active'
GROUP BY id
I have currently 2 rows in project_income and 4 rows in project_expenses. The result is project_income displays double the values. Something is wrong with my JOIN.
Being a newbie I am unable to understand what i am doing wrong?? Requesting help...
Use sub-selects in the query result columns. No need for GROUP BY.
SELECT p.id
, p.project_name
, ( SELECT SUM(i.project_income)
FROM project_income i
WHERE i.project_id = p.id
) AS income
, ( SELECT SUM(e.amount)
FROM project_expenses e
WHERE e.project_id = p.id
) AS expenses
FROM projects p
WHERE p.status = 'Active'
The problem with the query in the question is best explained with an example.
You say there are 2 rows in project_income and 4 rows in project_expenses. Let say the 2 incomes are 1000 and 1500, and the 4 expenses are 615, 750, 840, and 900.
Since there are no restrictions between them, that means you'll get the cross join, i.e. 8 records:
income expense
1000 615
1000 750
1000 840
1000 900
1500 615
1500 750
1500 840
1500 900
Now, when you sum income you get 4 times the value you want, and when you sum expense you get 2 times the value you want.

How to apply aggregate function only on distinct records

I have two tables, orders and order_item.
orders table:
Id Total DeliveryCharge Status DeliveryDate
2001 600 120 30 2015-09-01 11:56:32
2002 1500 150 30 2015-09-09 09:56:32
2003 1200 100 30 2015-09-30 08:05:32
order_item table:
Id OrderTotal Quantity
12001 2001 2
12002 2001 1
12003 2002 1
12004 2003 1
12005 2003 1
As each order can contain multiple products, that way order_item table could multiple records for a single order.
I want to get result by the query is
OrderCount Quantity OrderTotal DeliveryCharge
3 6 3300 370
I wrote a query
select count(distinct od.Id) as OrderCount,
sum(oi.Quantity) as Quantity,
(select sum(ord.OrderTotal) from orders ord
where ord.DeliveryDate between '2015-09-01' and '2015-10-01' and ord.Status=30 ) as OrderTotal
from orders od
join Order_items oi on od.Id=oi.orderId
where od.Status=30
and od.DeliveryDate between '2015-09-01' and '2015-10-01'
which has the result
OrderCount Quantity OrderTotal
3 6 3300
But now I want the sum of DeliveryCharge of orders table, so again I have to write select sub-query as I wrote for OrderTotal.
Is there a good way to find it with single query without using multiple sub-queries?
Put subqueries in the from clause:
select o.OrderCount, o.OrderTotal, o.OrderDeliveryCharge, oi.quantity
from (select count(*) as OrderCount, sum(Total) as OrderTotal,
sum(DeliveryCharge) as OrderDeliveryCharge
from orders
) o cross join
(select sum(quantity) as quantity
from order_item
) oi;
Use this
SELECT SUM(oi.oc) AS 'OrderCount', SUM(oi.q) AS 'Quantity', SUM(o.total) AS 'OrderTotal', SUM(o.deliverycharge) AS 'DeliveryCharge'
FROM
orders o INNER JOIN
(SELECT ordertotal, COUNT(DISTINCT(ordertotal)) AS oc, SUM(quantity) AS q FROM order_item GROUP BY 1) oi ON o.id=oi.ordertotal