About correct use of sum in case statement - mysql

There is a table "payments"
user_id payment_time amount sale_type
1 2018-04-01 10 cash
1 2018-04-01 10 cash
1 2018-04-01 10 cash
1 2018-04-01 20 bank
2 2018-04-01 10 cash
2 2018-04-01 10 cash
Need the sum of cash.
I don't understand why this query gives wrong results:
select SUM(CASE WHEN p1.sale_type='cash' THEN p1.amount ELSE 0 END)
as cash
FROM
(SELECT distinct user_id, SUM(amount) AS amount, sale_type FROM payments where
payment_time = '2018-04-01' group by user_id) p1

You need to add sale_type column to GROUP BY statement for the inner query and that should be group by user_id, sale_type for the correct results for your query style.
P.S. actually, I don't think you need a subquery.
The above query gives result as 60, while
select SUM(CASE WHEN p1.sale_type='cash' THEN p1.amount ELSE 0 END) as cash
from
(select distinct user_id, SUM(amount) AS amount, sale_type
from payments
where payment_time = date'2018-04-01'
group by user_id, sale_type) p1;
or
select SUM(CASE WHEN sale_type='cash' THEN amount ELSE 0 END) as cash
from payments
where payment_time = date'2018-04-01'
gives 40 for resulting cash column
SQL Fiddle Demo

Why don't you use 'Having' clause which is made for this purpose.
SELECT SUM(amount) AS cash FROM payments
WHERE payment_time = '2018-04-01'
GROUP BY sale_type
HAVING sale_type= 'cash'

I think you may not need the distinct in your sub-query or the entire sub-query at all.
select p.user_id as id, sum(case when p.sale_type = 'cash' then p.amount else 0 end) as amount
from payments p
where p.payment_time = '2018-04-01'
group by p.user_id
or without case
select p.user_id, sum(p.amount)
from payments p
where p.sale_type = 'cash' and p.payment_time = '2018-04-01'
group by p.user_id

Related

Optimize subquery in SELECT

My table schema is as follow:
Indexes:
products.id PRIMARY KEY
products.description UNIQUE
expenses.id PRIMARY KEY
expenses.product_id FOREIGN KEY to product.id
My goal is to load
Cost of each product of current month (AS costs_november)
Cost of each product of last month (AS costs_october)
Change in costs of current month compared to last (current month costs - last month costs) (AS costs)
Percentage change of current month costs compared to last (last month costs * 100 / current month costs) (AS percent_diff)
I've managed to code SQL that does exactly that:
SELECT description, (SUM(cost) - IFNULL(
(
SELECT SUM(cost)
FROM expenses
WHERE month = 9 AND year = 2019 AND product_id = e.product_id
GROUP BY product_id
), 0)) AS costs,
SUM(cost) * 100 /
(
SELECT SUM(cost)
FROM expenses
WHERE month = 9 AND year = 2019 AND product_id = e.product_id
GROUP BY product_id
) AS percent_diff,
SUM(cost) AS costs_october,
IFNULL(
(
SELECT SUM(cost)
FROM expenses
WHERE month = 9 AND year = 2019 AND product_id = e.product_id
GROUP BY product_id
), 0) AS costs_september
FROM expenses e
JOIN products p ON (e.product_id = p.id)
WHERE month = 10 AND year = 2019
GROUP BY product_id
ORDER BY product_id;
But is copy-pasting the same subquery three times really the solution? In theory it requires to run four queries per product. Is there a more elegant way?
Appreciate for any help!
I would address this with conditional aggregation:
select
p.description,
sum(case when e.month = 11 then e.cost else 0 end) costs_november,
sum(case when e.month = 10 then e.cost else 0 end) costs_october,
sum(case when e.month = 11 then e.cost else -1 * e.cost end) costs,
sum(case when e.month = 10 then e.cost else 0 end)
* 100
/ nullif(
sum(case when e.month = 11 then e.cost else 0 end),
0
) percent_diff
from expenses e
inner join products p on p.id = e.product_id
where e.year = 2019 and e.month in (10, 11)
goup by e.product_id
You can avoid repeating the same conditional sums by using a subquery (your RDBMS would probably optimize it anyway, but this tends to make the query more readable):
select
description,
costs_november,
costs_october,
costs_november - costs_october costs,
costs_october * 100 / nullif(costs_november, 0) percent_diff
from (
select
p.description,
sum(case when e.month = 11 then e.cost else 0 end) costs_november,
sum(case when e.month = 10 then e.cost else 0 end) costs_october
from expenses e
inner join products p on p.id = e.product_id
where e.year = 2019 and e.month in (10, 11)
goup by e.product_id
) t
You can calculate for all months and all products at one time:
SELECT year, month,
SUM(costs) as curr_month_costs,
LAG(SUM(costs)) OVER (ORDER BY year, month) as prev_month_costs,
(SUM(costs) -
LAG(SUM(costs)) OVER (ORDER BY year, month)
) as diff,
LAG(SUM(costs)) OVER (ORDER BY year, month) * 100 / SUM(costs)
FROM expenses e JOIN
products p
ON e.product_id = p.id
GROUP BY product_id, year, month
ORDER BY year, month, product_id;
You can use a subquery if you want to select only the current month.

How to add a few restrictios to a query?

I have difficulty with syntax...
This is my query:
SELECT t.diapason,
Count(*) AS 'number_of_users'
FROM (SELECT CASE
WHEN amount < 200 THEN '0-200'
WHEN amount >= 200 THEN '200 +'
end AS diapason
FROM (SELECT Sum(amount) AS amount
FROM payments) p) t
GROUP BY t.diapason
ORDER BY number_of_users DESC;
But now I need to select only users which had activity.login_time between '2018-01-01' and'2018-01-12'.
So, I think I should use INNER JOIN and set period of time. Bu how?
My tables:
activity
user_id login_time
1 01.01.2018
2 01.01.2018
3 03.01.2018
4 30.02.2018
payments
user_id amount payment_time
1 50 10.12.2017
1 200 09.12.2017
2 40 08.08.2017
what should I change in my query to add activity.login_time?
Output for period 01.01.2018-12.01.2018
diapason number_of_users
0-200 2
200+ 1
I understand your question as this. You had 3 users (user_id=1,2,3) login in the period 01.01.2018-12.01.2018. Of those users, user_id 1 made 2 payments totalling 250, user_id 2 made 1 payment of 40, and user_id 3 made 0 payments so their total is 0. Hence there are 2 values in the range 0-200, and 1 in the range 200 +. If that is the correct understanding, this query will give you the desired results:
SELECT CASE
WHEN amount < 200 THEN '0-200'
WHEN amount >= 200 THEN '200 +'
END AS diapason,
COUNT(*) AS number_of_users
FROM (SELECT a.user_id, COALESCE(SUM(p.amount), 0) AS amount
FROM activity a
LEFT JOIN payments p ON p.user_id = a.user_id
WHERE a.login_time BETWEEN '01.01.2018' AND '12.01.2018'
GROUP BY a.user_id) p
GROUP BY diapason;
Output:
diapason number_of_users
0-200 2
200 + 1
SQLFiddle demo
Update
To add another row with the total number_of_users, just add WITH ROLLUP to the GROUP BY clause:
SELECT CASE
WHEN amount < 200 THEN '0-200'
WHEN amount >= 200 THEN '200 +'
END AS diapason,
COUNT(*) AS number_of_users
FROM (SELECT a.user_id, COALESCE(SUM(p.amount), 0) AS amount
FROM activity a
LEFT JOIN payments p ON p.user_id = a.user_id
WHERE a.login_time BETWEEN '01.01.2018' AND '12.01.2018'
GROUP BY a.user_id) p
GROUP BY diapason WITH ROLLUP
Output:
diapason number_of_users
0-200 2
200 + 1
(null) 3
In your application framework you can use the fact that the diapason value is NULL to output something like Total instead.
Updated SQLFiddle
You can also do the same in MySQL (see this SQLFiddle) by wrapping this query up as a subquery and using a COALESCE on the diapason column. In that case the output would be:
diapason number_of_users
0-200 2
200 + 1
Total 3
You add WHERE clause to filter.
SELECT t.diapason,
COUNT(*) AS 'number_of_users'
FROM (
SELECT
CASE
WHEN amount < 200 THEN '0-200'
WHEN amount >= 200 THEN '200 +'
END AS diapason
FROM (
SELECT payments.user_id, SUM(amount) AS amount
FROM payments
INNER JOIN activity ON payments.user_id = activity.user_idAND activity.login_time = payments.payment_time
WHERE activity.login_time BETWEEN '2018-01-10' AND '2018-01-12'
GROUP BY payments.user_id
) p
) t
GROUP BY t.diapason
ORDER BY number_of_users DESC;

How to divide the result of calculation on groups?

Good day! Could you help me with query?
I have a table "payments":
payments
user_id amount payment_time sale_type
1 20 31.01.2011 card
1 10 02.01.2012 cash
3 10 03.01.2012 card
4 15 05.02.2012 cash
...and so on
The task is to select total amount of payments for 01.01.2012 - 30.01.2012 and divide this sum on groups due to the amount user ever payed.
The groups are "0-10" - if sum is 0 -10 $
"10 and more" - if sum > 10 $.
My code:
SELECT * from (select IFnull(t.diapason,'total') as diapason, total_amount
FROM
(SELECT p.user_id, p.amount as total_amount, CASE
when amount<=10 then '0-10'
when amount>10 then '10 and more' END AS diapason
FROM (SELECT distinct payments.user_id, SUM(amount) AS amount
FROM payments inner JOIN (SELECT DISTINCT user_id
FROM payments where payment_time between '2012-01-01'
and '2012-01-30') a ON payments.user_id = a.user_id
GROUP BY payments.user_id) p) t GROUP BY diapason WITH ROLLUP) as
t1 ORDER BY total_amount desc;
What is wrong here?
Expected output
diapason total_amount
0-10 10 - here is user with id 3
10 and more 10 - here is user with id 1 (because he ever payed 30)
total
Try this query -
select case when p2.amount <=10 then '0-10'
else '10 and more' end diapason
,p1.amount "total amount"
,p1.payment_by_card
,p1.cash
from (select user_id, sum(amount) amount, payment_by_card, cash
from payments
where payment_time between '2012-01-01' and '2012-01-30'
group by user_id, payment_by_card, cash) p1
join (select user_id, sum(amount) amount
from payments
group by user_id) p2
on p1.user_id = p2.user_id
Here is the fiddle - http://www.sqlfiddle.com/#!9/22caaa/8

Dont select data from database with two tables using mysql

I have 2 tables sale and receipt.Table structure and result structure is shown below.
sale
date total sale_type
15-8-2014 50 credit
16-8-2014 100 credit
17-8-201 200 return
18-8-2014 300 return
receipt
date net_amount
15-8-2014 100
16-8-2014 200
17-8-2014 300
result
date sale receipt
15-8-2014 50 100
16-8-2014 100 200
17 -8-2014 200 300
18-8-2014 300
Using my query i got these result structure ,but i want to get also the sum total in the case sale_type='credit' and sale_type='return".Any body help me?
My query is
select date,total,net_amount from
(select date, total, null as net_amount, 2 as sort_col from sale union
all select date,
null as total, net_amount as net_amount, 1 as sort_col from receipt)
as a order by date desc, sort_col desc
Does the following query (using JOIN and CASE get you expected results?
SELECT
s.date,
s.total AS sale,
r.net_amount AS receipt,
SUM(CASE
WHEN s.sale_type = 'credit' THEN s.total
ELSE 0
END) AS sum_credit,
SUM(CASE
WHEN s.sale_type = 'return' THEN s.total
ELSE 0
END) AS sum_return
FROM sale s
LEFT JOIN receipt r
ON s.date = r.date
GROUP BY s.date, s.total, r.net_amount;

SQL query with subqueries performing terribly

I have this quite long query that should give me some information about shipments, and it works, but it's performing terribly bad. It takes about 4500ms to load.
SELECT
DATE(paid_at) AS day,
COUNT(*) as order_count,
(
SELECT COUNT(*) FROM line_items
WHERE order_id IN (SELECT id from orders WHERE DATE(paid_at) = day)
) as product_count,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'colissimo'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
) as orders_co,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'colissimo'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
AND paid_amount < 70
) as co_less_70,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'colissimo'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
AND paid_amount >= 70
) as co_plus_70,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'mondial_relais'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
) as orders_mr,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'mondial_relais'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
AND paid_amount < 70
) as mr_less_70,
(
SELECT COUNT(*) FROM orders
WHERE shipping_method = 'mondial_relais'
AND DATE(paid_at) = day
AND state IN ('paid','shipped','completed')
AND paid_amount >= 70
) as mr_plus_70
FROM orders
WHERE MONTH(paid_at) = 11
AND YEAR(paid_at) = 2011
AND state IN ('paid','shipped','completed')
GROUP BY day;
Any idea what I could be doing wrong or what I could be doing better? I have other queries of similar length that don't take as much time to load as this. I thought this would be faster than for example having an individual query for each day (in my programming instead of the SQL query).
It is because you are using sub-queries where you don't need them.
As a general rule, where you have a sub-query within a main SELECT clause, that sub-query will query the tables within it once for each row in the main SELECT clause - so if you have 7 subqueries and are selecting a date range of 30 days, you will effectively be running 210 separate subqueries (plus your main query).
(Some query optimisers can resolve sub-queries into the main query under some circumstances, but as a general rule you can't rely on this.)
In this case, you don't need any of the orders sub-queries, because all the orders data you require is included in the main query - so you can rewrite this as:
SELECT
DATE(paid_at) AS day,
COUNT(*) as order_count,
(
SELECT COUNT(*) FROM line_items
WHERE order_id IN (SELECT id from orders WHERE DATE(paid_at) = day)
) as product_count,
sum(case when shipping_method = 'colissimo' then 1 end) as orders_co,
sum(case when shipping_method = 'colissimo' AND
paid_amount < 70 then 1 end) as co_less_70,
sum(case when shipping_method = 'colissimo' AND
paid_amount >= 70 then 1 end) as co_plus_70,
sum(case when shipping_method = 'mondial_relais' then 1 end) as orders_mr,
sum(case when shipping_method = 'mondial_relais' AND
paid_amount < 70 then 1 end) as mr_less_70,
sum(case when shipping_method = 'mondial_relais' AND
paid_amount >= 70 then 1 end) as mr_plus_70
FROM orders
WHERE MONTH(paid_at) = 11
AND YEAR(paid_at) = 2011
AND state IN ('paid','shipped','completed')
GROUP BY day;
The problem in your query is that scans the same table over and over. All scans (selects in your case) of ORDER table can be transformed to multiple SUM+CASE or COUNT+CASE as in SQL query with count and case statement.