I have three tables: monthly_revenue, currencies and foreign_exchange.
monthly_revenue table
|------------------------------------------------------|
| id | product_id | currency_id | value | month | year |
|------------------------------------------------------|
| 1 | 1 | 1 | 100 | 1 | 2015 |
| 2 | 1 | 2 | 125 | 1 | 2015 |
| 3 | 1 | 3 | 115 | 1 | 2015 |
| 4 | 1 | 1 | 100 | 2 | 2015 |
| 5 | 1 | 2 | 125 | 2 | 2015 |
| 6 | 1 | 3 | 115 | 2 | 2015 |
|------------------------------------------------------|
foreign_exchange table
|---------------------------------------|
| id | base | target | rate | rate_date |
|---------------------------------------|
| 1 | GBP | USD | 1.6 |2015-01-01 |
| 2 | GBP | USD | 1.62 |2015-01-15 |
| 3 | GBP | USD | 1.61 |2015-01-31 |
| 4 | EUR | USD | 1.2 |2015-01-01 |
| 5 | EUR | USD | 1.4 |2015-01-15 |
| 6 | EUR | USD | 1.4 |2015-01-31 |
| 7 | GBP | EUR | 1.4 |2015-01-01 |
| 8 | GBP | EUR | 1.45 |2015-01-15 |
| 9 | GBP | EUR | 1.44 |2015-01-31 |
|---------------------------------------|
From this, we can see the average fx rates:
GBP > USD in January is 1.61
EUR > USD in January is 1.33
GBP > EUR in January is 1.43
No rates are available for USD as a base currency, and no rates are available for February.
currencies table
|-----------|
| id | name |
|-----------|
| 1 | GBP |
| 2 | USD |
| 3 | EUR |
|-----------|
What i'm trying to achieve
Each row within the monthly_revenue table can have a different currency_id, as orders are placed is different currencies. I want to see all revenue for a given month, in a common currency. So, rather than looking at all revenue in January in GBP, and then separately looking at all revenue in January in USD, I'd like to get one value for all revenue in January - converted to USD (for example).
This can be calculated for each row, using the following (using January for this example):
revenue value x average fx rate for January between base and target currency
If I have 50 orders in January, in 4 different currencies, this let's me see all revenue in any single currency.
Example - get all revenue in January, in USD
This should return:
|------------------------------------------------------|
| id | product_id | currency_id | value | month | year |
|------------------------------------------------------|
| 1 | 1 | 1 | 100 | 1 | 2015 |
| 2 | 1 | 2 | 125 | 1 | 2015 |
| 3 | 1 | 3 | 115 | 1 | 2015 |
|------------------------------------------------------|
However, rows 1 and 3 are not in USD (these are GBP, and EUR respectively).
What I'd like to see is each row returned with the average FX rate that is being converted to, and a converted column. For example:
|-------------------------------------------------------------------------|
| id | prod_id | currency_id | value | month | year | fx_avg | converted |
|-------------------------------------------------------------------------|
| 1 | 1 | 1 | 100 | 1 | 2015 | 1.61 | 161 |
| 2 | 1 | 2 | 125 | 1 | 2015 | 1 | 125 |
| 3 | 1 | 3 | 115 | 1 | 2015 | 1.33 | 152.95 |
|-------------------------------------------------------------------------|
Where I'm at
I can currently get the basic calculation done using the query below, but a couple of key features are lacking:
If there is no FX rate available (for example for future dates where of course an FX rate isn't available) then the entire row is ignored. What I'd like in this instance is for the latest month's average to be used.
If the calculation is being performed where the target currency is the same as the base currency, the entire row is ignored (as there is no record in the FX table where the base equals the target). In this instance, the rate should be hard defined as 1.
Query so far
SELECT
r.value * IFNULL(AVG(fx.rate),1) as converted, AVG(fx.rate) as averageFx,
r.*, fx.*
FROM
foreign_exchange fx, monthly_revenue r, order_headers h
WHERE
fx.base IN (SELECT name FROM currencies WHERE id = r.currency_id) AND
r.order_header_id = h.id AND
fx.target = 'USD' AND
MONTH(fx.rate_date) = r.month AND
YEAR(fx.rate_date) = r.year AND
r.year = 2015
GROUP BY r.id
ORDER BY month ASC
If there are no records available for FX, it looks like a separate subquery should be performed to get the average of the latest month's rates.
Any input would be appreciated. If any further info is required, please post a comment.
Thanks.
Edit Here is a SQFiddle which has the example schemas and the code which highlights the issue.
Here is an approximation of a function that computes your exchange for a given currency and start of month:
DELIMITER //
CREATE FUNCTION MonthRate(IN _curr CHAR(3) CHARACTER SET ascii,
IN _date DATE)
RETURNS FLOAT
DETERMINISTIC
BEGIN
-- Note: _date must be the first of some month, such as '2015-02-01'
DECLARE _avg FLOAT;
DECLARE _prev FLOAT;
-- First, try to get the average for the month:
SELECT AVG(rate) INTO _avg FROM foreign_exchange
WHERE base = _curr
AND target = 'USD'
AND rate_date >= _date
AND rate_date < _date + INTERVAL 1 MONTH;
IF _avg IS NOT NULL THEN
RETURN _avg;
END;
-- Fall back onto the last rate before the month:
SELECT rate INTO _prev
FROM foreign_exchange
WHERE base = _curr
AND target = 'USD'
AND rate_date < _date
ORDER BY _date
LIMIT 1;
IF _prev IS NOT NULL THEN
RETURN _prev;
END;
SELECT "Could not get value -- ran off start of Rates table";
END;
DELIMITER ;
There are probably syntax errors, etc. But hopefully you can work with it.
It should be easy to call the function from the rest of the code.
For performance, this would be beneficial:
INDEX(base, target, rate_date, rate)
Create a view :
create view avg_rate as
select base, target, year(fx.rate_date) year, month(fx.rate_date) month,
avg(rate) avg_rate
from foreign_exchange group by base, target
Join it twice, once for current month, and once for previous
select r.id, r.month,
r.value * avg(coalesce(cr.avg_rate, pr.avg_rate, 1)) converted,
avg(coalesce(cr.avg_rate, pr.avg_rate), 0) rate
from monthly_revenue r, avg_rate cr, avg_rate pr, order_headers h
where
r.year = 2015 and
cr.year = r.year and cr.month = r.month and cr.target='USD' and
pr.year = r.year and pr.month = r.month - 1 and pr.target='USD' and
r.order_header_id = h.id
group by r.id
order by r.month
Also I personally don't like this way of writing query and prefer to using explicit joins as you group conditions logically and don't have a mess in where clause. i.e.:
...
from monthly_revenue r
inner join order_headers h on r.order_header_id = h.id
left join avg_rate cr on cr.year = r.year and cr.month = r.month and cr.target='USD'
left join avg_rate pr on pr.year = r.year and pr.month = r.month - 1 and pr.target='USD'
where r.year = 2015
http://sqlfiddle.com/#!9/6a41a/1
This fiddle based on your original one but I added some rate values for February and March to test and show how it works.
SELECT t.*,
IF(#first=t.id, #flag := #flag+1,#flag:=1) `flag`,
#first:=t.id
FROM
(SELECT
coalesce(fx.rate,1) `rate`, (r.value * coalesce(fx.rate,1)) as converted,
r.*, fx.base,fx.target, fx.avg_date, fx.rate frate
FROM
monthly_revenue r
LEFT JOIN
currencies
ON r.currency_id = currencies.id
LEFT JOIN
(SELECT AVG(rate) `rate`,
`base`,
`target`,
STR_TO_DATE(CONCAT('1/',MONTH(rate_date),'/',YEAR(rate_date)), '%d/%m/%Y') avg_date
FROM foreign_exchange
GROUP BY `base`, `target`, `avg_date`
) fx
ON currencies.name = fx.base
AND fx.target = 'USD'
AND fx.avg_date <= STR_TO_DATE(CONCAT('1/',r.month,'/',r.year), '%d/%m/%Y')
ORDER BY r.id, fx.avg_date DESC) t
HAVING `flag` = 1
and if you need records just for specific month you can add WHERE before ORDER like this:
WHERE r.month = 1 and r.year = 2015
ORDER BY r.id, fx.avg_date DESC) t
You may test this query on the fiddle link you provided : http://sqlfiddle.com/#!9/33def/2
select id,product_id,currency_id,currency_name,
value,month,year,#prev_fx_avg:=ifnull(fx_avg,#prev_fx_avg) fx_avg,
value*#prev_fx_avg as converted
from (SELECT
r.id,r.product_id,r.currency_id,c.name as currency_name,
r.value,r.month,r.year,if(c.name="USD",1,temp.avg_rate) as fx_avg
FROM
monthly_revenue r
left join currencies c on r.currency_id=c.id
left join
(select base , avg(rate) as avg_rate, MONTH(fx.rate_date) month,
YEAR(fx.rate_date) year
from foreign_exchange fx
where target="USD"
group by base,target,MONTH(fx.rate_date),
YEAR(fx.rate_date)) temp on(r.month=temp.month and r.year=temp.year and c.name=temp.base)
group by r.id
order by r.currency_id,r.month ASC, r.year ASC) final,(select #prev_fx_avg:=-1) temp2;
Related
I have a table sales with some columns and data like this:
SELECT order_date, sale FROM sales;
+------------+------+
| order_date | sale |
+------------+------+
| 2020-01-01 | 20 |
| 2020-01-02 | 25 |
| 2020-01-03 | 15 |
| 2020-01-04 | 30 |
| 2020-02-05 | 20 |
| 2020-02-10 | 20 |
| 2020-02-06 | 25 |
| 2020-03-07 | 15 |
| 2020-03-08 | 30 |
| 2020-03-09 | 20 |
| 2020-03-10 | 40 |
| 2020-04-01 | 20 |
| 2020-04-02 | 25 |
| 2020-04-03 | 10 |
+------------+------+
and I would like to calculate, for example, monthly growth rate.
From the previous data example the expected result would be like this:
month sale growth_rate
1 90 0
2 65 -27.78
3 105 61.54
4 55 -47.62
We have an old MySQL version, 5.x.
could anyone help or give me some clues to achieve this?
It is a bit complicate:
select
s.*
-- calculate rate
, ifnull(round((s.mnt_sale - n.mnt_sale)/n.mnt_sale * 10000)/100, 0) as growth_rate
from (
-- calculate monthly summary
select month(order_date) mnt, sum(sale) mnt_sale
from sales
group by mnt
) s
left join ( -- join next month summary
-- calculate monthly summary one more time
select month(order_date) mnt, sum(sale) mnt_sale
from sales
group by mnt) n on n.mnt = s.mnt - 1
;
DB Fiddle
You can use aggregation and window functions. Something like his:
select year(order_date) as year, month(order_date) as month, sum(sale) as sale,
100 * (1 - sum(sale) / lag(sum(sale), 1, sum(sale)) over (order by min(order_date)) as growth_rate
from t
group by year, month
A little tricky for me, but I think the code below works as expected
SELECT month, sale,growth_rate
FROM(
SELECT month, sale,
IF(#last_entry = 0, 0, ROUND(((sale - #last_entry) / #last_entry) * 100,2)) AS growth_rate,
#last_entry := sale AS last_entry
FROM
(SELECT #last_entry := 0) x,
(SELECT month, sum(sale) sale
FROM (SELECT month(order_date) as month,sum(sale) as sale
FROM sales GROUP BY month(order_date)) monthly_sales
GROUP BY month) y) t;
expected result
+-------+------+-------------+
| month | sale | growth_rate |
+-------+------+-------------+
| 1 | 90 | 0.00 |
| 2 | 65 | -27.78 |
| 3 | 105 | 61.54 |
| 4 | 55 | -47.62 |
+-------+------+-------------+
I have two tables below with the following information
project.analytics
| proj_id | list_date | state
| 1 | 03/05/10 | CA
| 2 | 04/05/10 | WA
| 3 | 03/05/10 | WA
| 4 | 04/05/10 | CA
| 5 | 03/05/10 | WA
| 6 | 04/05/10 | CA
employees.analytics
| employee_id | proj_id | worked_date
| 20 | 1 | 3/12/10
| 30 | 1 | 3/11/10
| 40 | 2 | 4/15/10
| 50 | 3 | 3/16/10
| 60 | 3 | 3/17/10
| 70 | 4 | 4/18/10
What query can I write to determine the average number of unique employees who have worked on the project in the first 7 days that it was listed by month and state?
Desired output:
| list_date | state | # Unique Employees of projects first 7 day list
| March | CA | 1
| April | WA | 2
| July | WA | 2
| August | CA | 1
My Attempt
select
month(list_date),
state_name,
count(*) as Projects,
from projects
group by
month(list_date),
state_name;
I understand the next steps are to subtract the worked_date - list_date and if value is <7 then average count of employees from the 2nd table but I'm not sure what query functions to use.
You could use a CASE with a DISTINCT to COUNT the unique employees that worked within the first 7 days of the list_date.
Once you have that total of employees per project, then you can calculate those averages per month & state.
SELECT
MONTHNAME(list_date) as `ListMonth`,
state,
AVG(TotalUniqEmp7Days) AS `Average Unique Employees of projects first 7 day list`
FROM
(
SELECT
proj.proj_id,
proj.list_date,
proj.state,
COUNT(DISTINCT CASE
WHEN emp.worked_date BETWEEN proj.list_date and DATE_ADD(proj.list_date, INTERVAL 6 DAY)
THEN emp.employee_id
END) AS TotalUniqEmp7Days
-- , COUNT(DISTINCT emp.employee_id) AS TotalUniqEmp
FROM project.analytics proj
LEFT JOIN employees.analytics emp ON emp.proj_id = proj.proj_id
GROUP BY proj.proj_id, proj.list_date, proj.state
) AS ProjectTotals
GROUP BY YEAR(list_date), MONTH(list_date), MONTHNAME(list_date), state;
A Sql Fiddle test can be found here
I think this is the code that you want
select
p.list_date, p.state,
emp.no_of_unique_emp
from project.analytics p
inner join (
select
t.project_id,
count(t.employee_id) as no_of_unique_emp
from (
select distinct employee_id, project_id
from employees.analytics
) t
group by t.project_id
) emp
on emp.project_id = p.project_id
where datediff (p.list_date, getdate()) <= 7
I came across a task where I have to return the total COUNT and SUM of issued policies for each day of the month and compare it to the previous year.
Table PolicyOrder has fields:
PolicyOrderId - primary key
CreatedAt (DATETIME)
CalculatedPremium - cost of policy or "premium"
PolicyOrderStatusId - irrelevant to the question but still - status of the policy.
To solve this I came up with a query that inner joins self table and sums/counts by grouping according to DAY of the creation date.
SELECT
DATE(po1.CreatedAt) AS dayDate_2017,
SUM(po1.CalculatedPremium) AS premiumSum_2017,
COUNT(po1.PolicyOrderId) AS policyCount_2017,
po2.*
FROM
PolicyOrder po1
INNER JOIN (
SELECT
DATE(CreatedAt) AS dayDate_2018,
SUM(CalculatedPremium) AS premiumSum_2018,
COUNT(PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(CreatedAt) = 2018 AND
MONTH(CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(CreatedAt)
) po2 ON (
DAY(po2.dayDate_2018) = DAY(po1.CreatedAt)
)
WHERE
YEAR(po1.CreatedAt) = 2017 AND
MONTH(po1.CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(po1.CreatedAt)
The above query returns these results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018
2017-10-01 | 4699.36 | 98 | 2018-10-01 | 8524.21 | 144
2017-10-02 | 9114.55 | 168 | 2018-10-02 | 7942.25 | 140
2017-10-03 | 9512.43 | 178 | 2018-10-03 | 9399.61 | 161
2017-10-04 | 9291.77 | 155 | 2018-10-04 | 6922.83 | 137
2017-10-05 | 8063.27 | 155 | 2018-10-05 | 9278.58 | 178
2017-10-06 | 9743.40 | 184 | 2018-10-06 | 6139.38 | 136
...
2017-10-31 | ...
The problem is that now I have to add two more columns in which policies has to be counted and amounts added from the start of the year UP UNTIL each returned row.
Desired results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining | countFromYearBegining
2017-10-01 | 4699.36 | 98 | 150000.34 | 5332
2017-10-02 | 9114.55 | 168 | 156230.55 | 5443
2017-10-03 | 9512.43 | 178 | 160232.44 | 5663
...
2017-10-31 | ...
WHERE:
sumFromYearBegining (150000.34) - SUM of premiumSum from 2017-01-01 until 2017-10-01 (excluding)
countFromYearBegining (5332) - COUNT of policies from 2017-01-01 until 2017-10-01 (excluding)
sumFromYearBegining (1566239.55) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5443) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
sumFromYearBegining (160232.44) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5663) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
I have tried inner joining same table COUNTed and SUMed which failed because I cannot specify the range up to which I need to count and sum, I have tried LEFT joining and then counting, which fails because the results are counted not untill each row result but until the last result etc...
DB Fiddle: https://www.db-fiddle.com/f/ckM8HyTD6NjLbK41Mq1gct/5
Any help from you SQL ninjas highly appreciated.
We can use User-defined variables to calculate Rolling Sum / Count, in absence of Window Functions' availability.
We will first need to determine the Sum and Count for every day in the year 2017 (even though you need rows for a particular month only). Because, in order to calculate rolling Sum for the days in March month, we would need the sum/count values from the January, and February month(s) as well. One optimization possibility is that we can restrict calculations from the first month to the require month only.
Note that ORDER BY daydate_2017 is necessary in order to be able to calculate rolling sum correctly. By default, data is in unordered fashion. Without defining the order, we cannot guarantee that Sum will be correct.
Also, we need to two levels of sub-select queries. First level is used to calculate the Rolling sum values. Second level is used to restrict the result to February month only. Since WHERE is executed before SELECT; we cannot restrict the result to February month, in the first level itself.
If you need similar rolling Sum for the year 2018 as well; similar query logic can be implemented in other set of sub-select queries.
SELECT dt2_2017.*, dt_2018.*
FROM
(
SELECT dt_2017.*,
#totsum := #totsum + dt_2017.premiumsum_2017 AS sumFromYearBegining_2017,
#totcount := #totcount + dt_2017.policycount_2017 AS countFromYearBeginning_2017
FROM (SELECT Date(po1.createdat) AS dayDate_2017,
Sum(po1.calculatedpremium) AS premiumSum_2017,
Count(po1.policyorderid) AS policyCount_2017
FROM PolicyOrder AS po1
WHERE po1.policyorderstatusid = 6 AND
YEAR(po1.createdat) = 2017 AND
MONTH(po1.createdat) <= 2 -- calculate upto February for 2017
GROUP BY daydate_2017
ORDER BY daydate_2017) AS dt_2017
CROSS JOIN (SELECT #totsum := 0, #totcount := 0) AS user_init_vars
) AS dt2_2017
INNER JOIN (
SELECT
DATE(po2.CreatedAt) AS dayDate_2018,
SUM(po2.CalculatedPremium) AS premiumSum_2018,
COUNT(po2.PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(po2.CreatedAt) = 2018 AND
MONTH(po2.CreatedAt) = 2 AND
po2.PolicyOrderStatusId = 6
GROUP BY
dayDate_2018
) dt_2018 ON DAY(dt_2018.dayDate_2018) = DAY(dt2_2017.dayDate_2017)
WHERE YEAR(dt2_2017.daydate_2017) = 2017 AND
MONTH(dt2_2017.daydate_2017) = 2;
RESULT: View on DB Fiddle
| dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining_2017 | countFromYearBeginning_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018 |
| ------------ | --------------- | ---------------- | ------------------------ | --------------------------- | ------------ | --------------- | ---------------- |
| 2017-02-01 | 4131.16 | 131 | 118346.77 | 3627 | 2018-02-01 | 8323.91 | 149 |
| 2017-02-02 | 2712.74 | 85 | 121059.51000000001 | 3712 | 2018-02-02 | 9469.33 | 153 |
| 2017-02-03 | 3888.59 | 111 | 124948.1 | 3823 | 2018-02-03 | 6409.21 | 97 |
| 2017-02-04 | 2447.99 | 74 | 127396.09000000001 | 3897 | 2018-02-04 | 5693.69 | 120 |
| 2017-02-05 | 1437.5 | 45 | 128833.59000000001 | 3942 | 2018-02-05 | 8574.97 | 129 |
| 2017-02-06 | 4254.48 | 127 | 133088.07 | 4069 | 2018-02-06 | 8277.51 | 133 |
| 2017-02-07 | 4746.49 | 136 | 137834.56 | 4205 | 2018-02-07 | 9853.75 | 173 |
| 2017-02-08 | 3898.05 | 125 | 141732.61 | 4330 | 2018-02-08 | 9116.33 | 144 |
| 2017-02-09 | 8306.86 | 286 | 150039.46999999997 | 4616 | 2018-02-09 | 8818.32 | 166 |
| 2017-02-10 | 6740.99 | 204 | 156780.45999999996 | 4820 | 2018-02-10 | 7880.17 | 134 |
| 2017-02-11 | 4290.38 | 133 | 161070.83999999997 | 4953 | 2018-02-11 | 8394.15 | 180 |
| 2017-02-12 | 3687.58 | 122 | 164758.41999999995 | 5075 | 2018-02-12 | 10378.29 | 171 |
| 2017-02-13 | 4939.31 | 159 | 169697.72999999995 | 5234 | 2018-02-13 | 9383.15 | 160 |
If you want a way that avoids using #variables in the select list, and also avoids analytics (only mysql 8 supports them) you can do it with a semi-cartesian product:
WITH prevYr AS(
SELECT
YEAR(CreatedAt) AS year_prev,
MONTH(CreatedAt) AS month_prev,
DAY(CreatedAt) AS day_prev,
SUM(CalculatedPremium) AS premiumSum_prev,
COUNT(PolicyOrderId) AS policyCount_prev
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2017-02-01' AND '2017-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
),
currYr AS (
SELECT
YEAR(CreatedAt) AS year_curr,
MONTH(CreatedAt) AS month_curr,
DAY(CreatedAt) AS day_curr,
SUM(CalculatedPremium) AS premiumSum_curr,
COUNT(PolicyOrderId) AS policyCount_curr
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2018-02-01' AND '2018-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
)
SELECT
*
FROM
prevYr
INNER JOIN
currYr
ON
currYr.day_curr = prevYr.day_prev
INNER JOIN
(
SELECT
main.day_prev AS dayRolling_prev,
SUM(pre.premiumSum_prev) AS premiumSumRolling_prev,
SUM(pre.policyCount_prev) AS policyCountRolling_prev
FROM
prevYr main LEFT OUTER JOIN prevYr pre ON pre.day_prev < main.day_prev
GROUP BY
main.day_prev
) rollingPrev
ON
currYr.day_curr = rollingPrev.dayRolling_prev
ORDER BY 1,2,3
We summarise the year 2017 and year 2018 data into two CTEs because it makes things a lot cleaner and neater later, particularly for this rolling count. You can probably follow the logic of the CTE easily because it's lifted more or less straight from your query - I only dropped the DATE column in favour of a year/month/date triplet because it made other things cleaner (joins) and can be recombined to a date if needed. I also swapped the WHERE clauses to use date BETWEEN x AND y because this will leverage an index on a column whereas using YEAR(date) = x AND MONTH(date) = y might not
The rolling counts works via something I referred to as a semi-cartesian. It's actually a cartesian product; any database join that results in rows from one o both tables multiplying and being represented repeatedly in the output, is a cartesian product. Rather than being a full product (every row crossed with every other row) in this case it uses a less than, so every row is only crossed with a subset of rows. As the date increases, more rows match the predicate, because a date of 30th has 29 rows that are less than it.
This thus causes the following pattern of data:
maindate predate maincount precount
2017-02-01 NULL 10 NULL
2017-02-02 2017-02-01 20 10
2017-02-03 2017-02-01 30 10
2017-02-03 2017-02-02 30 20
2017-02-04 2017-02-01 40 10
2017-02-04 2017-02-02 40 20
2017-02-04 2017-02-03 40 30
You can see that for any given main date, it repeats N - 1 times because there are N - 1 dates lower than in that satisfy the join condition predate < maindate
If we group by the maindate and sum the counts associated with each predate, we get the rolling sum of all the pre-counts on that main-date (So, on the 4th day of the month, it's SUM(pre count for dates 1st - 3rd, i.e. 10+20+30 = 60. On the 5th day, we sum the counts for days 1 to 4. On the 6th day, we sum days 1 to 5 etc)
I have a table that looks like this:
+--------+---------------------+-------+--------+-----------+
| PartNo | Date | Inv | Retail | Wholesale |
+--------+---------------------+-------+--------+-----------+
| 1 | 2018-05-12 00:00:00 | 15 | $100 | $90 |
| 2 | 2018-05-12 00:00:00 | 20 | $200 | $150 |
| 3 | 2018-05-12 00:00:00 | 25 | $300 | $200 |
| 1 | 2018-05-13 00:00:00 | 10 | $95 | $90 |
| 2 | 2018-05-14 00:00:00 | 15 | $200 | $150 |
| 3 | 2018-05-14 00:00:00 | 20 | $300 | $200 |
+--------+---------------------+-------+--------+-----------+
And I want it to look like this with a Mysql query:
+--------+------+--------+
| PartNo | Sold | Profit |
+--------+------+--------+
| 1 | 5 | $25 |
| 2 | 5 | $250 |
| 3 | 5 | $500 |
+--------+------+--------+
I need to group by PartNo while calculating the difference between totals and profits over a date range.
The unit profit has to be calculated by subtracting the wholesale from retail on the last day (or record) of the date range.
I feel like this should be easy but the differences over the date ranges are confusing me and handling records within the date range that don't start or end exactly on the date range input are losing me.
Any help would be super appreciated.
Thank you.
You can look up the situation at the start and at the end of the period If no start situation is found, assume no stock. If no end situation is found, that means no sales during the period.
For example for the period starting 2018-05-13 and ending 2018-05-14:
select parts.PartNo
, coalesce(FirstSale.Total, 0) - coalesce(LastSale.Total, FirstSale.Total, 0) as Sold
, (coalesce(FirstSale.Total, 0) - coalesce(LastSale.Total, FirstSale.Total, 0)) *
coalesce(LastSale.Retail - LastSale.Wholesale, 0) as Profit
from (
select PartNo
, max(case when Date < '2018-05-13' then Date end) as FirstEntry
, max(case when Date <= '2018-05-14' then Date end) as LastEntry
from Sales
group by
PartNo
) parts
left join
Sales FirstSale
on FirstSale.PartNo = parts.PartNo
and FirstSale.Date = parts.FirstEntry
left join
Sales LastSale
on LastSale.PartNo = parts.PartNo
and LastSale.Date = parts.LastEntry
Example at SQL Fiddle.
SELECT c.partno as partno,MAX(c.inv)-MIN(c.inv) as sold,SUM(CASE WHEN c.date = c.last_date THEN profit else 0 END)*(MAX(c.inv)-MIN(c.inv)) as profit
FROM (SELECT partno,date,inv,retail-wholesale as profit,MAX(date) OVER (partition by partno) AS last_date FROM test1)c
GROUP BY c.partno
ORDER BY c.partno;
Using the window function, first append a new column to track the max date for each partno. So the inner query inside FROM will produce rows like these with one column added to the the original dataset,
| 1 | 2018-05-12 00:00:00 | 15 | $100 | $90 | **2018-05-13 00:00:00** |
The highlighted field is the one added to the dataset which is the last date in the date range for that part number!
Now from this result, we can pull out profit by checking for the row in which date column is equal to the new column we appended, which is essentially calculating the profit for the last date by subtracting wholesale from retail and multiplying with items sold.
PS : The logic for items sold is grouping by partno and subtracting MIN(Inv) from MAX(Inv)
Link to SQL Fiddle
This query runs on an invoices table to help me decide who I need to pay
Here's the base table:
The users table
+---------+--------+
| user_id | name |
+---------+--------+
| 1 | Peter |
| 2 | Lois |
| 3 | Stewie |
+---------+--------+
The invoices table:
+------------+---------+----------+--------+---------------+---------+
| invoice_id | user_id | currency | amount | description | is_paid |
+------------+---------+----------+--------+---------------+---------+
| 1 | 1 | usd | 140 | Cow hoof | 0 |
| 2 | 1 | usd | 45 | Cow tail | 0 |
| 3 | 1 | gbp | 1 | Cow nostril | 0 |
| 4 | 2 | gbp | 1500 | Cow nose hair | 0 |
| 5 | 2 | cad | 1 | eyelash | 1 |
+------------+---------+----------+--------+---------------+---------+
I want a resulting table that looks like this:
+---------+-------+----------+-------------+
| user_id | name | currency | SUM(amount) |
+---------+-------+----------+-------------+
| 1 | Peter | usd | 185 |
| 2 | Lois | gbp | 1500 |
+---------+-------+----------+-------------+
The conditions are:
Only consider invoices that have not been paid, so where is_paid = 0
Group them by user_id, by currency
If the SUM(amount) < $100 for the user_id, currency pair then don't bother showing the result, since we don't pay invoices that are less than $100 (or equivalent, based on a fixed exchange rate).
Here's what I've got so far (not working -- which I guess is because I'm filtering by a GROUP'ed parameter):
SELECT
users.user_id, users.name,
invoices.currency, SUM(invoices.amount)
FROM
mydb.users,
mydb.invoices
WHERE
users.user_id = invoices.user_id AND
invoices.is_paid != true AND
SUM(invoices.amount) >=
CASE
WHEN invoices.currency = 'usd' THEN 100
WHEN invoices.currency = 'gbp' THEN 155
WHEN invoices.currency = 'cad' THEN 117
END
GROUP BY
invoices.currency, users.user_id
ORDER BY
users.name, invoices.currency;
Help?
You can't use SUM in a WHERE. Use HAVING instead.
Use HAVING clause instead of SUM in WHERE condition
Try this:
SELECT u.user_id, u.name, i.currency, SUM(i.amount) invoiceAmount
FROM mydb.users u
INNER JOIN mydb.invoices i ON u.user_id = i.user_id
WHERE i.is_paid = 0
GROUP BY u.user_id, i.currency
HAVING SUM(i.amount) >= (CASE i.currency WHEN 'usd' THEN 100 WHEN 'gbp' THEN 155 WHEN 'cad' THEN 117 END)
ORDER BY u.name, i.currency;
Try something like this:
SELECT
user_id, name, currency, sum(amount) due
FROM
invoice i
JOIN users u ON i.user_id=u.user_id
WHERE
is_paid = 0 AND
GROUP BY user_id, currency
having due >= 100
do you store exchange rates? Multiply rates with amount to get actual amount with respect to base currency.
sum(amount*ex_rate) due