Group consecutive days in Week in MySql - mysql

I want to be able to group a table of sales by week but only by the first x number of days.
I can group by week easily
SELECT SUM( order_total_price ) AS total, WEEK(order_time,1) AS week_number
FROM orders
WHERE YEAR( order_time ) = 2014
GROUP BY WEEK( order_time, 1 )
So I want to get the aggregate sum of orders for each day of the week.
I will need to run this query 7 times for each of the days.
Here is some sample data. I have selected a range of totals from Mon-Sun
+-------------+-------------------+
| order_time | order_total_price |
+-------------+-------------------+
| 2014-03-03 | 20 |
| 2014-03-04 | 25 |
| 2014-03-05 | 30 |
| 2014-03-06 | 15 |
| 2014-03-07 | 20 |
| 2014-03-08 | 15 |
| 2014-03-09 | 30 |
| 2014-03-10 | 20 |
| 2014-03-11 | 15 |
| 2014-03-12 | 10 |
| 2014-03-13 | 25 |
| 2014-03-14 | 30 |
| 2014-03-15 | 25 |
| 2014-03-16 | 10 |
+-------------+-------------------+
Here is the results that I am after
+----------+-------------+-------+
| end_day | week_number | total |
+----------+-------------+-------+
| 1 | 10 | 20 |
| 2 | 10 | 45 |
| 3 | 10 | 75 |
| 4 | 10 | 90 |
| 5 | 10 | 110 |
| 6 | 10 | 125 |
| 7 | 10 | 155 |
| 1 | 11 | 20 |
| 2 | 11 | 35 |
| 3 | 11 | 45 |
| 4 | 11 | 70 |
| 5 | 11 | 100 |
| 6 | 11 | 125 |
| 7 | 11 | 135 |
+----------+-------------+-------+
The end_day(1=Mon - 7=Sun) would be the day which the aggregate of the week total is calculated to. Notice how the totals are the aggregate total to that day of the week.

SELECT
sum(order_total_price) AS total,
WEEK(order_time,1) AS week_number, date_format(order_time,'%w') as day_of_week
FROM orders
WHERE YEAR(order_time) = 2014
AND date_format(order_time,'%w') >= 1
AND date_format(order_time,'%w') <= 2
-- date_format(date,format)
-- %w: Day of the week (0=Sunday, 6=Saturday)
GROUP BY WEEK(order_time,1), date_format(order_time, '%w')

Related

MYSQL - Get all months between two dates (from, to) and data for this months

I have table something like that:
| id | date | user_id | value |
---------------------------------------------
| 1 | 2019-01-10 | 3 | 20
| 2 | 2019-04-08 | 3 | 30
| 3 | 2019-06-04 | 3 | 40
| 4 | 2019-08-20 | 3 | 50
| 5 | 2019-11-19 | 3 | 60
| 6 | 2019-01-11 | 4 | 70
| 7 | 2019-02-20 | 4 | 11
| 8 | 2019-03-11 | 4 | 12
| 9 | 2019-07-12 | 4 | 23
--------------------------------
and I want to get values between two dates: date_from and date_to. And all months from this interval.
For example:
date_from = 2019-01-08;
date_to = 2019-09-10;
So for user_id = 3 i want to get something like that:
| date | value
-------------------------
| 2019-01 | 20 |
| 2019-02 | NULL |
| 2019-03 | NULL |
| 2019-04 | 30 |
| 2019-05 | NULL |
| 2019-06 | 40 |
| 2019-07 | NULL |
| 2019-08 | 50 |
| 2019-09 | NULL |
--------------------------
Is anyone help me? Thanks!
You can use a recursive CTE to generate the dates and then left join:
with recursive dates as (
select date('2019-01-08') as dte
union all
select dte + interval 1 day
from dates
where dte < '2019-09-10'
)
select extract(year_month from d.dte) as yyyymm, sum(t.value)
from dates d left join
t
on d.dte = t.date and t.user_id = 3
group by yyyymm;
Here is a db<>fiddle.

sql joins with multiple conditions

i have two tables, (say bill and soldproduct)
select * from bill;
+------+------------+------------+
| id | solddate | customerId |
+------+------------+------------+
| 11 | 2018-07-23 | 1 |
| 12 | 2018-07-21 | 1 |
| 13 | 2018-08-02 | 2 |
| 14 | 2018-08-08 | 2 |
| 15 | 2018-08-08 | 1 |
| 16 | 2018-08-08 | 1 |
+------+------------+------------+
select * from soldproduct;
+--------+-------------+----------+-------+------------+
| billid | productname | quantity | price | totalprice |
+--------+-------------+----------+-------+------------+
| 11 | book | 2 | 100 | 200 |
| 11 | pen | 10 | 10 | 100 |
| 11 | pencil | 5 | 2 | 10 |
| 12 | pencil | 5 | 2 | 10 |
| 13 | pen | 10 | 10 | 100 |
| 13 | book | 2 | 100 | 200 |
| 14 | pen | 1 | 10 | 10 |
| 14 | bottle | 1 | 75 | 75 |
| 15 | phone | 1 | 5000 | 5000 |
| 16 | lock | 15 | 50 | 750 |
+--------+-------------+----------+-------+------------+
I need to find the highest bill id using totalprice.
I tried using
select billid,sum(totalprice)
from soldproduct
where billid in (select id from bill where solddate >= date_sub(curdate(),interval 1 month))
group by billid
order by totalprice desc;
and my output is
+--------+-----------------+
| billid | sum(totalprice) |
+--------+-----------------+
| 15 | 5000 |
| 16 | 750 |
| 11 | 310 |
| 13 | 300 |
| 12 | 10 |
| 14 | 85 |
+--------+-----------------+
How do i get the same output with a single query using joins (without using subquery)?
try the following join
select billid,sum(totalprice)
from soldproduct
join bill on soldproduct.billid = bill.id and solddate >= date_sub(curdate(),interval 1
month)
group by billid
order by totalprice desc;
Can you try the below query:(I do not tested it out)
SELECT billid, SUM(totalprice)
FROM soldproduct SP
JOIN bill B ON (B.id = SP.billid)
WHERE B.solddate BETWEEN (CURRENT_DATE() - INTERVAL 1 MONTH) AND CURRENT_DATE()
GROUP BY SP.billid
ORDER BY SP.totalprice DESC;

mysql query sum values group by month but divided in years

I have a table with 4 columns: ID, fieldDATE, fieldINT1, fieldINT2.
Table is like this:
ID| fieldDate | FieldINT1 | FiledINT2 |
===================================
1 | 2016-01-01 | 100 | 1 |
2 | 2016-01-08 | 200 | 1 |
3 | 2016-02-01 | 150 | 1 |
4 | 2016-02-05 | 400 | 2 |
5 | 2017-01-01 | 120 | 1 |
6 | 2017-01-21 | 123 | 1 |
7 | 2017-02-03 | 30 | 1 |
8 | 2018-01-01 | 123 | 1 |
9 | 2018-01-03 | 30 | 1 |
I'd like to create a table with 12 rows, with the first column is the month name, and the other columns are sum of fieldINT1 and fieldINT2, group by month in a specific YEAR. So in my example, there will be 4 columns ( MONTH NAME, 2016, 2017, 2018)
How can I do it?

group by week, but new group if it falls into next month

I am having a hard time wrapping my head around how I would do this.
I have daily (most days) invoice data that I need to group in buckets of weeks. However if the week goes into the next month I need the bucket to only have the amount of days in it that fall in the current month, and then the next bucket would start on the 1st - the next saturday. so that the next full week starts again on sunday.
Right now we just don't group it at all, and just export by day which gives us ~60 million rows for the rolling 2 years (it is more complex than the example as it also is split by item and customer). This then gets imported into our demand planning software which has both a weekly and monthly model. It has no problem dumping them into the correct buckets when it is by day.
However I would like to decrease this ~60 million rows as we are running into some time constraints. But it still has to accurately work with both the weekly and monthly models the data gets imported into.
How can I group this way?
Example Data set
+------------+------------+
| date | sales |
+------------+------------+
| 2014-06-22 | 100 |
| 2014-06-23 | 200 |
| 2014-06-24 | 300 |
| 2014-06-25 | 150 |
| 2014-06-26 | 170 |
| 2014-06-27 | 210 |
| 2014-06-28 | 220 |
| 2014-06-29 | 120 |
| 2014-06-30 | 110 |
| 2014-07-01 | 190 |
| 2014-07-02 | 210 |
| 2014-07-03 | 100 |
| 2014-07-04 | 140 |
| 2014-07-05 | 150 |
| 2014-07-06 | 130 |
| 2014-07-07 | 420 |
| 2014-07-08 | 310 |
| 2014-07-09 | 290 |
| 2014-07-10 | 180 |
| 2014-07-11 | 140 |
| 2014-07-12 | 210 |
+------------+------------+
Expected Result:
+------------+------------+
| date | sum(sales) |
+------------+------------+
| 2014-06-22 | 1350 | 7 days in group
| 2014-06-29 | 230 | 2 days in group
| 2014-07-01 | 790 | 5 days in group
| 2014-07-06 | 1680 | 7 days in group
+------------+------------+
EDIT:
We came up with a working solution. Feel free to improve on it if wanted, or not.
SELECT DATE(IF(
MONTH(DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)) = MONTH(`date`)
, DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)
, DATE_FORMAT(`date`,'%Y-%m-01')
)) AS datekey
, SUM(val) AS valsum
FROM tmp.testdata
GROUP BY IF(
MONTH(DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)) = MONTH(`date`) -- If the closest previous Sunday from date falls within the same month as the date...
, DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY) -- ...use the date of the closest previous Sunday as the key...
, DATE_FORMAT(`date`,'%Y-%m-01') -- ...otherwise use the 1st of the month the date falls in as the key (since that must mean the date falls in that opening partial week).
)
ORDER BY datekey
Thanks all! We combined some of this together and ended up with:
SELECT MIN(`date`) AS datekey
, SUM(val) AS valsum
FROM tmp.testdata
GROUP BY DATE_FORMAT(`date`, '%U'), MONTH(`date`), YEAR(`date`)
ORDER BY datekey
OR in the case we ALWAYS want the bucket to be sunday or the 1st (for instance when not all days have invoices) we combined my solution with the one here, since the group here was faster
SELECT
DATE(IF(MONTH(DATE_SUB(`date`,
INTERVAL DAYOFWEEK(`date`) - 1 DAY)) = MONTH(`date`),
DATE_SUB(`date`,
INTERVAL DAYOFWEEK(`date`) - 1 DAY),
DATE_FORMAT(`date`, '%Y-%m-01'))) AS datekey,
SUM(val) AS valsum
FROM
tmp.testdata
GROUP BY DATE_FORMAT(`date`, '%U') , MONTH(`date`) , YEAR(`date`)
ORDER BY datekey
Here's something to think about...
calendar is a simple table of dates...
SELECT MIN(dt),YEARWEEK(dt),MONTH(dt) FROM calendar WHERE dt BETWEEN '2014-01-01' AND '2014-12-31' GROUP BY YEARWEEK(dt),MONTH(dt);
+------------+--------------+-----------+
| MIN(dt) | YEARWEEK(dt) | MONTH(dt) |
+------------+--------------+-----------+
| 2014-01-01 | 201352 | 1 |
| 2014-01-05 | 201401 | 1 |
| 2014-01-12 | 201402 | 1 |
| 2014-01-19 | 201403 | 1 |
| 2014-01-26 | 201404 | 1 |<-- Overlap
| 2014-02-01 | 201404 | 2 |<-- Overlap
| 2014-02-02 | 201405 | 2 |
| 2014-02-09 | 201406 | 2 |
| 2014-02-16 | 201407 | 2 |
| 2014-02-23 | 201408 | 2 |<-- Overlap
| 2014-03-01 | 201408 | 3 |<-- Overlap
| 2014-03-02 | 201409 | 3 |
| 2014-03-09 | 201410 | 3 |
| 2014-03-16 | 201411 | 3 |
| 2014-03-23 | 201412 | 3 |
| 2014-03-30 | 201413 | 3 |<-- Overlap
| 2014-04-01 | 201413 | 4 |<-- Overlap
| 2014-04-06 | 201414 | 4 |
| 2014-04-13 | 201415 | 4 |
| 2014-04-20 | 201416 | 4 |
| 2014-04-27 | 201417 | 4 |<-- Overlap
| 2014-05-01 | 201417 | 5 |<-- Overlap
| 2014-05-04 | 201418 | 5 |
| 2014-05-11 | 201419 | 5 |
| 2014-05-18 | 201420 | 5 |
| 2014-05-25 | 201421 | 5 |<-- No overlap
| 2014-06-01 | 201422 | 6 |<-- No overlap
| 2014-06-08 | 201423 | 6 |
| 2014-06-15 | 201424 | 6 |
| 2014-06-22 | 201425 | 6 |
| 2014-06-29 | 201426 | 6 |<-- Overlap
| 2014-07-01 | 201426 | 7 |<-- Overlap
| 2014-07-06 | 201427 | 7 |
| 2014-07-13 | 201428 | 7 |
| 2014-07-20 | 201429 | 7 |
| 2014-07-27 | 201430 | 7 |<-- Overlap
| 2014-08-01 | 201430 | 8 |<-- Overlap
| 2014-08-03 | 201431 | 8 |
| 2014-08-10 | 201432 | 8 |
| 2014-08-17 | 201433 | 8 |
| 2014-08-24 | 201434 | 8 |
| 2014-08-31 | 201435 | 8 |<-- Overlap
| 2014-09-01 | 201435 | 9 |<-- Overlap
| 2014-09-07 | 201436 | 9 |
| 2014-09-14 | 201437 | 9 |
| 2014-09-21 | 201438 | 9 |
| 2014-09-28 | 201439 | 9 |<-- Overlap
| 2014-10-01 | 201439 | 10 |<-- Overlap
| 2014-10-05 | 201440 | 10 |
| 2014-10-12 | 201441 | 10 |
| 2014-10-19 | 201442 | 10 |
| 2014-10-26 | 201443 | 10 |<-- Overlap
| 2014-11-01 | 201443 | 11 |<-- Overlap
| 2014-11-02 | 201444 | 11 |
| 2014-11-09 | 201445 | 11 |
| 2014-11-16 | 201446 | 11 |
| 2014-11-23 | 201447 | 11 |
| 2014-11-30 | 201448 | 11 |<-- Overlap
| 2014-12-01 | 201448 | 12 |<-- Overlap
| 2014-12-07 | 201449 | 12 |
| 2014-12-14 | 201450 | 12 |
| 2014-12-21 | 201451 | 12 |
| 2014-12-28 | 201452 | 12 |
+------------+--------------+-----------+
SELECT min(date),sum(sales) FROM sales GROUP BY WEEKOFYEAR(date), MONTH(date);
Update: WEEKOFYEAR() will use the MySQL calendar which starts the week on a Monday. So I found you can use DATE_FORMAT to get the week number starting with Sunday.
SELECT min(date),sum(sales) FROM sales GROUP BY DATE_FORMAT(date, '%U'), MONTH(date);
We came up with a working solution.
SELECT DATE(IF(
MONTH(DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)) = MONTH(`date`)
, DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)
, DATE_FORMAT(`date`,'%Y-%m-01')
)) AS datekey
, SUM(val) AS valsum
FROM tmp.testdata
GROUP BY IF(
MONTH(DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY)) = MONTH(`date`) -- If the closest previous Sunday from date falls within the same month as the date...
, DATE_SUB(`date`, INTERVAL DAYOFWEEK(`date`) - 1 DAY) -- ...use the date of the closest previous Sunday as the key...
, DATE_FORMAT(`date`,'%Y-%m-01') -- ...otherwise use the 1st of the month the date falls in as the key (since that must mean the date falls in that opening partial week).
)
ORDER BY datekey

MySQL left outer join trouble

Here is a query that groups transactions by pricepoint on an hourly basis:
SELECT hour(Stamp) AS hour, PointID AS pricepoint, count(1) AS counter
FROM Transactions
GROUP BY 1,2;
Sample output:
+------+------------+---------+
| hour | pricepoint | counter |
+------+------------+---------+
| 0 | 19 | 5 |
| 0 | 20 | 14 |
| 1 | 19 | 3 |
| 1 | 20 | 12 |
| 2 | 19 | 2 |
| 2 | 20 | 8 |
| 3 | 19 | 2 |
| 3 | 20 | 4 |
| 4 | 19 | 1 |
| 4 | 20 | 1 |
| 5 | 19 | 4 |
| 5 | 20 | 1 |
| 6 | 20 | 2 |
| 8 | 19 | 1 |
| 8 | 20 | 4 |
| 9 | 19 | 2 |
| 9 | 20 | 5 |
| 10 | 19 | 6 |
| 10 | 20 | 1 |
| 11 | 19 | 10 |
| 11 | 20 | 2 |
| 12 | 19 | 10 |
| 12 | 20 | 3 |
| 13 | 19 | 10 |
| 13 | 20 | 10 |
| 14 | 19 | 8 |
| 14 | 20 | 3 |
| 15 | 19 | 6 |
| 15 | 20 | 8 |
| 16 | 19 | 11 |
| 16 | 20 | 10 |
| 17 | 19 | 7 |
| 17 | 20 | 17 |
| 18 | 19 | 7 |
| 18 | 20 | 9 |
| 19 | 19 | 10 |
| 19 | 20 | 12 |
| 20 | 19 | 17 |
| 20 | 20 | 11 |
| 21 | 19 | 12 |
| 21 | 20 | 29 |
| 22 | 19 | 6 |
| 22 | 20 | 21 |
| 23 | 19 | 9 |
| 23 | 20 | 23 |
+------+------------+---------+
As you can see, some hours have no transactions (e.g 7am), and some hours only have transactions for a single pricepoint (e.g. 6am, only pricepoint 20 but no transactions for pricepoint 19).
I would like to display the results set with "0" when there are no transactions, rather than just not being there as is the case now.
Trying to work with a LEFT OUTER JOIN there. The inHour table contains values 0..23
SELECT H.hour, PointID AS Pricepoint, COALESCE(T.counter, 0) AS Count
FROM inHour H
LEFT OUTER JOIN
(
SELECT hour(Stamp) AS Hour, PointID, count(1) AS counter
FROM Transactions
GROUP BY 1,2
) T
ON T.Hour = H.hour;
This produces the following output (truncated for brevity):
| 5 | 19 | 4 |
| 5 | 20 | 1 |
| 6 | 20 | 2 |
| 7 | NULL | 0 |
| 8 | 19 | 1 |
| 8 | 20 | 4 |
What I would like in fact would be:
| 5 | 19 | 4 |
| 5 | 20 | 1 |
| 6 | 19 | 0 |
| 6 | 20 | 2 |
| 7 | 19 | 0 |
| 7 | 20 | 0 |
| 8 | 19 | 1 |
| 8 | 20 | 4 |
In my desired output, the value "0" is put next to pricepoints that had no transactions during a given hour.
Your suggestions would be welcome! Thanks.
SELECT h.Hour, p.Pricepoint, COUNT(t.*) AS Count
FROM inHour h,
(SELECT DISTINCT PointId AS Pricepoint FROM Transactions) p
LEFT OUTER JOIN Transactions t
ON h.Hour = hour(t.Stamp) AND p.Pricepoint = t.PointID
GROUP BY h.Hour, p.Pricepoint
ORDER BY h.Hour, p.Pricepoint
I don't have time at the moment to try this, so let me know if it doesn't work and I'll try to adjust.
Someone probably has a better solution than this, but I would use a UNION to simplify things:
SELECT hour(Stamp) AS hour, PointID AS pricepoint, count(1) AS counter
FROM Transactions
GROUP BY 1,2
UNION
SELECT hour,0 AS pricepoint,0 AS counter FROM inHour WHERE hour NOT IN (SELECT hour(Stamp) FROM Transactions)