MySQL GROUP BY query group results by week, month - mysql

I have a table named "counter" that has 3 fields:
| Field | Type | Null | Key | Default | Extra |
+--------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| record | datetime | YES | | NULL | |
| passed | tinyint(1) | YES | | NULL | |
+--------+------------+------+-----+---------+----------------+
My table records
mysql> SELECT record, passed FROM counter;
+---------------------+--------+
| record | passed |
+---------------------+--------+
| 2019-09-19 00:00:00 | 1 |
I would like to make a query to group the results by weeks, and months from the field record.
I have tried to do it like this but doesn't look right
SELECT DATE_FORMAT(record, '%d-%m-%Y') AS ts, COUNT(*) FROM counter WHERE record >= '2019-10-09' AND record < '2019-10-09' + INTERVAL 7 DAY GROUP BY DATE_FORMAT(record, '%d-%m-%Y');

Simple grouping with date extraction must work
select
EXTRACT(YEAR FROM record) V_YEAR,
EXTRACT(MONTH FROM record) V_MONTH,
EXTRACT(WEEK FROM record) V_WEEK,
COUNT(*)
FROM COUNTER
GROUP BY
EXTRACT(YEAR FROM record) ,
EXTRACT(MONTH FROM record) ,
EXTRACT(WEEK FROM record)
edit same works with DATE_FORMAT(date, format) as well
select
DATE_FORMAT(record, '%Y') V_YEAR,
DATE_FORMAT(record, '%M') V_MONTH,
DATE_FORMAT(record, '%d') V_WEEK,
COUNT(*)
FROM COUNTER
GROUP BY
DATE_FORMAT(record, '%Y') ,
DATE_FORMAT(record, '%M') ,
DATE_FORMAT(record, '%d')

Number of passes, fails etc per week:
SELECT week(record_date) as weekofyear, sum(passed) as numpasses, count(*) - sum(passed) as numfails, count(*) as total
FROM counter
WHERE record_date >= '2019-01-01' and record_date < '2020-01-01'
GROUP BY week(record_date)
Swap week for month etc
MySQL supports ROLLUP which can summarise weeks and months in the same statement:
SELECT month(record_date) as monthofyear, week(record_date) as weekofyear, sum(passed) as numpasses, count(*) - sum(passed) as numfails, count(*) as total
FROM counter
WHERE record_date >= '2019-01-01' and record_date < '2020-01-01'
GROUP BY month(record_date), week(record_date) WITH ROLLUP
lines where the week is null are the total for the month

Related

count by person by month between days in mysql

I have a table of absences with 3 columns id, begin_dt, end_dt. I need to give a count of how many id's has at least one day of absence in that month. So for example there is a row as follow:
id begin_dt end_dt
1 01/01/2020 02/02/2020
2 02/02/2020 02/02/2020
my result has to be
month count
01-2020 1
02-2020 2
I thought with a group by on DATE_FORMAT(SYSDATE(), '%Y-%m'), but I don't know how to manage the fact that we had to look for the whole period begin_dt till end_dt
you can find a working creation of table of this example here: https://www.db-fiddle.com/f/rYBsxQzTjjQ9nGBEmeAX6W/0
Schema (MySQL v5.7)
CREATE TABLE absence (
`id` VARCHAR(6),
`begin_dt` DATETIME,
`end_dt` DATETIME
);
INSERT INTO absence
(`id`, `begin_dt`, `end_dt`)
VALUES
('1', DATE('2019-01-01'), DATE('2019-02-02')),
('2', DATE('2019-02-02'), DATE('2019-02-02'));
Query #1
select * from absence;
| id | begin_dt | end_dt |
| --- | ------------------- | ------------------- |
| 1 | 2019-01-01 00:00:00 | 2019-02-02 00:00:00 |
| 2 | 2019-02-02 00:00:00 | 2019-02-02 00:00:00 |
View on DB Fiddle
SELECT DATE_FORMAT(startofmonth, '%Y-%m-01') year_and_month,
COUNT(*) absent_person_count
FROM absence
JOIN ( SELECT DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m-01') startofmonth,
DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m-01') + INTERVAL 1 MONTH - INTERVAL 1 DAY endofmonth
FROM ( SELECT MIN(begin_dt) dt
FROM absence ) startdate,
( SELECT 0 n UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 ) numbers,
( SELECT DATE_FORMAT(MIN(begin_dt), '%Y-%m') mindate,
DATE_FORMAT(MAX(end_dt), '%Y-%m') maxdate
FROM absence ) datesrange
WHERE DATE_FORMAT(dt + INTERVAL n MONTH, '%Y-%m') BETWEEN mindate AND maxdate ) dateslist
ON begin_dt <= endofmonth
AND end_dt >= startofmonth
GROUP BY year_and_month;
fiddle

Can't run MySQL code that seem to work for others

The goal of the code is to select Month, SaleID, Total and Growth. I can display Month, SaleID and Total but can't get Growth to work because it calculates from the the first row always. What am I doing wrong?
I've tried setting up variables, Emulating LAG(), PREV, CURRENT, NEXT to get the row the calculation should use but it won't register the native functions.
CREATE VIEW SalesTemp
AS
SELECT
DATE_FORMAT(Sales.SaleDate, "%Y-%m") AS Month,
Sales.SaleID,
Sales.Total
FROM Sales
WHERE SaleDate BETWEEN '2018-04-00' AND '2040-00-00'
GROUP BY DATE_FORMAT(Sales.SaleDate, "%Y-%m");
SELECT * FROM SalesTemp;
DROP VIEW IF EXISTS PercentageGrowth;
CREATE VIEW PercentageGrowth
AS
SELECT
DATE_FORMAT(Sales.SaleDate, "%Y-%m") AS Month,
Sales.SaleID,
Sales.Total,
CONCAT(ROUND(((Sales.Total) - SalesTemp.Total) / (SELECT SalesTemp.Total FROM SalesTemp GROUP BY DATE_FORMAT(SalesTemp.Month, "%Y-%m")) * 100, 2), "%") AS Growth
FROM Sales, SalesTemp
GROUP BY DATE_FORMAT(Sales.SaleDate, "%Y-%m");
SELECT * FROM PercentageGrowth;
DROP VIEW PercentageGrowth;
DROP VIEW SalesTemp;
I want it to display growth of a company through the calculation of ((newValue - oldValue) / oldValue).
Since I can't link pictures I'll ascii what the result is. What I get from the SELECT now is:
+--------------------------------------+
| Month | SaleID | Total | Growth |
| ------- | ------ | ------- | ------- |
| 2018-04 | 1 | 310.46 | 00.00% |
| 2018-05 | 3 | 2160.62 | 595.54% |
| 2018-06 | 6 | 1087.89 | 250.21% |
| 2018-07 | 14 | 2314.54 | 645.09% |
+--------------------------------------+
I want it to say:
+--------------------------------------+
| Month | SaleID | Total | Growth |
| ------- | ------ | ------- | ------- |
| 2018-04 | 1 | 310.46 | 00.00% |
| 2018-05 | 3 | 2160.62 | 595.54% |
| 2018-06 | 6 | 1087.89 | -49.64% |
| 2018-07 | 14 | 2314.54 | 112.76% |
+--------------------------------------+
Currently, you are cross joining your table pairings across all SaleID and all formatted date months and this is then further impacted by your unclear aggregations.
Assuming you use MySQL 8+, consider a couple of CTEs which includes LAG by one offset of your aggregated month totals:
WITH cte1 AS
(SELECT DATE_FORMAT(Sales.SaleDate, "%Y-%m") AS `Month`,
Sales.SaleID,
SUM(Sales.Total) AS `Total_Sales`
FROM Sales
WHERE SaleDate BETWEEN '2018-04-00' AND '2040-00-00'
GROUP BY
DATE_FORMAT(Sales.SaleDate, "%Y-%m"),
Sales.SaleID
),
cte2 AS
(SELECT *,
LAG(`Total_Sales`) OVER (PARTITION BY `SaleID`
ORDER BY `Month`) AS `Lag_Total_Sales`
FROM cte1)
SELECT `Month`, `SaleID`, `Total_Sales`,
CONCAT(
ROUND(
(`Total_Sales` - `Lag_Total_Sales`) / `Lag_Total_Sales`
, 2) * 100
, '%') AS `Growth`
FROM cte2
For MySQL 5.7 or less, consider a self-join of subquery that explicitly joins SaleID and any date in last month normalizing all dates to the first of their respective months.
SELECT DATE_FORMAT(curr.FirstMonth, "%Y-%m") AS `Month`,
curr.SaleID,
curr.Total_Sales,
CONCAT(
ROUND((`curr`.Total_Sales - `prev`.Total_Sales) / `prev`.Total_Sales
, 2)*100
, '%') AS `Growth`
FROM
(SELECT DATE_ADD(LAST_DAY(DATE_SUB(SaleDate, INTERVAL 1 MONTH))
, INTERVAL 1 DAY) As FirstMonth,
SaleID,
SUM(`Total`) As `Total_Sales`
FROM Sales
GROUP BY
DATE_ADD(LAST_DAY(DATE_SUB(SaleDate, INTERVAL 1 MONTH))
, INTERVAL 1 DAY),
SaleID
) AS `curr`
LEFT JOIN
(SELECT DATE_ADD(LAST_DAY(DATE_SUB(SaleDate, INTERVAL 1 MONTH))
, INTERVAL 1 DAY) As FirstMonth,
SaleID,
SUM(`Total`) As `Total_Sales`
FROM Sales
GROUP BY
DATE_ADD(LAST_DAY(DATE_SUB(SaleDate,
INTERVAL 1 MONTH))
, INTERVAL 1 DAY),
SaleID
) AS `prev`
ON `curr`.SaleID = `prev`.SaleID
AND `curr`.FirstMonth - INTERVAL 1 MONTH = `prev`.FirstMonth
AND `curr`.FirstMonth BETWEEN '2018-04-00' AND '2040-00-00'
Rextester Demo (MySQL 5.7 version)

With MySQL, why is this date query showing incorrect results?

Sample table:
id | foreign_key_id | timestamp | amt |
-------------------------------------------------
1 | 223344 | 2018-06-01 09:22:31 | 3
2 | 233445 | 2018-06-15 23:22:31 | 2
2 | 233445 | 2018-06-30 23:22:31 | 5
3 | 334455 | 2018-07-01 12:22:31 | 1
3 | 334455 | 2018-07-15 12:22:31 | 1
4 | 344556 | 2018-07-31 20:22:31 | 2
And what I want is a grouped result of the total of amt per month, something like,
year | month | total_amt
------------------------
2018 | 6 | 10
2018 | 7 | 4
Which I thought would be easily enough achieved with a query like,
SELECT YEAR(timestamp) year, MONTH(timestamp) month, SUM(amt) total_amt
FROM sample_table
WHERE timestamp >= '2018-06-01'
AND timestamp <= '2018-07-31'
GROUP BY YEAR(timestamp), MONTH(timestamp)
Unfortunately, the result of this query is incorrect,
year | month | total_amt
------------------------
2018 | 6 | 10
2018 | 7 | 2
The amount for June is correct, but July is wrong.
This is a misunderstanding of what is timestamp here, and then what is the compared string.
In the timestamp is a date-time object, which has both a date part, and a time part. The string used in the query is just the date part, without the time part, so when MySQL is doing its thing, it basically "zeroes out" the rest of the date.
So when the query is
...
AND timestamp <= '2018-07-31'
It basically turns into,
...
AND timestamp <= '2018-07-31 00:00:00'
And when you dump the row that isn't matched with the query,
...
AND '2018-07-31 20:22:31' <= '2018-07-31 00:00:00'
Whether it's date comparison, or even simple string comparison, the missing row's timestamp is not less or equal to the date passed in, it is definitely more.
You've a few options to fix this, create a full date-time object in the comparison, with the "fullest" of times,
...
AND timestamp <= '2018-07-31 23:59:59'
Change the operator to less-than the next date,
...
AND timestamp < '2018-08-01'
Or convert the timestamp from a date-time object to a date one,
...
AND DATE(timestamp) <= '2018-07-31'
All work, though I'm not sure about which is the best one performance / speed wise.
Try this:
SELECT YEAR(timestamp) year, MONTH(timestamp) month, SUM(amt) total_amt
FROM sample_table
WHERE timestamp >= '2018-06-01'
AND timestamp < '2018-08-01'
GROUP BY YEAR(timestamp), MONTH(timestamp)
This will work:
SELECT YEAR(timestamp) year, MONTH(timestamp) month, SUM(amt) total_amt
FROM sample_table
WHERE DATE(timestamp) >= '2018-06-01'
AND DATE(timestamp) <= '2018-08-01'
GROUP BY YEAR(timestamp), MONTH(timestamp);
OR
SELECT YEAR(timestamp) year, MONTH(timestamp) month, SUM(amt) total_amt
FROM sample_table
WHERE DATE(timestamp) BETWEEN '2018-06-01' AND '2018-08-01'
GROUP BY YEAR(timestamp), MONTH(timestamp);

Finding count for a Period in sql

I have a table with :
user_id | order_date
---------+------------
12 | 2014-03-23
12 | 2014-01-24
14 | 2014-01-26
16 | 2014-01-23
15 | 2014-03-21
20 | 2013-10-23
13 | 2014-01-25
16 | 2014-03-23
13 | 2014-01-25
14 | 2014-03-22
A Active user is someone who has logged in last 12 months.
Need output as
Period | count of Active user
----------------------------
Oct-2013 - 1
Jan-2014 - 5
Mar-2014 - 10
The Jan 2014 value - includes Oct -2013 1 record and 4 non duplicate record for Jan 2014)
You can use a variable to calculate the running total of active users:
SELECT Period,
#total:=#total+cnt AS `Count of Active Users`
FROM (
SELECT CONCAT(MONTHNAME(order_date), '-', YEAR(order_date)) AS Period,
COUNT(DISTINCT user_id) AS cnt
FROM mytable
GROUP BY Period
ORDER BY YEAR(order_date), MONTH(order_date) ) t,
(SELECT #total:=0) AS var
The subquery returns the number of distinct active users per Month/Year. The outer query uses #total variable in order to calculate the running total of active users' count.
Fiddle Demo here
I've got two queries that do the thing. I am not sure which one's the fastest. Check them aginst your database:
SQL Fiddle
Query 1:
select per.yyyymm,
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(per.yyyymm - INTERVAL 1 YEAR) and o.order_date < per.yyyymm + INTERVAL 1 MONTH) as `count`
from
(select DISTINCT LAST_DAY(order_date) + INTERVAL 1 DAY - INTERVAL 1 MONTH as yyyymm
from orders) per
order by per.yyyymm
Results:
| yyyymm | count |
|---------------------------|-------|
| October, 01 2013 00:00:00 | 1 |
| January, 01 2014 00:00:00 | 5 |
| March, 01 2014 00:00:00 | 6 |
Query 2:
select DATE_FORMAT(order_date, '%Y-%m'),
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(LAST_DAY(o1.order_date) + INTERVAL 1 DAY - INTERVAL 13 MONTH) and
o.order_date <= LAST_DAY(o1.order_date)) as `count`
from orders o1
group by DATE_FORMAT(order_date, '%Y-%m')
Results:
| DATE_FORMAT(order_date, '%Y-%m') | count |
|----------------------------------|-------|
| 2013-10 | 1 |
| 2014-01 | 5 |
| 2014-03 | 6 |
The best thing I could do is this:
SELECT Date, COUNT(*) as ActiveUsers
FROM
(
SELECT DISTINCT userId, CONCAT(YEAR(order_date), "-", MONTH(order_date)) as Date
FROM `a`
ORDER BY Date
)
AS `b`
GROUP BY Date
The output is the following:
| Date | ActiveUsers |
|---------|-------------|
| 2013-10 | 1 |
| 2014-1 | 4 |
| 2014-3 | 4 |
Now, for every row you need to sum up the number of active users in previous rows.
For example, here is the code in C#.
int total = 0;
while (reader.Read())
{
total += (int)reader['ActiveUsers'];
Console.WriteLine("{0} - {1} active users", reader['Date'].ToString(), reader['ActiveUsers'].ToString());
}
By the way, for the March of 2014 the answer is 9 because one row is duplicated.
Try this, but thise doesn't handle the last part: The Jan 2014 value - includes Oct -2013
select TO_CHAR(order_dt,'MON-YYYY'), count(distinct User_ID ) cnt from [orders]
where User_ID in
(select User_ID from
(select a.User_ID from [orders] a,
(select a.User_ID,count (a.order_dt) from [orders] a
where a.order_dt > (select max(b.order_dt)-365 from [orders] b where a.User_ID=b.User_ID)
group by a.User_ID
having count(order_dt)>1) b
where a.User_ID=b.User_ID) a
)
group by TO_CHAR(order_dt,'MON-YYYY');
This is what I think you are looking for
SET #cnt = 0;
SELECT Period, #cnt := #cnt + total_active_users AS total_active_users
FROM (
SELECT DATE_FORMAT(order_date, '%b-%Y') AS Period , COUNT( id) AS total_active_users
FROM t
GROUP BY DATE_FORMAT(order_date, '%b-%Y')
ORDER BY order_date
) AS t
This is the output that I get
Period total_active_users
Oct-2013 1
Jan-2014 6
Mar-2014 10
You can also do COUNT(DISTINCT id) to get the unique Ids only
Here is a SQL Fiddle

mysql grouping by days

i have this scheme:
+-------+
|posts: |
+----+--+--------+--------------------+
| ID | title | timestamp |
+----+-----------+--------------------+
| 1 | t1 |2011-04-05 17:54:55 |
+----+-----------+--------------------+
| 2 | t2 |2011-04-06 09:10:11 |
+----+-----------+--------------------+
| 3 | t3 |2011-04-07 02:07:22 |
+----+-----------+--------------------+
How i can get the total of posts for last 7 days grouped like this:
monday: 3
Tuesday: 9
Wednesday: 2
MySQL specific solutions:
SELECT WEEKDAY(timestamp_field) AS wd, count(*) FROM your_table GROUP BY wd;
or
SELECT count(*) FROM your_table GROUP BY WEEKDAY(timestamp_field);
Well,
you would have to select the date, make a count(*) and group by date.
SELECT date_format(TIMESTAMP, '%d %m')
, COUNT(*)
FROM posts
WHERE TIMESTAMP BETWEEN FROMDATE AND TODATE
GROUP BY date_format(TIMESTAMP, '%d %m')
further help and explanation:
MySQL Manual for DATE_FORMAT
EDIT:
Weekday can also be achieved with this function by using %W.
SELECT WEEKDAY(timestamp), count(*)
FROM POSTS as p1
WHERE DATE_SUB(NOW(), INTERVAL 7 DAY) < timestamp
GROUP BY YEAR(timestamp), MONTH(timestamp), DAY(timestamp)
ORDER BY YEAR(timestamp) desc, MONTH(timestamp) desc, DAY(timestamp) desc
Check out the date and time functions in MySQL.