mySQL total user count
Grouping by month
I want to list the total count of registered users grouped by month
Well, the difficulty about this is that I don't want the count per month,
but the the total count of users up to (and including) the month.
User table structure
+---------------+--------------+------+-------------------+----------------+
| Field | Type | Null | Default | Extra |
+---------------+--------------+------+-------------------+----------------+
| ID | int(11) | NO | NULL | auto_increment |
| email | varchar(225) | NO | NULL | |
................................-CUT-.......................................
| registered | timestamp | NO | CURRENT_TIMESTAMP | |
+---------------+--------------+------+-------------------+----------------+
Example data
1 example1#mail 2012-04-04 xx:xx:xx
2 example2#mail 2012-05-04 xx:xx:xx
3 example3#mail 2012-05-04 xx:xx:xx
Preferred output
+------+-------+-------+
| Year | Month | Count |
+------+-------+-------+
| 2012 | 01 | 0 |
| 2012 | 02 | 0 |
| 2012 | 03 | 0 |
| 2012 | 04 | 1 |
| 2012 | 05 | 3 |
+------+-------+-------+
The NULL results aren't necessary.
How could I achieve that result in pure mySQL?
I have not tried this but something along these lines should work -
SELECT tots.*, #var := #var + tots.`count`
FROM (
SELECT
YEAR(registered) AS `year`,
MONTH(registered) AS `month`,
COUNT(*) AS `count`
FROM user
GROUP BY `year`, `month`
) AS tots, (SELECT #var := 0) AS inc
You can do it with a couple of user variables:
set #c = 0;
set #d = 0;
select y, m, #d := #d + Count as Count from
(select year(registered) as y,
month(registered) as m,
#c := #c + count(*) as Count
from user
group by y,m) as t;
gives you
+------+------+-------+
| y | m | Count |
+------+------+-------+
| 2011 | 1 | 2455 |
| 2011 | 2 | 14253 |
| 2011 | 3 | 42311 |
This approach first gets the first day of the month for all months in which any registration occurred. It then joins to every user that had a registration greater than the first day of the month, and then counts the number of users.
SELECT
YEAR(dates.first_day_of_month) AS registration_year,
MONTH(dates.first_day_of_month) AS registration_month,
COUNT(u.ID)
FROM (
SELECT DISTINCT
DATE_SUB(
DATE_ADD(
DATE_SUB(registered,INTERVAL (DAY(registered)-1) DAY),
INTERVAL 1 MONTH),
INTERVAL 1 SECOND) first_day_of_month
FROM user
) dates
LEFT JOIN user u ON u.registered <= dates.first_day_of_month
GROUP BY dates.first_day_of_month
If you want to avoid the gaps in months where no registrations occurred, you could substitute the sub-query with another that used a pre-existing "numbers" table to get a list of all possible months.
Related
This is the table I am working with:
+---------------------+-----------
| Field | Type |
+---------------------+--------------+
| ID | binary(17) |
| MiscSensor_ID | binary(17) |
| rawValue | varchar(100) |
| RawValueUnitType_ID | int |
| timestamp | timestamp |
+---------------------+--------------+
Now my goal is to implement an event which deletes all entries older than a month BUT for each week I want to leave one entry per MiscSensor_ID (the one with the lowest rawValue).
I am this far:
CREATE EVENT delete_old_miscsensordatahistory
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_DATE + INTERVAL 1 DAY
DO
DELETE
FROM history
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31;
I need to do something like: delete if (value > minvalue) and group it in by MiscSensor_ID and 7 day periods but i am stuck right now on how to do that.
Any help would be much appreciated.
You can try using the ROW_NUMBER window function to match the rows which you don't want to delete. Records having row number equal to 1 will be those rows with the minimum "rawValue" for each combination of (week, sensorId).
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(
PARTITION BY MiscSensorId, WEEK(timestamp)
ORDER BY rawValue ) AS rn
FROM history
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31
)
DELETE
FROM history
INNER JOIN cte
ON history.ID = cte.ID
WHERE rn > 1;
This is how i implemented the event right now:
CREATE EVENT delete_old_miscsensordatahistory
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_DATE + INTERVAL 1 DAY
DO
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(
PARTITION BY MiscSensor_ID, WEEK(timestamp)
ORDER BY CAST(rawValue AS SIGNED) ) AS rn
FROM MiscSensorDataHistory
WHERE TIMESTAMPDIFF(DAY, timestamp,NOW()) > 31
)
DELETE MiscSensorDataHistory
FROM MiscSensorDataHistory
INNER JOIN cte
ON cte.ID = MiscSensorDataHistory.ID
WHERE rn > 1
Testing my method I found out that there are still entries with the same MiscSensor_ID and less than 7 days apart:
| 0x3939333133303037343939353436393032 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-02-17 23:09:21 |
| 0x3939333133303037343939313631333039 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-02-06 16:52:48 |
| 0x3939333133303037343938383835353239 | 0x3439303031303031303730303030303535 | 553 | 30 | 2022-01-30 08:21:55 |
| 0x3939333133303037343938383639333436 | 0x3439303031303031303730303030303535 | 554 | 30 | 2022-01-29 22:48:06 |
| 0x3939333133303037343937303734353537 | 0x3439303031303031303730303030303535 | 444 | 30 | 2021-12-26 06:12:07 |
| 0x3939333133303037343937303530363738 | 0x3439303031303031303730303030303535 | 446 | 30 | 2021-12-25 21:53:03 |
| 0x3939333133303037343936333034343238 | 0x3439303031303031303730303030303535 | 0 | 30 | 2021-12-14 13:08:04 |
| 0x3939333133303037343935393934303832 | 0x3439303031303031303730303030303535 | 415 | 30 | 2021-12-08 12:56:43
Any suggestions would be much appreciated.
In my Mysql database, I have 2 columns that store the start and end date of the process, respectively. I need to write a query that allows me to count the number of rows for each month in each column, and presents the count separately.
Table example:
+----+------------+----------------+
| id | startData | endData |
+----+-------------+----------------+
| 1 | 02/03/2020 | 02/03/2020 |
| 2 | 02/04/2020 | 02/04/2020 |
| 3 | 02/04/2020 | 02/05/2020 |
| 4 | 02/04/2020 | 02/05/2020 |
| 5 | 02/05/2020 | 02/06/2020 |
| 6 | 02/05/2020 | 02/06/2020 |
| 7 | 02/06/2020 | 02/07/2020 |
+----+-------------+----------------+
I want as a result:
+-------+--------------------+-------------------+
| month | count_month_start | count_month_end |
+-------+--------------------+-------------------+
| 03 | 01 | 01 |
| 04 | 03 | 01 |
| 05 | 02 | 02 |
| 06 | 01 | 02 |
| 07 | 00 | 01 |
+-------+--------------------+-------------------+
Assuming your start date and end date columns are of datatype date, you can do -
Select ifnull(Tb1.mn,Tb2.mn) As mn, ifnull(count_mn_start,0) As count_mn_start, ifnull(count_mn_end,0) As count_mn_end
from
(Select Month(StartDate) as mn, count(id) as count_mn_start
from
my_table
Group by Month(StartDate))Tb1
left Join (Select Month(EndDate) as mn, count(id) as count_mn_end
from my_table
Group by Month(EndDate)) Tb2
on Tb1.mn = Tb2.mn
UNION
Select ifnull(Tb1.mn,Tb2.mn) As mn, ifnull(count_mn_start,0) As count_mn_start, ifnull(count_mn_end,0) As count_mn_end
from
(Select Month(StartDate) as mn, count(id) as count_mn_start
from
my_table
Group by Month(StartDate))Tb1
Right Join (Select Month(EndDate) as mn, count(id) as count_mn_end
from my_table
Group by Month(EndDate)) Tb2
on Tb1.mn = Tb2.mn;
DB fiddle - https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=84ecddb9f5ed431ddff6a9eaab87e5df
PS : If your dates only have one year (2020 as your example) this would work, however ideally if you have different years in the data, consider having the year in the output as well and in that case use the same syntax ie Year(datefield) and add it in the select and group by in the sub-queries (same way as month in the above query).
A pretty simple way is to expand the time periods into days using a recursive CTE. Then just aggregate:
with recursive cte as (
select id, startdate as dte, enddate
from t
union all
select id, dte + interval 1 day, enddate
from cte
where dte < enddate
)
select year(dte), month(dte),
sum( day(dte) = 1 ) as cnt_start,
sum( day(dte) = day(last_day(dte)) ) as cnt_end
from cte
group by year(dte), month(dte) ;
Here is a db<>fiddle.
I have a table with 'ON' and 'OFF' values in column activity and another column datetime.
id(AUTOINCREMENT) id_device activity datetime
1 a ON 2017-05-26 22:00:00
2 b ON 2017-05-26 05:00:00
3 a OFF 2017-05-27 04:00:00
4 b OFF 2017-05-26 08:00:00
5 a ON 2017-05-28 12:00:00
6 a OFF 2017-05-28 15:00:00
I need to get total ON time by day
day id_device total_minutes_on
2017-05-26 a 120
2017-05-26 b 180
2017-05-27 a 240
2017-05-27 b 0
2017-05-28 a 180
2017-05-28 b 0
i have searched and tried answers for another posts, i tried TimeDifference and i get correct total time.
I don't find the way to get total time grouped by date
i appreciate your help
I'm not posting this as a definite answer rather it's an experiment for me and hopefully you'll find is useful in your case. Also I would like to mention that the MySQL database version I'm working with is quite old so the method I'm using is also very manual to say the least.
First of all lets extract your expected output:
The date value in day need to be repeated twice fro each of id_device a and b.
Minutes are calculated based on the activity; if activity is 'ON' until tomorrow, it needs to be calculated until the day end at 24:00:00 while the next day will calculate minutes until the activity is OFF.
What I come up with is this:
Creating condition (1):
SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY dtt,id_device;
The query above will return the following result:
+------------+-----------+
| dtt | id_device |
+------------+-----------+
| 2017-05-26 | a |
| 2017-05-26 | b |
| 2017-05-27 | a |
| 2017-05-27 | b |
| 2017-05-28 | a |
| 2017-05-28 | b |
+------------+-----------+
*Above will only work with all the dates you have in the table. If you want all date regardless if there's activity or not, I suggest you create a calendar table (refer: Generating a series of dates).
So this become the base query. Then I've added an outer query to left join the query above with the original data table:
SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
What's new in the query is the addition of GROUP_CONCAT operation on both activity and time value extracted from datetime column which is converted into seconds value. You notice that in both of the GROUP_CONCAT there's a similar ORDER BY condition which is important in order to get the exact corresponding value.
The query above will return the following result:
+------------+-----------+----------+-------------+
| dtt | id_device | activity | tr |
+------------+-----------+----------+-------------+
| 2017-05-26 | a | ON | 79200 |
| 2017-05-26 | b | ON OFF | 18000 28800 |
| 2017-05-27 | a | OFF | 14400 |
| 2017-05-27 | b | (NULL) | (NULL) |
| 2017-05-28 | a | ON OFF | 43200 54000 |
| 2017-05-28 | b | (NULL) | (NULL) |
+------------+-----------+----------+-------------+
From here, I've added another query outside to calculate how many minutes and attempt to get the expected result:
SELECT dtt,id_device,
CASE
WHEN SUBSTRING_INDEX(activity,' ',1)='ON' AND SUBSTRING_INDEX(activity,' ',-1)='OFF'
THEN (SUBSTRING_INDEX(tr,' ',-1)-SUBSTRING_INDEX(tr,' ',1))/60
WHEN activity='ON' THEN 1440-(tr/60)
WHEN activity='OFF' THEN tr/60
WHEN activity IS NULL AND tr IS NULL THEN 0
END AS 'total_minutes_on'
FROM
-- from the last query
(SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
--
) z
The last part I do is if the activity value have both ON and OFF on the same day then (OFF-ON)/60secs=total minutes. If activity value is only ON then minutes value for '24:00:00' > 24 hr*60 min= 1440-(ON/60secs)= total minutes, and if activity only OFF, I just convert seconds to minutes because the day starts at 00:00:00 anyhow.
+------------+-----------+------------------+
| dtt | id_device | total_minutes_on |
+------------+-----------+------------------+
| 2017-05-26 | a | 120 |
| 2017-05-26 | b | 180 |
| 2017-05-27 | a | 240 |
| 2017-05-27 | b | 0 |
| 2017-05-28 | a | 180 |
| 2017-05-28 | b | 0 |
+------------+-----------+------------------+
Hopefully this will give you some ideas. ;)
I am currently struggling on how to aggregate my daily data in other time aggregations (weeks, months, quarters etc).
Here is how my raw data type looks like:
| date | traffic_type | visits |
|----------|--------------|---------|
| 20180101 | 1 | 1221650 |
| 20180101 | 2 | 411424 |
| 20180101 | 4 | 108407 |
| 20180101 | 5 | 298117 |
| 20180101 | 6 | 26806 |
| 20180101 | 7 | 12033 |
| 20180101 | 8 | 80368 |
| 20180101 | 9 | 69544 |
| 20180101 | 10 | 39919 |
| 20180101 | 11 | 26291 |
| 20180102 | 1 | 1218490 |
| 20180102 | 2 | 410965 |
| 20180102 | 4 | 108037 |
| 20180102 | 5 | 297727 |
| 20180102 | 6 | 26719 |
| 20180102 | 7 | 12019 |
| 20180102 | 8 | 80074 |
First, I would like to check the sum of visits regardless of traffic_type:
SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date
Here is the outcome:
| ymd | visits_per_day |
|:--------:|:--------------:|
| 20180101 | 2294563 |
| 20180102 | 2289145 |
| 20180103 | 2300367 |
| 20180104 | 2310256 |
| 20180105 | 2368098 |
| 20180106 | 2372257 |
| 20180107 | 2373863 |
| 20180108 | 2364236 |
However, if I want to check the specific day which the visits_per_day was the highest for each time aggregation (eg.: Month), I am struggling to retrieve the right output.
Here is what I did:
SELECT
(date div 100) as y_month, MAX(visits_per_day) as max_visit_per_day
FROM
(SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date) as t1
GROUP BY
y_month
And here is the output of my query:
| y_month | max_visit_per_day |
|:-------:|:-----------------:|
| 201801 | 2435845 |
| 201802 | 2519000 |
| 201803 | 2528097 |
| 201804 | 2550645 |
However, I cannot know what was the exact day where the visits_per_day was the highest.
Desired output:
| y_month | max_visit_per_day | ymd |
|:-------:|:-----------------:|:--------:|
| 201801 | 2435845 | 20180130 |
| 201802 | 2519000 | 20180220 |
| 201803 | 2528097 | 20180325 |
| 201804 | 2550645 | 20180406 |
ymd would represent the day in which the visits_per_day was the highest.
This logic would be used in a dashboard with the help of programming in order to automatically select the time aggregation.
Can someone please help me?
This is a job for the structured part of structured query language. That is, you will write some subqueries and treat them as tables.
You already know how to find the number of visits per day. Let's add the month for each day to that query (http://sqlfiddle.com/#!9/a8455e/13/0).
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
Next you need to find the largest number of daily visits in each month. (http://sqlfiddle.com/#!9/a8455e/12/0)
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
Then, the trick is retrieving the date on which that maximum occurred in each month. That requires a join. Without common table expressions (which MySQL lacks) you need to repeat the first subquery. (http://sqlfiddle.com/#!9/a8455e/11/0)
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
) maxvisits
JOIN (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
The outline of this rather complex query helps explain it. Instead of that subquery, we'll use an imaginary table called dayvisits.
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY date DIV 100
) maxvisits
JOIN dayvisits detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
You're seeking an extreme value for each month in the subquery. (This is a fairly standard sort of SQL operation.) To do that you find that value with a MAX() ... GROUP BY query. Then you join that to the subquery itself to find the other values corresponding to the extreme value.
If you did have common table expressions, the query would look like this. YOu might consider adopting the MySQL fork called MariaDB, which has CTEs.
WITH dayvisits AS (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
)
SELECT dayvisits.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY month
) maxvisits
JOIN dayvisits ON dayvisits.visits = maxvisits.max_daily_visits
AND dayvisits.month = maxvisits.month
[Query Check on MSSQL] its quick and efficient.
select visit_sum_day_wise.date
, visit_sum_day_wise.Max_Visits
, visit_sum_day_wise.traffic_type
, LAST_VALUE(visit_sum_day_wise.visits) OVER(PARTITION BY
visit_sum_day_wise.date ORDER BY visit_sum_day_wise.date ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS max_visit_per_day
from (
select visits_tbl.date , visits_tbl.visits , visits_tbl.traffic_type
,max(visits_tbl.visits ) OVER ( PARTITION BY visits_tbl.date ORDER
BY visits_tbl.date ROWS BETWEEN UNBOUNDED PRECEDING AND 0
PRECEDING) Max_visits
from visits_tbl
) as visit_sum_day_wise
where visit_sum_day_wise.visits = (select max(visits_B.visits ) from
visits_tbl visits_B where visits_B.Date = visit_sum_day_wise.date )
enter image description here
I have this database:
| id | name | email | control_number | created | | | | | |
|:--:|-------|-----------------|----------------|------------|---|---|---|---|---|
| 1 | john | john#gmail.com | 1 | 14/09/2016 | | | | | |
| 2 | carl | carl#gmail.com | 1 | 13/08/2016 | | | | | |
| 3 | frank | frank#gmail.com | 2 | 12/08/2016 | | | | | |
And i want to get the COUNT in the last 12 months by the control_number.
basicly is a COUNT where control_number = 1 but by month.
So if the query is done today, its september, it should start from september to October 2015 and display the count of records for each month.
Result should be:
09/2016 = 50
08/2016 = 35
07/2016 = 20
06/2016 = 50
05/2016 = 21
04/2016 = 33
03/2016 = 60
02/2016 = 36
01/2016 = 11
12/2015 = 0
11/2015 = 0
10/2015 = 0
Hmmm. Getting the 0 values can be tricky. Assuming that you have some data each month (even if not for "1"), th en you can do:
select extract(year_month from created) as yyyymm,
sum(control_number = 1)
from t
where created >= date_sub(curdate(), interval 12 month)
group by extract(year_month from created)
order by yyyymm;
If you don't have at least one record for each month, then you'll need a left join and a table with one row per month.
Try this:
select CONCAT(SUBSTRING(ym, 5, 2), '/', SUBSTRING(ym, 1, 4)) Month, Count from (
select EXTRACT(YEAR_MONTH FROM created) ym, count(*) Count
from mytable
where EXTRACT(YEAR_MONTH FROM created) > (EXTRACT(YEAR_MONTH FROM SUBDATE(NOW(), INTERVAL 1 YEAR))
group by 1
order by 1 desc) x
Try:
select concat(month(created),'/',year(created)) as period, count(*) as cnt
from mytable
where control_number=1 and TIMESTAMPDIFF(year, created, now())=0
group by (month(created));