count of every day (mysql) - mysql

I want get something like this
Mysql data
(dat_reg)
1.1.2000
1.1.2000
1.1.2000
2.1.2000
2.1.2000
3.1.2000
I want to get:
(dat_reg) (count)
1.1.2000 - 3
2.1.2000 - 5
3.1.2000 - 6
What I tried is this:
SELECT COUNT( * ) as a , DATE_FORMAT( dat_reg, '%d.%m.%Y' ) AS dat
FROM members
WHERE (dat_reg > DATE_SUB(NOW() , INTERVAL 5 DAY))
GROUP BY DATE_FORMAT(dat_reg, '%d.%m.%Y')
ORDER BY dat_reg
but I get:
1.1.2000 - 3 | 2.1.2000 - 2 | 3.1.2000 - 1
Some tips how create query for this?

I would suggest using variables in MySQL:
SELECT d.*, (#sumc := #sumc + cnt) as running_cnt
FROM (SELECT DATE_FORMAT(dat_reg, '%d.%m.%Y') as dat, COUNT(*) as cnt
FROM members
WHERE dat_reg > DATE_SUB(NOW() , INTERVAL 5 DAY)
GROUP BY dat
ORDER BY dat_reg
) d CROSS JOIN
(SELECT #sumc := 0) params;
If you want an accumulative from the beginning of time, then you need an additional subquery:
SELECT d.*
FROM (SELECT d.*, (#sumc := #sumc + cnt) as running_cnt
FROM (SELECT DATE_FORMAT(dat_reg, '%d.%m.%Y') as dat, dat_reg, COUNT(*) as cnt
FROM members
GROUP BY dat
ORDER BY dat_reg
) d CROSS JOIN
(SELECT #sumc := 0) params
) d
WHERE dat_reg > DATE_SUB(NOW() , INTERVAL 5 DAY)

A subquery counting the rows where the registration date is less than or equal to the current registration date could help you out.
SELECT m2.dat_reg,
(SELECT count(*)
FROM members m3
WHERE m3.dat_reg <= m2.dat_reg) count
FROM (SELECT DISTINCT m1.dat_reg
FROM m1.members
WHERE m1.dat_reg > date_sub(now(), INTERVAL 5 DAY)) m2
ORDER BY m2.dat_reg;
(If you got days, on which no one registered and don't want to have gaps in the result, you need to replace the subquery aliased m2 with a table or subquery, that has all days in the respective range.)

I believe you can use the window functions to do the work:
mysql> SELECT employee, sale, date, SUM(sale) OVER (PARTITION by employee ORDER BY date) AS cum_sales FROM sales;
+----------+------+------------+-----------+
| employee | sale | date | cum_sales |
+----------+------+------------+-----------+
| odin | 200 | 2017-03-01 | 200 |
| odin | 300 | 2017-04-01 | 500 |
| odin | 400 | 2017-05-01 | 900 |
| thor | 400 | 2017-03-01 | 400 |
| thor | 300 | 2017-04-01 | 700 |
| thor | 500 | 2017-05-01 | 1200 |
+----------+------+------------+-----------+
In your case you already have the right groups, it is only a matter of specifying the order in which you want the data the be aggregated.
Source: https://mysqlserverteam.com/mysql-8-0-2-introducing-window-functions/
Cheers

Here is a solution using rank and a continuous count variable:
WITH ranked AS (
SELECT m.*
,ROW_NUMBER() OVER (PARTITION BY m.dat_reg ORDER BY m.id DESC) AS rn
FROM (
select id, dat_reg
,#cnt := #cnt + 1 AS ccount from members
,(SELECT #cnt := 0) var
WHERE (dat_reg > DATE_SUB(NOW(), INTERVAL 5 DAY))
) AS m
)
SELECT DATE_FORMAT(dat_reg, '%d.%m.%Y') as dat, ccount FROM ranked WHERE rn = 1;
DB-Fiddle

Related

How to count concurrently bookings in sql in time interval per minute?

If I have a start and stop time for a booking, how can I calculate the number of bookings there are each minute? I made a simplified version of my database table looks like here:
Start time | End time | booking |
--------------------------------------------------
2020-09-01 10:00 | 2020-09-01 10:10 | Booking 1 |
2020-09-01 10:00 | 2020-09-01 10:05 | Booking 2 |
2020-09-01 10:05 | 2020-09-01 10:10 | Booking 3 |
2020-09-01 10:09 | 2020-09-01 10:10 | Booking 4 |
I want to have the bookings between a given time interval like 10:02 - 10:09. It should be something like this as result:
Desired result
Time | count
-----------
10:02 | 2 |
10:03 | 2 |
10:04 | 2 |
10:05 | 3 |
10:06 | 2 |
10:07 | 2 |
10:08 | 2 |
10:09 | 3 |
Question
How can this be achieved? Today I export it to python however I think it should be possible to achieve directly in SQL.
You can use a recursive CTE directly on your data:
with recursive cte as (
select start_time, end_time
from t
union all
select start_time + interval 1 minute, end_time
from cte
where start_time < end_time
)
select start_time, count(*)
from cte
group by start_time
order by start_time;
Here is a db<>fiddle.
EDIT:
In earlier versions of MySQL, it helps to have a tally table. You can create one on the fly, using something like:
(select #rn := #rn + 1 as n
from t cross join
(select #rn := 0) params
) tally
You need enough numbers for your maximum span, but then you can do:
select t.start_time + interval tally.n hour, count(*)
from t join
(select #rn := #rn + 1 as n
from t cross join
(select #rn := -1) params -- so it starts from 0
limit 100
) tally
on t.start_time + interval tally.n hour <= t.end_time
group by t.start_time + interval tally.n hour;
You can use a recursive query to generate the timestamp range, then unpivot the table and join:
with recursive dates (ts) as(
select '2020-09-01'
union all
select ts + interval 1 minute
from dates
where ts + itnerval 1 minute < '2020-09-02'
)
select d.ts, sum(t.cnt) over(order by d.ts) cnt
from dates d
left join (
select start_time ts, 1 cnt from mytable
union all select end_time, -1 from mytable
) t on t.ts <= d.ts
If you are going to run this repeatedly and/or against large time periods, you would better materialize the date ranges in a calendar table rather than use a recursive query. The calendar table has one row per minute over a large period of dates - assuming a table called date_calendar, you would do:
select d.ts, sum(t.cnt) over(order by d.ts) cnt
from date_calendar d
left join (
select start_time ts, 1 cnt from mytable
union all select end_time, -1 from mytable
) t on t.ts <= d.ts
where d.ts >= '2020-09-01' and d.ts < '2020-09-02'

MySQL: How to get the Active members by month in year

I have got the previous year working members and subtracted previous year relieving employees, then got the previous month relieving list and subtracted it from the result set. Then added the newly added members in a current month.
SQL Fiddle Link
I am sensing that there lot of improvements we can do to the current query. But right now I am out of ideas, Can someone kindly help on this?
IF I have interpreted your existing query correctly, I suggest the following:
select
mnth.num, count(*)
from (
select 1 AS num union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9 union all select 10 union all select 11 union all select 12
) mnth
left join (
select
e.emp_id
, case
when e.hired_date < date_format(current_date(), '%Y-01-01') then 1
else month(e.hired_date)
end AS start_month
, case
when es.relieving_date < date_format(current_date(), '%Y-01-01') then 0
when es.relieving_date >= date_format(current_date(), '%Y-01-01') then month(es.relieving_date)
else month(current_date())
end AS end_month
from employee e
left join employee_separation es on e.emp_id = es.emp_id
) emp on mnth.num between emp.start_month and emp.end_month
where mnth.num <= month(current_date())
group by
mnth.num
;
This produced the following result (current_date() on Nov 21 2017
| num | count(*) |
|-----|----------|
| 1 | 6 |
| 2 | 7 |
| 3 | 8 |
| 4 | 9 |
| 5 | 10 |
| 6 | 9 |
| 7 | 10 |
| 8 | 11 |
| 9 | 12 |
| 10 | 13 |
| 11 | 14 |
DEMO
Depending on data volumes adding a where clause in the emp subquery may help, this also affect a case expression:
, case
when es.relieving_date >= date_format(current_date(), '%Y-01-01') then month(es.relieving_date)
else month(current_date())
end AS end_month
from employee e
left join employee_separation es on e.emp_id = es.emp_id
where es.relieving_date >= date_format(current_date(), '%Y-01-01')
I think what you need to do is to get all the employees who are already working from the employee table with:
SELECT * FROM employee WHERE hired_date<= CURRENT_DATE;
Then get the list of employees whose relieving date is still in the future using:
SELECT * FROM employee_separation WHERE relieving_date > CURRENT_DATE;
Then join the two results and group by the month and year of the reliving date as shown below:
SELECT DATE_FORMAT(B.relieving_date, "%Y-%M") RELIEVING_DATE, COUNT(*)
NUMBER_OF_ACTIVE_MEMBERS FROM
(SELECT * FROM employee WHERE hired_date <= CURRENT_DATE) A INNER JOIN
(SELECT * FROM employee_separation WHERE relieving_date > CURRENT_DATE) B
ON A.emp_id=B.emp_id
GROUP BY DATE_FORMAT(B.relieving_date , "%Y-%M");
Here is a Demo on sql fiddle.

Finding count for a Period in sql

I have a table with :
user_id | order_date
---------+------------
12 | 2014-03-23
12 | 2014-01-24
14 | 2014-01-26
16 | 2014-01-23
15 | 2014-03-21
20 | 2013-10-23
13 | 2014-01-25
16 | 2014-03-23
13 | 2014-01-25
14 | 2014-03-22
A Active user is someone who has logged in last 12 months.
Need output as
Period | count of Active user
----------------------------
Oct-2013 - 1
Jan-2014 - 5
Mar-2014 - 10
The Jan 2014 value - includes Oct -2013 1 record and 4 non duplicate record for Jan 2014)
You can use a variable to calculate the running total of active users:
SELECT Period,
#total:=#total+cnt AS `Count of Active Users`
FROM (
SELECT CONCAT(MONTHNAME(order_date), '-', YEAR(order_date)) AS Period,
COUNT(DISTINCT user_id) AS cnt
FROM mytable
GROUP BY Period
ORDER BY YEAR(order_date), MONTH(order_date) ) t,
(SELECT #total:=0) AS var
The subquery returns the number of distinct active users per Month/Year. The outer query uses #total variable in order to calculate the running total of active users' count.
Fiddle Demo here
I've got two queries that do the thing. I am not sure which one's the fastest. Check them aginst your database:
SQL Fiddle
Query 1:
select per.yyyymm,
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(per.yyyymm - INTERVAL 1 YEAR) and o.order_date < per.yyyymm + INTERVAL 1 MONTH) as `count`
from
(select DISTINCT LAST_DAY(order_date) + INTERVAL 1 DAY - INTERVAL 1 MONTH as yyyymm
from orders) per
order by per.yyyymm
Results:
| yyyymm | count |
|---------------------------|-------|
| October, 01 2013 00:00:00 | 1 |
| January, 01 2014 00:00:00 | 5 |
| March, 01 2014 00:00:00 | 6 |
Query 2:
select DATE_FORMAT(order_date, '%Y-%m'),
(select count(DISTINCT o.user_id) from orders o where o.order_date >=
(LAST_DAY(o1.order_date) + INTERVAL 1 DAY - INTERVAL 13 MONTH) and
o.order_date <= LAST_DAY(o1.order_date)) as `count`
from orders o1
group by DATE_FORMAT(order_date, '%Y-%m')
Results:
| DATE_FORMAT(order_date, '%Y-%m') | count |
|----------------------------------|-------|
| 2013-10 | 1 |
| 2014-01 | 5 |
| 2014-03 | 6 |
The best thing I could do is this:
SELECT Date, COUNT(*) as ActiveUsers
FROM
(
SELECT DISTINCT userId, CONCAT(YEAR(order_date), "-", MONTH(order_date)) as Date
FROM `a`
ORDER BY Date
)
AS `b`
GROUP BY Date
The output is the following:
| Date | ActiveUsers |
|---------|-------------|
| 2013-10 | 1 |
| 2014-1 | 4 |
| 2014-3 | 4 |
Now, for every row you need to sum up the number of active users in previous rows.
For example, here is the code in C#.
int total = 0;
while (reader.Read())
{
total += (int)reader['ActiveUsers'];
Console.WriteLine("{0} - {1} active users", reader['Date'].ToString(), reader['ActiveUsers'].ToString());
}
By the way, for the March of 2014 the answer is 9 because one row is duplicated.
Try this, but thise doesn't handle the last part: The Jan 2014 value - includes Oct -2013
select TO_CHAR(order_dt,'MON-YYYY'), count(distinct User_ID ) cnt from [orders]
where User_ID in
(select User_ID from
(select a.User_ID from [orders] a,
(select a.User_ID,count (a.order_dt) from [orders] a
where a.order_dt > (select max(b.order_dt)-365 from [orders] b where a.User_ID=b.User_ID)
group by a.User_ID
having count(order_dt)>1) b
where a.User_ID=b.User_ID) a
)
group by TO_CHAR(order_dt,'MON-YYYY');
This is what I think you are looking for
SET #cnt = 0;
SELECT Period, #cnt := #cnt + total_active_users AS total_active_users
FROM (
SELECT DATE_FORMAT(order_date, '%b-%Y') AS Period , COUNT( id) AS total_active_users
FROM t
GROUP BY DATE_FORMAT(order_date, '%b-%Y')
ORDER BY order_date
) AS t
This is the output that I get
Period total_active_users
Oct-2013 1
Jan-2014 6
Mar-2014 10
You can also do COUNT(DISTINCT id) to get the unique Ids only
Here is a SQL Fiddle

change mySQL subquery to a join for efficiency

Is it possible to change the following mySQL query to use a join instead of a subquery for efficiency (or another way to increase efficiency)? I have a table with patient visits to an emergency department. The table lists arrival and departure time. I need the query to return the total number of patients that were already present in the emergency department (the "census") when the patient arrived.
My table looks something like this:
+------+------+---------------------+---------------------+
| id | name | arrival | departure |
+------+------+---------------------+---------------------+
| 1 | Joe | 2010-01-01 00:00:00 | 2010-01-01 02:00:00 |
| 2 | John | 2010-01-01 00:05:00 | 2010-01-01 03:00:00 |
| 3 | Jane | 2010-01-01 01:00:00 | 2010-01-01 04:00:00 |
...
With a desired result like this:
+------+--------+
| name | census |
+------+--------+
| Joe | 0 |
| John | 1 |
| Jane | 2 |
...
The following query works, but is quite slow (about 3.5 seconds on 180,000 rows). Is there a way to increase the efficiency of this query (with some sort of join, or other method)?
select name, arrival,
(SELECT count(*)
FROM patient_arrivals as b
WHERE b.arrival <= a.arrival and b.departure >= a.departure) as census
FROM patient_arrivals as a
I don't think a join will help. Instead, you need to restructure the query. The following gets the number of patients in the room at any particular time:
select t, sum(num) as num, #total := #total + num as total
from (select arrival as t, 1 as num
from patient_arrivals
union all
select departure, -1
from patient_arrivals
) t cross join
(select #total := 0) vars
group by t
order by t
Then, you can use this as a subquery for the join:
select pa.*, t.total as census
from patient_arrivals pa join
(select t, sum(num) as num, #total := #total + num as total
from (select arrival as t, 1 as num
from patient_arrivals
union all
select departure, -1
from patient_arrivals
) t cross join
(select #total := 0) vars
group by t
order by t
) tnum
on pa.arrival = tnum.t;
This gives the number when the patient arrives. For the total that overlap:
select pa.*, max(t.total) as census
from patient_arrivals pa join
(select t, sum(num) as num, #total := #total + num as total
from (select arrival as t, 1 as num
from patient_arrivals
union all
select departure, -1
from patient_arrivals
) t cross join
(select #total := 0) vars
group by t
order by t
) tnum
on tnum.t between pa.arrival and pa.departure
group by pa.id

Optimize mysql query with UNION

I have a requirement of getting next date and previous date
the table structure is as follows
| auto_id | id | next_date | next_activity |
| 1 | 1 | 22-12-2012 | - |
| 2 | 1 | 25-12-2012 | - |
| 3 | 1 | 26-12-2012 | - |
| 4 | 1 | 28-12-2012 | - |
so i need next_day and previous_day
next_day = next_date after current day
previous_day = next_date before current_date
(SELECT * FROM `activity` WHERE id = 1 and next_date > CURDATE() order by next_date asc limit 1)
UNION
(SELECT * FROM `activity` WHERE id = 1 and next_date = CURDATE() )
UNION
(SELECT * FROM `activity` WHERE id = 1 and next_date < CURDATE() order by next_date desc limit 1)
ORDER BY next_date desc limit 2
Other way to do it self join the table...
Is there a way to optimize the table
Here is another way:
SELECT next_date, date_diff
FROM (SELECT *,
#dateDiff := datediff(next_date, curdate()) AS date_diff,
#pDateDiff :=
IF((#dateDiff < 0 AND #dateDiff > #pDateDiff),
#dateDiff,
#pDateDiff)
AS pDateDiff,
#nDateDiff :=
IF((#dateDiff > 0 AND #dateDiff < #nDateDiff),
#dateDiff,
#nDateDiff)
AS nDateDiff
FROM activity, (SELECT #pDateDiff := -9999, #nDateDiff := 9999) tmp
WHERE id = 1) aView
WHERE date_diff IN (#pDateDiff, 0, #nDateDiff)
ORDER BY next_date;
date_diff value gives perspective of prev and next dates.
pick all dates where id = 1
find the date difference between next_date and curdate() & store in an user defined variable #dateDiff.
#pDateDiff is another variable which tracks maximum among negative #dateDiff values (our previous date)
#nDateDiff is yet another variable which tracks minimum among positive #dateDiff values (our next date)
in the end, select only those dates which are in (-ve max, 0, +ve min).
PS: if you've duplicate date entries then query may return all of them.