Counting entries within time range in MySQL - mysql

I have a table in my database where I am recording how long a user
is active or inactive. If a user switches from active to inactive or
vice versa (or changes any other attribute like address, etc) a new record is added. A simplified version looks like this:
UserID state state_from state_to
1 active 2016-12-14 2017-01-15
2 active 2016-12-14 2017-02-02
3 active 2017-01-01 NULL
1 inactive 2017-01-16 2017-03-05
2 inactive 2017-02-02 NULL
1 active 2017-03-06 NULL
I would like to count how many users had at some point the state "active", on a monthly basis. The desired output table for January till March would be:
date user_count
2017-01 3
2017-02 2
2017-03 2
For a specific timestamp (e.g. January 2017) I use the following query.
SET #stamp = '2017-01';
SELECT #stamp AS date, COUNT(UserID) AS user_count
FROM se_user
WHERE (state LIKE 'active' AND #stamp BETWEEN DATE_FORMAT(state_from,'%Y-%m') AND DATE_FORMAT(state_to,'%Y-%m') )
OR (state LIKE 'active' AND DATE_FORMAT(state_from,'%Y-%m') <= #stamp AND DATE_FORMAT(state_to,'%Y-%m') IS NULL);
My problem is, that I don't know how to do it for range of dates (say from January to March). Is there a way to do this?
Example database and table
-- Set up a test database
DROP DATABASE IF EXISTS se_toy_example;
CREATE DATABASE se_toy_example;
USE se_toy_example;
-- Create example table
DROP TABLE IF EXISTS se_user;
CREATE TABLE se_user (
UserID INT(10),
state VARCHAR(10),
state_from DATE,
state_to DATE
);
-- Populate table
INSERT INTO se_user (UserID,state,state_from,state_to)
VALUES (1,'active','2016-12-14','2017-01-15'),
(2,'active','2016-12-14','2017-02-02'),
(3,'active','2017-01-01',NULL),
(1,'inactive','2017-01-16','2017-03-05'),
(2,'inactive','2017-02-02',NULL),
(1,'active','2017-03-06',NULL);

You need a calendar_table to have the month date ranges, with dt_begin and dt_end for each month.
So your query for the current year become.
SQL DEMO
SELECT c.dt_begin, count(*)
FROM calendar_table c
JOIN se_user u
ON c.dt_begin <= COALESCE(u.state_to, curdate() )
AND c.dt_end >= u.state_from
WHERE dt_begin >= '2017-01-01'
AND dt_begin <= '2017-12-01'
AND state = 'active'
GROUP BY c.dt_begin
;
OUTPUT
NOTE: You can use LEFT JOIN if want the months with 0 activity.

Related

How to find datetimes where some conditions hold in MySQL?

We have a MySQL database containing bookings on different courts. Table properties (shortened):
CREATE TABLE `booking` (
`startDate` datetime NOT NULL,
`endDate` datetime NOT NULL,
`courtId` varchar(36),
FOREIGN KEY (`courtId`) REFERENCES `court` (`id`) ON DELETE CASCADE
)
Usually, bookings are paid, but under certain conditions (which I can check in the WHERE-part of a query), bookings can be free.
Given a court and booking duration, I want to query the next datetime at which the booking can be created for free. The conditions are not the problem, the problem is how to query not for entities but for datetime values.
How to realize this efficiently in MySQL?
EDIT: Maybe it helps to outline the conditions under which bookings are free:
The conditions under which bookings are free are dependent on how many courts are offered at the startDate by someone (courts are always offered except if there are special "not-offered"-bookings on that court) and how many other bookings overlapping the startDate are already free. This means bookings can be (and probably are) free even if there are no bookings at all in the database.
Solution
Finding available slot before the last booking :
Find the difference between each booking with it's following one. If the difference is greater than the number of days of the new booking, you can use that slot.
Finding available slot after the last booking :
If there is no such slot, you can assign a day after the end date of the last booking.
If this query returns null, it means there is no booking for the court. You can handle that in the client side.
Code
SET #c := 1; # Court id
SET #n := 2; # Number of days
/*
Previous booking
*/
SET #i := 0;
CREATE TEMPORARY TABLE bp AS
SELECT #i := #i + 1 AS id, startDate, endDate FROM booking
WHERE courtId = #c
ORDER BY startDate;
/*
Next booking
*/
SET #i := -1;
CREATE TEMPORARY TABLE bn AS
SELECT #i := #i + 1 AS id, startDate, endDate FROM booking
WHERE courtId = #c
ORDER BY startDate;
/*
Finding available slot before the last booking (Intermediate slot).
*/
SELECT DATE_ADD(MIN(bp.endDate), INTERVAL 1 DAY) INTO #si FROM
bp
JOIN
bn
ON bn.id = bp.id
WHERE DATEDIFF(bn.startDate, bp.endDate) > #n;
/*
Finding available slot after the last booking
*/
SELECT DATE_ADD(MAX(endDate), INTERVAL 1 DAY) INTO #sa FROM bn;
SELECT IFNULL(#si, #sa);
Using the code
Just replace the values of the variables #c and #n.
An idea to solve this is to rephrase it as: for the given :court_id parameter, give me the smallest future end_time for which no other booking starts within the given :duration parameter.
This can be expressed in different ways in SQL.
With a not exists condition and a correlated subquery that ensures that no further booking on the same court starts within :duration minutes.
select min(b.end_date) next_possible_start_date
from bookings b
where
b.court_id = :court_id
and b.end_date > now()
and not exists (
select 1
from bookings b1
where
b.court_id = :court_id
and b1.start_date > b.end_date
and b1.start_date < DATE_ADD(b.end_date, interval :duration minute)
)
Note: if you have additional conditions, they must be repeated in the where clause of the query and of the subquery.
The same logic as not exists can be impemented with a left join antipattern
select min(b.end_date) next_possible_start_date
from bookings b
left join booking b1
on b1.court_id = b.court_id
and bi1.start_date > b.end_date
and b1.start < DATE_ADD(b.end_date, interval :duration minute)
where
b.court_id = :court_id
and b.end_date > now()
and b1.court_id is null
In MySQL 8.0, it is also possible to use window functions: lag() retrieves the start_date of the next booking, which can then be compared with the end_date of the current booking.
select min(end_date) next_possible_start_date
from (
select
end_date,
lead(start_date) over(partition by court_id order by start_date) next_start_date
from booking b
where court_id = :court_id
)
where
next_start_date is null
or next_start_date >= DATE_ADD(end_date, interval :duration minute)
Edit
Here is a new version of the query that adresses the use case when the court is immediatly free at the time when the search is performed:
select
court_id,
greatest(min(b.end_date), now()) next_possible_start_date
from bookings b
where
-- b.court_id = :court_id and
not exists (
select 1
from bookings b1
where
b1.court_id = b.court_id
and b1.start_date > b.end_date
and b1.start_date < date_add(greatest(b.end_date, now()), interval ::duration minute)
)
group by court_id
Note: this searches for all available courts at once; you can uncomment the where clause to filter on a specific court.
Given this sample data:
court_id | start_date | end_date
-------: | :------------------ | :------------------
1 | 2019-10-29 13:00:00 | 2019-10-29 13:30:00
1 | 2019-10-29 14:00:00 | 2019-10-29 15:00:00
2 | 2019-10-29 23:14:05 | 2019-10-30 00:14:05
2 | 2019-10-30 01:14:05 | 2019-10-30 02:14:05
Court 1 is immedialty free. Court 2 is booked for next hour, then there is a 60 minutes vacancy before the next booking.
If we run the query for a duration of 60 minutes, we get:
court_id | next_possible_start_date
-------: | :-----------------------
1 | 2019-10-29 23:14:05 -- available right now
2 | 2019-10-30 00:14:05 -- available in 1 hour
While for 90 minutes, we get:
court_id | next_possible_start_date
-------: | :-----------------------
1 | 2019-10-29 23:14:05 -- available right now
2 | 2019-10-30 02:14:05 -- available in 3 hours
Demo on DB Fiddle

MYSQL , Count week wise and also show sum with empty dates

I have two tables
Table_1 : Routes_Day_plan
Date Status_Id
------------------------
2019-06-09 1
2019-06-10 2
2019-06-09 2
2019-06-11 3
2019-06-14 4
2019-06-14 6
2019-06-15 8
Table_2 : Codes
id code
-------
1 Leave
2 Half_leave
3 Holiday
4 Work
5 Full_Hours
Now my task is to count week wise from table 1 where code (from second table) = Leave,Half_leave,work and than also show the sum , and where date not found show 0 , i write this query it's return data but not empty dates can someone please help ,
My Query:
select COUNT(*) as available, DATE(date)
from Table_1
where status_id in (
select id from codes
where code in ('Leave','Half_leave','work'))
AND DATE(date) >= DATE('2019-06-09') AND DATE(date) <= DATE('2019-06-16')
group by date
UNION ALL
SELECT COUNT(date), 'SUM' date
FROM Table_1
where status_id in (
select id from codes
where code in ('Leave','Half_leave','work'))
AND DATE(date) >= DATE('2019-06-09') AND DATE(date) <= DATE('2019-06-16')
Result Something Like ,
available Dates
------------------------
5 2019-06-09
2 2019-06-10
3 2019-06-11
3 2019-06-12
2 2019-06-14
2 2019-06-15
17 SUM
I want like this
available Dates
------------------------
5 2019-06-09
2 2019-06-10
3 2019-06-11
3 2019-06-12
0 2019-06-13
2 2019-06-14
2 2019-06-15
17 SUM
Your best bet here would be to have a Date Dimension/Lookup table which contains pre-populated dates for the entire year. By joining your record table to this lookup, you essentially allocate your data to each date that actually exist (ex. 2019-06-13) and if your data is not found in the lookup, you will find a null in that field.
The Count function will count a null as a 0. Just make sure you group on the date field from your lookup table and not from your record table.
Make a table, a date dimension that contains all the dates value, from beginning to end. Like this:
Set EndDate = '2099-01-01';
Set RunDate = '1900-01-01';
WHILE RunDate <= EndDate DO
insert into dim_date
(`DATE`)
select
RunDate as DATE
;
Set RunDate = ADDDATE(RunDate,1);
END WHILE;
Create temporary table with dim_date left join Routes_Day_plan and set Status as 0 maybe for record that dont match. Use this temporary table then instead of Routes_Day_plan in your queries.

Complex mysql query with conditional results

I have the next structure in a MySQL database:
boats
id name
-------------
1 name1
2 name2
boat_prices
id boat_id date duration price is_default
---------------------------------------------------------------
1 1 '2018-01-01' 1 100
2 1 '2018-01-01' 2 200
3 1 null null 100 1
4 2 '2018-01-02' 2 400
5 2 '2018-01-02' 4 800
6 2 null null 200 1
7 3 '2018-01-03' 5 1500
8 3 null null 300 1
The boats have a price for a specific date and duration in days.
All boats have a default "from" price that is identified by date = null and duration = null.
But, not all boats have prices for all days.
When I search for boat prices for a specific date and duration, the query should return all rows with a price for that date and duration, and in case a boat hasnĀ“t got a price for that date return its "from" default price.
Example: For the date = '2018-01-01 and duration = 1, the result should be:
boat_prices
id boat_id date duration price is_default
----------------------------------------------------------------
1 1 '2018-01-01' 1 100
6 2 null null 200 1
8 3 null null 300 1
I did this query example just to simplify, but please take into account apart from this, the query has some other joins with other tables.
I need help with the query.
I believe Rick was on the right direction having left join, but you probably need TWO. One to get the boat prices that qualify the date interested in, another explicitly for the default.
select
b.id,
b.name,
DefPrice.price as DefaultPrice,
Specials.price as SpecialsPrice,
COALESCE( Specials.price, DefPrice.price ) as DiscountOrDefaultPrice
from
( select #parmDate = '2018-01-01' ) sqlvars,
boats b
JOIN boat_prices DefPrice
on b.id = DefPrice.boat_id
AND DefPrice.date IS NULL
AND DefPrice.Duration IS NULL
LEFT JOIN boat_prices Specials
on b.id = Specials.boat_id
AND Specials.date <= #parmDate
AND #parmDate <= Date_Add( Specials.Date, INTERVAL (Specials.duration -1 ) DAY )
Now, you could always return only the one price in question by doing a COALESCE() in case there is no Specials price, it gets the default via the DiscountOrDefaultPrice column.
Take your pick version of which column(s) you want to run with. This should get ALL boats, regardless of some special price based on durations. As you change whatever your parameter date in question is -- even if you do a current date, it will work. This is because you are testing the date in question against ALL possible special boat prices and its beginning to beginning + duration end date range. If you have multiple prices that overlap dates, that will just return those multiple rows that overlap.
My Adding of the duration is subtracting 1. For example, if your date is 2018-01-01 and its good for 1 day, does that mean it is only good for that one day? or up to and including 2018-01-02. The -1 forces the qualification to just the one day. So the price on 2018-01-01 good for 1 day is ONLY 2018-01-01.
Your other example for 2018-01-02 has two day duration. To me, indicating 2 days including 01-02 through 01-03. Two actual days.
CONFIRMATION from comment about dates and range
I guess my interpretation was wrong then on your data needs. Your sample of TWO dated boat price records apparently is not enough. You stated you want ALL boats regardless of qualification of a special price record. So you must start with the boat and the join to get all possible "Default" pricing no matter what. It is only the LEFT-JOIN component that needs to be adjusted.
That being said, lets simulate more data. Assume you have the following
Boad ID Date Duration Rate
1 2018-01-01 1 x
1 2018-01-02 4 y
2 2018-01-02 2 z
2 2018-01-04 4 a
3 2018-01-03 5 b
If I provide the date 2018-01-01, what rate records should I see?
If I provide date 2018-01-03, what records?
If I provide date 2018-01-05, what records?
For the particular date "2018-01-01" and duration of 1, i will use an UNION clause like this:
(Note: Edited for add is_default column)
-- Get prices for particular day and duration.
(SELECT
boat_id,
date,
duration,
price,
0 AS is_default
FROM
boat_prices
WHERE
date = "2018-01-01" AND duration = 1)
UNION
-- Add defaults prices for those don't have a price on the particular day and duration
(SELECT
boat_id,
date,
duration,
price,
is_default
FROM
boat_prices
WHERE
date IS NULL
AND
duration IS NULL
AND
boat_id NOT IN (SELECT boat_id
FROM boat_prices
WHERE date ="2018-01-01" AND duration = 1))
EXAMPLE WITH STORED PROCEDURE SOLUTION
DELIMITER //
CREATE PROCEDURE GetPricesByDateAndDuration(IN pDate DATE, IN pDuration INT)
BEGIN
-- Get prices for particular day and duration.
(SELECT
boat_id,
date,
duration,
price,
0 AS is_default
FROM
boat_prices
WHERE
date = pDate AND duration = pDuration)
UNION
-- Add defaults prices for those don't have a price on the particular day and duration
(SELECT
boat_id,
date,
duration,
price,
is_default
FROM
boat_prices
WHERE
date IS NULL
AND
duration IS NULL
AND
boat_id NOT IN (SELECT boat_id
FROM boat_prices
WHERE date = pDate AND duration = pDuration))
END //
DELIMITER ;
Then you can call the procedure like this:
CALL GetPricesByDateAndDuration('2018-01-01', 1);
Instead of that clunky output, consider:
boat_id price default
-----------------------------
1 100
2 300 (default)
Something like this should generate that:
SELECT boat_id,
IF(b.price IS NULL, dflt.price, b.price) AS price,
IF(b.price IS NULL, '(default)', '') AS default
FROM boat_prices AS dflt
LEFT JOIN boat_prices AS b USING(boat_id)
WHERE dflt.date IS NULL
AND dflt.duration IS NULL
AND '2018-01-01' >= b.date
AND '2018-01-01' < b.date + INTERVAL b.duration DAY
GROUP BY boat_id

Complicated SELECT statement includes multiple joins not returning desired results

I am trying to create a SELECT to return information from a couple of tables. I had it working but then received an additional requirement and now I am having trouble figuring out how to get what I want.
I have a table with information on programs that could be included in the report (based on further requirements)...this file is called milestones.
I have another table with projects in it that relate to the programs - if the IDs match
I have a new table that has a manually entered override end date - this is the new requirement. There is a system end date in the milestones table, but if this override date is entered then it takes precedence over the system end date. If an override date has been entered, the exception file will have the same program ID and two dates which match dates in the milestones table.
dates are yyyy-mm-dd formatted
Example data:
Milestones:
prgId | startDate | endDate
------------------------------
123 | 2014-03-09 | 2014-11-10
123 | 2014-07-10 | 2014-11-10
324 | 2014-05-09 | 2014-11-12
exceptions:
prgId | startDate | overEnd
-------------------------------
123 | 2014-03-09 | 2014-05-31
projects:
prgId | cust
-------------
123 | 12121
123 | 4323
what I currently have being returned is:
prgId prjCnt startDate endDate overEnd
123 2 2014-03-09 2014-11-10 2014-05-31
123 2 2014-07-10 2014-11-10
324 0 2014-05-09 2014-11-12
I do realize that right now the two projects for program 123 will show for both lines - we will be looking for a way to associate them with the right ones but do not have that yet.
We added the override date requirement so that a report of current programs would not show both the '123' lines but only the one that is current (the second one).
My current SELECT is like this (sorry, I can't get this to show easier it is really long):
SELECT milestones.*, newtbl.prjcnt, exceptions.overEnd
FROM milestones
LEFT JOIN ((
SELECT prgGuid, count( prgGuid ) AS prjcnt
FROM projects
GROUP BY prgGuid
) AS newtbl )
ON milestones.prgId = newtbl.prgId
LEFT JOIN exceptions
ON (milestones.prgId = exceptions.prgId
AND milestones.startDate = exceptions.startDate)
WHERE <(milestones.startDate > '2013-00-00')
AND (milestones.startDate <= CURDATE() AND milestones.endDate >= CURDATE())
ORDER BY milestones.endDate, milestones.startDate DESC
Now what I want is to change this to only grab programs, project counts, start and end dates, and the override end date for programs where the start date is anything from 2013 to the current date and that have not ended yet. Now....if a program has an override end date and that end date is current (>= the current date) it should be included but if the override date is NULL or <= the current date, I do not want to include it.
What I want to have returned is:
prgId prjCnt startDate endDate overEnd
123 2 2014-07-10 2014-11-10
324 0 2014-05-09 2014-11-12
The first line before has expired so shouldn't show.
I've tried a few things but I either end up with no results or I get everything that I am currently getting.
Can someone help me figure out what the SELECT should be?
So if I follow you, your DDL might look like this:
CREATE TABLE MILESTONES
(`prgId` int, `startDate` varchar(10), `endDate` varchar(10))
;
INSERT INTO MILESTONES
(`prgId`, `startDate`, `endDate`)
VALUES
(123, '2014-03-09', '2014-11-10'),
(123, '2014-07-10', '2014-11-10'),
(324, '2014-05-09', '2014-11-12')
;
CREATE TABLE EXCEPTIONS
(`prgId` int, `startDate` varchar(10), `overEnd` varchar(10))
;
INSERT INTO EXCEPTIONS
(`prgId`, `startDate`, `overEnd`)
VALUES
(123, '2014-03-09', '2014-05-31')
;
CREATE TABLE PROJECTS
(`prgId` int, `cust` int)
;
INSERT INTO PROJECTS
(`prgId`, `cust`)
VALUES
(123, 12121),
(123, 4323)
;
And your current query which isn't working is this (note I've corrected what I presume are typos in your query from your question):
SELECT milestones.*, newtbl.prjcnt, exceptions.overEnd
FROM milestones
LEFT JOIN ((
SELECT prgId, count( prgId ) AS prjcnt
FROM projects
GROUP BY prgId
) AS newtbl )
ON milestones.prgId = newtbl.prgId
LEFT JOIN exceptions
ON (milestones.prgId = exceptions.prgId
AND milestones.startDate = exceptions.startDate)
WHERE (milestones.startDate > '2013-00-00')
AND (milestones.startDate <= CURDATE() AND milestones.endDate >= CURDATE())
ORDER BY milestones.endDate, milestones.startDate DESC
A working solution looks like this:
SELECT DISTINCT
M.prgId as PRGID
, ( SELECT COUNT(X.prgID)
FROM PROJECTS X
WHERE X.prgID = M.prgID ) as PRJCNT
, M.startDate as STARTDATE
, M.endDate as ENDDATE
, COALESCE(E.overEnd,'') as OVEREND
FROM MILESTONES M
LEFT OUTER JOIN PROJECTS P
ON M.prgId = P.prgId
LEFT JOIN EXCEPTIONS E
ON M.prgId = E.prgId
AND M.startDate = E.startDate
WHERE M.startDate > '2013-01-01'
AND M.startDate <= CURDATE()
AND M.endDate >= CURDATE()
AND ( E.overEnd IS NULL
OR E.overEnd > CURDATE() )
You can see it in action here: SQLFiddle.
Note that the solution relies on the COALESCE function for clean output and more of your business rules being put in place in the WHERE clause.

mysql select rows by consecutive date

I have a table of available date blocks (7 days in my case) which may or may not be consecutive:
start_date end_date booked id room_id
2012-07-14 2012-07-21 0 1 6
2012-07-21 2012-07-28 0 2 6
2012-07-28 2012-08-04 1 3 6
2012-08-04 2012-08-11 0 4 6
What I'd like to do is be able to get a result set that gives me one row per X weeks of consecutive unbooked dates, within a date range.
So, for 2 week blocks starting on the 14th of July and using the above table data, I would expect the following:
start_date end_date booked
2012-07-14 2012-07-28 0
The second block of 2 weeks would not be returned as one of the component weeks is booked.
Here are a few ideas I've tried already:
SELECT
MIN(start_date) AS start_date_min,
MAX(end_date) AS end_date_max,
CAST(GROUP_CONCAT(id) AS CHAR) AS ids,
SUM(booked) AS booked
FROM
available_dates
WHERE
(start_date>=20120714 AND end_date<=DATE_ADD(20120714, INTERVAL 14 DAY))
GROUP BY
room_id
HAVING
end_date_max=DATE_ADD(20120714, INTERVAL 14 DAY)
This gets me part of the way, however doesn't get me the consecutive results - that is the important part. It also only returns a single result (probably because of the HAVING clause) when I widen the test data.
Can anyone point me in the right direction?
If you have a calendar or a numbers table:
CREATE TABLE num
( i INT NOT NULL
, PRIMARY KEY (i)
) ;
INSERT INTO num
(i)
VALUES
(0), (1), (2), ..., (1000) ;
You could use something like this:
SELECT
avail.room_id,
MIN(avail.start_date) AS start_date_min,
MAX(avail.end_date) AS end_date_max,
CAST(GROUP_CONCAT(avail.id) AS CHAR) AS ids,
SUM(avail.booked) AS booked
FROM
available_dates AS avail
CROSS JOIN
( SELECT DATE('2012-07-14') AS start_date_check
, 52 AS max_week_check
) AS param
JOIN
num
ON avail.start_date = param.start_date_check + INTERVAL num.i WEEK
AND num.i < param.max_week_check
WHERE
avail.booked = 0
GROUP BY
avail.room_id,
( num.i / 2 )
HAVING
COUNT(*) = 2
You could also have this:
WHERE
1 =1 --- no WHERE condition
GROUP BY
avail.room_id,
( num.i / 2 )
HAVING --- and optionally
SUM(avail.booked) = 0 --- this