MySQL Number of Days inside a DateRange, inside a month (Booking Table) - mysql

I'm attempting to create a report for an accommodation service with the following information:
Number of Bookings (Easy, use the COUNT function)
Revenue Amount (Kind of easy).
Number of Room nights. (Rather Hard it seems)
Broken down into each month of the year.
Limitations - I'm currently using PHP/MySQL to create this report.
I'm pulling the data out of the booking system 1 month at a time, then using an ETL process to put it into MySQL.
Because of this, I have duplicate records, when a booking splits across the end of the Month. (eg BookingID = 9216 below - This is because for Revenue purposes we need to split the percentage of the revenue into the corresponding month).
The Question.
How do I write some SQL that will:
Calculate the number of room nights that was booked into a Property and Group it by the month. Taking into account that if a booking spans across the end of the month, that the room nights that are inside of the same month, as the checkin are counted towards that month, and room nights which the same month as checkout are in the same month as checkout.
At first I used this: DATEDIFF(Checkout, Checkin).
But that lead to one month having 48 room nights in a 31 day month. (because a) it counted 1 booking as 11 nights, even through it was split across the 2 months, and b) because it appears twice).
Then once I have the statement I need to integrate it back into my CrossTab SQL for the entire year.
Some resources that I have found, but can't seem to make work (MySql Query- Date Range within a Date Range & php mysql double date range)
Here is a Sample of the Table: (There are ~100,000 rows of similar data).
CREATE TABLE IF NOT EXISTS `bookingdata` (
`idBookingData` int(11) NOT NULL AUTO_INCREMENT,
`PropertyID` int(10) NOT NULL,
`Checkin` date DEFAULT NULL,
`Checkout` date DEFAULT NULL,
`Rent` decimal(10,2) DEFAULT NULL,
`BookingID` int(11) DEFAULT NULL,
PRIMARY KEY (`idBookingData`),
UNIQUE KEY `idBookingData_UNIQUE` (`idBookingData`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=10472 ;
INSERT INTO `bookingdata` (`idBookingData`, `PropertyID`, `Checkin`, `Checkout`, `Rent`, `BookingID`) VALUES
(5148, 2, '2011-07-02', '2011-07-05', 1105.00, 10612),
(5149, 2, '2011-07-05', '2011-07-13', 2155.00, 10184),
(5151, 2, '2011-07-14', '2011-07-17', 1105.00, 11102),
(5153, 2, '2011-07-22', '2011-07-24', 930.00, 14256),
(5154, 2, '2011-07-24', '2011-08-04', 1832.73, 9216),
(5907, 2, '2011-07-24', '2011-08-04', 687.27, 9216),
(5910, 2, '2011-08-11', '2011-08-14', 1140.00, 13633),
(5911, 2, '2011-08-15', '2011-08-16', 380.00, 17770),
(5915, 2, '2011-08-25', '2011-08-29', 1350.00, 17719),
(5916, 2, '2011-08-30', '2011-09-01', 740.00, 16813);

You're on the right lines. You need to join your query with a table of the months for which you want data, which can either be permanent or (as shown in my example below) created dynamically in a UNION subquery:
SELECT YEAR(month.d),
MONTHNAME(month.d),
SUM(1 + DATEDIFF( -- add 1 because start&finish on same day is still 1 day
LEAST(Checkout, LAST_DAY(month.d)), GREATEST(Checkin, month.d)
)) AS days
FROM bookingdata
RIGHT JOIN (
SELECT 20110101 AS d
UNION ALL SELECT 20110201 UNION ALL SELECT 20110301
UNION ALL SELECT 20110401 UNION ALL SELECT 20110501
UNION ALL SELECT 20110601 UNION ALL SELECT 20110701
UNION ALL SELECT 20110801 UNION ALL SELECT 20110901
UNION ALL SELECT 20111001 UNION ALL SELECT 20111101
UNION ALL SELECT 20111201
) AS month ON
Checkin <= LAST_DAY(month.d)
AND month.d <= Checkout
GROUP BY month.d
See it on sqlfiddle.

Related

SQL query to order by day, date and lead count

Trying to figure out how to pull data arranged by day, date and leads stats in the following format
Example output format
Day Date Leads
Today 2020/09/14 3
Yesterday 2020/09/13 64
Saturday 2020/09/12 18
Friday 2020/09/11 29
Thursday 2020/09/10 17
Wednesday 2020/09/09 94
A lead will is either a email or number
What SQL query can I use to get this
Example data
CREATE TABLE weektest(
date datetime,
lead VARCHAR(100)
);
INSERT INTO weektest(date, lead)
VALUES
(
'2020/09/04 10:36:51', 'number'
);
INSERT INTO weektest(date, lead)
VALUES
(
'2020/09/08 00:47:52', 'email'
);
INSERT INTO weektest(date, lead)
VALUES
(
'2020/09/11 03:03:41', ''
);
Do you just want aggregation?
select dayname(w.date) day, date(w.date) as date, count(*) cnt
from weektest w
group by date(w.date)
order by date(w.date)
I am not sure what you want to count: the above query gives you the number of rows per day. If you want the count of distinct lead values, then use count(distinct leads) instead of count(*).

How to get a rolling data set by week with sql

I had a sql query I would run that would get a rolling sum (or moving window) data set. I would run this query for every 7 days, increase the interval number by 7 (28 in example below) until I reached the start of the data. It would give me the data split by week so I can loop through it on the view to create a weekly graph.
SELECT *
FROM `table`
WHERE `row_date` >= DATE_SUB(NOW(), INTERVAL 28 DAY)
AND `row_date` <= DATE_SUB(NOW(), INTERVAL 28 DAY)
This is of course very slow once you have several weeks worth of data. I wanted to replace it with a single query. I came up with this.
SELECT *
CONCAT(YEAR(row_date), '/', WEEK(row_date)) as week_date
FROM `table`
GROUP BY week_date
ORDER BY row_date DESC
It appeared mostly accurate, except I noticed the current week and the last week of 2015 was much lower than usual. That's because this query gets a week starting on Sunday (or Monday?) meaning that it resets weekly.
Here's a data set of employees that you can use to demonstrate the behavior.
CREATE TABLE employees (
id INT NOT NULL,
first_name VARCHAR(14) NOT NULL,
last_name VARCHAR(16) NOT NULL,
row_date DATE NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO `employees` VALUES
(1,'Bezalel','Simmel','2016-12-25'),
(2,'Bezalel','Simmel','2016-12-31'),
(3,'Bezalel','Simmel','2017-01-01'),
(4,'Bezalel','Simmel','2017-01-05')
This data will return the last 3 rows on the same data point on the old query (last 7 days) assuming you run it today 2017-01-06, but only the last 2 rows on the same data point on the new query (Sunday to Saturday).
For more information on what I mean by rolling or moving window, see this English stack exchange link.
https://english.stackexchange.com/questions/362791/word-for-graph-that-counts-backwards-vs-graph-that-counts-forwards
How can I write a query in MySQL that will bring me rolling data, where the last data point is the last 7 days of data, the previous point is the previous 7 days, and so on?
I've had to interpret your question a lot so this answer might be unsuitable. It sounds like you are trying to get a graph showing data historically grouped into 7-day periods. Your current attempt does this by grouping on calendar week instead of by 7-day period leading to inconsistent size of periods.
So using a modification of your dataset on sql fiddle ( http://sqlfiddle.com/#!9/90f1f2 ) I have come up with this
SELECT
-- Figure out how many periods of 7 days ago this record applies to
FLOOR( DATEDIFF( CURRENT_DATE , row_date ) / 7 ) AS weeks_ago,
-- Count the number of ids in this group
COUNT( DISTINCT id ) AS number_in_week,
-- Because this is grouped, make sure to have some consistency on what we select instead of leaving it to chance
MIN( row_date ) AS min_date_in_week_in_dataset
FROM `sample_data`
-- Groups by weeks ago because that's what you are interested in
GROUP BY weeks_ago
ORDER BY
min_date_in_week_in_dataset DESC;

Some questions about SQL group by week

I have some problems when coding SQL group by week.
I have a MySQL table named order.
In this entity, there are several attributes, called 'order_id', 'order_date', 'amount', etc.
I want to make a table to show the statistics of past 7 days order sales amount.
I think first I should get the today value.
Since I use Java Server Page, the code like this:
Calendar cal = Calendar.getInstance();
int day = cal.get(Calendar.DATE);
int Month = cal.get(Calendar.MONTH) + 1;
int year = cal.get(Calendar.YEAR);
String today = year + "-" + Month + "-" + day;
then, I need to use group by statement to calculate the SUM of past 7 day total sales amount.
like this:
ResultSet rs=statement.executeQuery("select order_date, SUM(amount) " +
"from `testing`.`order` GROUP BY order_date");
I have problem here. In my SQL, all order_date will be displayed.
How can I modify this SQL so that only display past seven days order sale amount?
Besides that, I discover a problem in my original SQL.
That is, if there is no sales on that day, no results would be displayed.
OF course, I know the ResultSet does not allow return null values in my SQL.
I just want to know if I need the past 7 order sales even the amount is 0 dollars,
Can I have other methods to show the 0?
Please kindly give me advices if you have idea.
Thank you.
Usually it occurs to create with a script or with a stored procedure a calendar table with all dates.
However if you prefer you can create a table with few dates (in your case dates of last week) with a single query.
This is an example:
create table orders(
id int not null auto_increment primary key,
dorder date,
amount int
) engine = myisam;
insert into orders (dorder,amount)
values (curdate(),100),
(curdate(),200),
('2011-02-24',50),
('2011-02-24',150),
('2011-02-22',10),
('2011-02-22',20),
('2011-02-22',30),
('2011-02-22',5),
('2011-02-19',10);
select t.cdate,sum(coalesce(o.amount,0)) as total
from (
select curdate() -
interval tmp.digit * 1 day as `cdate`
from (
select 0 as digit union all
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 ) as tmp) as t
left join orders as o
on t.cdate = o.dorder and o.dorder >= curdate() - interval 7 day
group by t.cdate
order by t.cdate desc
Hope that it helps. Regards.
To answer your question "How can I modify this SQL so that only display past seven days order sale amount?"
Modify the SQL statement by adding a where clause to it:
Where order_date >= #date_7days_ago
The value for this #date_7days_ago date variable can be set before your statement:
Select #date_7days_ago = dateadd(dd,-7,getdate())
Adding that where clause to your query will return only those records which order date is in the last seven days.
Hope this helps.
You can try using this:
ResultSet rs = statement.executeQuery(
"SELECT IFNULL(SUM(amount),0)
FROM table `testing`.`order`
WHERE order_date >= DATE_SUB('" + today + "', INTERVAL 7 DAY)"
);
This will get you the number of orders made in the last 7 days, and 0 if there were none.

MySQL: Average interval between records

Assume this table:
id date
----------------
1 2010-12-12
2 2010-12-13
3 2010-12-18
4 2010-12-22
5 2010-12-23
How do I find the average intervals between these dates, using MySQL queries only?
For instance, the calculation on this table will be
(
( 2010-12-13 - 2010-12-12 )
+ ( 2010-12-18 - 2010-12-13 )
+ ( 2010-12-22 - 2010-12-18 )
+ ( 2010-12-23 - 2010-12-22 )
) / 4
----------------------------------
= ( 1 DAY + 5 DAY + 4 DAY + 1 DAY ) / 4
= 2.75 DAY
Intuitively, what you are asking should be equivalent to the interval between the first and last dates, divided by the number of dates minus 1.
Let me explain more thoroughly. Imagine the dates are points on a line (+ are dates present, - are dates missing, the first date is the 12th, and I changed the last date to Dec 24th for illustration purposes):
++----+---+-+
Now, what you really want to do, is evenly space your dates out between these lines, and find how long it is between each of them:
+--+--+--+--+
To do that, you simply take the number of days between the last and first days, in this case 24 - 12 = 12, and divide it by the number of intervals you have to space out, in this case 4: 12 / 4 = 3.
With a MySQL query
SELECT DATEDIFF(MAX(dt), MIN(dt)) / (COUNT(dt) - 1) FROM a;
This works on this table (with your values it returns 2.75):
CREATE TABLE IF NOT EXISTS `a` (
`dt` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `a` (`dt`) VALUES
('2010-12-12'),
('2010-12-13'),
('2010-12-18'),
('2010-12-22'),
('2010-12-24');
If the ids are uniformly incremented without gaps, join the table to itself on id+1:
SELECT d.id, d.date, n.date, datediff(d.date, n.date)
FROM dates d
JOIN dates n ON(n.id = d.id + 1)
Then GROUP BY and average as needed.
If the ids are not uniform, do an inner query to assign ordered ids first.
I guess you'll also need to add a subquery to get the total number of rows.
Alternatively
Create an aggregate function that keeps track of the previous date, and a running sum and count. You'll still need to select from a subquery to force the ordering by date (actually, I'm not sure if that's guaranteed in MySQL).
Come to think of it, this is a much better way of doing it.
And Even Simpler
Just noting that Vegard's solution is much better.
The following query returns correct result
SELECT AVG(
DATEDIFF(i.date, (SELECT MAX(date)
FROM intervals WHERE date < i.date)
)
)
FROM intervals i
but it runs a dependent subquery which might be really inefficient with no index and on a larger number of rows.
You need to do self join and get differences using DATEDIFF function and get average.

Computing average values over sections of date/time

Problem:
I have a database of sensor readings with a timestamp for the time the sensor was read. Basically it looks like this:
Sensor | Timestamp | Value
Now I want to make a graph out of this data and I want to make several different graphs. Say I want one for the last day, one for the last week and one for the last month. The resolution of each graph will be different so for the day-graph the resolution would be 1 minute. For the week graph it would be one hour and for the month graph it would be one day, or quarter of a day.
So I would like an output that is the average of each resolution (eg. Day = Average over the minute, Week = Average over the hour and so on)
Ex:
Sensor | Start | End | Average
How do I do this easily and quickly in mySQL? I suspect it invoves creating a temporary table or sorts and joining the sensor data with that to get the average values of the sensor? But my knowledge of mySQL is limited at best.
Is there a really clever way to do this?
SELECT DAY(Timestamp), HOUR(Timestamp), MINUTE(Timestamp), AVG(value)
FROM mytable
GROUP BY
DAY(Timestamp), HOUR(Timestamp), MINUTE(Timestamp) WITH ROLLUP
WITH ROLLUP clause here produces extra rows with averages for each HOUR and DAY, like this:
SELECT DAY(ts), HOUR(ts), MINUTE(ts), COUNT(*)
FROM (
SELECT CAST('2009-06-02 20:00:00' AS DATETIME) AS ts
UNION ALL
SELECT CAST('2009-06-02 20:30:00' AS DATETIME) AS ts
UNION ALL
SELECT CAST('2009-06-02 21:30:00' AS DATETIME) AS ts
UNION ALL
SELECT CAST('2009-06-03 21:30:00' AS DATETIME) AS ts
) q
GROUP BY
DAY(ts), HOUR(ts), MINUTE(ts) WITH ROLLUP
2, 20, 0, 1
2, 20, 30, 1
2, 20, NULL, 2
2, 21, 30, 1
2, 21, NULL, 1
2, NULL, NULL, 3
3, 21, 30, 1
3, 21, NULL, 1
3, NULL, NULL, 1
NULL, NULL, NULL, 4
2, 20, NULL, 2 here means that COUNT(*) is 2 for DAY = 2, HOUR = 20 and all minutes.
Not quite the result table you wanted, but here's a starter for doing a 1 minute resolution:
SELECT sensor,minute(timestamp),avg(value)
FROM table
WHERE <time period specifier limits to a single hour>
GROUP BY sensor, minute(timestamp)
I've used code very similar to this (untested, but it's taking from working code)
set the variables:
$seconds = 3600;
$start = mktime(...); // say 2 hrs ago
$end = .... // 1 hour after $start
then run the query
SELECT MAX(`when`) AS top_When, MIN(`when`) AS low_When,
ROUND(AVG(sensor)) AS Avg_S,
(MAX(`when`) - MIN(`when`)) AS dur, /* the duration in seconds of the actual period */
((floor(UNIX_TIMESTAMP(`when`) / $seconds)) * $seconds) as Epoch
FROM `sensor_stats`
WHERE `when` >= '$start' AND `when` <= '$end' and duration=30
GROUP BY Epoch/*((floor(UNIX_TIMESTAMP(`when`) / $seconds)) * $seconds)*/
The advantage of this is that you can have whatever time periods you want - and not even required to have them on 'round numbers', like a complete clock-hour (even a clock-minute, 0-59).