Ordering within a MySQL group - mysql

I have two tables which are joined - one holds schedules and the other holds actual worked times.
This works fine if a given user only has a single schedule on a day but when they have more than one schedule I cannot get the query to match up the "right" slot to the right time.
I am beginning to think the only way to do this is to allocate the time to the schedule when the clock event happens but that is going to be a big rewrite so I am hoping there is a way in MySQL.
As this is inside a third party application, I am limited in what I can do to the query - I can modify the basics like from, group, joins etc and I can add aggregates to the fields (I have toyed with using min/max on the times). However, if the only way is to write a hugely complex query especially within the field selections then this system simply doesn't give me that option.
Schedule table:
CREATE TABLE `schedule` (
`id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`date` date NOT NULL,
`start_time` time NOT NULL,
`end_time` time NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Dumping data for table `schedule`
--
INSERT INTO `schedule` (`id`, `user_id`, `date`, `start_time`, `end_time`) VALUES
(1, 1, '2019-07-07', '08:00:00', '12:00:00'),
(2, 1, '2019-07-07', '16:00:00', '22:00:00'),
(3, 1, '2019-07-06', '10:00:00', '18:00:00');
Time table
CREATE TABLE `time` (
`id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`date` date NOT NULL,
`start_time` time NOT NULL,
`end_time` time NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Dumping data for table `time`
--
INSERT INTO `time` (`id`, `user_id`, `date`, `start_time`, `end_time`) VALUES
(1, 1, '2019-07-07', '08:00:00', '12:00:00'),
(2, 1, '2019-07-07', '16:00:00', '22:00:00'),
(3, 1, '2019-07-06', '10:00:00', '18:00:00');
Current query
select
t.date as date, t.user_id,
s.start_time as schedule_start,
s.end_time as schedule_end,
t.start_time as actual_start,
t.end_time as actual_end
from time t
left join schedule s on
t.user_id=s.user_id and t.date=s.date
group by t.date, t.start_time
Current output
== Dumping data for table s
|2019-07-06|1|10:00:00|18:00:00|10:00:00|18:00:00
|2019-07-07|1|08:00:00|12:00:00|08:00:00|12:00:00
|2019-07-07|1|08:00:00|12:00:00|16:00:00|22:00:00
Desired output
== Dumping data for table s
|2019-07-06|1|10:00:00|18:00:00|10:00:00|18:00:00
|2019-07-07|1|08:00:00|12:00:00|08:00:00|12:00:00
|2019-07-07|1|16:00:00|22:00:00|16:00:00|22:00:00
Is this possible to achieve?

I would try something like this.
I selected 15 min time limit that a shift should start
select
t.date as date, t.user_id,
s.start_time as schedule_start,
s.end_time as schedule_end,
t.start_time as actual_start,
t.end_time as actual_end
from time t
left join schedule s on
t.user_id=s.user_id and t.date=s.date
and s.start_time BETWEEN t.start_time - INTERVAL 15 MINUTE
AND t.start_time + INTERVAL 15 MINUTE
order by date,schedule_start;
Grouping would you do be add up time for every day and user day

You need a much more complicated query to distinguish the 2 shifts.
So you must execute 2 separate queries each for each shift and combine them with UNION:
select
s.date, s.user_id,
s.schedule_start,
s.schedule_end,
t.actual_start,
t.actual_end
from (
select s.date, s.user_id,
min(s.start_time) as schedule_start,
min(s.end_time) as schedule_end
from schedule s
group by s.date, s.user_id
) s left join (
select t.date, t.user_id,
min(t.start_time) as actual_start,
min(t.end_time) as actual_end
from time t
group by t.date, t.user_id
) t on t.user_id=s.user_id and t.date=s.date
union
select
s.date, s.user_id,
s.schedule_start,
s.schedule_end,
t.actual_start,
t.actual_end
from (
select s.date, s.user_id,
max(s.start_time) as schedule_start,
max(s.end_time) as schedule_end
from schedule s
group by s.date, s.user_id
) s left join (
select t.date, t.user_id,
max(t.start_time) as actual_start,
max(t.end_time) as actual_end
from time t
group by t.date, t.user_id
) t on t.user_id=s.user_id and t.date=s.date
See the demo.
Results:
> date | user_id | schedule_start | schedule_end | actual_start | actual_end
> :--------- | ------: | :------------- | :----------- | :----------- | :---------
> 2019-07-06 | 1 | 10:00:00 | 18:00:00 | 10:00:00 | 18:00:00
> 2019-07-07 | 1 | 08:00:00 | 12:00:00 | 08:00:00 | 12:00:00
> 2019-07-07 | 1 | 16:00:00 | 22:00:00 | 16:00:00 | 22:00:00

Related

mysql union Merge different columns

I want to remove the null value And move up the value from yesterday
But I don't know how to do it.
Full sql:
(SELECT
COUNT(1) toDay, NULL AS yesterDay
FROM
bas_user
WHERE UNIX_TIMESTAMP(user_datetime) BETWEEN UNIX_TIMESTAMP(
DATE_FORMAT(CURDATE(), '%Y-%m-%d %H:%i:%s')
)
AND UNIX_TIMESTAMP(NOW())
GROUP BY HOUR(user_datetime))
UNION
(SELECT
NULL AS toDay,COUNT(1) yesterDay
FROM
bas_user
WHERE UNIX_TIMESTAMP(user_datetime) BETWEEN UNIX_TIMESTAMP(
DATE_SUB(
DATE_FORMAT(CURDATE(), '%Y-%m-%d %H:%i:%s'),
INTERVAL 1 DAY
)
)
AND UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 DAY))
GROUP BY HOUR(user_datetime)
)
In order to merge the two result sets, you need a join key. For example, assume user_id is the join key of both.
-- Step 1
create table user_today (
user_id int,
today_count int);
create table user_yesterday (
user_id int,
yesterday_count int);
insert into user_today values (101, 10), (102, 20), (103, 30);
insert into user_yesterday values (102, 25), (103, 35), (104, 45);
-- Step 2
select COALESCE(t.user_id, y.user_id) as user_id,
t.today_count,
y.yesterday_count
from user_today t
left
join user_yesterday y
using (user_id)
union
select COALESCE(y.user_id, t.user_id) as user_id,
t.today_count,
y.yesterday_count
from user_yesterday y
left
join user_today t
using (user_id);
Result:
user_id|today_count|yesterday_count|
-------+-----------+---------------+
101| 10| |
102| 20| 25|
103| 30| 35|
104| | 45|

Query average with nested subquery

I cannot figure out how to calculate the running average per customer up until each month.
I tried to write it in one big query using subqueries, and also joins with no luck
Here is the query I tried with a subquery:
SELECT
date_format(z1.ServiceDate, '%y-%b') as months,
(
SELECT
AVG(cc.total) + 1 AS 'avg'
FROM
(
SELECT
z.Customer_ID,
COUNT(z.BookingId) 'total'
from
Orders z
where
YEAR(z.ServiceDate) <= YEAR(z1.months) AND
MONTH(z.ServiceDate) <= MONTH(z1.months)
GROUP BY
z.Customer_ID
) cc
)
from
Orders z1
GROUP BY
YEAR(z1.ServiceDate),
MONTH(z1.ServiceDate)
I also tried to join these two queries with no luck:
SELECT date_format(Orders.ServiceDate, '%y-%b') from Orders
GROUP BY YEAR(Orders.ServiceDate), month(Orders.ServiceDate)
Could not join it with this one:
(
SELECT AVG(cc.total) + 1 AS 'avg' FROM (
SELECT Orders.Customer_ID as 'c',
COUNT(BookingId) 'total' from Orders
where year(Orders.ServiceDate) <= '2019' and month(Orders.ServiceDate)
<= '01'
GROUP BY Orders.Customer_ID
) cc
)
where '2019' and '01' would be taken from the first query.
Here is my test schema:
CREATE TABLE IF NOT EXISTS `orders` (
`BookingId` INT(6) NOT NULL,
`ServiceDate` DATETIME NOT NULL,
`Customer_ID` varchar(1) NOT NULL,
PRIMARY KEY (`BookingId`)
) DEFAULT CHARSET=utf8;
INSERT INTO `orders` (`BookingId`, `ServiceDate`, `Customer_ID`) VALUES
('1', '2019-01-03T12:00:00', '1'),
('2', '2019-01-04T12:00:00', '2'),
('3', '2019-01-12T12:00:00', '2'),
('4', '2019-02-03T12:00:00', '1'),
('5', '2019-02-04T12:00:00', '2'),
('6', '2019-02-012T12:00:00', '3');
I was expecting something like this for all months
month AVG
19-Jan 1.5
19-Feb 2
...
...
The dots is there only to show that there is much many more months in my original dataset.
For January, there was 3 bookings and two Customer_ID's. Therefore the average for bookings up until that month was 1.5. Up until February, There has been 6 bookings, and 3 Customer_IDs. Therefore the new average is 2
Join a subquery that returns the distinct months to the table and aggregate:
SELECT d.month,
COUNT(o.bookingid) / COUNT(DISTINCT o.customer_id) avg
FROM (
SELECT DISTINCT
EXTRACT(YEAR_MONTH FROM servicedate) yearmonth,
DATE_FORMAT(servicedate, '%y-%b') month
FROM orders
) d INNER JOIN orders o
ON EXTRACT(YEAR_MONTH FROM o.servicedate) <= d.yearmonth
GROUP BY d.yearmonth, d.month
See the demo.
Results:
| month | avg |
| ------ | --- |
| 19-Jan | 1.5 |
| 19-Feb | 2 |

Combine temporary table of dates with two tables and fill in missing values?

I have a rates table which holds rows of nightly rates per day. I have a ratecodes table which houses different ratecodes mapped to rates.
My goal is to find any missing rates for any days for an X period of time. For this example let's use 1 month.
Desired result: 64 rows of which 2 rows are filled with information with the first rate code. The second rate code has absolutely no rows in rates but I need to show that it's actually missing dates. ( 64 because 1 month from now returns 32 days x 2 rate codes )
Two tables in question:
CREATE TABLE `ratecode` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ratecode` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
INSERT INTO `ratecode` VALUES ('1', 'BLAH');
INSERT INTO `ratecode` VALUES ('2', 'NAH');
CREATE TABLE `rates` (
`thedate` date DEFAULT NULL,
`rate` double DEFAULT NULL,
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ratecode` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
INSERT INTO `rates` VALUES ('2014-12-27', '999', '1', '1');
INSERT INTO `rates` VALUES ('2014-12-26', '99', '2', '1');
So using this query, in 2 parts. Part 1 is a temporary table of dates from today to 1 month ahead:
CREATE TEMPORARY TABLE IF NOT EXISTS `myDates` AS (
SELECT
CAST((SYSDATE()+INTERVAL (H+T+U) DAY) AS date) d
FROM ( SELECT 0 H
UNION ALL SELECT 100 UNION ALL SELECT 200 UNION ALL SELECT 300
) H CROSS JOIN ( SELECT 0 T
UNION ALL SELECT 10 UNION ALL SELECT 20 UNION ALL SELECT 30
UNION ALL SELECT 40 UNION ALL SELECT 50 UNION ALL SELECT 60
UNION ALL SELECT 70 UNION ALL SELECT 80 UNION ALL SELECT 90
) T CROSS JOIN ( SELECT 0 U
UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) U
WHERE
(SYSDATE()+INTERVAL (H+T+U) DAY) <= (SYSDATE()+INTERVAL 1 MONTH)
ORDER BY d ASC
);
And part 2 is the actual selection going on:
SELECT
*
FROM
rates
RIGHT JOIN myDates ON ( myDates.d = rates.thedate )
LEFT OUTER JOIN ratecode ON ( rates.ratecode = ratecode.id )
This returns only 32 rows back because in rates, there are 2 records for the first entry in ratecode. I don't get back the 32 missing rows for the other ratecode. How can I adjust in order to retain this information?
After I get the 64 rows back, I also need to filter for which ones are "blank" or haven't been entered in rates. So missing values only.
If I understand correctly, you want to generate all the rows using a cross join, then left join to the data and filter out all th ematches:
select rc.ratecode, d.d as missingdate
from ratecode rc cross join
mydates d left join
rates r
on rc.id = r.ratecode and d.d = r.thedate
where r.id is null;

INSERT SELECT ON DUPLICATE not updating

Short
I want to SUM a column in TABLE_A based on CRITERIA X and insert into TABLE_B.total_x
I want to SUM a column in TABLE_A based on CRITERIA Y and insert into TABLE_B.total_y
Problem: Step 2 does not update TABLE_B.total_y
LONG
TABLE_A: Data
| year | month | type | total |
---------------------------------------
| 2013 | 11 | down | 100 |
| 2013 | 11 | down | 50 |
| 2013 | 11 | up | 60 |
| 2013 | 10 | down | 200 |
| 2013 | 10 | up | 15 |
| 2013 | 10 | up | 9 |
TABLE_B: structure
CREATE TABLE `TABLE_B` (
`year` INT(4) NULL DEFAULT NULL,
`month` INT(2) UNSIGNED ZEROFILL NULL DEFAULT NULL,
`total_x` INT(10) NULL DEFAULT NULL,
`total_y` INT(10) NULL DEFAULT NULL,
UNIQUE INDEX `unique` (`year`, `month`)
)
SQL: CRITERIA_X
INSERT INTO TABLE_B (
`year`, `month`, `total_x`
)
SELECT
t.`year`, t.`month`,
SUM(t.`total`) as total_x
FROM TABLE_A t
WHERE
t.`type` = 'down'
GROUP BY
t.`year`, t.`month`
ON DUPLICATE KEY UPDATE
`total_x` = total_x
;
SQL: CRITERIA_Y
INSERT INTO TABLE_B (
`year`, `month`, `total_y`
)
SELECT
t.`year`, t.`month`,
SUM(t.`total`) as total_y
FROM TABLE_A t
WHERE
t.`type` = 'up'
GROUP BY
t.`year`, t.`month`
ON DUPLICATE KEY UPDATE
`total_y` = total_y
;
The second SQL (CRITERIA_Y) does not update total_y as expected. WHY?
I would do it another way
insert into TABLE_B (year, month, total_x, total_y)
select year, month
, sum (case [type] when 'down' then [total] else 0 end) [total_x]
, sum (case [type] when 'up' then [total] else 0 end) [total_y]
from TABLE_A
group by [year], [month]
Or using two subqueries way would be
insert into TABLE_B (year, month, total_x, total_y)
select coalesce(t1.year, t2.year) year
, coalesce(t1.month, t2.month) month
, t1.total_x total_x
, t2.total_y total_y
from (select year, month, sum(total) total_x
from TABLE_A where [type]='down') t1
full outer join
(select year, month, sum(total) total_y
from TABLE_A where [type]='up') t2
on t1.year = t2.year and t1.month = t2.month
Or using union
insert into TABLE_B (year, month, total_x, total_y)
select year, month, sum(total_x), sum(total_y)
from (
select year, month, sum(total) total_x, 0 total_y
from TABLE_A where [type]='down'
group by year, month
union
select year, month, 0 total_x, sum(total) total_y
from TABLE_A where [type]='up'
group by year, month) t
group by year, month
Reading specs on INSERT...ON DUPLICATE KEY UPDATE, I noticed this:
If ... matches several rows, only one row is updated. In general, you should try to avoid using an ON DUPLICATE KEY UPDATE clause on tables with multiple unique indexes.
So syntax with composite key is kind of cumbersome, and I personally would avoid using it.

Sum amount of overlapping datetime ranges in MySQL

I have a question that is almost the same as Sum amount of overlapping datetime ranges in MySQL, so I'm reusing part of his text, hope that is ok...
I have a table of events, each with a StartTime and EndTime (as type DateTime) in a MySQL Table.
I'm trying to output the sum of overlapping times for each type of event and the number of events that overlapped.
What is the most efficient / simple way to perform this query in MySQL?
CREATE TABLE IF NOT EXISTS `events` (
`EventID` int(10) unsigned NOT NULL auto_increment,
`EventType` int(10) unsigned NOT NULL,
`StartTime` datetime NOT NULL,
`EndTime` datetime default NULL,
PRIMARY KEY (`EventID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=37 ;
INSERT INTO `events` (`EventID`, EventType,`StartTime`, `EndTime`) VALUES
(10001,1, '2009-02-09 03:00:00', '2009-02-09 10:00:00'),
(10002,1, '2009-02-09 05:00:00', '2009-02-09 09:00:00'),
(10003,1, '2009-02-09 07:00:00', '2009-02-09 09:00:00'),
(10004,3, '2009-02-09 11:00:00', '2009-02-09 13:00:00'),
(10005,3, '2009-02-09 12:00:00', '2009-02-09 14:00:00');
# if the query was run using the data above,
# the table below would be the desired output
# Number of Overlapped Events , The event type, | Total Amount of Time those events overlapped.
1,1, 03:00:00
2,1, 02:00:00
3,1, 02:00:00
1,3, 01:00:00
There is a really beautiful solution given there by Mark Byers and I'm wondering if that one can be extended to include "Event Type".
His solution without event type was:
SELECT `COUNT`, SEC_TO_TIME(SUM(Duration))
FROM (
SELECT
COUNT(*) AS `Count`,
UNIX_TIMESTAMP(Times2.Time) - UNIX_TIMESTAMP(Times1.Time) AS Duration
FROM (
SELECT #rownum1 := #rownum1 + 1 AS rownum, `Time`
FROM (
SELECT DISTINCT(StartTime) AS `Time` FROM events
UNION
SELECT DISTINCT(EndTime) AS `Time` FROM events
) AS AllTimes, (SELECT #rownum1 := 0) AS Rownum
ORDER BY `Time` DESC
) As Times1
JOIN (
SELECT #rownum2 := #rownum2 + 1 AS rownum, `Time`
FROM (
SELECT DISTINCT(StartTime) AS `Time` FROM events
UNION
SELECT DISTINCT(EndTime) AS `Time` FROM events
) AS AllTimes, (SELECT #rownum2 := 0) AS Rownum
ORDER BY `Time` DESC
) As Times2
ON Times1.rownum = Times2.rownum + 1
JOIN events ON Times1.Time >= events.StartTime AND Times2.Time <= events.EndTime
GROUP BY Times1.rownum
) Totals
GROUP BY `Count`
SELECT
COUNT(*) as occurrence
, sub.event_id
, SEC_TO_TIME(SUM(LEAST(e1end, e2end) - GREATEST(e1start, e2start)))) as duration
FROM
( SELECT
, e1.event_id
, UNIX_TIMESTAMP(e1.starttime) as e1start
, UNIX_TIMESTAMP(e1.endtime) as e1end
, UNIX_TIMESTAMP(e2.starttime) as e2start
, UNIX_TIMESTAMP(e2.endtime) as e2end
FROM events e1
INNER JOIN events e2
ON (e1.eventtype = e2.eventtype AND e1.id <> e2.id
AND NOT(e1.starttime > e2.endtime OR e1.endtime < e2.starttime))
) sub
GROUP BY sub.event_id
ORDER BY occurrence DESC