Date split in Mysql based on rate info in table - mysql

The system is a hotel management software with multiple hotels attached to it. The schema is as follows:
CREATE TABLE `ms_property` (
`id` int(10) NOT NULL,
`name` varchar(254) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `ms_property` (`id`, `name`) VALUES(1, 'Black Forest');
CREATE TABLE `ms_property_room` ( `id` int(10) NOT NULL, `property_id` int(10) NOT NULL,
`room_name` varchar(254) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `ms_property_room` (`id`, `property_id`, `room_name`) VALUES (1, 1, 'Standard Room'),
(2, 1, 'AC Room');
CREATE TABLE `ms_tariff_type` (
`tt_id` bigint(20) NOT NULL,
`tt_tariff_name` text
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `ms_tariff_type` (`tt_id`,`tt_tariff_name`) VALUES
(1, 'Season Rates'),
(2, 'Contracted Rates');
CREATE TABLE `room_tariff` (
`id` bigint(20) NOT NULL ,
`room_id` bigint(20) ,
`tariff_type_id` bigint(20) ,
`tariff_from` date,
`tariff_to` date,
`single_rate` int(11),
`default_rate` int(11)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `room_tariff` (`id`, `room_id`,`tariff_type_id`,`tariff_from`, `tariff_to`, `single_rate`, `default_rate`) VALUES
(1, 1, 1, '2019-01-01', '2019-01-20',1000,2000),
(2, 1, 2, '2019-02-06', '2019-02-12',5000,10000),
(3, 2, 1, '2019-03-05', '2019-04-10',8000,7000);
CREATE TABLE `tariff_hike_day` (
`id` bigint(20) NOT NULL,
`room_id` bigint(20) ,
`tariff_type_id` bigint(20) ,
`hd_tariff_from` date,
`hd_tariff_to` date,
`hd_single_rate` int(11),
`hd_default_rate` int(11),
`thd_sunday` smallint(6) COMMENT 'Is rate applicable on Sunday 1=>yes 0=>no',
`thd_monday` smallint(6) COMMENT 'Is rate applicable on Monday 1=>yes 0=>no',
`thd_thuesday` smallint(6) COMMENT 'Is rate applicable on Tuesday 1=>yes 0=>no',
`thd_wednesday` smallint(6) COMMENT 'Is rate applicable on Wednesday 1=>yes 0=>no',
`thd_thursday` smallint(6) COMMENT 'Is rate applicable on Thursday 1=>yes 0=>no',
`thd_friday` smallint(6) COMMENT 'Is rate applicable on Friday 1=>yes 0=>no',
`thd_saturday` smallint(6) COMMENT 'Is rate applicable on Saturday 1=>yes 0=>no'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tariff_hike_day` (`id`, `room_id`, `tariff_type_id`,`hd_tariff_from`, `hd_tariff_to`, `hd_single_rate`, `hd_default_rate`, `thd_sunday`, `thd_monday`, `thd_thuesday`, `thd_wednesday`, `thd_thursday`, `thd_friday`, `thd_saturday`) VALUES
(1, 1, 1, '2019-01-05', '2019-01-10',100,200, 1, 1, 1, 1, 1, 1, 1),
(1, 2, 1, '2019-03-09', '2019-03-25',400,600, 1, 0, 0, 1, 0, 0, 0);
The scenario is to display the room rates applicable to hotels based on rate info provided in two tables. Normally a room will have different types of rate like "Contracted Rates", "Seasonal Rates" etc and in each type, Hotel Administrative Team will provide the applicable rates and the date range in which the rates are applicable.
The problem arises when the Hotel Administrative Team wants to specify additional hikes which are applicable on certain days. This information is stored in tariff_hike_day table where the Hotel Administrative Team can specify the date range and the days (sunday, monday etc) on which the hike is to be applied on base rate.
When the full entry is completed, the system is expected to display the result as follows:
+-------+---------------+---------------+------------------+------------+------------+-------------+--------------+
| Sl No | Property Name | Room | Tariff Type | Date From | Date To | Single Rate | Default Rate |
+-------+---------------+---------------+------------------+------------+------------+-------------+--------------+
| 1 | Black Forest | Standard Room | Season Rates | 2019-01-01 | 2019-01-04 | 1000 | 2000 |
| 2 | Black Forest | Standard Room | Season Rates | 2019-01-05 | 2019-01-10 | 1100 | 2200 |
| 3 | Black Forest | Standard Room | Season Rates | 2019-01-11 | 2019-01-20 | 1000 | 2000 |
| 4 | Black Forest | Standard Room | Contracted Rates | 2019-02-06 | 2019-02-12 | 5000 | 10000 |
| 5 | Black Forest | AC Room | Season Rates | 2019-03-05 | 2019-03-09 | 8000 | 7000 |
| 6 | Black Forest | AC Room | Season Rates | 2019-03-10 | 2019-03-10 | 8400 | 8600 |
| 7 | Black Forest | AC Room | Season Rates | 2019-03-11 | 2019-03-12 | 8000 | 7000 |
| 8 | Black Forest | AC Room | Season Rates | 2019-03-13 | 2019-03-13 | 8400 | 8600 |
| 9 | Black Forest | AC Room | Season Rates | 2019-03-14 | 2019-03-16 | 8000 | 7000 |
| 10 | Black Forest | AC Room | Season Rates | 2019-03-17 | 2019-03-17 | 8400 | 8600 |
| 11 | Black Forest | AC Room | Season Rates | 2019-03-18 | 2019-03-19 | 8000 | 7000 |
| 12 | Black Forest | AC Room | Season Rates | 2019-03-20 | 2019-03-20 | 8400 | 8600 |
| 13 | Black Forest | AC Room | Season Rates | 2019-03-21 | 2019-03-23 | 8000 | 7000 |
| 14 | Black Forest | AC Room | Season Rates | 2019-03-24 | 2019-03-24 | 8400 | 8600 |
| 15 | Black Forest | AC Room | Season Rates | 2019-03-25 | 2019-04-10 | 8000 | 7000 |
+-------+---------------+---------------+------------------+------------+------------+-------------+--------------+
Any help would be appreciated.

I know bit late to answer but hope it would help you.
First of all you need to make sure that there is no overlapped dates for same room with same tariff on both tables 'room_tariff' and 'tariff_hike_day'
To find it you can use the queries given below.
Finding Duplicate Dates(Overlapped dates) in room_tariff Table
SELECT
a.*
FROM
`room_tariff` AS a
INNER JOIN `room_tariff` AS b
ON a.`id` != b.`id`
AND a.`room_id` = b.`room_id`
AND a.`tariff_type_id` = b.`tariff_type_id`
AND NOT (
(
a.`tariff_from` > b.`tariff_from`
AND a.`tariff_from` > b.`tariff_to`
)
OR (
a.`tariff_to` < b.`tariff_from`
AND a.`tariff_to` < b.`tariff_to`
)
)
GROUP BY a.`room_id`,
a.`tariff_type_id`,
a.`tariff_from`,
a.`tariff_to`
ORDER BY a.`room_id` ASC,
a.`tariff_type_id` ASC,
a.`tariff_from` ASC ;
Finding Duplicate Dates(Overlapped dates) in tariff_hike_day Table
SELECT
a.*
FROM
`tariff_hike_day` AS a
INNER JOIN `tariff_hike_day` AS b
ON a.`id` != b.`id`
AND a.`room_id` = b.`room_id`
AND a.`tariff_type_id` = b.`tariff_type_id`
AND NOT (
(
a.`hd_tariff_from` > b.`hd_tariff_from`
AND a.`hd_tariff_from` > b.`hd_tariff_to`
)
OR (
a.`hd_tariff_to` < b.`hd_tariff_from`
AND a.`hd_tariff_to` < b.`hd_tariff_to`
)
)
GROUP BY a.`room_id`,
a.`tariff_type_id`,
a.`hd_tariff_from`,
a.`hd_tariff_to`
ORDER BY a.`room_id` ASC,
a.`tariff_type_id` ASC,
a.`hd_tariff_from` ASC ;
Both queries should return 'ZERO' rows to avoid over lapping. Here i joined same table and checking overlapped dates for same room with same tariff.
This link will help you get more explanation
To get result as you expected, We can do with the help of Stored Procedure as follows.
DELIMITER $$
DROP PROCEDURE IF EXISTS `testprocedure`$$
CREATE PROCEDURE `testprocedure`()
BEGIN
DECLARE my_id,
my_room_id,
my_tariff_type_id,
my_hd_id BIGINT ;
DECLARE my_single_rate,
my_default_rate,
my_hd_single_rate,
my_hd_default_rate INT ;
DECLARE my_tariff_from,
my_tariff_to,
my_hd_tariff_from,
my_hd_tariff_to,
currentdate,
startdate,
stopdate DATE ;
DECLARE my_thd_sunday,
my_thd_monday,
my_thd_tuesday,
my_thd_wednesday,
my_thd_thursday,
my_thd_friday,
my_thd_saturday SMALLINT ;
DECLARE cur_done INTEGER DEFAULT 0 ;
DECLARE `should_rollback` BOOL DEFAULT FALSE;
DECLARE cur1 CURSOR FOR
SELECT
a1.*,
a2.id,
hd_tariff_from,
hd_tariff_to,
hd_single_rate,
hd_default_rate,
thd_sunday,
thd_monday,
thd_thuesday,
thd_wednesday,
thd_thursday,
thd_friday,
thd_saturday
FROM
`room_tariff` AS a1
LEFT JOIN `tariff_hike_day` a2
ON a1.`room_id` = a2.`room_id`
AND a1.`tariff_type_id` = a2.`tariff_type_id`
AND a2.`hd_tariff_from` != '0000-00-00'
AND NOT (
a1.`tariff_from` > a2.`hd_tariff_to`
OR a1.`tariff_to` < a2.`hd_tariff_from`
)
WHERE a1.tariff_from != '0000-00-00'
AND a1.`tariff_from` <= a1.`tariff_to`
ORDER BY a1.`room_id` ASC,
a1.`tariff_type_id` ASC,
a1.`tariff_from` ASC,
a2.`hd_tariff_from` ASC ;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET cur_done = 1 ;
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET `should_rollback` = TRUE;
START TRANSACTION;
CREATE TABLE IF NOT EXISTS `room_rate_temp` (
`id` INT (11) UNSIGNED NOT NULL AUTO_INCREMENT,
`room_id` BIGINT (20) NOT NULL,
`tariff_type_id` BIGINT (20) NOT NULL,
`tariff_from` DATE NOT NULL,
`tariff_to` DATE NOT NULL,
`single_rate` INT (11) NOT NULL,
`default_rate` INT (11) NOT NULL,
`resultset_id` INT (11) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
) ENGINE = INNODB DEFAULT CHARSET = utf8 ;
SET #last_res_id := 0 ;
TRUNCATE TABLE room_rate_temp ;
OPEN cur1 ;
loop_matched_tables :
LOOP
FETCH cur1 INTO my_id,
my_room_id,
my_tariff_type_id,
my_tariff_from,
my_tariff_to,
my_single_rate,
my_default_rate,
my_hd_id,
my_hd_tariff_from,
my_hd_tariff_to,
my_hd_single_rate,
my_hd_default_rate,
my_thd_sunday,
my_thd_monday,
my_thd_tuesday,
my_thd_wednesday,
my_thd_thursday,
my_thd_friday,
my_thd_saturday ;
IF cur_done = 1 THEN
CLOSE cur1 ;
LEAVE loop_matched_tables ;
END IF ;
IF my_tariff_from <= my_tariff_to THEN
IF #last_res_id = my_id THEN
SELECT id,tariff_from FROM `room_rate_temp` WHERE `resultset_id` = my_id ORDER BY id DESC LIMIT 1 INTO #lastid,#last_tariff_from ;
SET my_tariff_from := #last_tariff_from ;
DELETE FROM room_rate_temp WHERE id = #lastid ;
END IF ;
IF my_hd_id IS NULL THEN
INSERT INTO room_rate_temp
VALUES
(
NULL,
my_room_id,
my_tariff_type_id,
my_tariff_from,
my_tariff_to,
my_single_rate,
my_default_rate,
my_id
) ;
ELSE
IF ( my_hd_tariff_from <= my_hd_tariff_to ) THEN
SET startdate := my_tariff_from ;
SET currentdate := my_tariff_from ;
SET stopdate := my_tariff_to ;
SET #insflag := 1 ;
SET #last_insid := #last_hike_flag := #hiketablecovered := #splitonce := 0 ;
WHILE
currentdate <= stopdate DO
SET #my_repeat_col_name := DAYNAME(currentdate) ;
SET #hd_single_rate := my_single_rate ;
SET #hd_default_rate := my_default_rate ;
SELECT
CASE
#my_repeat_col_name
WHEN 'Sunday'
THEN my_thd_sunday
WHEN 'Monday'
THEN my_thd_monday
WHEN 'Tuesday'
THEN my_thd_tuesday
WHEN 'Wednesday'
THEN my_thd_wednesday
WHEN 'Thursday'
THEN my_thd_thursday
WHEN 'Friday'
THEN my_thd_friday
WHEN 'Saturday'
THEN my_thd_saturday
ELSE NULL
END AS mydate INTO #hikeapplicable ;
IF ( currentdate BETWEEN my_hd_tariff_from AND my_hd_tariff_to ) THEN
IF ( #last_hike_flag != #hikeapplicable ) THEN
SET #insflag := 1 ;
SET #last_hike_flag := #hikeapplicable ;
SET #splitonce := 1 ;
IF ( #hikeapplicable = 1 ) THEN
SET #hd_single_rate := my_single_rate + my_hd_single_rate ;
SET #hd_default_rate := my_default_rate + my_hd_default_rate ;
END IF ;
END IF ;
SET #hiketablecovered := 1;
ELSEIF ( (currentdate > my_hd_tariff_to) AND ( #hiketablecovered = 1 ) AND (#splitonce = 1) ) THEN
IF(#last_hike_flag = 1) THEN
SET #insflag := 1;
END IF ;
SET #hiketablecovered := #splitonce := 0 ;
END IF ;
IF (#insflag = 1) THEN
INSERT INTO room_rate_temp VALUES ( NULL, my_room_id, my_tariff_type_id, currentdate, currentdate, #hd_single_rate, #hd_default_rate, my_id );
SET #last_insid := LAST_INSERT_ID() ;
SET #insflag := 0 ;
ELSE
UPDATE room_rate_temp SET tariff_to = currentdate WHERE id = #last_insid;
END IF ;
SET currentdate = ADDDATE(currentdate, INTERVAL 1 DAY) ;
END WHILE ;
END IF ;
END IF ;
SET #last_res_id := my_id;
END IF ;
END LOOP loop_matched_tables ;
SET #count:=0;
SELECT (#count:=#count+1) AS `Sl No`, d.name AS `Property Name`, c.room_name AS Room, b.tt_tariff_name AS `Tariff Type`, a.tariff_from AS `Date From`, a.tariff_to AS `Date To`, a.single_rate AS `Single Rate`, a.default_rate AS `Default Rate`
FROM room_rate_temp AS a INNER JOIN ms_tariff_type AS b ON a.tariff_type_id = b.tt_id INNER JOIN ms_property_room AS C
ON a.room_id = c.id INNER JOIN ms_property AS d ON c.property_id = d.id;
IF `should_rollback` THEN
ROLLBACK;
ELSE
COMMIT;
END IF;
END$$
DELIMITER ;
In this procedure,
For storing the result, I created one temp table and will exist until next query so that you can fetch last result at any time.
First i joined tariff and hike table to find the matching for similar date range.
Then looping the query result and breaks rows when hike is applicable.

Related

SQL procedure - Multiple value

I have this procedure :
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE `id_var` varchar(255);
DECLARE `cur1` CURSOR FOR
SELECT `id` FROM `clients`
WHERE `status` = 'Active';
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
DROP TABLE IF EXISTS `tblquota_nc`;
CREATE TABLE IF NOT EXISTS `tblquota_nc` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`email` varchar(255),
`pack_id` int(11) NOT NULL,
`pack_name` varchar(255) NOT NULL,
`quota` int(11) NULL,
PRIMARY KEY (`id`)
);
OPEN cur1;
read_loop: LOOP
FETCH NEXT
FROM cur1
INTO id_var;
IF done THEN
LEAVE read_loop;
END IF;
SELECT clients.id as `User ID`, `email`, `packageid` as `Pack ID`, `name` as `Pack`, (CASE
WHEN `name` = "Basic" THEN '10'
WHEN `name` = "Silver" THEN '100'
WHEN `name` = "Gold" THEN '1000'
ELSE '10'
END) as Quota INTO #mid, #mail, #p_id, #packname, #quota
FROM `clients`
INNER JOIN `tblhosting` ON clients.id = tblhosting.userid
INNER JOIN `tblproducts` ON tblhosting.packageid = tblproducts.id
WHERE clients.status = 'Active'
AND tblhosting.domainstatus = 'Active'
AND clients.id = id_var;
IF (SELECT id FROM tblquota_nc WHERE user_id = #mid) THEN
BEGIN
END;
ELSE
BEGIN
if (#mid IS NOT NULL AND #p_id IS NOT NULL AND #packname IS NOT NULL) then
INSERT INTO tblquota_nc (user_id, email, pack_id, pack_name, quota) VALUES (#mid, #mail, #p_id, #packname, #quota);
end if;
END;
END IF;
END LOOP;
CLOSE cur1;
END
It seems like I have an error because the SELECT statment return several values. I thought doing another loop with these results to make an insert into the new table. I want to make a new table from these information.
table clients:
id | email | status
----------------------------
1 | user1#mail.com | Active
2 | user2#mail.com | Inactive
3 | user3#mail.com | Active
table tblhosting
id | userid | packageid | domainstatus
------------------------------------------------
1 | 1 | 2 | Active
2 | 2 | 3 | Active
3 | 3 | 1 | Active
table tblproducts
id | name
-----------
1 | Basic
2 | Silver
3 | Gold
I expect result like :
id | user_id | email | pack_id | pack_name | quota
-----------------------------------------------------------
1 | 1 | user1#mail.com | 2 | Silver | 100
2 | 2 | user2#mail.com | 3 | Gold | 1000
3 | 3 | user3#mail.com | 1 | Basic | 10
If I put max in the case statment, it will work but will not show all data.
I don't think you need a stored procedure to do this. Just use CREATE TABLE ... SELECT syntax:
CREATE TABLE tblquota_nc (id INT AUTO_INCREMENT PRIMARY KEY) AS
SELECT c.id as user_id
, c.`email`
, h.`packageid` as pack_id
, p.`name` as pack_name
, (CASE WHEN `name` = "Basic" THEN '10'
WHEN `name` = "Silver" THEN '100'
WHEN `name` = "Gold" THEN '1000'
ELSE '10'
END) as quota
FROM `clients` c
LEFT JOIN `tblhosting` h ON c.id = h.userid
INNER JOIN `tblproducts` p ON h.packageid = p.id
ORDER BY c.id;
Output from SELECT * FROM tblquota_nc:
id user_id email pack_id pack_name quota
1 1 user1#mail.com 2 Silver 100
2 2 user2#mail.com 3 Gold 1000
3 3 user3#mail.com 1 Basic 10
Demo on dbfiddle

SQL performance issue : find a route

I'm struggling with performance issue on my of my SQL query
I have a train journey traveling 5 stations named "A - B - C - D - E".
A passenger book a ticket for only "B - C - D" ride.
I need to retrieve all stations my passengers goes to.
What I have stored :
JOURNEY
+----+--------------------+-------------------+-------------------+-----------------+
| id | departure_datetime | arrival_datetime | departure_station | arrival_station |
+----+--------------------+-------------------+-------------------+-----------------+
| 1 | 2018-01-01 06:00 | 2018-01-01 10:00 | A | E |
+----+--------------------+-------------------+-------------------+-----------------+
BOOKING
+----+------------+-------------------+-----------------+
| id | journey_id | departure_station | arrival_station |
+----+------------+-------------------+-----------------+
| 1 | 1 | B | D |
+----+------------+-------------------+-----------------+
LEG
+----+------------+-------------------+-----------------+------------------+------------------+
| id | journey_id | departure_station | arrival_station | departure_time | arrival_time |
+----+------------+-------------------+-----------------+------------------+------------------+
| 1 | 1 | A | B | 2018-01-01 06:00 | 2018-01-01 07:00 |
| 2 | 1 | B | C | 2018-01-01 07:00 | 2018-01-01 08:00 |
| 3 | 1 | C | D | 2018-01-01 08:00 | 2018-01-01 09:00 |
| 4 | 1 | D | E | 2018-01-01 09:00 | 2018-01-01 10:00 |
+----+------------+-------------------+-----------------+------------------+------------------+
Only way I found to retrieve stations is :
select b.id as booking, l.departure_station, l.arrival_station
from JOURNEY j
inner join BOOKING b on j.id = b.journey_id
inner join LEG dl on (j.id = dl.journey_id and b.departure_station = dl.departure_station)
inner join LEG al on (j.id = al.journey_id and b.arrival_station = al.arrival_station)
inner join LEG l on (j.id = l.journey_id and l.departure_time >= dl.departure_time and l.arrival_time <= al.arrival_time)
where b.id = 1
But my LEG table is huge and doing this 3 joins on is very slow. Is there a way I can join only one time LEG table to increase performance ?
Intended return :
+------------+-------------------+-----------------+
| booking_id | departure_station | arrival_station |
+------------+-------------------+-----------------+
| 1 | B | C |
| 1 | C | D |
+------------+-------------------+-----------------+
I work on mariadb 12.2 so i have access to window function but i'm still not very comfortable with it.
Thanks.
EDIT : create tables :
CREATE TABLE `BOOKING` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `JOURNEY` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `LEG` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
I don't like your DB schema.
But in your particular case, since you have your query working good for you.
I would just create few indexes too speed up execution.
In general there is nothing wrong when you need to join table few times to itself.
http://sqlfiddle.com/#!9/1a467/1
Try just add 4 indexes:
CREATE INDEX journey ON BOOKING (journey_id);
CREATE INDEX arrival ON LEG (journey_id, arrival_station);
CREATE INDEX departure ON LEG (journey_id, departure_station);
CREATE INDEX d_a_time ON LEG (journey_id, departure_time, arrival_time);
And run your query again, it should be much faster when using indexes.
I would suggest using Common Table Expression (CTE):
WITH leg_cte as
(
SELECT l.* FROM leg l
JOIN booking b
ON l.journey_id = b.journey_id
WHERE b.id = 1
)
SELECT
b.id as booking,
l.departure_station,
l.arrival_station
FROM
booking b
JOIN leg_cte dl
ON b.departure_station = dl.departure_station
JOIN leg_cte al
ON b.arrival_station = al.arrival_station
JOIN leg_cte l
ON l.departure_time >= dl.departure_time AND l.arrival_time <= al.arrival_time
WHERE b.id = 1
Try it left join and use REGEXP to filiter departure_station and arrival_station
select T3.id booking_id , T1.departure_station,T1.arrival_station
from LEG T1
left join JOURNEY T2 on T1.`journey_id` = T2.`id`
and (T1.`departure_time` >= T2.`departure_datetime` and T1.`arrival_time` <= T2.`arrival_datetime`)
left join BOOKING T3 on T3.`id` = T2.`id`
and T1.departure_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
and T1.arrival_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
where T1.journey_id = 1 and T3.id is not null ;
SQL Fiddle Demo Link
| booking_id | departure_station | arrival_station |
|------------|-------------------|-----------------|
| 1 | B | C |
| 1 | C | D |
Test DDL:
CREATE TABLE JOURNEY
(`id` int, `departure_datetime` datetime, `arrival_datetime` datetime, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO JOURNEY
(`id`, `departure_datetime`, `arrival_datetime`, `departure_station`, `arrival_station`)
VALUES
(1, '2018-01-01 06:00:00', '2018-01-01 10:00:00', 'A', 'E')
;
CREATE TABLE BOOKING
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO BOOKING
(`id`, `journey_id`, `departure_station`, `arrival_station`)
VALUES
(1, 1, 'B', 'D')
;
CREATE TABLE LEG
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1), `departure_time` datetime, `arrival_time` datetime)
;
INSERT INTO LEG
(`id`, `journey_id`, `departure_station`, `arrival_station`, `departure_time`, `arrival_time`)
VALUES
(1, 1, 'A', 'B', '2018-01-01 06:00:00', '2018-01-01 07:00:00'),
(2, 1, 'B', 'C', '2018-01-01 07:00:00', '2018-01-01 08:00:00'),
(3, 1, 'C', 'D', '2018-01-01 08:00:00', '2018-01-01 09:00:00'),
(4, 1, 'D', 'E', '2018-01-01 09:00:00', '2018-01-01 10:00:00')
;

select from tables with different numbers of rows

I'm hoping there is a simple answer to this. Competitors race over a series of 3 races. Some competitors only show up for one race. How could I show a final result for ALL competitors?
race 1
+------+--------+
| name | result |
+------+--------+
| Ali | 30 |
| Bob | 28 |
| Cal | 26 |
+------+--------+
race 2
+------+--------+
| name | result |
+------+--------+
| Ali | 32 |
| Bob | 31 |
| Dan | 24 |
+------+--------+
race 3
+------+--------+
| name | result |
+------+--------+
| Eva | 23 |
| Dan | 25 |
+------+--------+
The final result should look like this:
+------+--------+--------+--------+
| name | result | result | result |
+------+--------+--------+--------+
| Ali | 30 | 32 | |
| Bob | 28 | 31 | |
| Cal | 26 | | |
| Dan | | 24 | 25 |
| Eva | | | 23 |
+------+--------+--------+--------+
The problem I have is with ordering by name from multiple tables.
Here is the example data:
CREATE TABLE race (name varchar(20), result int);
CREATE TABLE race1 LIKE race;
INSERT INTO race1 VALUES ('Ali', '30'), ('Bob', '28'), ('Cal', '26');
CREATE TABLE race2 like race;
insert INTO race2 VALUES ('Ali', '32'), ('Bob', '31'), ('Dan', '24');
CREATE TABLE race3 LIKE race;
INSERT INTO race3 VALUES ('Eva', '23'), ('Dan', '25');
Many thanks!
Here we go !!!
select race1.name as name, race1.result, race2.result, race3.result from race1
left join race2 on race2.name = race1.name
left join race3 on race3.name = race1.name
union
select race2.name as name, race1.result, race2.result, race3.result from race2
left join race1 on race1.name = race2.name
left join race3 on race3.name = race2.name
union
select race3.name as name, race1.result, race2.result, race3.result from race3
left join race1 on race1.name = race3.name
left join race2 on race2.name = race3.name;
It is working :)
select s.name,
max(case when s.R = 'Result1' then s.result else '' end) as result1,
max(case when s.R = 'Result2' then s.result else '' end) as result2,
max(case when s.R = 'Result3' then s.result else '' end) as result3
from
(
select 'Result1' as R,r1.* from race1 r1
union all
select 'Result2' as R,r2.* from race2 r2
union all
select 'Result3' as R,r3.* from race3 r3
) s
group by s.name
result
+------+---------+---------+---------+
| name | result1 | result2 | result3 |
+------+---------+---------+---------+
| Ali | 30 | 32 | |
| Bob | 28 | 31 | |
| Cal | 26 | | |
| Dan | | 24 | 25 |
| Eva | | | 23 |
+------+---------+---------+---------+
5 rows in set (0.00 sec)
I personally would create the schema in a different way.
One table for the users, one for the races and one that connects both:
-- Create syntax for TABLE 'races'
CREATE TABLE `races` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
-- Create syntax for TABLE 'users'
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
-- Create syntax for TABLE 'race_results'
CREATE TABLE `race_results` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`race_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`result` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Let's insert some data (should be equal to your data set).
-- Insert data
INSERT INTO users (name)values('Ali'),('Bob'),('Cal'),('Dan'), ('Eva');
INSERT INTO races (name)values('Race1'),('Race2'),('Race3');
INSERT INTO race_results (user_id, race_id, result)values(1,1,30),(2,1,30),(1,2,28),(2,2,31),(3,1,26),(4,2,24),(4,3,25),(5,3,23);
Then you could write the query like this:
-- Static version
SELECT us.name, sum(if(ra.name='Race1', result, null)) as Race1, sum(if(ra.name='Race2', result, null)) as Race2, sum(if(ra.name='Race3', result, null)) as Race3
FROM race_results as rr
LEFT JOIN users as us on us.id = rr.user_id
LEFT JOIN races as ra on ra.id = rr.race_id
GROUP BY us.id;
Which gives you the result you're looking for. (I changed the column names to make it more obvious which result belongs to which race.)
But I've to admit that this works fine for 3 races but what if you have 30 or more?
Here is a more dynamic version of the above query, which kind of creates itself ;)
-- Dynamic version
SET #sql = '';
SELECT
#sql := CONCAT(#sql,if(#sql='','',', '),temp.output)
FROM
(SELECT
CONCAT("sum(if(ra.name='", race.name, "', result, null)) as ", race.name) as output
FROM races as race
) as temp;
SET #sql = CONCAT("SELECT us.name,", #sql, " FROM race_results as rr LEFT JOIN users as us on us.id = rr.user_id LEFT JOIN races as ra on ra.id = rr.race_id GROUP BY 1;");
SELECT #sql;
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

SQL Restoring historical data from the changelog

There is a table A. One row from the table A looks like following:
+----+---------+---------+---------+------------+------------+------------------+------------------+
| id | value_a | value_b | value_c | created_on | created_by | last_modified_on | last_modified_by |
+----+---------+---------+---------+------------+------------+------------------+------------------+
| 42 | x | y | z | 2016-04-01 | Maria | 2016-05-01 | Jim |
+----+---------+---------+---------+------------+------------+------------------+------------------+
So, table A contains only the latest values.
There is also a table called changelog. It stores all the changes/updates concerning table A. changelog records for table A look like following:
+-----+-----------+--------+---------+-----------+-----------------------------------------+------------+------------+
| id | object_id | action | field | old_value | new_value | created_on | created_by |
+-----+-----------+--------+---------+-----------+-----------------------------------------+------------+------------+
| 234 | 42 | insert | NULL | NULL | {value_a: xx, value_b: yy, value_c: zz} | 2016-04-01 | Maria |
| 456 | 42 | update | value_a | xx | x | 2016-04-05 | Bob |
| 467 | 42 | update | value_b | yy | y | 2016-05-01 | Jim |
| 678 | 42 | update | value_c | zz | z | 2016-05-01 | Jim |
+-----+-----------+--------+---------+-----------+-----------------------------------------+------------+------------+
I need to create a historical_A table, which for this specific record will look like follows:
+----+---------+---------+---------+------------+------------+------------+--------------+
| id | value_a | value_b | value_c | valid_from | created_by | valid_to | modified_by |
+----+---------+---------+---------+------------+------------+------------+--------------+
| 42 | xx | yy | zz | 2016-04-01 | Maria | 2016-04-05 | Bob |
| 42 | x | yy | zz | 2016-04-05 | Bon | 2016-05-01 | Jim |
| 42 | x | y | z | 2016-05-01 | Jim | | |
+----+---------+---------+---------+------------+------------+------------+--------------+
Table A has about 1 500 000 rows, changelog table for table A has about 27 000 000 rows.
Currently i am doing an initial transformation (load) using both SQL and Python scripting. Bascially i generate an insert statement for the initial row (by parsing a json), and then generate all following insert statements grouping by created_on column of the changelog table.
Currently it takes me around 3 minutes to process 1000 rows of table A. Thus, i am parallelising (x10) my scripts execution to get a result in a more timely manner.
I suspect that Sql + Python scripting is not the best solution to the problem. Is there a purely SQL solution to the presented problem?
Are there any established best practices for such problems?
Unfortunately my MYSQL box is broken so I've done this in SQL Server but I don't think there are any compatibility issues in the code. I would be interested if it works for you and how well it performs. You may need to add indexes to speed up performance.
--SQL Restoring historical data from the changelog
/*
create table a
( id int, value_a varchar(20), value_b varchar(20), value_c varchar(20),
created_on date, created_by varchar(20), last_modified_on date, last_modified_by varchar(20));
create table changelog
( id int, object_id int, action varchar(20), field varchar(20) , old_value varchar(20), new_value varchar(50), created_on date, created_by varchar(20));
create table history_work
(changeid int,objectid int, value_a varchar(20), value_b varchar(20), value_c varchar(20), value_a_new varchar(20), value_b_new varchar(20), value_c_new varchar(20),
created_on date, created_by varchar(20), last_modified_on date, last_modified_by varchar(20));
CREATE TABLE `history` (
`changeid` INT(11) NULL DEFAULT NULL,
`objectid` INT(11) NULL DEFAULT NULL,
`value_a` VARCHAR(20) NULL DEFAULT NULL,
`value_b` VARCHAR(20) NULL DEFAULT NULL,
`value_c` VARCHAR(20) NULL DEFAULT NULL,
`valid_from` DATE NULL DEFAULT NULL,
`created_by` VARCHAR(20) NULL DEFAULT NULL,
`valid_to` DATE NULL DEFAULT NULL,
`last_modified_by` VARCHAR(20) NULL DEFAULT NULL
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
drop table if exists t;
CREATE TABLE `t` (
`changeid` INT(11) NULL DEFAULT NULL,
`objectid` INT(11) NULL DEFAULT NULL,
`value_a` VARCHAR(20) NULL DEFAULT NULL,
`value_b` VARCHAR(20) NULL DEFAULT NULL,
`value_c` VARCHAR(20) NULL DEFAULT NULL,
`value_a_new` VARCHAR(20) NULL DEFAULT NULL,
`value_b_new` VARCHAR(20) NULL DEFAULT NULL,
`value_c_new` VARCHAR(20) NULL DEFAULT NULL,
`created_on` DATE NULL DEFAULT NULL,
`created_by` VARCHAR(20) NULL DEFAULT NULL,
`last_modified_on` DATE NULL DEFAULT NULL,
`last_modified_by` VARCHAR(20) NULL DEFAULT NULL
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
;
;
expected result
+----+---------+---------+---------+------------+------------+------------+--------------+
| id | value_a | value_b | value_c | valid_from | created_by | valid_to | modified_by |
+----+---------+---------+---------+------------+------------+------------+--------------+
| 42 | xx | yy | zz | 2016-04-01 | Maria | 2016-04-05 | Bob |
| 42 | x | yy | zz | 2016-04-05 | Bon | 2016-05-01 | Jim |
| 42 | x | y | z | 2016-05-01 | Jim | | |
+----+---------+---------+---------+------------+------------+------------+--------------+
*/
truncate table a;
truncate table changelog;
truncate table history_work;
Insert into a values
( 42 , 'x' , 'y' , 'z' ,'2016-04-01' ,'Maria','2016-05-01', 'Jim');
insert into changelog values
( 234 , 42 , 'insert' , NULL , NULL , '{value_a: xx, value_b: yy, value_c: zz}' , '2016-04-01', 'Maria'),
( 456 , 42 , 'update' , 'value_a' ,'xx', 'x', '2016-04-05', 'Bob' ),
( 467 , 42 , 'update' , 'value_b' ,'yy', 'y', '2016-05-01', 'Jim' ),
( 678 , 42 , 'update' , 'value_c' ,'zz', 'z', '2016-05-01', 'Jim' ) ;
/*Dummy Insert record*/
insert into history_work
(changeid ,objectid,
#, value_a , value_b , value_c,
created_on , created_by, last_modified_on,last_modified_by
)
select
000,id, #, value_a , value_b , value_c,
created_on, created_by, last_modified_on, last_modified_by
from a;
/*
insert into history_work
(changeid ,objectid , value_a , value_b , value_c, created_on , created_by, last_modified_on,last_modified_by)
select
999,id , value_a , value_b , value_c, created_on, created_by, last_modified_on, last_modified_by
from a
*/
insert into history_work
(changeid ,objectid , value_a , value_b , value_c, value_a_new, value_b_new , value_c_new,
created_on , created_by, last_modified_on,last_modified_by)
select a.id,
a.object_id,
case
when field = 'value_a' then a.old_value
else null
end,
case
when field = 'value_b' then a.old_value
else null
end,
case
when field = 'value_c' then a.old_value
else null
end,
case
when field = 'value_a' then a.new_value
else null
end,
case
when field = 'value_b' then a.new_value
else null
end,
case
when field = 'value_c' then a.new_value
else null
end,
a.created_on,a.created_by,
a.created_on,a.created_by
from changelog a
#join history_work h on h.objectid = a.object_id and h.changeid = 999
where action <> 'insert';
/*Derive Insert values from first old_value*/
truncate table t;
insert into t
(changeid, objectid)
select distinct 0,objectid from history_work;
update t
set value_a = (select hw.value_a from history_work hw
where hw.objectid = t.objectid
and hw.changeid = (select min(changeid) from history_work a where a.objectid = hw.objectid and a.value_a is not null)),
value_b = (select hw.value_b from history_work hw
where hw.objectid = t.objectid
and hw.changeid = (select min(changeid) from history_work a where a.objectid = hw.objectid and a.value_b is not null)),
value_c = (select hw.value_c from history_work hw
where hw.objectid = t.objectid
and hw.changeid = (select min(changeid) from history_work a where a.objectid = hw.objectid and a.value_c is not null));
update history_work h
join t on t.objectid = h.objectid
set h.value_a = t.value_a, h.value_b = t.value_b, h.value_c = t.value_c
where h.changeid = 0;
#select * from history_work;
/*Get Changes*/
update history_work set value_a = value_a_new where value_a_new is not null;
update history_work set value_b = value_b_new where value_b_new is not null;
update history_work set value_c = value_c_new where value_c_new is not null;
/*Downfill and create final table*/
truncate table history;
insert into history
( `changeid` ,
`objectid` ,
`value_a` ,
`value_b` ,
`value_c` ,
`valid_from` ,
`created_by` ,
`valid_to` ,
`last_modified_by`
)
select h.changeid,h.objectid ,
(select a.value_a from history_work a where a.changeid =
(select max(changeid) from history_work h1 where h1.objectid = h.objectid and h1.value_a is not null and h1.changeid <= h.changeid)
) value_a,
(select a.value_b from history_work a where a.changeid =
(select max(changeid) from history_work h1 where h1.objectid = h.objectid and h1.value_b is not null and h1.changeid <= h.changeid)
) value_b,
(select a.value_c from history_work a where a.changeid =
(select max(changeid) from history_work h1 where h1.objectid = h.objectid and h1.value_c is not null and h1.changeid <= h.changeid)
) value_c,
h.created_on,h.created_by,h.last_modified_on,h.last_modified_by
from history_work h
where h.changeid in (select maxid from
(select a.created_on, a.created_by,a.object_id, min(id) minid,max(a.id) maxid
from changelog a
group by a.created_on, a.created_by,a.object_id) s
)
or h.changeid = 0
order by h.changeid;
truncate table t;
insert into t
(changeid, objectid,value_a,value_b,value_c,created_on,created_by,last_modified_on,last_modified_by)
select changeid,objectid,
value_a,
value_b,
value_c,
valid_from,created_by,
valid_to,last_modified_by
from history
;
update history h
set h.valid_to =
((select a.created_on from t a where a.changeid = (select min(b.changeid) from t b where b.objectid = a.objectid and b.changeid > h.changeid))),
last_modified_by =
(select a.created_by from t a where a.changeid = (select min(changeid) from t b where b.objectid = a.objectid and b.changeid > h.changeid))
;
select * from history;

Take data on SQL when statement is change

Let says i've data like below.
+-------+-------+
| time | status |
+-------+-------+
| 01:00 | On |
| 02:00 | On |
| 03:00 | On |
| 04:00 | Off |
| 05:00 | On |
| 06:00 | On |
| 07:00 | Off |
| 08:00 | Off |
| 09:00 | On |
| 10:00 | On |
| 11:00 | Off |
+-------+-------+
My expected result table is
+-------+-------+
| On | Off |
+-------+-------+
| 01:00 | 04:00 |
| 05:00 | 07:00 |
| 09:00 | 11:00 |
+-------+-------+
How to create query like to become my expected result ?, because when i cannot use decode into this case.
this is my create table command
CREATE TABLE `gps_data` (
`id` int(255) NOT NULL AUTO_INCREMENT,
`imei_no` int(255) DEFAULT NULL,
`car_identification_no` varchar(255) DEFAULT NULL,
`datetime` bigint(10) DEFAULT NULL,
`longitude` float(10,7) DEFAULT NULL,
`latitude` float(10,7) DEFAULT NULL,
`engine_status` varchar(100) DEFAULT NULL,
`speed` int(100) DEFAULT NULL,
`mileage` bigint(255) DEFAULT NULL,
`alarm` int(20) DEFAULT NULL,
`address` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `car_identification_no` (`car_identification_no`)
) ENGINE=MyISAM AUTO_INCREMENT=430564 DEFAULT CHARSET=latin1
i want to use datetime and engine_status (on/off)
Thanks
Try this:
SELECT MIN(CASE WHEN (grp - 1) MOD 2 + 1 = 1 THEN `time` END) AS 'On',
MIN(CASE WHEN (grp - 1) MOD 2 + 1 = 2 THEN `time` END) AS 'Off'
FROM (
SELECT `time`, `status`,
#grp := IF(#prev_status = `status`, #grp,
IF(#prev_status := `status`, #grp + 1, #grp + 1)) AS grp
FROM mytable
CROSS JOIN (SELECT #grp := 0, #prev_status = '') AS vars
ORDER BY `time`) AS t
GROUP BY (grp - 1) DIV 2
Demo here
Explanation:
The inner query:
SELECT `time`, `status`,
#grp := IF(#prev_status = `status`, #grp,
IF(#prev_status := `status`, #grp + 1, #grp + 1)) AS grp
FROM mytable
CROSS JOIN (SELECT #grp := 0, #prev_status = '') AS vars
ORDER BY `time`
uses variables in order to generate the derived table seen below:
# time, status, grp
======================
01:00:00, On, 1
02:00:00, On, 1
03:00:00, On, 1
04:00:00, Off, 2
05:00:00, On, 3
06:00:00, On, 3
07:00:00, Off, 4
08:00:00, Off, 4
09:00:00, On, 5
10:00:00, On, 5
11:00:00, Off, 6
So #grp identifies consecutive records having the same status value.
The outer query groups by grp integer divided by 2. This essentially groups together consecutive On - Off pairs. Finally, using conditional aggregation we can get in the same SELECT both On, Off values.
SELECT t1.t "On", MIN(t2.`time`) "Off"
FROM (
SELECT `datetime` t, #stat := engine_status s
FROM gps_data JOIN (SELECT #stat='') vars
WHERE engine_status != #stat) t1
JOIN gps_data t2 ON t1.t < t2.`datetime`
WHERE t1.s = 'On' AND t2.engine_status = 'Off'
GROUP BY t1.t;