Right join / inner join / multiselect [MYSQL] TABLE RESULTS - mysql

I have a big trouble to find a correct way to select a column from another table, and show one results that would contain two tables in the same time.
First table:
id | times | project_id |
12 | 12.24 | 40 |
13 | 13.22 | 40 |
14 | 13.22 | 20 |
15 | 12.22 | 20 |
16 | 13.30 | 40 |
Second table:
id | times | project_id |
32 | 22.24 | 40 |
33 | 23.22 | 40 |
34 | 23.22 | 70 |
35 | 22.22 | 70 |
36 | 23.30 | 40 |
I expect to select all the times from the first table for project_id =40, and join to this times from the second table for the same project_id =40.
The results should be like this below:
id | time | time | project_id |
12 | 12.24 | 22.24 | 40 |
13 | 13.22 | 23.22 | 40 |
16 | 13.30 | 23.30 | 40 |

You need to use UNION ALL between those 2 tables otherwise you will get incorrect results. Once you have all the rows together then you can use variables to carry over "previous values" such as shown below and demonstrated at this SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE Table1
(`id` int, `times` decimal(6,2), `project_id` int)
;
INSERT INTO Table1
(`id`, `times`, `project_id`)
VALUES
(12, 12.24, 40),
(13, 13.22, 40),
(14, 13.22, 20),
(15, 12.22, 20),
(16, 13.30, 40)
;
CREATE TABLE Table2
(`id` int, `times` decimal(6,2), `project_id` int)
;
INSERT INTO Table2
(`id`, `times`, `project_id`)
VALUES
(32, 22.24, 40),
(33, 23.22, 40),
(34, 23.22, 70),
(35, 22.22, 70),
(36, 23.30, 40)
;
Query 1:
select
project_id, id, prev_time, times
from (
select
#row_num :=IF(#prev_value=d.project_id,#row_num+1,1) AS RowNumber
, d.*
, IF(#row_num %2 = 0, #prev_time, '') prev_time
, #prev_value := d.project_id
, #prev_time := times
from (
select `id`, `times`, `project_id` from Table1
union all
select `id`, `times`, `project_id` from Table2
) d
cross join (select #prev_value := 0, #row_num := 0) vars
order by d.project_id, d.times
) d2
where prev_time <> ''
Results:
| project_id | id | prev_time | times |
|------------|----|-----------|-------|
| 20 | 14 | 12.22 | 13.22 |
| 40 | 13 | 12.24 | 13.22 |
| 40 | 32 | 13.30 | 22.24 |
| 40 | 36 | 23.22 | 23.3 |
| 70 | 34 | 22.22 | 23.22 |
Note: MySQL doe snot currently support LEAD() and LAG() functions when this answer was prepared. When MySQL does support these that approach would be simpler and probably more efficient.
select
d.*
from (
select
d1.*
, LEAD(times,1) OVER(partition by project_id order by times ASC) next_time
from (
select id, times, project_id from Table1
union all
select id, times, project_id from Table2
) d1
) d
where next_time is not null

Related

MySQL select all rows from last N groups

I have a dataset like this, where there can be multiple transactions per trade
| tx_id | trade_id |
--------------------
| 100 | 11 |
| 99 | 11 |
| 98 | 11 |
| 97 | 10 |
| 96 | 10 |
| 95 | 9 |
| 94 | 9 |
| 93 | 8 |
...
I want to select all of the transactions from the last N trades. For instance if I wanted to select all rows from the last 2 trades, I would get:
| tx_id | trade_id |
--------------------
| 100 | 11 |
| 99 | 11 |
| 98 | 11 |
| 97 | 10 |
| 96 | 10 |
I cannot guarantee that the trade_id will always have an interval of 1.
How can I accomplish this in mysql?
This will also work with mysql 5
Changing the linit , you can choose how many trades you want to receive
CREATE TABLE tab1 (
`tx_id` INTEGER,
`trade_id` INTEGER
);
INSERT INTO tab1
(`tx_id`, `trade_id`)
VALUES
('100', '11'),
('99', '11'),
('98', '11'),
('97', '10'),
('96', '10'),
('95', '9'),
('94', '9'),
('93', '8');
SELECT t1.* FROM tab1 t1 JOIN (SELECT DISTINCT `trade_id` FROM tab1 ORDER BY `trade_id` DESC LIMIT 2) t2
ON t1.`trade_id` = t2.`trade_id`
tx_id | trade_id
----: | -------:
100 | 11
99 | 11
98 | 11
97 | 10
96 | 10
db<>fiddle here
You use DENSE_RANK on trade_id descending, then filter on your required X for "last X":
CREATE TABLE t (tx_id int, trade_id int);
INSERT INTO t (tx_id, trade_id) VALUES
(100,11),
(99,11),
(98,11),
(97,10),
(96,10),
(95,9),
(94,9),
(93,8);
SET #ngroups=2;
WITH dat
AS
(
SELECT tx_id, trade_id, DENSE_RANK() OVER (ORDER BY trade_id DESC) AS trade_id_rank
FROM t
)
SELECT tx_id, trade_id
FROM dat
WHERE trade_id_rank <= #ngroups;
dbfiddle.uk
If we assume the "last trades" are the ones with the highest trade_id numbers, then you can use DENSE_RANK().
For example:
select *
from (
select *,
dense_rank() over(order by trade_id desc) as dr
from t
) x
where dr <= 2
This can be done with a CTE
WITH trades AS
SELECT trade_id tid
FROM myTable
GROUP BY trade_id
ORDER BY trade_id
LIMIT 2
SELECT * FROM
trades
JOIN myTable ON trade_id = tid
ORDER BY tx_id;

how to track score gains in mysql

I would like to display a players current score as well as how many points they have gained within a selected time frame.
I have 2 tables
skills table
+----+---------+---------------------+
| id | name | created_at |
+----+---------+---------------------+
| 1 | skill 1 | 2020-06-05 00:00:00 |
| 2 | skill 2 | 2020-06-05 00:00:00 |
| 3 | skill 3 | 2020-06-05 00:00:00 |
+----+---------+---------------------+
scores table
+----+-----------+----------+-------+---------------------+
| id | player_id | skill_id | score | created_at |
+----+-----------+----------+-------+---------------------+
| 1 | 1 | 1 | 5 | 2020-06-06 00:00:00 |
| 2 | 1 | 1 | 10 | 2020-07-06 00:00:00 |
| 3 | 1 | 2 | 1 | 2020-07-06 00:00:00 |
| 4 | 2 | 1 | 11 | 2020-07-06 00:00:00 |
| 5 | 1 | 1 | 13 | 2020-07-07 00:00:00 |
| 6 | 1 | 2 | 10 | 2020-07-07 00:00:00 |
| 7 | 2 | 1 | 12 | 2020-07-07 00:00:00 |
| 8 | 1 | 1 | 20 | 2020-07-08 00:00:00 |
| 9 | 1 | 2 | 15 | 2020-07-08 00:00:00 |
| 10 | 2 | 1 | 17 | 2020-07-08 00:00:00 |
+----+-----------+----------+-------+---------------------+
my expected results are:-
24 hour query
+-----------+---------+-------+------+
| player_id | name | score | gain |
+-----------+---------+-------+------+
| 1 | skill 1 | 20 | 7 |
| 1 | skill 2 | 15 | 5 |
+-----------+---------+-------+------+
7 day query
+-----------+---------+-------+------+
| player_id | name | score | gain |
+-----------+---------+-------+------+
| 1 | skill 1 | 20 | 10 |
| 1 | skill 2 | 15 | 14 |
+-----------+---------+-------+------+
31 day query
+-----------+---------+-------+------+
| player_id | name | score | gain |
+-----------+---------+-------+------+
| 1 | skill 1 | 20 | 15 |
| 1 | skill 2 | 15 | 14 |
+-----------+---------+-------+------+
so far I have the following, but all this does is return the last 2 records for each skill, I am struggling to calculate the gains and the different time frames
SELECT player_id, skill_id, name, score
FROM (SELECT player_id, skill_id, name, score,
#skill_count := IF(#current_skill = skill_id, #skill_count + 1, 1) AS skill_count,
#current_skill := skill_id
FROM skill_scores
INNER JOIN skills
ON skill_id = skills.id
WHERE player_id = 1
ORDER BY skill_id, score DESC
) counted
WHERE skill_count <= 2
I would like some help figuring out the query I need to build to get the desired results, or is it best to do this with php instead of in the db?
EDIT:-
MYSQL 8.0.20 dummy data id's are primary_key auto increment but I didnt ad that for simplicity:-
CREATE TABLE skills
(
id bigint,
name VARCHAR(80)
);
CREATE TABLE skill_scores
(
id bigint,
player_id bigint,
skill_id bigint,
score bigint,
created_at timestamp
);
INSERT INTO skills VALUES (1, 'skill 1');
INSERT INTO skills VALUES (2, 'skill 2');
INSERT INTO skills VALUES (3, 'skill 3');
INSERT INTO skill_scores VALUES (1, 1, 1 , 5, '2020-06-06 00:00:00');
INSERT INTO skill_scores VALUES (2, 1, 1 , 10, '2020-07-06 00:00:00');
INSERT INTO skill_scores VALUES (3, 1, 2 , 1, '2020-07-06 00:00:00');
INSERT INTO skill_scores VALUES (4, 2, 1 , 11, '2020-07-06 00:00:00');
INSERT INTO skill_scores VALUES (5, 1, 1 , 13, '2020-07-07 00:00:00');
INSERT INTO skill_scores VALUES (6, 1, 2 , 10, '2020-07-07 00:00:00');
INSERT INTO skill_scores VALUES (7, 2, 1 , 12, '2020-07-07 00:00:00');
INSERT INTO skill_scores VALUES (8, 1, 1 , 20, '2020-07-08 00:00:00');
INSERT INTO skill_scores VALUES (9, 1, 2 , 15, '2020-07-08 00:00:00');
INSERT INTO skill_scores VALUES (10, 2, 1 , 17, '2020-07-08 00:00:00');
WITH cte AS (
SELECT id, player_id, skill_id,
FIRST_VALUE(score) OVER (PARTITION BY player_id, skill_id ORDER BY created_at DESC) score,
FIRST_VALUE(score) OVER (PARTITION BY player_id, skill_id ORDER BY created_at DESC) - FIRST_VALUE(score) OVER (PARTITION BY player_id, skill_id ORDER BY created_at ASC) gain,
ROW_NUMBER() OVER (PARTITION BY player_id, skill_id ORDER BY created_at DESC) rn
FROM skill_scores
WHERE created_at BETWEEN #current_date - INTERVAL #interval DAY AND #current_date
)
SELECT cte.player_id, skills.name, cte.score, cte.gain
FROM cte
JOIN skills ON skills.id = cte.skill_id
WHERE rn = 1
ORDER BY player_id, name;
fiddle
Ps. I don't understand where gain=15 is taken for 31-day period - the difference between '2020-07-08 00:00:00' and '2020-06-06 00:00:00' is 32 days.
Well i think you need a (temporary) table for this. I will call it "player_skill_gains". Its basically the players skills ordered by created_at and with an auto_incremented id:
CREATE TABLE player_skill_gains
(`id` int PRIMARY KEY AUTO_INCREMENT NOT NULL
, `player_id` int
, skill_id int
, score int
, created_at date)
;
INSERT INTO player_skill_gains(player_id, skill_id, score, created_at)
SELECT player_skills.player_id AS player_id
, player_skills.skill_id
, SUM(player_skills.score) AS score
, player_skills.created_at
FROM player_skills
GROUP BY player_skills.id, player_skills.skill_id, player_skills.created_at
ORDER BY player_skills.player_id, player_skills.skill_id, player_skills.created_at ASC;
Using this table we can relatively easily select the last skill for each row (id-1). Using this we can calculate the gains:
SELECT player_skill_gains.player_id, skills.name, player_skill_gains.score
, player_skill_gains.score - IFNULL(bef.score,0) AS gain
, player_skill_gains.created_at
FROM player_skill_gains
INNER JOIN skills ON player_skill_gains.skill_id = skills.id
LEFT JOIN player_skill_gains AS bef ON (player_skill_gains.id - 1) = bef.id
AND player_skill_gains.player_id = bef.player_id
AND player_skill_gains.skill_id = bef.skill_id
For the different queries you want to have (24 hours, 7 days, etc.) you just have to specify the needed where-part for the query.
You can see all this in action here: http://sqlfiddle.com/#!9/1571a8/11/0

How to identify and delete duplicate rows, except for most recent

I'm working in HeidiSQL and I'm trying to figure out how to delete all duplicate rows except for the most recent. There are some slight differences amongst the "duplicates," but whenever more than four specific values are identical (i.e. UserID, ContactID, SMSID, and EventID) the row is considered a duplicate. I need to remove these according to the most recent row (identified by CreatedDate).
The following query identifies these rows:
SELECT a.UserID, a.ContactID, a.SMSID, a.EventID, CreatedDate
FROM WhenToText a
JOIN (SELECT UserID, ContactID, SMSID, EventID
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
HAVING COUNT(*) > 1 ) b
ON a.UserID = b.UserID
AND a.ContactID = b.ContactID
AND a.SMSID = b.SMSID
AND a.EventID = b.EventID
ORDER BY UserID, ContactID, SMSID, EventID, CreatedDate DESC
However, I'm not sure how to delete these duplicates after I've identified them.
Here is some sample data:
Here is one approach:
DELETE FROM WhenToText w1
INNER JOIN
(
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS MaxDate
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
) w2
ON w1.UserID = w2.UserID AND w1.ContactID = w2.ContactID AND w1.SMSID = w2.SMSID
AND w1.EventID = w2.EventID
AND w1.CreatedDate != w2.MaxDate
This will delete any record for a given (UserID, ContactID, SMSID, EventID) group whose CreatedDate is not the most recent. Keep in mind this may leave behind more than one record for each group in the event that the latest CreatedDate is shared.
If you want to test which this query first to see which records will be targeted for deletion, you can replace DELETE FROM WhenToText w1 with SELECT w1.* FROM WhenToText w1.
Here is a link to a SQL Fiddle which demonstrates how the query will identify records for deletion:
SQLFiddle
Here is a solution using DELETE FROM JOIN, w/ a full demo with your data.
SQL:
-- Data preparation
create table WhenToText(UserID int, ContactID int, SMSID int, EventID int, CreatedDate datetime);
insert into WhenToText values
(4, 25, 7934, 7407, '2016-02-10 00:00:11'),
(4, 25, 7934, 7407, '2016-02-09 00:00:12'),
(4, 29, 5132, 7407, '2016-02-10 00:00:11'),
(4, 29, 5132, 7407, '2016-02-09 00:00:12'),
(4, 31, 12944, 7405, '2016-02-10 07:03:02'),
(4, 31, 12944, 7405, '2016-02-10 05:03:02'),
(4, 146, 12908, 7405, '2016-02-10 06:52:02'),
(4, 146, 12908, 7405, '2016-02-10 04:52:02'),
(15, 63, 12964, 7401, '2016-02-10 03:42:04'),
(15, 63, 12964, 7401, '2016-02-10 03:41:04'),
(15, 64, 12326, 7401, '2016-02-07 03:01:03'),
(15, 64, 12326, 7401, '2016-02-07 03:00:03');
SELECT * FROM WhenToText;
-- SQL needed
DELETE a FROM
WhenToText a INNER JOIN
(
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) CreatedDate
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
) b
USING(UserID, ContactID, SMSID, EventID)
WHERE
a.CreatedDate != b.CreatedDate;
SELECT * FROM WhenToText;
Output:
mysql> SELECT * FROM WhenToText;
+--------+-----------+-------+---------+---------------------+
| UserID | ContactID | SMSID | EventID | CreatedDate |
+--------+-----------+-------+---------+---------------------+
| 4 | 25 | 7934 | 7407 | 2016-02-10 00:00:11 |
| 4 | 25 | 7934 | 7407 | 2016-02-09 00:00:12 |
| 4 | 29 | 5132 | 7407 | 2016-02-10 00:00:11 |
| 4 | 29 | 5132 | 7407 | 2016-02-09 00:00:12 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 07:03:02 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 05:03:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 06:52:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 04:52:02 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:42:04 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:41:04 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:01:03 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:00:03 |
+--------+-----------+-------+---------+---------------------+
12 rows in set (0.00 sec)
mysql>
mysql> -- SQL needed
mysql> DELETE a FROM
-> WhenToText a INNER JOIN
-> (
-> SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) CreatedDate
-> FROM WhenToText
-> GROUP BY UserID, ContactID, SMSID, EventID
-> ) b
-> USING(UserID, ContactID, SMSID, EventID)
-> WHERE
-> a.CreatedDate != b.CreatedDate;
SELECT * FQuery OK, 6 rows affected (0.00 sec)
mysql>
mysql> SELECT * FROM WhenToText;
+--------+-----------+-------+---------+---------------------+
| UserID | ContactID | SMSID | EventID | CreatedDate |
+--------+-----------+-------+---------+---------------------+
| 4 | 25 | 7934 | 7407 | 2016-02-10 00:00:11 |
| 4 | 29 | 5132 | 7407 | 2016-02-10 00:00:11 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 07:03:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 06:52:02 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:42:04 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:01:03 |
+--------+-----------+-------+---------+---------------------+
6 rows in set (0.00 sec)
This should provide the solution you're looking for, given CreatedDate is a date datatype. This is also under the assumption that the most recent row is technically the most recent CreatedDate.
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS CreatedDate
FROM WhenToText
GROUP BY 1, 2, 3, 4;
With these values you could just overwrite WhenToText table...which would look something like this...
CREATE TABLE tmp_table LIKE WhenToText;
INSERT INTO tmp_table (SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS CreatedDate
FROM WhenToText
GROUP BY 1, 2, 3, 4);
TRUNCATE WhenToText;
INSERT INTO WhenToText (SELECT * FROM tmp_table);
DROP TABLE tmp_table;

MySQL count changes

I would like to count number of changes in column Value grouped by Id using MySQL.
Source Table:
create table sequence
(
`Id` int,
`Date` date,
`Value` int not null,
PRIMARY KEY (`Id`,`Date`)
);
insert into sequence
( `Id`,`Date`, `Value` )
values
(1, '2016-01-01' , 0 ),
(1, '2016-01-02' , 10 ),
(1, '2016-01-03' , 0 ),
(1, '2016-01-05' , 0 ),
(1, '2016-01-06' , 10 ),
(1, '2016-01-07' , 15 ),
(2, '2016-01-08' , 15 );
Visualization:
+------------+-------+-------+
| Date | ID | Value |
+------------+-------+-------+
| 2016-01-01 | 1 | 0 |
| 2016-01-02 | 1 | 10 | (change)
| 2016-01-03 | 1 | 0 | (change)
| 2016-01-05 | 1 | 0 |
| 2016-01-06 | 1 | 10 | (change)
| 2016-01-07 | 1 | 15 | (change)
| 2016-01-08 | 2 | 15 |
+------------+-------+-------+
Expected output:
+-------+-------+
| ID | Value |
+-------+-------+
| 1 | 4 |
| 2 | 0 |
+-------+-------+
I would like to ask if there is a way how to do this in SQL.
This is not the very efficient or elegant solution,
but just to show some goals that you can achieve using mysql :-)
http://sqlfiddle.com/#!9/1db14/6
SELECT t1.id, MAX(t1.changes)
FROM (SELECT t.*,
IF (#i IS NULL,#i:=0,IF(#lastId <> id,#i:=0,IF (#lastV <> `value`, #i:=#i+1, #i:=#i))) as changes,
#lastV := `value`,
#lastId := `id`
FROM (
SELECT *
FROM sequence
ORDER BY id, date) t
) t1
GROUP BY t1.id

How do I create a period date range from a mysql table grouping every common sequence of value in a column

My goal is to return a start and end date having same value in a column. Here is my table. The (*) have been marked to give you the idea of how I want to get "EndDate" for every similar sequence value of A & B columns
ID | DayDate | A | B
-----------------------------------------------
1 | 2010/07/1 | 200 | 300
2 | 2010/07/2 | 200 | 300 *
3 | 2010/07/3 | 150 | 250
4 | 2010/07/4 | 150 | 250 *
8 | 2010/07/5 | 150 | 350 *
9 | 2010/07/6 | 200 | 300
10 | 2010/07/7 | 200 | 300 *
11 | 2010/07/8 | 100 | 200
12 | 2010/07/9 | 100 | 200 *
and I want to get the following result table from the above table
| DayDate |EndDate | A | B
-----------------------------------------------
| 2010/07/1 |2010/07/2 | 200 | 300
| 2010/07/3 |2010/07/4 | 150 | 250
| 2010/07/5 |2010/07/5 | 150 | 350
| 2010/07/6 |2010/07/7 | 200 | 300
| 2010/07/8 |2010/07/9 | 100 | 200
UPDATE:
Thanks Mike, The approach of yours seems to work in your perspective of considering the following row as a mistake.
8 | 2010/07/5 | 150 | 350 *
However it is not a mistake. The challenge I am faced with this type of data is like a scenario of logging a market price change with date. The real problem in mycase is to select all rows with the beginning and ending date if both A & B matches in all these rows. Also to select the rows which are next to previously selected, and so on like that no data is left out in the table.
I can explain a real world scenario. A Hotel with Room A and B has room rates for each day entered in to table as explained in my question. Now the hotel needs to get a report to show the price calendar in a shorter way using start and end date, instead of listing all the dates entered. For example, on 2010/07/01 to 2010/07/02 the price of A is 200 and B is 300. This price is changed from 3rd to 4th and on 5th there is a different price only for that day where the Room B is price is changed to 350. So this is considered as a single day difference, thats why start and end dates are same.
I hope this explained the scenario of the problem. Also note that this hotel may be closed for a specific time period, lets say this is an additional problem to my first question. The problem is what if the rate is not entered on specific dates, for example on Sundays the hotel do not sell these two rooms so they entered no price, meaning the row will not exist in the table.
Creating related tables allows you much greater freedom to query and extract relevant information. Here's a few links that you might find useful:
You could start with these tutorials:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html
http://net.tutsplus.com/tutorials/databases/sql-for-beginners/
There are also a couple of questions here on stackoverflow that might be useful:
Normalization in plain English
What exactly does database normalization do?
Anyway, on to a possible solution. The following examples use your hotel rooms analogy.
First, create a table to hold information about the hotel rooms. This table just contains the room ID and its name, but you could store other information in here, such as the room type (single, double, twin), its view (ocean front, ocean view, city view, pool view), and so on:
CREATE TABLE `room` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `name_UNIQUE` (`name` ASC) )
ENGINE = InnoDB;
Now create a table to hold the changing room rates. This table links to the room table through the room_id column. The foreign key constraint prevents records being inserted into the rate table which refer to rooms that do not exist:
CREATE TABLE `rate` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
`room_id` INT UNSIGNED NOT NULL,
`date` DATE NOT NULL,
`rate` DECIMAL(6,2) UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
INDEX `fk_room_rate` (`room_id` ASC),
CONSTRAINT `fk_room_rate`
FOREIGN KEY (`room_id` )
REFERENCES `room` (`id` )
ON DELETE CASCADE
ON UPDATE CASCADE)
ENGINE = InnoDB;
Create two rooms, and add some daily rate information about each room:
INSERT INTO `room` (`id`, `name`) VALUES (1, 'A'), (2, 'B');
INSERT INTO `rate` (`id`, `room_id`, `date`, `rate`) VALUES
( 1, 1, '2010-07-01', 200),
( 2, 1, '2010-07-02', 200),
( 3, 1, '2010-07-03', 150),
( 4, 1, '2010-07-04', 150),
( 5, 1, '2010-07-05', 150),
( 6, 1, '2010-07-06', 200),
( 7, 1, '2010-07-07', 200),
( 8, 1, '2010-07-08', 100),
( 9, 1, '2010-07-09', 100),
(10, 2, '2010-07-01', 300),
(11, 2, '2010-07-02', 300),
(12, 2, '2010-07-03', 250),
(13, 2, '2010-07-04', 250),
(14, 2, '2010-07-05', 350),
(15, 2, '2010-07-06', 300),
(16, 2, '2010-07-07', 300),
(17, 2, '2010-07-08', 200),
(18, 2, '2010-07-09', 200);
With that information stored, a simple SELECT query with a JOIN will show you the all the daily room rates:
SELECT
room.name,
rate.date,
rate.rate
FROM room
JOIN rate
ON rate.room_id = room.id;
+------+------------+--------+
| A | 2010-07-01 | 200.00 |
| A | 2010-07-02 | 200.00 |
| A | 2010-07-03 | 150.00 |
| A | 2010-07-04 | 150.00 |
| A | 2010-07-05 | 150.00 |
| A | 2010-07-06 | 200.00 |
| A | 2010-07-07 | 200.00 |
| A | 2010-07-08 | 100.00 |
| A | 2010-07-09 | 100.00 |
| B | 2010-07-01 | 300.00 |
| B | 2010-07-02 | 300.00 |
| B | 2010-07-03 | 250.00 |
| B | 2010-07-04 | 250.00 |
| B | 2010-07-05 | 350.00 |
| B | 2010-07-06 | 300.00 |
| B | 2010-07-07 | 300.00 |
| B | 2010-07-08 | 200.00 |
| B | 2010-07-09 | 200.00 |
+------+------------+--------+
To find the start and end dates for each room rate, you need a more complex query:
SELECT
id,
room_id,
MIN(date) AS start_date,
MAX(date) AS end_date,
COUNT(*) AS days,
rate
FROM (
SELECT
id,
room_id,
date,
rate,
(
SELECT COUNT(*)
FROM rate AS b
WHERE b.rate <> a.rate
AND b.date <= a.date
AND b.room_id = a.room_id
) AS grouping
FROM rate AS a
ORDER BY a.room_id, a.date
) c
GROUP BY rate, grouping
ORDER BY room_id, MIN(date);
+----+---------+------------+------------+------+--------+
| id | room_id | start_date | end_date | days | rate |
+----+---------+------------+------------+------+--------+
| 1 | 1 | 2010-07-01 | 2010-07-02 | 2 | 200.00 |
| 3 | 1 | 2010-07-03 | 2010-07-05 | 3 | 150.00 |
| 6 | 1 | 2010-07-06 | 2010-07-07 | 2 | 200.00 |
| 8 | 1 | 2010-07-08 | 2010-07-09 | 2 | 100.00 |
| 10 | 2 | 2010-07-01 | 2010-07-02 | 2 | 300.00 |
| 12 | 2 | 2010-07-03 | 2010-07-04 | 2 | 250.00 |
| 14 | 2 | 2010-07-05 | 2010-07-05 | 1 | 350.00 |
| 15 | 2 | 2010-07-06 | 2010-07-07 | 2 | 300.00 |
| 17 | 2 | 2010-07-08 | 2010-07-09 | 2 | 200.00 |
+----+---------+------------+------------+------+--------+
You can find a good explanation of the technique used in the above query here:
http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data
My general approach is to join the table onto itself based on DayDate = DayDate+1 and the A or B values not being equal
This will find the end dates for each period (where the value is going to be different on the following day)
The only problem is, that won't find an end date for the final period. To get around this, I selct the max date from the table and union that into my list of end dates
Once you have the list of end dates defined, you can join them to the original table based on the end date being greater than or equal to the original date
From this final list, select the minimum daydate grouped by the other fields
select
min(DayDate) as DayDate,EndDate,A,B from
(SELECT DayDate, A, B, min(ends.EndDate) as EndDate
FROM yourtable
LEFT JOIN
(SELECT max(DayDate) as EndDate FROM yourtable UNION
SELECT t1.DayDate as EndDate
FROM yourtable t1
JOIN yourtable t2
ON date_add(t1.DayDate, INTERVAL 1 DAY) = t2.DayDate
AND (t1.A<>t2.A OR t1.B<>t2.B)) ends
ON ends.EndDate>=DayDate
GROUP BY DayDate, A, B) x
GROUP BY EndDate,A,B
I think I have found a solution which does produce the table desired.
SELECT
a.DayDate AS StartDate,
( SELECT b.DayDate
FROM Dates AS b
WHERE b.DayDate > a.DayDate AND (b.B = a.B OR b.B IS NULL)
ORDER BY b.DayDate ASC LIMIT 1
) AS StopDate,
a.A as A,
a.B AS B
FROM Dates AS a
WHERE Coalesce(
(SELECT c.B
FROM Dates AS c
WHERE c.DayDate <= a.DayDate
ORDER BY c.DayDate DESC LIMIT 1,1
), -99999
) <> a.B
AND a.B IS NOT NULL
ORDER BY a.DayDate ASC;
is able to generate the following table result
StartDate StopDate A B
2010-07-01 2010-07-02 200 300
2010-07-03 2010-07-04 150 250
2010-07-05 NULL 150 350
2010-07-06 2010-07-07 200 300
2010-07-08 2010-07-09 100 200
But I need a way to replace the NULL with the same date of the start date.