SQL request excluding periods of time - mysql

I need to get all DISTINCT users excluding those who are not available according to unavailability periods of time.
The user table:
+------+-----------+--------------------------------------+
| id | firstname | content |
+------+-----------+--------------------------------------+
| 13 | John | ... |
| 44 | Marc | ... |
| 55 | Elise | ... |
+------+-----------+--------------------------------------+
The unavailability periods table:
+------+-----------+--------------+--------------+
| id | user_id | start | end |
+------+-----------+--------------+--------------+
| 1 | 13 | 2019-07-01 | 2019-07-10 |
| 2 | 13 | 2019-07-20 | 2019-07-30 |
| 3 | 13 | 2019-09-01 | 2019-09-30 |
| 4 | 44 | 2019-08-01 | 2019-08-15 |
+------+-----------+--------------+--------------|
For example, we want user who are available from 2019-06-20 to 2019-07-05: Marc and Elise are available.
Do I have to use a LEFT JOIN? This request is not working:
SELECT DISTINCT user.*, unavailability.start, unavailability.end,
FROM user
LEFT JOIN unavailability ON unavailability.user_id = user.id
WHERE
unavailability.start < "2019-06-20" AND unavailability.end > "2019-06-20"
AND unavailability.start < "2019-07-05" AND unavailability.end > "2019-07-05"
And I need as result:
+------+-----------+--------------------------------------+
| id | firstname | content |
+------+-----------+--------------------------------------+
| 44 | Marc | ... |
| 55 | Elise | ... |
+------+-----------+--------------------------------------+
With this request I don't get Elise who has no unavailability periods of time.

DROP TABLE IF EXISTS user;
CREATE TABLE user
(id SERIAL PRIMARY KEY
,firstname VARCHAR(12) NOT NULL UNIQUE
);
INSERT INTO user VALUES
(13,'John'),
(44,'Marc'),
(55,'Elise');
DROP TABLE IF EXISTS unavailability ;
CREATE TABLE unavailability
(id SERIAL PRIMARY KEY
,user_id INT NOT NULL
,start DATE NOT NULL
,end DATE NOT NULL
);
INSERT INTO unavailability VALUES
(1,13,'2019-07-01','2019-07-10'),
(2,13,'2019-07-20','2019-07-30'),
(3,13,'2019-09-01','2019-09-30'),
(4,44,'2019-08-01','2019-08-15');
SELECT x.*
FROM user x
LEFT
JOIN unavailability y
ON y.user_id = x.id
AND y.start <= '2019-07-05'
AND y.end >= '2019-06-20'
WHERE y.id IS NULL;
+----+-----------+
| id | firstname |
+----+-----------+
| 44 | Marc |
| 55 | Elise |
+----+-----------+
2 rows in set (0.01 sec)

This approach can be used:
select * from user k
where not exists (
select 1 from user
join unavailability u on u.user_id = user.id
and ('2019-06-20' between start and end or '2019-07-05' between start and end)
where user.id = k.id)

You can select the ids of the unavailables and use this result in a subquery :
Schema (MySQL v5.7)
CREATE TABLE user (
`id` INTEGER,
`firstname` VARCHAR(5),
`content` VARCHAR(3)
);
INSERT INTO user
(`id`, `firstname`, `content`)
VALUES
(13, 'John', '...'),
(44, 'Marc', '...'),
(55, 'Elise', '...');
CREATE TABLE unavailability (
`id` INTEGER,
`user_id` INTEGER,
`start` DATETIME,
`end` DATETIME
);
INSERT INTO unavailability
(`id`, `user_id`, `start`, `end`)
VALUES
(1, 13, '2019-07-01', '2019-07-10'),
(2, 13, '2019-07-20', '2019-07-30'),
(3, 13, '2019-09-01', '2019-09-30'),
(4, 44, '2019-08-01', '2019-08-15');
Query #1
SELECT *
FROM user us
WHERE us.id NOT IN (
SELECT u.user_id
FROM unavailability u
WHERE u.start <= '2019-07-05' AND u.end >= '2019-06-20'
);
| id | firstname | content |
| --- | --------- | ------- |
| 44 | Marc | ... |
| 55 | Elise | ... |
View on DB Fiddle
Note
This condition :
unavailability.start < 2019-06-20 AND unavailability.end > 2019-06-20
AND unavailability.start < 2019-07-05 AND unavailability.end > 2019-07-05
Will be evaluated like this :
unavailability.start < 2019-06-20 AND unavailability.end > 2019-07-05
Because, for the parts unavailability.start < 2019-06-20 AND unavailability.start < 2019-07-05, everything below 2019-07-05 but above 2019-06-20 will be excluded (you are using AND). The same for both unavailability.end

Related

How to join table based on two column in mysql?

I have two tables as mentioned below.
user table
id | username | password | status |
1 | Prajna | ***** | active |
2 | Akshata | ***** | active |
3 | Sanjana | ***** | inactive |
test table
id | project_name | created_by (user id) | edited_by (user id) |
1 | Test | 1 | 2 |
2 | Trial | 1 | 1 |
3 | Pro1 | 2 | 2 |
I am trying with below query.
select project_name, user.username from test join user on user.id=test.created_by where user.status='active';
I wanted the result like below
I want to retrieve the result as below
How can I retrieve?
project_name | username(created by) | username (edited by) |
Test | Prajna | Akshata |
Trial | Prajna | Prajna |
Pro1 | Akshata | Akshata |
Try this code.
create table `user`
(
`id` int,
`username` varchar(20),
`password` varchar(20),
`status` varchar(20)
)
insert into `user` (`id`,`username`,`password`,`status`) values
(1, 'Prajna', '*****', 'active'),
(2, 'Akshata', '*****', 'active'),
(3, 'Sanjana', '*****', 'inactive')
create table `test`
(
`id` int,
`project_name` varchar(20),
`created_by` int,
`edited_by` int
)
insert into `test` (`id`,`project_name`,`created_by`,`edited_by`) values
(1, 'Test', 1, 2),
(2, 'Trial', 1, 1),
(3, 'Pro1', 2, 2)
SELECT
`t`.`project_name`,
`ua`.`username` as 'username (created by)' ,
`ub`.`username` as 'username (edited by)'
FROM `test` `t`
JOIN `user` `ua` ON `t`.`created_by` = `ua`.`id`
JOIN `user` `ub` ON `t`.`edited_by` = `ub`.`id`
WHERE
`ua`.`status` = 'active'
AND `ub`.`status` = 'active'
order by `t`.`id`
project_name | username (created by) | username (edited by)
:----------- | :-------------------- | :-------------------
Test | Prajna | Akshata
Trial | Prajna | Prajna
Pro1 | Akshata | Akshata
db<>fiddle here
SELECT
test.project_name, user.username
FROM test
INNER JOIN user
ON user.id = test.created_by
WHERE user.status='active';
PS: you have an error here user.id=test=created_by
.
You need sub-query and join between those two sub-query using join
select project_name,created_by,edited_by from
(
select u.id,project_name, u.username as created_by from user u left join test t1 on
u.id= t1.created_by
where user.status='active'
) Table1
inner join
(
select u.id,project_name, u.username as edited_by from user u left join test t2 on
u.id= t2.created_by
where user.status='active'
) table2 on Table1.project_name=table2.project_name

How to identify and delete duplicate rows, except for most recent

I'm working in HeidiSQL and I'm trying to figure out how to delete all duplicate rows except for the most recent. There are some slight differences amongst the "duplicates," but whenever more than four specific values are identical (i.e. UserID, ContactID, SMSID, and EventID) the row is considered a duplicate. I need to remove these according to the most recent row (identified by CreatedDate).
The following query identifies these rows:
SELECT a.UserID, a.ContactID, a.SMSID, a.EventID, CreatedDate
FROM WhenToText a
JOIN (SELECT UserID, ContactID, SMSID, EventID
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
HAVING COUNT(*) > 1 ) b
ON a.UserID = b.UserID
AND a.ContactID = b.ContactID
AND a.SMSID = b.SMSID
AND a.EventID = b.EventID
ORDER BY UserID, ContactID, SMSID, EventID, CreatedDate DESC
However, I'm not sure how to delete these duplicates after I've identified them.
Here is some sample data:
Here is one approach:
DELETE FROM WhenToText w1
INNER JOIN
(
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS MaxDate
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
) w2
ON w1.UserID = w2.UserID AND w1.ContactID = w2.ContactID AND w1.SMSID = w2.SMSID
AND w1.EventID = w2.EventID
AND w1.CreatedDate != w2.MaxDate
This will delete any record for a given (UserID, ContactID, SMSID, EventID) group whose CreatedDate is not the most recent. Keep in mind this may leave behind more than one record for each group in the event that the latest CreatedDate is shared.
If you want to test which this query first to see which records will be targeted for deletion, you can replace DELETE FROM WhenToText w1 with SELECT w1.* FROM WhenToText w1.
Here is a link to a SQL Fiddle which demonstrates how the query will identify records for deletion:
SQLFiddle
Here is a solution using DELETE FROM JOIN, w/ a full demo with your data.
SQL:
-- Data preparation
create table WhenToText(UserID int, ContactID int, SMSID int, EventID int, CreatedDate datetime);
insert into WhenToText values
(4, 25, 7934, 7407, '2016-02-10 00:00:11'),
(4, 25, 7934, 7407, '2016-02-09 00:00:12'),
(4, 29, 5132, 7407, '2016-02-10 00:00:11'),
(4, 29, 5132, 7407, '2016-02-09 00:00:12'),
(4, 31, 12944, 7405, '2016-02-10 07:03:02'),
(4, 31, 12944, 7405, '2016-02-10 05:03:02'),
(4, 146, 12908, 7405, '2016-02-10 06:52:02'),
(4, 146, 12908, 7405, '2016-02-10 04:52:02'),
(15, 63, 12964, 7401, '2016-02-10 03:42:04'),
(15, 63, 12964, 7401, '2016-02-10 03:41:04'),
(15, 64, 12326, 7401, '2016-02-07 03:01:03'),
(15, 64, 12326, 7401, '2016-02-07 03:00:03');
SELECT * FROM WhenToText;
-- SQL needed
DELETE a FROM
WhenToText a INNER JOIN
(
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) CreatedDate
FROM WhenToText
GROUP BY UserID, ContactID, SMSID, EventID
) b
USING(UserID, ContactID, SMSID, EventID)
WHERE
a.CreatedDate != b.CreatedDate;
SELECT * FROM WhenToText;
Output:
mysql> SELECT * FROM WhenToText;
+--------+-----------+-------+---------+---------------------+
| UserID | ContactID | SMSID | EventID | CreatedDate |
+--------+-----------+-------+---------+---------------------+
| 4 | 25 | 7934 | 7407 | 2016-02-10 00:00:11 |
| 4 | 25 | 7934 | 7407 | 2016-02-09 00:00:12 |
| 4 | 29 | 5132 | 7407 | 2016-02-10 00:00:11 |
| 4 | 29 | 5132 | 7407 | 2016-02-09 00:00:12 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 07:03:02 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 05:03:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 06:52:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 04:52:02 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:42:04 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:41:04 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:01:03 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:00:03 |
+--------+-----------+-------+---------+---------------------+
12 rows in set (0.00 sec)
mysql>
mysql> -- SQL needed
mysql> DELETE a FROM
-> WhenToText a INNER JOIN
-> (
-> SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) CreatedDate
-> FROM WhenToText
-> GROUP BY UserID, ContactID, SMSID, EventID
-> ) b
-> USING(UserID, ContactID, SMSID, EventID)
-> WHERE
-> a.CreatedDate != b.CreatedDate;
SELECT * FQuery OK, 6 rows affected (0.00 sec)
mysql>
mysql> SELECT * FROM WhenToText;
+--------+-----------+-------+---------+---------------------+
| UserID | ContactID | SMSID | EventID | CreatedDate |
+--------+-----------+-------+---------+---------------------+
| 4 | 25 | 7934 | 7407 | 2016-02-10 00:00:11 |
| 4 | 29 | 5132 | 7407 | 2016-02-10 00:00:11 |
| 4 | 31 | 12944 | 7405 | 2016-02-10 07:03:02 |
| 4 | 146 | 12908 | 7405 | 2016-02-10 06:52:02 |
| 15 | 63 | 12964 | 7401 | 2016-02-10 03:42:04 |
| 15 | 64 | 12326 | 7401 | 2016-02-07 03:01:03 |
+--------+-----------+-------+---------+---------------------+
6 rows in set (0.00 sec)
This should provide the solution you're looking for, given CreatedDate is a date datatype. This is also under the assumption that the most recent row is technically the most recent CreatedDate.
SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS CreatedDate
FROM WhenToText
GROUP BY 1, 2, 3, 4;
With these values you could just overwrite WhenToText table...which would look something like this...
CREATE TABLE tmp_table LIKE WhenToText;
INSERT INTO tmp_table (SELECT UserID, ContactID, SMSID, EventID, MAX(CreatedDate) AS CreatedDate
FROM WhenToText
GROUP BY 1, 2, 3, 4);
TRUNCATE WhenToText;
INSERT INTO WhenToText (SELECT * FROM tmp_table);
DROP TABLE tmp_table;

MySQL SELECT latest record from a subquery with UNION

I have the following table
+--------+-----------+---------+----------------------------+---------------------+--------------------+
| msg_id | user_from | user_to | msg_text | msg_time | msg_read |
+--------+-----------+---------+----------------------------+---------------------+--------------------+
| 1 | 1 | 72 | Hello Mark from Andy | 2014-09-18 12:44:09 | 2014-09-20 12:44:09|
| 2 | 72 | 1 | Hello Andy from Mark | 2014-09-22 12:45:26 | 2014-09-28 12:45:26|
| 3 | 1 | 72 | Back to you Mark from Andy | 2014-10-18 12:46:01 | |
| 4 | 12388 | 1 | Hello Andy from Graham | 2014-09-20 12:45:37 | 2014-09-20 12:46:37|
| 5 | 1 | 12388 | Hello Graham from Andy | 2014-09-20 12:51:08 | |
| 6 | 106 | 1 | Hello Andy from Carol | 2015-04-18 12:47:04 | |
+--------+-----------+---------+----------------------------+---------------------+--------------------+
As SQLFiddle is down at the moment, here is the query.
-- ----------------------------
-- Table structure for `messages`
-- ----------------------------
DROP TABLE IF EXISTS `messages`;
CREATE TABLE `messages` (
`msg_id` int(11) NOT NULL AUTO_INCREMENT,
`user_from` int(11) NOT NULL,
`user_to` int(11) NOT NULL,
`msg_text` text,
`msg_time` datetime DEFAULT NULL,
`msg_read` datetime DEFAULT NULL,
PRIMARY KEY (`msg_id`),
KEY `IX_MESSAGES` (`user_from`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
-- ----------------------------
-- Records of messages
-- ----------------------------
INSERT INTO `messages` VALUES ('1', '1', '72', 'Hello Mark from Andy', '2014-09-18 12:44:09', '2014-09-20 12:44:09');
INSERT INTO `messages` VALUES ('2', '72', '1', 'Hello Andy from Mark', '2014-09-22 12:45:26', '2014-09-28 12:45:26');
INSERT INTO `messages` VALUES ('3', '1', '72', 'Back to you Mark from Andy', '2014-10-18 12:46:01', null);
INSERT INTO `messages` VALUES ('4', '12388', '1', 'Hello Andy from Graham', '2014-09-20 12:45:37', '2014-09-20 12:46:37');
INSERT INTO `messages` VALUES ('5', '1', '12388', 'Hello Graham from Andy', '2014-09-20 12:51:08', null);
INSERT INTO `messages` VALUES ('6', '106', '1', 'Hello Andy from Carol', '2015-04-18 12:47:04', null);
As you may have guessed, this is a for a messaging system. In order to show in a Facebook style inbox, I want to extract a result set that shows the distinct users that a specific user has had communication with but I also want to include the latest message along with the message time and whether it was read, bearing in mind the latest message could either be from the sender or the recipient.
Getting the distinct users was easy enough. I simply use a UNION as follows:
SELECT
t.user_id,
t.msg_read,
t.msg_time,
t.msg_text
FROM
(
(
SELECT
m.user_from AS user_id,
m.msg_time,
m.msg_text,
m.msg_read
FROM
messages m
WHERE
m.user_to = 1
)
UNION
(
SELECT
m.user_to AS user_id,
m.msg_time AS msg_time,
m.msg_text,
m.msg_read
FROM
messages m
WHERE
m.user_from = 1
)
) t
GROUP BY user_id
This produces:
+---------+--------------------+---------------------+------------------------+
| user_id | msg_read | msg_time | msg_text |
+---------+--------------------+---------------------+------------------------+
| 72 | 2014-09-28 12:45:26| 2014-09-22 12:45:26 | Hello Andy from Mark |
| 106 | | 2015-04-18 12:47:04 | Hello Andy from Carol |
| 12388 | 2014-09-20 12:46:37| 2014-09-20 12:45:37 | Hello Andy from Graham |
+---------+--------------------+---------------------+------------------------+
Getting the latest message though is proving tricky. In the past I have simply used a JOIN to another subquery, but when trying to do the same with this, it (of course) doesn't recognise the t table.
SELECT
t.user_id,
t.msg_read,
t.msg_time AS msg_time,
t.msg_text
FROM
(
(
SELECT
m.user_from AS user_id,
m.msg_time AS msg_time,
m.msg_text,
m.msg_read
FROM
messages m
WHERE
m.user_to = 1
)
UNION
(
SELECT
m.user_to AS user_id,
m.msg_time AS msg_time,
m.msg_text,
m.msg_read
FROM
messages m
WHERE
m.user_from = 1
)
) t
INNER JOIN (SELECT MAX(msg_time) AS msg_time, user_id FROM t GROUP BY user_id) t2 ON (t.user_id=t2.user_id AND t.msg_time=t2.msg_time)
GROUP BY user_id
Table 't' doesn't exist
I realise that I could simply JOIN to another query containing the UNION but this seems a rather inefficient way of working.
I also hoped that I could create a temporary table, however it seems this is forbidden by the hosting provider.
Does anyone have any suggestions? I am happy to consider alternatives to the UNION concept.
For reference, the expected outcome should be:
+---------+--------------------+---------------------+----------------------------+
| user_id | msg_read | msg_time | msg_text |
+---------+--------------------+---------------------+----------------------------+
| 106 | | 2015-04-18 12:47:04 | Hello Andy from Carol |
| 72 | | 2014-10-18 12:46:01 | Back to you Mark from Andy |
| 12388 | 2014-09-20 12:46:37| 2014-09-20 12:51:08 | Hello Graham from Andy |
+---------+--------------------+---------------------+----------------------------+
First, you don't need the union. This following query gets all messages:
SELECT (case when m.user_to = 1 then m.user_from else m.user_to end) AS user_id,
m.msg_time, m.msg_text, m.msg_read
FROM messages m
WHERE 1 in (m.user_to, m.user_from);
If you want the most recent one for each user, just use aggregation to get the most recent message and use a join for filtering:
SELECT m.*
FROM (SELECT (case when m.user_to = 1 then m.user_from else m.user_to end) AS user_id,
m.msg_time, m.msg_text, m.msg_read
FROM messages m
WHERE 1 in (m.user_to, m.user_from)
) m JOIN
(SELECT (case when m.user_to = 1 then m.user_from else m.user_to end) AS user_id,
MAX(m.msg_time) as maxt
FROM messages m
WHERE 1 in (m.user_to, m.user_from)
GROUP BY (case when m.user_to = 1 then m.user_from else m.user_to end)
) mm
ON m.user_id = mm.user_id and
m.msg_time = mm.maxt;

Update the next row of the target row in MySQL

Suppose I have a table that tracks if a payment is missed like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | NULL |
| 5 | 1 | 100 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
And, for example, I ran a query that checks if a payment is missed like this:
UPDATE loan_payments
SET missed = 1
WHERE DATEDIFF(NOW(), due_at) >= 10
AND paid_at IS NULL
Then suppose that the row with id = 1 gets affected. I want the amount_due of row with id = 1 be added to the amount_due of the next row so the table would look like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | 1 |
| 5 | 1 | 200 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
Any advice on how to do it?
Thanks
Take a look at this :
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE loan_payments
(`id` int, `loan_id` int, `amount_due` int,
`due_at` varchar(10), `paid_at` varchar(4), `missed` varchar(4))
;
INSERT INTO loan_payments
(`id`, `loan_id`, `amount_due`, `due_at`, `paid_at`, `missed`)
VALUES
(1, 1, 100, '2013-09-17', NULL, NULL),
(3, 2, 100, '2013-09-17', NULL, NULL),
(5, 1, 100, '2013-10-17', NULL, NULL),
(7, 1, 100, '2013-11-17', NULL, NULL)
;
UPDATE loan_payments AS l
LEFT OUTER JOIN (SELECT loan_id, MIN(ID) AS ID
FROM loan_payments
WHERE DATEDIFF(NOW(), due_at) < 0
GROUP BY loan_id) AS l2 ON l.loan_id = l2.loan_id
LEFT OUTER JOIN loan_payments AS l3 ON l2.id = l3.id
SET l.missed = 1, l3.amount_due = l3.amount_due + l.amount_due
WHERE DATEDIFF(NOW(), l.due_at) >= 10
AND l.paid_at IS NULL
;
Query 1:
SELECT *
FROM loan_payments
Results:
| ID | LOAN_ID | AMOUNT_DUE | DUE_AT | PAID_AT | MISSED |
|----|---------|------------|------------|---------|--------|
| 1 | 1 | 100 | 2013-09-17 | (null) | 1 |
| 3 | 2 | 100 | 2013-09-17 | (null) | 1 |
| 5 | 1 | 200 | 2013-10-17 | (null) | (null) |
| 7 | 1 | 100 | 2013-11-17 | (null) | (null) |
Unfortunately I don't have time at the moment to write out full-blown SQL, but here's the psuedocode I think you need to implement:
select all DISTINCT loan_id from table loan_payments
for each loan_id:
set missed = 1 for all outstanding payments for loan_id (as determined by date)
select the sum of all outstanding payments for loan_id
add this sum to the amount_due for the loan's next due date after today
Refer to this for how to loop using pure MySQL: http://dev.mysql.com/doc/refman/5.7/en/cursors.html
I fixed my own problem by adding a missed_at field. I put the current timestamp ($now) in a variable before I update the first row to missed = 1 and missed_at = $now then I ran this query to update the next row's amount_due:
UPDATE loan_payments lp1 JOIN loan_payments lp2 ON lp1.due_at > lp2.due_at
SET lp1.amount_due = lp2.amount_due + lp1.amount_due
WHERE lp2.missed_at = $now AND DATEDIFF(lp1.due_at, lp2.due_at) <= DAYOFMONTH(LAST_DAY(lp1.due_at))
I wish I could use just use LIMIT 1 to that query but it turns out that it's not possible for an UPDATE query with a JOIN.
So all in all, I used two queries to achieve what I want. It did the trick.
Please advise if you have better solutions.
Thanks!

How do I create a period date range from a mysql table grouping every common sequence of value in a column

My goal is to return a start and end date having same value in a column. Here is my table. The (*) have been marked to give you the idea of how I want to get "EndDate" for every similar sequence value of A & B columns
ID | DayDate | A | B
-----------------------------------------------
1 | 2010/07/1 | 200 | 300
2 | 2010/07/2 | 200 | 300 *
3 | 2010/07/3 | 150 | 250
4 | 2010/07/4 | 150 | 250 *
8 | 2010/07/5 | 150 | 350 *
9 | 2010/07/6 | 200 | 300
10 | 2010/07/7 | 200 | 300 *
11 | 2010/07/8 | 100 | 200
12 | 2010/07/9 | 100 | 200 *
and I want to get the following result table from the above table
| DayDate |EndDate | A | B
-----------------------------------------------
| 2010/07/1 |2010/07/2 | 200 | 300
| 2010/07/3 |2010/07/4 | 150 | 250
| 2010/07/5 |2010/07/5 | 150 | 350
| 2010/07/6 |2010/07/7 | 200 | 300
| 2010/07/8 |2010/07/9 | 100 | 200
UPDATE:
Thanks Mike, The approach of yours seems to work in your perspective of considering the following row as a mistake.
8 | 2010/07/5 | 150 | 350 *
However it is not a mistake. The challenge I am faced with this type of data is like a scenario of logging a market price change with date. The real problem in mycase is to select all rows with the beginning and ending date if both A & B matches in all these rows. Also to select the rows which are next to previously selected, and so on like that no data is left out in the table.
I can explain a real world scenario. A Hotel with Room A and B has room rates for each day entered in to table as explained in my question. Now the hotel needs to get a report to show the price calendar in a shorter way using start and end date, instead of listing all the dates entered. For example, on 2010/07/01 to 2010/07/02 the price of A is 200 and B is 300. This price is changed from 3rd to 4th and on 5th there is a different price only for that day where the Room B is price is changed to 350. So this is considered as a single day difference, thats why start and end dates are same.
I hope this explained the scenario of the problem. Also note that this hotel may be closed for a specific time period, lets say this is an additional problem to my first question. The problem is what if the rate is not entered on specific dates, for example on Sundays the hotel do not sell these two rooms so they entered no price, meaning the row will not exist in the table.
Creating related tables allows you much greater freedom to query and extract relevant information. Here's a few links that you might find useful:
You could start with these tutorials:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html
http://net.tutsplus.com/tutorials/databases/sql-for-beginners/
There are also a couple of questions here on stackoverflow that might be useful:
Normalization in plain English
What exactly does database normalization do?
Anyway, on to a possible solution. The following examples use your hotel rooms analogy.
First, create a table to hold information about the hotel rooms. This table just contains the room ID and its name, but you could store other information in here, such as the room type (single, double, twin), its view (ocean front, ocean view, city view, pool view), and so on:
CREATE TABLE `room` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `name_UNIQUE` (`name` ASC) )
ENGINE = InnoDB;
Now create a table to hold the changing room rates. This table links to the room table through the room_id column. The foreign key constraint prevents records being inserted into the rate table which refer to rooms that do not exist:
CREATE TABLE `rate` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
`room_id` INT UNSIGNED NOT NULL,
`date` DATE NOT NULL,
`rate` DECIMAL(6,2) UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
INDEX `fk_room_rate` (`room_id` ASC),
CONSTRAINT `fk_room_rate`
FOREIGN KEY (`room_id` )
REFERENCES `room` (`id` )
ON DELETE CASCADE
ON UPDATE CASCADE)
ENGINE = InnoDB;
Create two rooms, and add some daily rate information about each room:
INSERT INTO `room` (`id`, `name`) VALUES (1, 'A'), (2, 'B');
INSERT INTO `rate` (`id`, `room_id`, `date`, `rate`) VALUES
( 1, 1, '2010-07-01', 200),
( 2, 1, '2010-07-02', 200),
( 3, 1, '2010-07-03', 150),
( 4, 1, '2010-07-04', 150),
( 5, 1, '2010-07-05', 150),
( 6, 1, '2010-07-06', 200),
( 7, 1, '2010-07-07', 200),
( 8, 1, '2010-07-08', 100),
( 9, 1, '2010-07-09', 100),
(10, 2, '2010-07-01', 300),
(11, 2, '2010-07-02', 300),
(12, 2, '2010-07-03', 250),
(13, 2, '2010-07-04', 250),
(14, 2, '2010-07-05', 350),
(15, 2, '2010-07-06', 300),
(16, 2, '2010-07-07', 300),
(17, 2, '2010-07-08', 200),
(18, 2, '2010-07-09', 200);
With that information stored, a simple SELECT query with a JOIN will show you the all the daily room rates:
SELECT
room.name,
rate.date,
rate.rate
FROM room
JOIN rate
ON rate.room_id = room.id;
+------+------------+--------+
| A | 2010-07-01 | 200.00 |
| A | 2010-07-02 | 200.00 |
| A | 2010-07-03 | 150.00 |
| A | 2010-07-04 | 150.00 |
| A | 2010-07-05 | 150.00 |
| A | 2010-07-06 | 200.00 |
| A | 2010-07-07 | 200.00 |
| A | 2010-07-08 | 100.00 |
| A | 2010-07-09 | 100.00 |
| B | 2010-07-01 | 300.00 |
| B | 2010-07-02 | 300.00 |
| B | 2010-07-03 | 250.00 |
| B | 2010-07-04 | 250.00 |
| B | 2010-07-05 | 350.00 |
| B | 2010-07-06 | 300.00 |
| B | 2010-07-07 | 300.00 |
| B | 2010-07-08 | 200.00 |
| B | 2010-07-09 | 200.00 |
+------+------------+--------+
To find the start and end dates for each room rate, you need a more complex query:
SELECT
id,
room_id,
MIN(date) AS start_date,
MAX(date) AS end_date,
COUNT(*) AS days,
rate
FROM (
SELECT
id,
room_id,
date,
rate,
(
SELECT COUNT(*)
FROM rate AS b
WHERE b.rate <> a.rate
AND b.date <= a.date
AND b.room_id = a.room_id
) AS grouping
FROM rate AS a
ORDER BY a.room_id, a.date
) c
GROUP BY rate, grouping
ORDER BY room_id, MIN(date);
+----+---------+------------+------------+------+--------+
| id | room_id | start_date | end_date | days | rate |
+----+---------+------------+------------+------+--------+
| 1 | 1 | 2010-07-01 | 2010-07-02 | 2 | 200.00 |
| 3 | 1 | 2010-07-03 | 2010-07-05 | 3 | 150.00 |
| 6 | 1 | 2010-07-06 | 2010-07-07 | 2 | 200.00 |
| 8 | 1 | 2010-07-08 | 2010-07-09 | 2 | 100.00 |
| 10 | 2 | 2010-07-01 | 2010-07-02 | 2 | 300.00 |
| 12 | 2 | 2010-07-03 | 2010-07-04 | 2 | 250.00 |
| 14 | 2 | 2010-07-05 | 2010-07-05 | 1 | 350.00 |
| 15 | 2 | 2010-07-06 | 2010-07-07 | 2 | 300.00 |
| 17 | 2 | 2010-07-08 | 2010-07-09 | 2 | 200.00 |
+----+---------+------------+------------+------+--------+
You can find a good explanation of the technique used in the above query here:
http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data
My general approach is to join the table onto itself based on DayDate = DayDate+1 and the A or B values not being equal
This will find the end dates for each period (where the value is going to be different on the following day)
The only problem is, that won't find an end date for the final period. To get around this, I selct the max date from the table and union that into my list of end dates
Once you have the list of end dates defined, you can join them to the original table based on the end date being greater than or equal to the original date
From this final list, select the minimum daydate grouped by the other fields
select
min(DayDate) as DayDate,EndDate,A,B from
(SELECT DayDate, A, B, min(ends.EndDate) as EndDate
FROM yourtable
LEFT JOIN
(SELECT max(DayDate) as EndDate FROM yourtable UNION
SELECT t1.DayDate as EndDate
FROM yourtable t1
JOIN yourtable t2
ON date_add(t1.DayDate, INTERVAL 1 DAY) = t2.DayDate
AND (t1.A<>t2.A OR t1.B<>t2.B)) ends
ON ends.EndDate>=DayDate
GROUP BY DayDate, A, B) x
GROUP BY EndDate,A,B
I think I have found a solution which does produce the table desired.
SELECT
a.DayDate AS StartDate,
( SELECT b.DayDate
FROM Dates AS b
WHERE b.DayDate > a.DayDate AND (b.B = a.B OR b.B IS NULL)
ORDER BY b.DayDate ASC LIMIT 1
) AS StopDate,
a.A as A,
a.B AS B
FROM Dates AS a
WHERE Coalesce(
(SELECT c.B
FROM Dates AS c
WHERE c.DayDate <= a.DayDate
ORDER BY c.DayDate DESC LIMIT 1,1
), -99999
) <> a.B
AND a.B IS NOT NULL
ORDER BY a.DayDate ASC;
is able to generate the following table result
StartDate StopDate A B
2010-07-01 2010-07-02 200 300
2010-07-03 2010-07-04 150 250
2010-07-05 NULL 150 350
2010-07-06 2010-07-07 200 300
2010-07-08 2010-07-09 100 200
But I need a way to replace the NULL with the same date of the start date.