Getting the missing dates from a sql table - mysql

I have a mysql table with the rows: ID, name, startDate, endDate.
As a rule, the dates should be consecutive and i want to alert the user if an interval is missing.
Saying i have this dates inserted:
2012-03-25 -> 2012-03-29
2012-04-02 -> 2012-04-05
I wanna show a message like
"No dates found from 2012-03-29 to 2012-04-02. Please insert data for this interval"
Can this be done without surfing with php the entire table entries?
Thanks!

SELECT t1.endDate AS gapStart, (SELECT MIN(t3.startDate) FROM `table` t3 WHERE t3.startDate > t1.endDate) AS gapEnd
FROM `table` t1
LEFT JOIN `table` t2
ON t1.endDate = t2.startDate
WHERE t2.startDate IS NULL

This works. I include code for creating table and inserting data for testing purposes.
create table dates(
id int(11) not null auto_increment,
name varchar(16) not null,
startDate date,
endDate date,
primary key(id)
);
insert into dates (name,startDate,endDate)
values('personA', '2012-03-25', '2012-03-29'),
('PersonB','2012-04-02', '2012-04-05');
So here is the query:
select d1.endDate,d2.startDate
from dates d1, dates d2
where (d1.id+1) =d2.id and d1.endDate < d2.startDate;

Yes, it can be done without surfing the whole table using PHP, instead you have to surf the whole table using mysql. A crude solution would be:
SELECT a.enddate AS start_of_gap,
(SELECT MIN(c.startdate)
FROM yourtable c
WHERE c.startdate>a.enddate) AS end_of_gap
FROM yourtable a
WHERE NOT EXISTS (
SELECT 1
FROM yourtable b
WHERE b.startdate=a.enddate + INTERVAL 1 DAY
);
I expect if I thought about it some more, there will be a more efficient (but likely less obvious) method.

Related

SQL Find date range gaps in Table

Good day.
I seem to be struggling with what seems like a simple problem.
I have a table that has a value connected to a date (Monthly) for a finite number of ID's
ie. Table1
ID | Date ---| Value
01 | 2015-01 | val1
01 | 2015-02 | val2
02 | 2015-01 | val1
02 | 2015-03 | val2
So ID: 02 does not have a value for date 2015-02.
I would like to return all ID's and Dates that do not have a value.
Date range is: select distinct date from Table1
I can't seem to think outside the realms of selecting and joining on the same table.
I need to include the ID in my select to I can somehow select the ID and Date range that exists for that ID and compare to the entire date range, to get all the dates for each ID that isn't in the "entire" date range.
Please advise.
Thank you
Not very clear about your last two sentences. But you can play with the following query with different #max_days and #min_date:
-- DROP TABLE table1;
CREATE TABLE table1(ID int not null, `date` date not null, value varchar(64) not null);
INSERT table1(ID,`date`,value)
VALUES (1,'2015-01-01','v1'),(1,'2015-01-02','v2'),(2,'2015-01-01','v1'),(2,'2015-01-03','v2'),(4,'2015-01-01','v1'),(4,'2015-01-04','v2');
SELECT * FROM table1;
SET #day=0;
SET #max_days=5;
SET #min_date='2015-01-01';
SELECT i.ID,d.`date`
FROM (SELECT DISTINCT ID FROM table1) i
CROSS JOIN (
SELECT TIMESTAMPADD(DAY,#day,#min_date) AS `date`,#day:=#day+1 AS day_num
FROM table1 WHERE #day<#max_days) d
LEFT JOIN table1 t
ON t.ID=i.ID
AND t.`date`=d.`date`
WHERE t.`date` IS NULL
ORDER BY i.ID,d.`date`;
I now understand your requirement of dates being taken from the table; you want to find any gaps in the date ranges for each id.
This does what you need, but can probably be improved. Explanation below and you can view a working example.
DROP TABLE IF EXISTS Table1;
DROP TABLE IF EXISTS Year_Month_Calendar;
CREATE TABLE Table1 (
id INTEGER
,date CHAR(7)
,value CHAR(4)
);
INSERT INTO Table1
VALUES
(1,'2015-01','val1')
,(1,'2015-02','val2')
,(2,'2015-01','val1')
,(2,'2015-03','val1');
CREATE TABLE Year_Month_Calendar (
date CHAR(10)
);
INSERT INTO Year_Month_Calendar
VALUES
('2015-01')
,('2015-02')
,('2015-03');
SELECT ID_Year_Month.id, ID_Year_Month.date, Table1.id, Table1.date
FROM (
SELECT Distinct_ID.id, Year_Month_Calendar.date
FROM Year_Month_Calendar
CROSS JOIN
( SELECT DISTINCT id FROM Table1 ) AS Distinct_ID
WHERE Year_Month_Calendar.date >= (SELECT MIN(date) FROM Table1 WHERE id=Distinct_ID.ID)
AND Year_Month_Calendar.date <= (SELECT MAX(date) FROM Table1 WHERE id=Distinct_ID.ID)
) AS ID_Year_Month
LEFT JOIN Table1
ON ID_Year_Month.id = Table1.id AND ID_Year_Month.date = Table1.date
-- WHERE Table1.id IS NULL
ORDER BY ID_Year_Month.id, ID_Year_Month.date
Explanation
You need a calendar table which contains all dates (year/months) to cover the data you are querying.
CREATE TABLE Year_Month_Calendar (
date CHAR(10)
);
INSERT INTO Year_Month_Calendar
VALUES
('2015-01')
,('2015-02')
,('2015-03');
The inner select creates a table with all dates between the min and max date for each id.
SELECT Distinct_ID.id, Year_Month_Calendar.date
FROM Year_Month_Calendar
CROSS JOIN
( SELECT DISTINCT id FROM Table1 ) AS Distinct_ID
WHERE Year_Month_Calendar.date >= (SELECT MIN(date) FROM Table1 WHERE id=Distinct_ID.ID)
AND Year_Month_Calendar.date <= (SELECT MAX(date) FROM Table1 WHERE id=Distinct_ID.ID)
This is then LEFT JOINED to the original table to find the missing rows.
If you only want to return the missing row (my query displays the whole table to show how it works), add a WHERE clause to restrict the output to those rows where an id and date is not returned from Table1
Original answer before comments
You can do this without a tally table, since you say
Date range is: select distinct date from Table1
I've slightly changed the field names to avoid reserved words in SQL.
SELECT id_table.ID, date_table.`year_month`, table1.val
FROM (SELECT DISTINCT ID FROM table1) AS id_table
CROSS JOIN
(SELECT DISTINCT `year_month` FROM table1) AS date_table
LEFT JOIN table1
ON table1.ID=id_table.ID AND table1.`year_month` = date_table.`year_month`
ORDER BY id_table.ID
I've not filtered the results, in order to show how the query is working. To return the rows where only where a date is missing, add WHERE table1.year_month IS NULL to the outer query.
SQL Fiddle
You will need a tally table(s) or month/year tables. So you can then generate all of the potential combinations you want to test with. As far as exactly how to use it your example could use some expanding on such as last 12 months, last3 months, etc. but here is an example that might help you understand what you are looking for:
http://rextester.com/ZDQS5259
CREATE TABLE IF NOT EXISTS Tbl (
ID INTEGER
,Date VARCHAR(10)
,Value VARCHAR(10)
);
INSERT INTO Tbl VALUES
(1,'2015-01','val1')
,(1,'2015-02','val2')
,(2,'2015-01','val1')
,(2,'2015-03','val1');
SELECT yr.YearNumber, mn.MonthNumber, i.Id
FROM
(
SELECT 2016 as YearNumber
UNION SELECT 2015
) yr
CROSS JOIN (
SELECT 1 MonthNumber
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9
UNION SELECT 10
UNION SELECT 11
UNION SELECT 12
) mn
CROSS JOIN (
SELECT DISTINCT ID
FROM
Tbl
) i
LEFT JOIN Tbl t
ON yr.YearNumber = CAST(LEFT(t.Date,4) as UNSIGNED)
AND mn.MonthNumber = CAST(RIGHT(t.Date,2) AS UNSIGNED)
AND i.ID = t.ID
WHERE
t.ID IS NULL
The basic idea to determine what you don't know is to generate all possible combinations of something could be. E.g. Year X Month X DISTINCT Id and then join back to figure out what is missing.
Probably not the prettiest but this should work.
select distinct c.ID, c.Date, d.Value
from (select a.ID, b.Date
from (select distinct ID from Table1) as a, (select distinct Date from Table1) as b) as c
left outer join Table1 d on (c.ID = d.ID and c.Date = d.Date)
where d.Value is NULL

MySQL get records beetween tables with conditions

I've got a big problem in my hands, I have the following SQL structure, where the contracts tables are dinamically generated, with random names, like _xyz, _xxx, etc:
CREATE TABLE contract_xyz(
id INT(11) PRIMARY KEY NOT NULL AUTO_INCREMENT,
created_at DATETIME NOT NULL
);
CREATE TABLE contract_events(
id INT(11) PRIMARY KEY NOT NULL AUTO_INCREMENT,
id_contract INT(11) NOT NULL,
table_contract VARCHAR(255) NOT NULL,
created_at DATETIME NOT NULL
);
INSERT INTO contract_xyz (id,created_at) VALUES (1,'2016-11-01');
INSERT INTO contract_xyz (id,created_at) VALUES (2,'2016-10-21');
INSERT INTO contract_xyz (id,created_at) VALUES (3,'2016-11-04');
INSERT INTO contract_events(id,id_contract,table_contract,created_at) VALUES (1,1,'contract_xyz','2016-11-03');
INSERT INTO contract_events(id,id_contract,table_contract,created_at) VALUES (2,3,'contract_xyz','2016-11-04');
Each contract can have his own events. I need to solve the following issue:
Get all contracts that don't have new events in 2 days, or don't have any event at all, and was created over 2 days ago.
I've tried with LET JOIN but it wasn't the correct result. The nearest I get was the following query:
SELECT `contract_xyz`.*
FROM `contract_xyz`
WHERE EXISTS(SELECT 1
FROM `contract_events`
WHERE
`contract_events`.id_contract = `contract_xyz`.id AND `contract_events`.table_contract = 'contract_xyz'
AND DATEDIFF(CURDATE(), `contract_events`.created_at) >= 2
ORDER BY `contract_events`.created_at DESC
LIMIT 1)
OR (NOT EXISTS(SELECT 1
FROM `contract_events`
WHERE `contract_events`.id_contract = `contract_xyz`.id AND
`contract_events`.table_contract = 'contract_xyz') AND
DATEDIFF(CURDATE(), `contract_xyz`.created_at) >= 2);
But I still can't find the contracts that doesn't have any events, and was created over 2 days ago.
I would create a subquery with the max event date for each contract. I would left join the contracts table on this subquery. You can filter based on the max event date and the created date fields to achieve the expected outcome:
select c.*
from contract_xyz c
left join
(select id_contract,
max(created_at) max_event_date
from contract_events
group by id_contract) t on c.id-t.id_contract
where
DATEDIFF(CURDATE(), t.max_event_date) >= 2
or (t.max_event_date is null and DATEDIFF(CURDATE(), c.created_at) >= 2)
Alternatively, you do not use a subquery, but join the 2 tables directly with group by and do the filtering in the having clause.
LEFT OUTER JOIN with an ON condition could help here:
select c.id, c.created_at,count(e.id) as contract_events_less_than_2_days_old
from contract_xyz c
left outer join contract_events e on e.id_contract = c.id
and e.table_contract = 'contract_xyz'
and e.created_at > now() - interval 2 day
where c.created_at < now() - interval 2 day
and e.id is null
group by c.id, c.created_at;
If you have any control over it I would advise against dynamically-generated table names!

MySQL - Get rows from 2 tables where time distance is 5 minutes, and put them in new table

I have two MySQL tables (tbl1 and tbl2), and I want to get the rows from tbl1 and tbl2 which have a time difference of 5 minutes from one another. I want to put the resulting rows in an other table, called combined
INSERT INTO combined ( news_id, col1, col2, col3, col4, quote_ID, quote_DATE, quote_TIME)
SELECT tbl1.news_id, tbl1.col1, tbl1.col2, tbl1.col3, tbl1.articleTime, tbl1.articleDate, tbl2.ID, tbl2.DATE, tbl2.TIME, tbl2.BID_PRICE, tbl2.BID_SIZE, tbl2.ASK_SIZE, tbl2.BID_YIELD, tbl2.ASK_YIELD
FROM tbl1 JOIN tbl2
WHERE tbl1.articleDate = tbl2.date
AND hour(tbl1.articleTime) = hour(tbl2.time)
AND minute(tbl1.articleTime) = minute(tbl2.time)+5;
articleDate and articleTime are varchar(11) and varchar(12) in tbl2, time is TIME and date is varchar(10) in tbl1..
Is my query right ? Can I do something better ? Thanks a lot !
Nobody answered (probably it was too easy or sth :D) but I worked out a query that works ! - wasn't that difficult after all
INSERT INTO combined ( news_id, tbl1Date, id, date, time)
SELECT n.news_id, n.tbl1Time, n.tbl1Date, t.id, t.date, t.time
FROM tbl1 n
JOIN tbl2 t
ON n.tbl1Date= t.date AND hour(n.tbl1Time) = hour(t.time) AND minute(n.tbl1Time)+5 = minute(t.time);
Is there some way to speed up this query ? Thanks !

avoiding creation of temporary or permanent tables

I want to manipulate with a table, but the only solution I found so far is to create some tables first and join them together to the desired results.
I'm trying to avoid creating tables and dropping them at the end of my MySQL query which btw, I'm running on phpmyadmin page.
Here is the data: I have one table containing user_id, columnA_unixtime, columnB_unixtime -- meaning that for each user there are two unix_time stored in the database for two different events.
user_id eventA_join eventB_join
1 1321652009 1321652009
2 0 1321652257
3 0 1321668650
4 1321669261 0
what I want to have is a table showing how many users joined the two events for each day. Something like this (just a sample)
day eventA eventB
11/18/11 3 2
11/19/11 11 8
11/20/11 6 3
11/21/11 17 11
Here is the code I'm using so far:
CREATE TABLE table1(
day VARCHAR(256),
eventA_count INT);
INSERT INTO table1 (day, eventA_count)
(SELECT DATE(FROM_UNIXTIME('eventA_join') ) AS 'day', COUNT('user_id') AS 'eventA_count'
FROM org_table
WHERE 'eventA_join' > 0
GROUP BY day);
CREATE TABLE table2(
day VARCHAR(256),
eventB_count INT);
INSERT INTO table2 (day, eventB_count)
(SELECT DATE(FROM_UNIXTIME('eventB_join') ) AS 'day', COUNT('user_id') AS 'eventB_count'
FROM org_table
WHERE 'eventB_join' > 0
GROUP BY day);
SELECT t.day, t1.eventA_count, t2.eventB_count FROM
(SELECT DISTINCT day FROM table1
UNION
SELECT DISTINCT day FROM table2) t
LEFT JOIN table1 t1
ON t.day = t1.day
LEFT JOIN table2 t2
ON t.day = t2.day
DROP table2;
DROP table1;
As far as I tried I couldn't use table variables in phpmyadmin and neither I could use template tables because there was no way to refer to template tables multiple times (error #10327 Can't reopen temporary table) when I try to join them together.
Is there anyway I avoid creating tables but gain what I'm looking for? Any thoughts?
Edit: both tables are getting data from 'org_table' which is now corrected in the code.
This is pretty easy to do with a single query. You just need UNION ALL:
select date, sum(IsEventA) as EventA, sum(isEventB) as EventB
from ((select user_id, DATE(FROM_UNIXTIME('eventA_join') as date,
1 as IsEventA, 0 as IsEventB
from table1
where eventA_join > 0
) union all
(select user_id, DATE(FROM_UNIXTIME('eventB_join') as date,
0 as IsEventA, 1 as IsEventB
from table1
where eventB_join > 0
)
) t
group by date
order by 1

PHP SQL - Advanced delete query

I have a table with 3 columns: id, date and name. What I am looking for is to delete the records that have a duplicate name. The rule should be to keep the record that has the oldest date. For instance in the example below, there is 3 records with the name Paul. So I would like to keep the one that has the oldest date (id=1) and remove all the others (id = 4 and 6). I know how to make insert, update, etc queries, but here I do not see how to make the trick work.
id, date, name
1, 2012-03-10, Paul
2, 2012-03-10, James
4, 2012-03-12, Paul
5, 2012-03-11, Ricardo
6, 2012-03-13, Paul
mysql_query(?);
The best suggestion I can give you is create a unique index on name and avoid all the trouble.
Follow the steps as Peter Kiss said from 2 to 3. Then do this
ALTER Table tablename ADD UNIQUE INDEX name (name)
Then Follow 4 Insert everything from the temporary table to the original.
All the new duplicate rows, will be omitted
Select all the records what you want to keep
Insert them to a temporary table
Delete everything from the original table
Insert everything from the temporary table to the original
Like Matt, but without the join:
DELETE FROM `table` WHERE `id` NOT IN (
SELECT `id` FROM (
SELECT `id` FROM `table` GROUP BY `name` ORDER BY `date`
) as A
)
Without the first SELECT you will get "You can't specify target table 'table' for update in FROM clause"
Something like this would work:
DELETE FROM tablename WHERE id NOT IN (
SELECT tablename.id FROM (
SELECT MIN(date) as dateCol, name FROM tablename GROUP BY name /*select the minimum date and name, for each name*/
) as MyInnerQuery
INNER JOIN tablename on MyInnerQuery.dateCol = tablename.date
and MyInnerQuery.name = tablename.name /*select the id joined on the minimum date and the name*/
) /*Delete everything which isn't in the list of ids which are the minimum date fore each name*/
DELETE t
FROM tableX AS t
LEFT JOIN
( SELECT name
, MIN(date) AS first_date
FROM tableX
GROUP BY name
) AS grp
ON grp.name = t.name
AND grp.first_date = t.date
WHERE
grp.name IS NULL
DELETE FROM thetable tt
WHERE EXISTS (
SELECT *
FROM thetable tx
WHERE tx.thename = tt.thename
AND tx.thedate > tt. thedate
);
(note that "date" is a reserver word (type) in SQL, "and" name is a reserved word in some SQL implementations)