SQL query that returns all dates not used in a table - mysql

So lets say I have some records that look like:
2011-01-01 Cat
2011-01-02 Dog
2011-01-04 Horse
2011-01-06 Lion
How can I construct a query that will return 2011-01-03 and 2011-01-05, ie the unused dates. I postdate blogs into the future and I want a query that will show me the days I don't have anything posted yet. It would look from the current date to 2 weeks into the future.
Update:
I am not too excited about building a permanent table of dates. After thinking about it though it seems like the solution might be to make a small stored procedure that creates a temp table. Something like:
CREATE PROCEDURE MISSING_DATES()
BEGIN
CREATE TABLE TEMPORARY DATES (FUTURE DATETIME NULL)
INSERT INTO DATES (FUTURE) VALUES (CURDATE())
INSERT INTO DATES (FUTURE) VALUES (ADDDATE(CURDATE(), INTERVAL 1 DAY))
...
INSERT INTO DATES (FUTURE) VALUES (ADDDATE(CURDATE(), INTERVAL 14 DAY))
SELECT FUTURE FROM DATES WHERE FUTURE NOT IN (SELECT POSTDATE FROM POSTS)
DROP TABLE TEMPORARY DATES
END
I guess it just isn't possible to select the absence of data.

You're right — SQL does not make it easy to identify missing data. The usual technique is to join your sequence (with gaps) against a complete sequence, and select those elements in the latter sequence without a corresponding partner in your data.
So, #BenHoffstein's suggestion to maintain a permanent date table is a good one.
Short of that, you can dynamically create that date range with an integers table. Assuming the integers table has a column i with numbers at least 0 – 13, and that your table has its date column named datestamp:
SELECT candidate_date AS missing
FROM (SELECT CURRENT_DATE + INTERVAL i DAY AS candidate_date
FROM integers
WHERE i < 14) AS next_two_weeks
LEFT JOIN my_table ON candidate_date = datestamp
WHERE datestamp is NULL;

One solution would be to create a separate table with one column to hold all dates from now until eternity (or whenever you expect to stop blogging). For example:
CREATE TABLE Dates (dt DATE);
INSERT INTO Dates VALUES ('2011-01-01');
INSERT INTO Dates VALUES ('2011-01-02');
...etc...
INSERT INTO Dates VALUES ('2099-12-31');
Once this reference table is set up, you can simply outer join to determine the unused dates like so:
SELECT d.dt
FROM Dates d LEFT JOIN Blogs b ON d.dt = b.dt
WHERE b.dt IS NULL
If you want to limit the search to two weeks in the future, you could add this to the WHERE clause:
AND d.dt BETWEEN NOW() AND ADDDATE(NOW(), INTERVAL 14 DAY)

The way to extract rows from the mysql database is via SELECT. Thus you cannot select rows that do not exist.
What I would do is fill my blog table with all possible dates (for a year, then repeat the process)
create table blog (
thedate date not null,
thetext text null,
primary key (thedate));
doing a loop to create all dates entries for 2011 (using a program, eg $mydate is the date you want to insert)
insert IGNORE into blog (thedate,thetext) values ($mydate, null);
(the IGNORE keyword to not create an error (thedate is a primary key) if thedate exists already).
Then you insert the values normally
insert into blog (thedate,thetext) values ($mydate, "newtext")
on duplicate key update thetext="newtext";
Finally to select empty entries, you just have to
select thedate from blog where thetext is null;

You probably not going to like this:
select '2011-01-03', count(*) from TABLE where postdate='2011-01-03'
having count(*)=0 union
select '2011-01-04', count(*) from TABLE where postdate='2011-01-04'
having count(*)=0 union
select '2011-01-05', count(*) from TABLE where postdate='2011-01-05'
having count(*)=0 union
... repeat for 2 weeks
OR
create a table with all days in 2011, then do a left join, like
select a.days_2011
from all_days_2011
left join TABLE on a.days_2011=TABLE.postdate
where a.days_2011 between date(now()) and date(date_add(now(), interval 2 week))
and TABLE.postdate is null;

Related

Mysql query period data to each month

I'm trying create an SQL query to resolve my problem.
I use mysqk5.7.
My Table:
|project_id|start |end |cost(per month)|
|1 |2018-05-01|2018-06-30|1000 |
|2 |2018-06-01|2018-07-31|2000 |
I want to generate date-columns by start and end columns.
like this:
|date |project_id|cost|
|2018-05|1 |1000|
|2018-06|1 |1000|
|2018-06|2 |2000|
|2018-07|2 |2000|
Create a table and populate it with first day of each month. You can programmatically do that or even use Excel to generate data and port it to MySQL.
create table dates (
start_date date
);
insert into dates values
('2018-04-01'),
('2018-05-01'),
('2018-06-01'),
('2018-07-01'),
('2018-8-01');
Then, you can run a query like so:
Query
select
date_format(start_date, '%Y-%m') as `Date`,
a.project_id,
a.cost
from projects a
inner join dates b on b.start_date between a.start and a.end;
Result
Date project_id cost
2018-05 1 1000
2018-06 1 1000
2018-06 2 2000
2018-07 2 2000
Example
http://rextester.com/JRIUZ98116
Alternative
The other alternative is to create a stored procedure that creates a temporary table containing dates so that you don't have to generate a table. Minimum start date and maximum end date from the table can be extracted to create the temporary table of dates.
Then, the stored procedure can do the same join as above to generate a resultset.
This is one of those places where a separate date table will make life much easier. If you have a table with something like this:
create table DateTable(ThisDate date, Month varchar(7))
adding whatever other columns you might need (isWeekday etc) and populate it in a loop then you will be able to re-use this for all sorts of things, including this query. For example you can create a view on it to get month, startdate, enddate, and then join from that back into your table looking for dates that are between the start and end date.
This and lots of other queries will become simple.
create table DateTable(ThisDate date, Month varchar(7))
--- populate the table just once, re-use for all future queries
create view MonthView as
select Month,
min(ThisDate) as StartOfMonth,
max(ThisDate) as EndOfMonth
from DateTable
select Month, ProjectID, Cost
from MonthView
join MyTab on myTab.Start<=EndOfmonth and myTab.End>=StartofMonth

List of months within multiple date ranges in mysql

I have a list of date ranges and I like to get a list of all months that are within these date ranges. I can query my date ranges like so:
Select id, start, end
From date_range
And this query would give the following output:
1, 01-01-2016, 25-03-2016
2, 26-03-2016, 30-03-2016
3, 30-12-2016, 08-01-2017
Now I would like to find a MySQL query that just lists all months within these date ranges. So it should give the following output:
01-2016
02-2016
03-2016
12-2016
01-2017
There are already examples here on how to get a list of month between two dates, such as:
Creating a list of month names between two dates in MySQL
How to get a list of months between two dates in mysql
But these examples are about a single date range, but I have multiple date ranges. It would be great if someone can find an sql query for my problem.
Here is a solution:
#DROP TABLE IF EXISTS monthTest;
CREATE TABLE monthTest(id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY, `start` DATETIME, `end`DATETIME);
INSERT INTO monthTest(`start`, `end`) VALUES
('2016-01-01', '2016-03-25'),
('2016-03-26', '2016-03-30'),
('2016-12-30', '2017-08-01');
SELECT A.`start`, A.`end`, DATE_FORMAT(DATE_ADD(A.`start`, INTERVAL B.help_keyword_id MONTH), '%Y%m') FROM
monthTest A,
mysql.help_keyword B
WHERE PERIOD_DIFF(DATE_FORMAT(A.`end`, '%Y%m'), DATE_FORMAT(A.`start`, '%Y%m')) >= B.help_keyword_id
ORDER BY A.id;
Note that this query in the second JOIN table has a dependency that this table must contain more rows than the maximum number of months between any two dates and the join field must be an incrementing INTEGER starting from 0. This is due to the limitation that mysql doesn't (yet) contain a row generator so a workaround is necessary.
Regards,
James

MySQL: Need to Query for missing records from various start dates

I have two tables that are linked by an ID, and one table has a start date, and the child (linked) table has weekly entries of data. I need to be able to query and determine the ID's, that are missing a week's data, without knowing the actual dates.
Table1
ID INT
START_DATE DATE
Table2
ID INT (foreign Key to Table 1)
TRAN_DATE DATE
VALUE INT
Each INT might have a different start date, and the values are saved weekly (every Monday, Tuesday, etc... based on Start Date)
Some IDs will have missed posting their value one week, and I need to look back historically for when a record is missing.
Assuming a Start_Date of Sept 9, 2013, the dates would be (9/9/2013. 9/16/2013, 9/23/2013,...) I need to see if TRAN_DATE for ID 1 is 9/9/2013, then add 7 days (9/16/2013), and check for that record, then add 7 days (9/23/2013) and check for that record to exist. Then repeat for the different IDs. This would end with the current date, or any date into the future (if this is easier).
I can do this with a program simply enough, but I need to do this at a customer site and I can not distribute code into the site, so I need to try to do it with a query).
The following query returns any gaps in table2:
select distinct id
from table2 t2
where t2.tran_date < now() - interval 7 day and
not exists (select 1
from table2 t2a
where t2a.id = t2.id and
datediff(t2a.tran_date, t2.tran_date) = 7
);
This assumes that the first transaction is not missing. Is that possible?

Padding MYSQL data with missing dates when comparing year over year stats?

I have a table that tracks emails sent. It is pretty simple.
ID | DATETIME | E-MAIL | SUBJECT | MESSAGE
I have been collecting data for several years. Some days I don't have any entries in the table.
query1:
SELECT COUNT(ID) FROM emails
WHERE DATE(datetime) >= 'XXXX-XX-XX'
AND DATE(datetime) is <= 'ZZZZ-ZZ-ZZ'
GROUP BY DATE(datetime)
I then use a some php to get one year prior for both XXXX and YYYY and run the second query which is the same as the first...
query2:
SELECT COUNT(ID) from emails
WHERE DATE(datetime) >= 'XXXX-XX-XX'
AND DATE(datetime) is <= 'ZZZZ-ZZ-ZZ'
GROUP BY DATE(datetime)
I am using a charting package to compare how many emails I got for a date range and then I overlay how many emails I got for the same range only one year prior. This is two queries right now and I chart the results.
The issue is where mysql does not have any emails for 2011 for a day in question, but has a few in 2012 for the same day.
Combining the results and graphing them skews the results since I am missing a date and a 0 value for last year for that day, effectively making all my values no longer match up.
2011-03-01 10 2012-03-01 4
2011-03-02 4 2012-03-02 2
2011-03-03 6 2012-03-04 1 <---- see where the two queries
end up diverging? (I had nothing
logged for 2012-03-03 so naturally
it was not in the results.
Is there a way I can get mysql to output the data I need including dates where value appear in one year but not another OR if no values appear in either year (still need date and 0) so my chart works?
I cannot seem to figure out how to do this...
Thanks!
There are a few different ways to get the results for a contiguous set of dates. My favourite one is to create the full set that is required using a dummy table or an existing contiguous set of ids from an AI PK. Something like this -
SELECT '2011-01-01' + INTERVAL (id -1) DAY
FROM dummy
WHERE id BETWEEN 1 AND 365
This will return a full set of days for 2011 which can then be LEFT JOINed to your emails table to get the counts -
SELECT `dates`.`date`, COUNT(emails.id)
FROM (
SELECT '2011-01-01' + INTERVAL (id - 1) DAY AS `date`, '2011-01-01 23:59:59' + INTERVAL (id - 1) DAY AS `end_of_day`
FROM dummy
WHERE id BETWEEN 1 AND 365
) `dates`
LEFT JOIN emails
ON `emails`.`datetime` BETWEEN `dates`.`date` AND `dates`.`end_of_day`
GROUP BY `dates`.`date`
To populate your dummy / seq table you can insert the first ten values manually and then use INSERT ... SELECT to add the rest -
CREATE TABLE dummy (id INTEGER NOT NULL PRIMARY KEY);
INSERT INTO dummy VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
SET #tmp := (SELECT MAX(id) FROM dummy) + 1;
INSERT INTO dummy
SELECT #tmp + id
FROM dummy;
You need to execute the SET query before each run of the INSERT ... SELECT query.

SQL group by date, but get dates w/o records too

Is there an easy way to do a GROUP BY DATE(timestamp) that includes all days in a period of time, regardless of whether there are any records associated with that date?
Basically, I need to generate a report like this:
24 Dec - 0 orders
23 Dec - 10 orders
22 Dec - 8 orders
21 Dec - 2 orders
20 Dec - 0 orders
Assuming you have more orders than dates something like this could work:
select date, count(id) as orders
from
(
SELECT DATE_ADD('2008-01-01', INTERVAL #rn:=#rn+1 DAY) as date from (select #rn:=-1)t, `order` limit 365
) d left outer join `order` using (date)
group by date
One method is to create a calendar table and join against it.
I would create it permanently, and then create a task that will insert new dates, it could be done weekly, daily, monthly, etc.
Note, that I am assuming that you are converting your timestamp into a date.
Instead of using GROUP BY, make a table (perhaps a temporary table) which contains the specific dates you want, for example:
24 Dec
23 Dec
22 Dec
21 Dec
20 Dec
Then, join that table to the Orders table.
you need to generate an intermediate result set with all the dates in it that you want included in the output...
if you're doing this in a stored proc, then you could create a temp table or table variable (I don't knoiw MySQL's capabilities), but once you have all the dates in a table or resultset of some kind
Just join to the real dataa from the temp table, using an outer join
In SQL Server it would be like this
Declare #Dates Table (aDate DateTime Not Null)
Declare #StartDt DateTime Set #StartDt = 'Dec 1 2008'
Declare #EndDt DateTime Set #EndDt = 'Dec 31 2008'
While #StartDt < #EndDt Begin
Insert #Dates(aDate) Values(#StartDt)
Set #StartDt = DateAdd(Day, 1, #StartDt)
End
Select D.aDate, Count(O.*) Orders
From #Dates D Left Join
OrderTable O On O.OrderDate = D.aDate
Group By D.aDate
In a data warehouse, the method taken is to create a table that contains all dates and create a foreign key between your data and the date table. I'm not saying that this is the best way to go in your case, just that it is the best practice in cases where large amounts of data need to be rolled up in numerous ways for reporting purposes.
If you are using a reporting layer over SQL Server, you could just write some logic to insert the missing dates within the range of interest after the data returns and before rendering your report.
If you are creating your reports directly from SQL Server and you do not already have a data warehouse and there isn't the time or need to create one right now, I would create a date table and join to it. The formatting necessary to do the join and get the output you want may be a bit wonky, but it will get the job done.
There's a pretty straightforward way to do this… except that I can't remember it. But I adapted this query from this thread:
SELECT
DISTINCT(LEFT(date_field,11)) AS `Date`,
COUNT(LEFT(date_field,11)) AS `Number of events`
FROM events_table
GROUP BY `Date`
It works in MySQL too