I've got a database with two tables, that I want to combine. One of the tables contains "incidental events", which just occur once. Next to this, I also have "periodical events". Now I want to combine these two in a view.
The incidental one simply has two columns, one called changes, the other one called date. The periodical one has three columns, changes, startDate and endDate. The difference between these two can be a maximum of 50 years, so manually typing out one case for every day is not going to work. Both views also have an AI ID. In this view I want to have a column date and a column changes.
To achieve this I want to unroll the periodical changes table, so that it shows one entry for every day in between the startDate and endDate. For instance:
incidental changes:
date | change
09/08/2015 | 5
11/08/2015 | 10
periodical changes:
startDate | endDate | change
09/08/2015 | 12/08/2015 | 7
These two I want combined into:
changes view:
date | change
09/08/2015 | 5
09/08/2015 | 7
10/08/2015 | 7
11/08/2015 | 10
11/08/2015 | 7
12/08/2015 | 7
My idea is to use something like this:
SELECT * FROM incidental_changes,(
SET #id = (SELECT min(ID) AS min FROM periodical_changes WHERE 1)
SET #maxID = (SELECT max(ID) AS max FROM periodical_changes WHERE 1)
WHILE (#id <= #maxID) DO
SET #firstDate = (SELECT startDate FROM periodical_changes WHERE id = #id)
SET #lastDate = (SELECT endDate FROM periodical_changes WHERE id = #id)
WHILE (#firstDate <= #lastDate) DO
SELECT #firstDate AS date, change FROM periodical_changes WHERE id = #id
#firstDate = #firstDate + INTERVAL 1 DAY
END
#id = #id + 1
END
) WHERE 1
This gives me an error,
CREATE ALGORITHM = UNDEFINED VIEW all_periodicals AS SELECT * FROM
incidental_changes,( SET #id = (SELECT min(ID) AS min FROM
periodical_changes WHERE 1) SET #maxID = (SELECT max(ID) AS max FROM
periodical_changes WHERE 1) WHILE (#id <= #maxID) DO SET #firstDate =
(SELECT startDate FROM periodical_changes WHERE id = #id) SET
#lastDate = (SELECT endDate FROM periodical_changes WHERE id = #id)
WHILE (#firstDate <= #lastDate) DO SELECT #firstDate AS date, change
FROM periodical_changes WHERE id = #id #firstDate = #firstDate +
INTERVAL 1 DAY END #id = #id + 1 END ) WHERE 1
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use
near 'SET #id = (SELECT min(ID) AS min FROM periodical_changes WHERE
1) SET #' at line 5
and I'm guessing that if I'd manage to fix this error there'd be more. So, is there any way to do this the way I want, or do I have to look for a different approach?
EDIT:
Okay, so far I have not found a way to do this in a view or so. So instead I am now using a routine. This routine has one parameter, account INT. The definition I am using so far is as followed:
BEGIN
DECLARE periodicalID int;
DECLARE v_finished INTEGER DEFAULT 0;
DECLARE periodicalCursor CURSOR
FOR SELECT periodicals.periodicalID FROM periodicals WHERE periodicals.accountID = account;
DECLARE CONTINUE HANDLER
FOR NOT FOUND SET v_finished = 1;
CREATE TEMPORARY TABLE results LIKE incidentials;
ALTER TABLE results DROP INDEX date;
SET #periodicalID = -1;
OPEN periodicalCursor;
allPeriodicals: LOOP
FETCH periodicalCursor INTO periodicalID;
IF (v_finished) THEN
LEAVE allPeriodicals;
END IF;
SELECT periodicals.startDate,periodicals.numberOfPeriods,periodicals.period,periodicals.endDate,periodicals.money FROM periodicals WHERE periodicals.periodicalID = periodicalID AND periodicals.accountID = account INTO #startDate, #numberOfPeriods, #period,#endDate,#money;
SET #intervalStatement = "SELECT ? + INTERVAL ? ";
SET #intervalStatement = CONCAT(#intervalStatement,#period," INTO #res");
PREPARE intervalStatement FROM #intervalStatement;
WHILE #startDate <= #endDate DO
EXECUTE intervalStatement USING #startDate,#numberOfPeriods;
SET #startDate = #res;
INSERT INTO results(accountID,date,money) VALUES (account,#startDate,#money);
END WHILE;
END LOOP allPeriodicals;
INSERT INTO results(accountID,date,money) SELECT accountID,date, money FROM incidentials WHERE incidentials.accountID = account;
SELECT * FROM results ORDER BY date;
END
This poses the problem of performance though. With only one periodical entry spread over a year this query already takes about 16 seconds. So even though this approach works, I either did something wrong causing it to take this long or this is not the right way to go.
Let me presume you have a numbers table. Then you can do:
select i.date, i.change
from incidental
union all
select date_add(p.startDate, interval n.n - 1 day), p.change
from periodic p join
numbers n
on date_add(p.startDate, interval n.n - 1 day) <= p.endDate;
For a select query, you can generate the numbers using a subquery, if you know the maximum length. Something like:
select i.date, i.change
from incidental
union all
select date_add(p.startDate, interval n.n - 1 day), p.change
from periodic p join
(select 1 as n union all select 2 union all select 3 union all select 4 union all
select 5 union all select 6 union all select 7
) n
on date_add(p.startDate, interval n.n - 1 day) <= p.endDate;
This doesn't work in a view, however. For that, you really do need a numbers table.
Related
How i can fill date gaps in MySQL? Here is my query:
SELECT DATE(posted_at) AS date,
COUNT(*) AS total,
SUM(attitude = 'positive') AS positive,
SUM(attitude = 'neutral') AS neutral,
SUM(attitude = 'negative') AS negative
FROM `messages`
WHERE (`messages`.brand_id = 1)
AND (`messages`.`spam` = 0
AND `messages`.`duplicate` = 0
AND `messages`.`ignore` = 0)
GROUP BY date ORDER BY date
It returns proper result set - but i want to fill gaps between dates start and end by zeros. How i can do this?
You'll need to create a helper table and fill it with all dates from start to end, then just LEFT JOIN with that table:
SELECT d.dt AS date,
COUNT(*) AS total,
SUM(attitude = 'positive') AS positive,
SUM(attitude = 'neutral') AS neutral,
SUM(attitude = 'negative') AS negative
FROM dates d
LEFT JOIN
messages m
ON m.posted_at >= d.dt
AND m.posted_at < d.dt + INTERVAL 1 DAYS
AND spam = 0
AND duplicate = 0
AND ignore = 0
GROUP BY
d.dt
ORDER BY
d.dt
Basically, what you need here is a dummy rowsource.
MySQL is the only major system which lacks a way to generate it.
PostgreSQL implements a special function generate_series to do that, while Oracle and SQL Server can use recursion (CONNECT BY and recursive CTEs, accordingly).
I don't know whether MySQL will support the following/similar syntax; but if not, then you could just create and drop a temporary table.
--Inputs
declare #FromDate datetime, /*Inclusive*/
#ToDate datetime /*Inclusive*/
set #FromDate = '20091101'
set #ToDate = '20091130'
--Query
declare #Dates table (
DateValue datetime NOT NULL
)
set NOCOUNT ON
while #FromDate <= #ToDate /*Inclusive*/
begin
insert into #Dates(DateValue) values(#FromDate)
set #FromDate = #FromDate + 1
end
set NOCOUNT OFF
select dates.DateValue,
Col1...
from #Dates dates
left outer join SourceTableOrView data on
data.DateValue >= dates.DateValue
and data.DateValue < dates.DateValue + 1 /*NB: Exclusive*/
where ...?
I need to find the most occurrences in a 10yr age range that can be Age 2 to 22, 15 to 25, 10 to 20, etc. in a table with name & age
I've created the SQL that returns the average age:
SELECT age, count(age)
FROM member
GROUP BY age
ORDER BY COUNT(age) DESC
LIMIT 1
Thanks for your help!
Create another table ages to hold the age ranges you are interested in with a field for age_lower, age_upper and a display name age_range such as '2 to 22'
Join the tables with a WHERE clause that puts the age between the lower and upper ranges.
SELECT `age_range`, COUNT(`age`) AS age_count
FROM `member` INNER JOIN `ages`
ON age BETWEEN age_lower AND age_upper
GROUP BY age_range
ORDER BY COUNT(`age`) DESC, `age_range` ASC
SQL Fiddle
This might solve the problem. The only thing I added was a table to hold values 1..x where x is your bucket count. The #T can easily be replaced with your MySQL table name. The results are all possible sets the age falls in, for each age. Then count of how many equal sets.
--IGNORE BUILDING TEST DATA IN SQL SERVER
DECLARE #T TABLE(member INT,age INT)
DECLARE #X INT
SET #X=1
WHILE(#X<=100) BEGIN
INSERT INTO #T SELECT #X, CAST(RAND() * 100 AS INT)
SET #X=#X+1
END
DECLARE #MinAge INT=1
DECLARE #MaxAge INT=100
--YOUR SET TABLE. TO MAKE LIFE EASY YOU NEED A TABLE OF 1..X
DECLARE #SET TABLE (Value INT)
DECLARE #SET_COUNT INT =10
DECLARE #LOOP INT=1
WHILE(#LOOP<=#SET_COUNT) BEGIN
INSERT #SET SELECT #LOOP
SET #LOOP=#LOOP+1
END
SELECT
MinAge,
MaxAge,
SetCount=COUNT(CountFlag)
FROM
(
SELECT
MinAge=AgeMinusSetCount,
MaxAge=AgePlusSetCount,
CountFlag=1
FROM
(
SELECT DISTINCT
ThisAge,
AgeMinusSetCount=(AgeMinusSetCount-1) + Value,
AgePlusSetCount=CASE WHEN (AgeMinusSetCount-1) + Value + #SET_COUNT > #MaxAge THEN #MaxAge ELSE (AgeMinusSetCount-1) + Value + #SET_COUNT END
FROM
(
SELECT
ThisAge=age,
AgeMinusSetCount=CASE WHEN (age - #SET_COUNT) < #MinAge THEN #MinAge ELSE (age) - #SET_COUNT END
FROM
#T
)RANGES
LEFT OUTER JOIN (SELECT Value FROM #SET) AS FanLeft ON 1=1
)AS DETAIL
)AS Summary
GROUP BY
MinAge,
MaxAge
ORDER BY
COUNT(CountFlag) DESC
I'm trying to make a report that will display how many patients came in during a specific time frame for an age range. This is what I got so far, but the numbers its outputting are wrong, so I'm not sure what I missed. I've followed a couple of examples on here, but none have worked so far. Not sure if its cause I'm Joining to a different table or what.
select COUNT (DISTINCT MPFILE.PATIENT_NO) as 'Male 0-4'
from ENHFILE
Join MPFILE
on MPFILE.PATIENT_NO = ENHFILE.PATIENT_NO
where ENHFILE.COSITE = '300'
and ENHFILE.VISIT_PURPOSE = '2'
and MPFILE.SEX = 'M'
and (DATEDIFF(hour,MPFILE.DOB,GETDATE())/8766) > 5
and ENHFILE.ENCOUNTER_DATE between (#StartDate) and (#EndDate)
select COUNT (DISTINCT MPFILE.PATIENT_NO) as 'FeMale 0-4'
from ENHFILE
Join MPFILE
on MPFILE.PATIENT_NO = ENHFILE.PATIENT_NO
where ENHFILE.COSITE = '300'
and ENHFILE.VISIT_PURPOSE = '2'
and MPFILE.SEX = 'F'
and (DATEDIFF(hour,MPFILE.DOB,GETDATE())/8766) > 5
and ENHFILE.ENCOUNTER_DATE between (#StartDate) and (#EndDate)
Here is something that should get you what you want.
--First i just created some test data
if object_id('tempdb..#temp') is not null drop table #temp
select '8/15/1995' as DOB into #temp union all --Note this person is 21
select '8/16/1995' union all --Note this person is 21 TODAY
select '8/17/1995' union all --Note this person is 21 TOMORROW
select '4/11/1996' union all
select '5/15/1997' union all
select '9/7/2001'
--set the years old you want to use here. Create another variable if you need to use ranges
declare #yearsOld int
set #yearsOld = 21
select
convert(date,DOB) as DOB,
--This figures out how old they are by seeing if they have had a birthday
--this year and calculating accordingly. It is what is used in the filter
--I only put it here so you can see the results
CASE
WHEN CONVERT(DATE, CONVERT(VARCHAR(4), YEAR(GetDate()))+ '-'+ CONVERT(VARCHAR(2),MONTH(DOB)) + '-' + CONVERT(VARCHAR(2),DAY(DOB))) <= GETDATE() THEN DATEDIFF(yy,DOB,GETDATE())
ELSE DATEDIFF(yy,DOB,GETDATE()) -1
END AS YearsOld
from #temp
where
--here is your filter. Feel free to change the >= to what ever you want, or combine it to make it a range.
CASE
WHEN CONVERT(DATE, CONVERT(VARCHAR(4), YEAR(GetDate()))+ '-'+ CONVERT(VARCHAR(2),MONTH(DOB)) + '-' + CONVERT(VARCHAR(2),DAY(DOB))) <= GETDATE() THEN DATEDIFF(yy,DOB,GETDATE())
ELSE DATEDIFF(yy,DOB,GETDATE()) -1
END >= #yearsOld
EDIT
This is your method which doesn't account for if they have had a birthday this year. I use some test data. Notice the person born on 8/18/1995. They turn 21 tomorrow but using (DATEDIFF(hour,DOB,GETDATE())/8766) >= #yearsOld includes them when it shouldn't...
--First i just created some test data
if object_id('tempdb..#temp') is not null drop table #temp
select '8/15/1995' as DOB into #temp union all --Note this person is 21
select '8/16/1995' union all --Note this person is 21 TODAY
select '8/18/1995' union all --Note this person is 21 TOMORROW
select '4/11/1996' union all
select '5/15/1997' union all
select '9/7/2001'
--set the years old you want to use here. Create another variable if you need to use ranges
declare #yearsOld int
set #yearsOld = 21
select
convert(date,DOB) as DOB,
--This figures out how old they are by seeing if they have had a birthday
--this year and calculating accordingly. It is what is used in the filter
--I only put it here so you can see the results
CASE
WHEN CONVERT(DATE, CONVERT(VARCHAR(4), YEAR(GetDate()))+ '-'+ CONVERT(VARCHAR(2),MONTH(DOB)) + '-' + CONVERT(VARCHAR(2),DAY(DOB))) <= GETDATE() THEN DATEDIFF(yy,DOB,GETDATE())
ELSE DATEDIFF(yy,DOB,GETDATE()) -1
END AS YearsOld
from #temp
where
--here is your filter. Feel free to change the >= to what ever you want, or combine it to make it a range.
(DATEDIFF(hour,DOB,GETDATE())/8766) >= #yearsOld
RESULTS
DOB | YearsOld
1995-08-15 | 21
1995-08-16 | 21
1995-08-18 | 20 --this shouldn't be here...
I've got two tables, a project table and a calendar table. The first containts a startdate and days required. The calendar table contains the usual date information, like date, dayofweek, and a column is workingday, which shows if the day is a saturday, sunday, or bank holiday (value = 0) or a regular workday (value = 1).
For a certain report I need write a stored procedure that calculates the predicted enddate by adding the number of estimated workddays needed.
Example:
**Projects**
Name Start_Planned Work_days_Required
Project A 02.05.2016 6
Calendar (04.05 is a bank holdiday)
Day Weekday Workingday
01.05.2016 7 0
02.05.2016 1 1
03.05.2016 2 1
04.05.2016 3 0
05.05.2016 4 1
06.05.2016 5 1
07.05.2016 6 0
08.05.2016 7 0
09.05.2016 1 1
10.05.2016 2 1
Let's say, the estimated number of days required is given as 6 (which leads to the predicted enddate of 10.05.2016). Is it possible to join the tables in a way, which allows me to put something like
select date as enddate_predicted
from calendar
join projects
where number_of_days = 6
I would post some more code, but I'm quite stuck on how where to start.
Thanks!
You could get all working days after your first date, then apply ROW_NUMBER() to get the number of days for each date:
SELECT Date, DayNum = ROW_NUMBER() OVER(ORDER BY Date)
FROM Calendar
WHERE IsWorkingDay = 1
AND Date >= #StartPlanned
Then it would just be a case of filtering for the 6th day:
DECLARE #StartPlanned DATE = '20160502',
#Days INT = 6;
SELECT Date
FROM ( SELECT Date, DayNum = ROW_NUMBER() OVER(ORDER BY Date)
FROM Calendar
WHERE WorkingDay = 1
AND Date >= #StartPlanned
) AS c
WHERE c.DayNum = #Days;
It's not part of the question, but for future proofing this is easier to acheive in SQL Server 2012+ with OFFSET/FETCH
DECLARE #StartPlanned DATE = '20160502',
#Days INT = 6;
SELECT Date
FROM dbo.Calendar
WHERE Date >= #StartPlanned
AND WorkingDay = 1
ORDER BY Date
OFFSET (#Days - 1) ROWS FETCH NEXT 1 ROWS ONLY
ADDENDUM
I missed the part earlier about having another table, and the comment about putting it into a cursor has prompted me to amend my answer. I would add a new column to your calendar table called WorkingDayRank:
ALTER TABLE dbo.Calendar ADD WorkingDayRank INT NULL;
GO
UPDATE c
SET WorkingDayRank = wdr
FROM ( SELECT Date, wdr = ROW_NUMBER() OVER(ORDER BY Date)
FROM dbo.Calendar
WHERE WorkingDay = 1
) AS c;
This can be done on the fly, but you will get better performance with it stored as a value, then your query becomes:
SELECT p.Name,
p.Start_Planned,
p.Work_days_Required,
EndDate = c2.Date
FROM Projects AS P
INNER JOIN dbo.Calendar AS c1
ON c1.Date = p.Start_Planned
INNER JOIN dbo.Calendar AS c2
ON c2.WorkingDayRank = c1.WorkingDayRank + p.Work_days_Required - 1;
This simply gets the working day rank of your start date, and finds the number of days ahead specified by the project by joining on WorkingDayRank (-1 because you want the end date inclusive of the range)
This will fail, if you ever plan to start your project on a non working day though, so a more robust solution might be:
SELECT p.Name,
p.Start_Planned,
p.Work_days_Required,
EndDate = c2.Date
FROM Projects AS P
CROSS APPLY
( SELECT TOP 1 c1.Date, c1.WorkingDayRank
FROM dbo.Calendar AS c1
WHERE c1.Date >= p.Start_Planned
AND c1.WorkingDay = 1
ORDER BY c1.Date
) AS c1
INNER JOIN dbo.Calendar AS c2
ON c2.WorkingDayRank = c1.WorkingDayRank + p.Work_days_Required - 1;
This uses CROSS APPLY to get the next working day on or after your project start date, then applies the same join as before.
This query returns a table with a predicted enddate for each project
select name,min(day) as predicted_enddate from (
select c.day,p.name from dbo.Calendar c
join dbo.Calendar c2 on c.day>=c2.day
join dbo.Projects p on p.start_planned<=c.day and p.start_planned<=c2.day
group by c.day,p.work_days_required,p.name
having sum(c2.workingday)=p.work_days_required
) a
group by name
--This gives me info about all projects
select p.projectname,p.Start_Planned ,c.date,
from calendar c
join
projects o
on c.date=dateadd(days,p.Work_days_Required,p.Start_Planned)
and c.isworkingday=1
now you can use CTE like below or wrap this in a procedure
;with cte
as
(
Select
p.projectnam
p.Start_Planned ,
c.date,datediff(days,p.Start_Planned,c.date) as nooffdays
from calendar c
join
projects o
on c.date=dateadd(days,p.Work_days_Required,p.Start_Planned)
and c.isworkingday=1
)
select * from cte where nooffdays=6
use below logic
CREATE TABLE #proj(Name varchar(50),Start_Planned date,
Work_days_Required int)
insert into #proj
values('Project A','02.05.2016',6)
CReATE TABLE #Calendar(Day date,Weekday int,Workingday bit)
insert into #Calendar
values('01.05.2016',7,0),
('02.05.2016',1,1),
('03.05.2016',2,1),
('04.05.2016',3,0),
('05.05.2016',4,1),
('06.05.2016',5,1),
('07.05.2016',6,0),
('08.05.2016',7,0),
('09.05.2016',1,1),
('10.05.2016',2,1)
DECLARE #req_day int = 3
DECLARE #date date = '02.05.2016'
--SELECT #req_day = Work_days_Required FROM #proj where Start_Planned = #date
select *,row_number() over(order by [day] desc) as cnt
from #Calendar
where Workingday = 1
and [Day] > #date
SELECT *
FROM
(
select *,row_number() over(order by [day] desc) as cnt
from #Calendar
where Workingday = 1
and [Day] > #date
)a
where cnt = #req_day
I have table 1 with 3 columns id, startdate and enddate. With order id being the primary key how do I list the dates between the date range Startdate and Enddate?
What I have:
id Startdate EndDate
1 2/11/2014 2/13/2014
2 2/15/2014 2/17/2014
What I need:
id Date
1 2/11/2014
1 2/12/2014
1 2/13/2014
2 2/15/2014
2 2/16/2014
2 2/17/2014
How do I do this?
Use recursive CTE:
WITH tmp AS (
SELECT id, StartDate AS [Date], EndDate
FROM MyTable
UNION ALL
SELECT tmp.id, DATEADD(DAY,1,tmp.[Date]), tmp.EndDate
FROM tmp
WHERE tmp.[Date] < tmp.EndDate
)
SELECT tmp.ID, tmp.[Date]
FROM tmp
ORDER BY tmp.id, tmp.[Date]
OPTION (MAXRECURSION 0) -- For long intervals
If you have to use cursor/loop, most times you are doing it wrong.
If you do a one-off setup of an auxiliary calendar table as shown at Why should I consider using an auxiliary calendar table?, possibly omitting a lot of the columns if you don't need them, like this:
CREATE TABLE dbo.Calendar
(
dt SMALLDATETIME NOT NULL
PRIMARY KEY CLUSTERED,
Y SMALLINT,
M TINYINT,
D TINYINT
)
GO
SET NOCOUNT ON
DECLARE #dt SMALLDATETIME
SET #dt = '20000101'
WHILE #dt < '20300101'
BEGIN
INSERT dbo.Calendar(dt) SELECT #dt
SET #dt = #dt + 1
END;
UPDATE dbo.Calendar SET
Y = YEAR(dt),
M = MONTH(dt),
D = DAY(dt);
(You may well not need the Y, M, D columns at all, but I left those in to show that more data can be stored for fast access - the article I linked to shows how that could be used.)
Then if your table is named "so", your code would simply be
SELECT A.id, C.dt
FROM so AS A
JOIN Calendar AS C
ON C.dt >= A.StartDate AND C.dt<= A.EndDate
An advantage of using an auxiliary table like that is that your queries can be faster: the work done in setting one up is a one-time cost which doesn't happen during usage..
Instead of using CTE (to over come recursive and performance when date range is large) below query can be used to get the list of dates between two date range.
DECLARE #StartDateSTR AS VARCHAR(32); DECLARE #EndDateSTR AS
VARCHAR(32); DECLARE #EndDate AS DATE; DECLARE #StartDate AS DATE;
SET #StartDateSTR = '01/01/1990'; SET #EndDateSTR = '03/31/2025'; SET
#StartDate = CAST(#StartDateSTR AS date); SET #EndDate =
cast(#EndDateSTR AS date); SELECT
DATEADD(DAY, n1.rn - 1, #StartDate) AS dt FROM (SELECT rn=Row_number() OVER( ORDER BY (SELECT NULL)) FROM sys.objects a
CROSS JOIN sys.objects b CROSS JOIN sys.objects c CROSS JOIN
sys.objects d) as n1 WHERE n1.[rn] <= Datediff(dd, #StartDate,
#EndDate)+1;