How to create a loop in a case statement - mysql

i have the following SQL statement:
DECLARE #time datetime;
SELECT #time = (select min(CreationDate) from TABLE);
DECLARE #time2 int;
SELECT #time2 = 15;
select ColumnA,
(case when CreationDate between #time and DATEADD(MINUTE,#time2***1**,#time)
then cast (#time2*1 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***2**,#time)
then cast (#time2*2 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***3**,#time)
then cast (#time2*3 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***4**,#time)
then cast (#time2*4 as int)
else 0
end) as 'interval', count(1)
from TABLE
group by
ColumnA,
(case when CreationDate between #time and DATEADD(MINUTE,#time2***1**,#time)
then cast (#time2*1 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***2**,#time)
then cast (#time2*2 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***3**,#time)
then cast (#time2*3 as int)
when CreationDate between #time and DATEADD(MINUTE,#time2***4**,#time)
then cast (#time2*4 as int)
else 0
end)
How can i write the case statement in a loop so the bold number will be a parameter
2.i need that the loop/function will be able to write as many case row as needed according to the parameter in Q.1
thanks a lot !
Hello everyone,
Thanks for your response and comments.
I realized that maybe I did not explain my questions properly. Let me rephrase my questions again.
I have an ETL process that runs and fills a table consisting of column called "ColumnA" which displayed codes, and creation time column called "CreationDate".
I want to divide the results by time of creation. Sometimes by 15 minutes, sometimes by 20 minutes or any other time interval.
So I established a variable called "#time" that say what is the interval length.
The first problem: I do not know how long the ETL process will run, so I do not know how many lines to produce in the CASE statement.
The second problem: the number of CASE statement lines also depends on the interval length in which I choose in the variable "#time". That is, if the process takes an hour and "#time" selected intervals is 15 minutes then I must produce 4 CASE statement lines but if I select "#time" to be 10 minutes then I must produce 6 CASE statement lines…
Who can I do it with parameters?
Thanks in advance for your time and efforts.
Regards,
Alan B.

Don't think of loops when working with SQL. Think of results you want to see.
If I interprete your request correctly, you are selecting all rows with a creation date between the given #time and the following hour. For these you determine the time slice. In your case it's 15 minutes and you want to know, whether a creation date is within the first 15 minutes (then you output 15), or second (then you output 30) and so on. Records with another creation date, no matter whether before or after the hour in question, are given an output 0.
So the only problem is to find the correct formula, which is more or less: get the time difference in minutes, divide by the minute slice, and the resulting full integer will tell you which slice it is, starting with 0 for the first slice in the hour.
That should be more or less:
select columna, interval_end_minute, count(*)
from
(
select
columna,
case when creationdate >= #time and creationdate < dateadd(hour,1,#time) then
(truncate((time_to_sec(datediff(creationdate, #time)) / 60) / #time2, 0) + 1) * #time2
else
0
end as interval_end_minute
from table
) data
group by columna, interval_end_minute;

Related

SQL - add days to a date in a loop

I have a table with this fields:
Id int(11) pk,
Start_date date,
End_date date,
type tinyint(1),
Frequency int.
I want to do a select on this table where start_date+frequency = #date(a variable date) until end_date(loop).
How do this with sql?
Thanks in advance.
EDIT:
Variable date is (for example):
SET #date = '2017-03-30'
type can be 0 or 1:
if type = 0 my query is :
select * from table
where type = 0 and start_date <= #date AND end_date>=#date
if type = 1, frequency is a field with an integer number(a interval of days). So I have to check if adding this value to start_date is equals to #date.
if yes, I have to return the current record
if no, I have to iterate this operation
Date current = start_date + interval of 'frequency' days
while(current < end_date){
if(current == #date)
(this is the record I want)
else
current+=frequency
}
The result of query of type 1 can be more than one record. And finally I want to UNION the result of type 0 and 1 in unique select.
Based on a comment/confirmation below the question:
So, to put it in less focussed on the wrong solution terms, you want to determine whether the difference between start_date and #date, in days, is an integer multiple of frequency? –
Looks like you want something like:
select * from table
where
start_date <= #date AND
end_date>=#date AND
(
type = 0 OR
(
type = 1 AND
mod(datediff(#date,start_date),frequency) = 0
)
)
Once we determined the actual requirement, above, and it was clear we just need to find out if one number is a multiple of another, we use mod to compute that. The rest of the structure of the WHERE clause essentially follows the bullet-pointed section of the question.

Efficiently selecting every nth row without ROW_NUMBER

I have a table consisting of about 20 million rows, totalling approximately 2 GB. I need to select every nth row, leaving me with only a few hundred rows. But I cannot for the life of me figure out how to do it without getting a timeout.
ROW_NUMBER is not available, and keeping track of the current row number with a variable (e.g. #row) causes a timeout. I presume this is because it is still iterating over every row, but I'm not too sure. There's no integer index for me to use either. A DATETIME field is used instead. This is an example query using #row:
SET #row = 0;
SELECT `field` FROM `table` WHERE (#row := #row + 1) % 1555200 = 0;
Is there anything else I haven't tried?
Thanks in advance!
It's a tricky one for sure. You could work out the minimum date and then use a datediff to get you the sequential values, but this probably isn't sargeable (as below). For me, it took 18 seconds on a table with 16 million rows, but your mileage may vary.
** EDIT ** I should also add that this was with a nonclustered index scan against an index which included the date column (pretty sure this is forced by the function around the date but perhaps someone with more knowledge can expand on this). After creating an index against that column, I got 12 seconds.
Try it out and let me know how it goes :)
DECLARE #n INT = 5;
SELECT
DATEDIFF(DAY, first_date.min_date, DATE_COLUMN) AS ROWNUM
FROM
ss.YOUR_TABLE
OUTER APPLY
( SELECT
MIN(a.DATE_COLUMN) min_date
FROM ss.YOUR_TABLE a
) first_date
WHERE DATEDIFF(DAY, first_date.min_date, DATE_COLUMN) % #n = 0
Edit again:
Just noticed this has been accepted as an answer... In case anyone else comes across this, it probably shouldn't be. On review, this only works if your datetime field has one entry per day and the datetime is sequential (in that rows are added in the same order as the datetime, or if the datetime is the primary key).
Again only works per day with the above caveats, you can change the date diff to use any unit (Month, Year, Minute etc) if you have one row added per unit of time.

MySql - Calculating distance in time using 2 values from 1 column (Poor design workaround)

I was granted access to a legacy database in order to do some statistics work.
I've so far gotten everything I need out of it, except I am trying to calculate a distance in time, using 5 values, stored in 4 columns (ARGGGHHH)
Above is a subsection of the database.
As you can see, I have start and stop date and time.
I would like to calculate the distance in time from str_date + str_time to stp_date + stp_time
The issue I have is, the calculation should be performed differently depending on the second value in stp_time.
IFF second value = "DUR".... THen I can just take the first value "01:04:51" in this scenario
IFF second value = anything else. stp_time represents a timecode and not a duration. This must then calculate stp_time - str_time (accounting for date if not same date)
All data is 24 hour format. I have done work with conditional aggregation, but I have not figured this one out, and I have never worked with a malformed column like this before.
Any and all advice is welcome.
Thanks for reading
SELECT
CASE WHEN RIGHT(stp_time,3)="DUR"
THEN
TIMEDIFF(LEFT(stp_time,8), '00:00:00')
ELSE
TIMEDIFF(
STR_TO_DATE(CONCAT(stp_date," ",LEFT(stp_time,8)), '%d/%b/%Y %H:%i:%s'),
STR_TO_DATE(CONCAT(str_date," ",LEFT(str_time,8)), '%d/%b/%Y %H:%i:%s')
)
END AS diff
FROM so33289063
Try this out, you might want a where condition for the subquery
With left and right:
SELECT IF(dur,stp,timediff(str,stp)) FROM(
SELECT STR_TO_DATE(CONCAT(str_date," ",LEFT(str_time,8)), 'd%/%b/%Y %H:%i:%s') as str,
STR_TO_DATE(CONCAT(stp_date," ",LEFT(stp_time,8)), 'd%/%b/%Y %H:%i:%s') as stp,
if(RIGHT(stp_time,3)="DUR",1,0) as dur
FROM my_table
) AS times

No TimeDiff function in T-SQL?

I have created a new column called DesiredTimeOfFileCreation of type time(7); this will indicate at what time the data is to be extracted to an export file.
Let's suppose it is set to 6:00:00. I then have a SQL agent job scheduled at 6:00 (probably every 30 minutes), but it might run at 6:00:05 or even 6:01. I want to select all rows where the DesiredTimeOfFileCreation is less than 30 minutes ago.
Does someone already have a user-defined TimeDiff function? Or is there an easy alternative that I'm missing?
As Martin mentioned above, I need to handle the midnight wrap-around.
This seems overly complicated. The code below seems to work if one time is one hour before midnight, and one is within an hour after. Would be nice to make it more generic. I think the only way to do that would be to make up a dummy date, which I may experiment with next.
The reason I'm passing a date in the unit test is that I will be passing a casted version of GetUTCDate() as a parm:
ALTER FUNCTION TimeDiffMinutes
(
#FirstTime time(7),
#SecondTime time(7)
)
RETURNS int
AS
BEGIN
/*
Unit Test:
select dbo.TimeDiffMinutes('13:31',cast ('2013-06-10 13:35' as time)), -- simple test
dbo.TimeDiffMinutes('23:55',cast ('2013-06-10 00:05' as time)) -- test midnight wrap-around
select dbo.TimeDiffMinutes('23:55',cast ('2013-06-10 00:05' as time)) -- test midnight wrap-around
*/
-- Declare the return variable here
DECLARE #resultMinutes int
DECLARE #Hour int
-- although we can compare two times, the problem is that if one time is 11:55 and the other is 00:05, we want to show 10 minutes difference.
-- We cannot add 24 hours to a time, because that would be an invalid value
Set #Hour = datePart(hour,#SecondTime)
if (#Hour <= 0)
begin
-- increase both times by an hour so we can compare them, 23:55 will wrap around to 01:55
Set #FirstTime = DateAdd(hour,+1,#FirstTime)
Set #SecondTime = DateAdd(hour,+1,#SecondTime)
end
SET #resultMinutes = DATEDIFF(Minute,#FirstTime,#SecondTime)
-- Return the result of the function
RETURN #resultMinutes
END
NOTE: This code shows that you cannot go past 24 hours in a time; it just wraps back around (with no error!):
declare #FirstTime time(7)
SET #FirstTime = '23:05'
print #FirstTime
Set #FirstTime = DATEADD(HOUR,1,#FirstTime)
print #FirstTime
Improved version, using an arbitrary date.
ALTER FUNCTION TimeDiffMinutes
(
#FirstTime time(7),
#SecondTime time(7)
)
RETURNS int
AS
BEGIN
/*
Unit Test:
select dbo.TimeDiffMinutes('13:31',cast ('2013-06-10 13:35' as time)), -- simple test
dbo.TimeDiffMinutes('23:55',cast ('2013-06-10 00:05' as time)) -- test midnight wrap-around
select dbo.TimeDiffMinutes('23:55',cast ('2013-06-10 00:05' as time)) -- test midnight wrap-around
*/
-- Declare the return variable here
DECLARE #resultMinutes int
DECLARE #Hour int
DECLARE #FirstDate datetime
DECLARE #SecondDate datetime
Set #FirstDate = CAST('2001-01-01 ' + Convert(varchar(12),#FirstTime) as DateTime)
Set #SecondDate = CAST('2001-01-01 ' + Convert(varchar(12),#SecondTime) as DateTime)
-- although we can compare two times, the problem is that if one time is 11:55 and the other is 00:05, we want to show 10 minutes difference.
-- We cannot add 24 hours to a time, because that would be an invalid value
Set #Hour = datePart(hour,#SecondDate)
if (#Hour <= 0)
begin
-- increase both times by an hour so we can compare them, 23:55 will wrap around to 01:55
Set #SecondDate = DateAdd(day,+1,#SecondDate)
end
SET #resultMinutes = DATEDIFF(Minute,#FirstDate,#SecondDate)
-- Return the result of the function
RETURN #resultMinutes
END
This is how I will use the function. We store the local time that an airport wants an extract file in a table. Then we will use SQL agent or BizTalk to poll every 30 minutes looking for work to do. AirportCode is a column in the table, and we have our own crazy function that converts for timezones.
select *,
dbo.TimeDiffMinutes(
DesiredFileCreationTimeLocal,
cast(dbo.LocationLocalTimeFromAirportCode(AirportCode,GETUTCDATE()) as time)
) as 'MinutesAgo'
from TransactionExtractDistribution
where dbo.TimeDiffMinutes(
DesiredFileCreationTimeLocal,
cast(dbo.LocationLocalTimeFromAirportCode(AirportCode,GETUTCDATE()) AS time)
) < 30
This will probably work for me:
WHERE DATEDIFF(Minute,DesiredFileCreationTimeLocal,cast(GETDATE() as time)) < 30
I had to research what happened if you pass a Time as a variable to the DateDiff function.
It seems to work, the only trick is then how to pass two times to it.
My real-world scenario is more complex, because we are dealing with different locations in different time zones, so there will some UTC conversions added to the above.

SQL Server 2008 CASE Logic in SELECT statement

Hello again SQL Server 2008 gurus.
I need to apply the following rules to the setting of a worker's start and end times for their work day (hourly employees) in a SELECT statement. I apologize in advance for my SQL ignorance.
The rule is to set their start time to a value stored in a table field for that worker, if they login on or before their start time (a time stored in the worker starttime column) and therefore get credit for starting at their start time.
If they log out within a 10 minute period before or anytime after their end time stored in a column for the worker, they get credit for their full day, another value stored in a column of the worker table, otherwise they are penalized some percentage of an hour, i.e. their log out time rounded to .25 of an hour less closest to the time they logged out. i.e. if they are set to log out at 4:30, and they log out at 4:18, their log out time is 4:15. If they log out at 4:20, and are set to log out at 4:30, their log out time is 4:30.
The first rule applies to all hourly employees where their workday hours is less than or equal to their expected workday value. The caveat is, for those where overtime is ok (a bit value set to 1). If overtime is permitted, the number of billable hours can exceed the full day value stored for them, and therefore the value of their logout - login time can exceed their fullday value.
The question is, can these rules be calculated in the SELECT statement and if so can I get some help with the code?
The columns containing the information are:
worker.startime (TIME)
worker.endtime (TIME)
worker.overtimeallowed (BIT)
worker.workdayhours (decimal (12,2))
worker.penaltyvalue (decimal (12,2))
If it requires a UDF or stored procedure (since I'm using the Telerik ReportViewer) I'm not sure it would be supported, but that's probably another question.
So far I've gotten some help with applying some CASE logic - calculating whether a worker get's credit for their 1/2 lunch. The code that was supplied works as promised. This, I believe may be an extension to that logic - so I'll provide the code I have here:
-- for testing purposes only.
DECLARE #StartDate AS DateTime
SET #StartDate = CAST('03/25/2012' AS DATE)
DECLARE #EndDate AS DateTime
SET #EndDate = CAST('04/10/2012' AS DATE)
SELECT
w.Firstname
,w.Lastname
,wf.Login
,wf.logout
,ROUND(CAST(DATEDIFF(MI, wf.Login, wf.Logout) AS DECIMAL)/60,2)
- CASE
WHEN DATEDIFF(hour, wf.Login, wf.Logout) < w.MinimumHours THEN
w.LunchDeduction
ELSE
0
END AS [Hours Credited]
FROM Workers AS w
JOIN Workflow AS wf
ON wf.LoggedInWorkerid = w.ID
JOIN Roles AS r
ON w.RoleID = r.RoleID
WHERE (r.Descript = 'Hourly')
AND wf.Login >= #StartDate AND wf.Logout <= #EndDate
ORDER BY w.Lastname, w.Firstname
Here is a sample select dealing with constraints you described. CTEs create tables for testing purposes. Main query shows the calculations. You have worked with datediffs and dateadds so there is no mistery. If you haven't use % before, it is modulo operator used to round time to 15 minutes.
;with worker (ID, overtime, startTime, endTime) as
(
select 1, 1, CAST ('08:30' as time), CAST ('16:30' as time)
union all
select 2, 0, CAST ('08:30' as time), CAST ('16:30' as time)
union all
select 3, 0, CAST ('08:30' as time), CAST ('16:30' as time)
),
-- Test table of workflows
wf (workerID, login, logout) as
(
select 1, CAST ('2012-03-11 08:20' as datetime), CAST ('2012-03-11 19:33' as datetime)
union all
select 2, CAST ('2012-03-11 08:50' as datetime), CAST ('2012-03-11 16:20' as datetime)
union all
select 3, CAST ('2012-03-11 08:22' as datetime), CAST ('2012-03-11 16:18' as datetime)
)
select wf.workerID,
wf.login,
wf.logout,
-- if starttime > login return startTime else login
case when DATEDIFF(MI, w.startTime, cast (wf.login as time)) < 0
then cast(CAST (wf.login AS date) AS datetime) + w.startTime
else wf.login
end roundedLogin,
case when w.overtime = 1 -- Round to 15 minutes whenever finished
OR
-- Round to 15 minutes if left ten or more minutes before endTime
DATEDIFF(MI, cast (wf.logout as time), dateadd (MI, -10, w.endTime)) > 0
then dateadd (MI, -(DATEPART (MI, wf.logout) % 15), wf.logout)
-- stop at endTime if overtime = 0 OR left job at apropriate time
else cast(CAST (wf.logout AS date) AS datetime) + w.endTime
end roundedLogout
from worker w
inner join wf
on w.ID = wf.workerID
There will be a problem with this approach. When you start to integrate mathematics into original query you will notice that you have to write expressions evaluating roundedLogin and roundedLogout again to calculate billable hours. You cannot reuse alias defined in the same scope, but you can create derived table or view or even calculated fields. View returning columns from workflows and all additional expressions would probably be the best.
Using this view in other queries would simplify things by encapsulating logic at one place.