My salary table looks like this,
employeeId Salary salaryEffectiveFrom
19966 10000.00 2022-07-01
19966 20000.00 2022-07-15
My role/grades table looks like this,
employeeId grade roleEffectiveFrom
19966 grade 3 2022-07-01
19966 grade 2 2022-07-10
I am trying to get the salary a grade is paid for by taking into account the effective date in both tables.
grade 3 is effective from 1-July-2022. grade 2 is effective from the 10th of July, implying grade 3 is effective till the 9th of July i.e. 9 days.
grade 2 is effective from 10-July-2022 onwards.
A salary of 10000 is effective from 1-July-2022 till 14-July-2022 as the salary of 20000 is effective from the 15th. Therefore grade 3 had a salary of 10000 for 9 days, grade 2 salary of 10000 for 4 days with grade 2 with a salary of 20000 from the 10th onwards. The role effectivefrom
date takes precedence over the salary effectivefrom date.
This query,
SELECT er.employeeId,
es.salary,
`grade`,
date(er.effectiveFrom) roleEffectiveFrom,
date(es.effectiveFrom) salaryEffectiveFrom,
DATEDIFF(LEAST(COALESCE(LEAD(er.effectiveFrom)
OVER (PARTITION BY er.employeeId ORDER By er.effectiveFrom),
DATE_ADD(LAST_DAY(er.effectiveFrom),INTERVAL 1 DAY)),
DATE_ADD(LAST_DAY(er.effectiveFrom),INTERVAL 1 DAY)),
er.effectiveFrom) as '#Days' ,
ROUND((salary * 12) / 365, 2) dailyRate
FROM EmployeeRole er
join EmployeeSalary es ON (es.employeeId = er.employeeId)
and er.employeeId = 19966
;
gives me the result set shown below,
employeeId Salary grade roleEffectiveFrom salaryEffectiveFrom Days dailyRate
19966 10000.00 grade 3 2022-07-01 2022-07-01 0 328.77
19966 20000.00 grade 3 2022-07-01 2022-07-15 9 657.53
19966 10000.00 grade 2 2022-07-10 2022-07-01 0 328.77
19966 20000.00 grade 2 2022-07-10 2022-07-15 22 657.53
grade3 is effective for 9 days in July so I want to get the total salary for those 9 days using a daily rate column, 328.77 * 9 = 2985.93 as a separate column but I am unable to do as I am getting the days for the wrong row i.e. 9 should be the result for the first row.
dbfiddle
merge the 2 table dates, lead them then use correlated sub queries
with cte as
(
SELECT employeeid,effectivefrom from EMPLOYEEROLE
union
select employeeid,effectivefrom from employeesalary
)
,cte1 as
(select employeeid,effectivefrom,
coalesce(
date_sub(lead(effectivefrom) over (partition by employeeid order by effectivefrom),interval 1 day) ,
now()) nexteff
from cte
)
select *,
datediff(nexteff,effectivefrom) + 1 diff,
(select grade from employeerole e where e.effectivefrom <= cte1.effectivefrom order by e.effectivefrom desc limit 1) grade,
(select salary from employeesalary e where e.effectivefrom <= cte1.nexteff order by e.effectivefrom desc limit 1) salary
from cte1;
+------------+---------------------+---------------------+------+---------+--------+
| employeeid | effectivefrom | nexteff | diff | grade | salary |
+------------+---------------------+---------------------+------+---------+--------+
| 19966 | 2022-07-01 00:00:00 | 2022-07-09 00:00:00 | 9 | grade 3 | 10000 |
| 19966 | 2022-07-10 00:00:00 | 2022-07-14 00:00:00 | 5 | grade 2 | 10000 |
| 19966 | 2022-07-15 00:00:00 | 2022-10-08 08:51:49 | 86 | grade 2 | 20000 |
+------------+---------------------+---------------------+------+---------+--------+
3 rows in set (0.003 sec)
with cte as
(
SELECT employeeid,effectivefrom from EMPLOYEEROLE
union
select employeeid,effectivefrom from employeesalary
)
,cte1 as
(select cte.employeeid,effectivefrom,
coalesce(
date_sub(lead(effectivefrom) over (partition by employeeid order by effectivefrom),interval 1 day) ,
last_day(maxdt)) nexteff
from cte
JOIN (select cte.employeeid,max(effectivefrom) maxdt from cte group by employeeid) c1
on c1.employeeid = cte.employeeid
)
select *,
datediff(nexteff,effectivefrom) + 1 diff,
(select grade from employeerole e where e.effectivefrom <= cte1.effectivefrom order by e.effectivefrom desc limit 1) grade,
(select salary from employeesalary e where e.effectivefrom <= cte1.nexteff order by e.effectivefrom desc limit 1) salary
from cte1;
+------------+---------------------+---------------------+------+---------+--------+
| employeeid | effectivefrom | nexteff | diff | grade | salary |
+------------+---------------------+---------------------+------+---------+--------+
| 19966 | 2022-07-01 00:00:00 | 2022-07-09 00:00:00 | 9 | grade 3 | 10000 |
| 19966 | 2022-07-10 00:00:00 | 2022-07-14 00:00:00 | 5 | grade 2 | 10000 |
| 19966 | 2022-07-15 00:00:00 | 2022-07-31 00:00:00 | 17 | grade 2 | 20000 |
+------------+---------------------+---------------------+------+---------+--------+
3 rows in set (0.004 sec)
I think if it were me, I'd generate a list containing an entry for each day with the effective grade and salary, and then just aggregate at the end. Take a look at this fiddle:
https://dbfiddle.uk/4t2RW2M2
I've started with the aggregate query, just so we can see the output, then I break out pieces of the query to show intermediate outputs. Here is an image of the final output and the query generating it:
SELECT grade, gradeEffective, salary, salaryEffective,
min(dt) as startsOn, max(dt) as endsOn, count(*) as days,
dailyRate,
sum(dailyRate) as pay
FROM (
SELECT DISTINCT dt, grade, gradeEffective, salary, salaryEffective,
ROUND((salary * 12) / 365, 2) as dailyRate
FROM (
SELECT dts.dt,
first_value(r.grade) OVER w as grade,
first_value(r.effectiveFrom) OVER w as gradeEffective,
first_value(s.salary) OVER w as salary,
first_value(s.effectiveFrom) OVER w as salaryEffective
FROM (
WITH RECURSIVE dates(n) AS (SELECT 0 UNION SELECT n + 1 FROM dates WHERE n + 1 <= 30)
SELECT '2022-07-01' + INTERVAL n DAY as dt FROM dates
) dts
LEFT JOIN EmployeeSalary s ON dts.dt >= s.effectiveFrom
LEFT JOIN EmployeeRole r on dts.dt >= r.effectiveFrom
WINDOW w AS (
PARTITION BY dts.dt
ORDER BY r.effectiveFrom DESC, s.effectiveFrom DESC
ROWS UNBOUNDED PRECEDING
)
) z
) a GROUP BY grade, gradeEffective, salary, salaryEffective, dailyRate
ORDER BY min(dt);
Now, the first thing I've done is create a list of dates using a recursive CTE:
WITH RECURSIVE dates(n) AS (SELECT 0 UNION SELECT n + 1 FROM dates WHERE n + 1 <= 30)
SELECT '2022-07-01' + INTERVAL n DAY as dt FROM dates
which produces a list of dates from July 1st to July 31st.
Take that list of dates and left join both of your tables to it, like so:
SELECT *
FROM (
WITH RECURSIVE dates(n) AS (SELECT 0 UNION SELECT n + 1 FROM dates WHERE n + 1 <= 30)
SELECT '2022-07-01' + INTERVAL n DAY as dt FROM dates
) dts
LEFT JOIN EmployeeSalary s ON dts.dt >= s.effectiveFrom
LEFT JOIN EmployeeRole r on dts.dt >= r.effectiveFrom
with the dt greater than or equal to the effective dates. Notice that after the 9th you start to get duplicate rows for each date.
We'll create a window to get the first values for grade and salary for each date, and we'll order first by role effectiveFrom and then salary effectiveFrom, to fulfil your priority condition.
SELECT dts.dt,
first_value(r.grade) OVER w as grade,
first_value(r.effectiveFrom) OVER w as gradeEffective,
first_value(s.salary) OVER w as salary,
first_value(s.effectiveFrom) OVER w as salaryEffective
FROM (
WITH RECURSIVE dates(n) AS (SELECT 0 UNION SELECT n + 1 FROM dates WHERE n + 1 <= 30)
SELECT '2022-07-01' + INTERVAL n DAY as dt FROM dates
) dts
LEFT JOIN EmployeeSalary s ON dts.dt >= s.effectiveFrom
LEFT JOIN EmployeeRole r on dts.dt >= r.effectiveFrom
WINDOW w AS (
PARTITION BY dts.dt
ORDER BY r.effectiveFrom DESC, s.effectiveFrom DESC
ROWS UNBOUNDED PRECEDING
);
This is still going to leave us multiple entries for some dates, although they are duplicates, so let's use that output in a new query, using DISTINCT to leave us only one copy of each row and using the opportunity to add the daily rate field:
SELECT DISTINCT dt, grade, gradeEffective, salary, salaryEffective,
ROUND((salary * 12) / 365, 2) as dailyRate
FROM (
SELECT dts.dt,
first_value(r.grade) OVER w as grade,
first_value(r.effectiveFrom) OVER w as gradeEffective,
first_value(s.salary) OVER w as salary,
first_value(s.effectiveFrom) OVER w as salaryEffective
FROM (
WITH RECURSIVE dates(n) AS (SELECT 0 UNION SELECT n + 1 FROM dates WHERE n + 1 <= 30)
SELECT '2022-07-01' + INTERVAL n DAY as dt FROM dates
) dts
LEFT JOIN EmployeeSalary s ON dts.dt >= s.effectiveFrom
LEFT JOIN EmployeeRole r on dts.dt >= r.effectiveFrom
WINDOW w AS (
PARTITION BY dts.dt
ORDER BY r.effectiveFrom DESC, s.effectiveFrom DESC
ROWS UNBOUNDED PRECEDING
)
) z;
This produces the deduplicated daily data
and now all we have to do is use aggregation to pull out the sums for each combination of grade and salary, which is the query that I started off with.
Let me know if this is what you were looking for, or if anything is unclear.
Since the start and end conditions weren't fleshed out in the question, I just created the date list arbitrarily. It's not difficult to generate the list based on the first effectiveFrom in both tables, and here is an example that runs from that start date until current:
WITH RECURSIVE dates(n) AS (
SELECT min(effectiveFrom) FROM (
select effectiveFrom from EmployeeRole UNION
select effectiveFrom from EmployeeSalary
) z
UNION SELECT n + INTERVAL 1 DAY FROM dates WHERE n <= now()
)
SELECT n as dt FROM dates
I also didn't handle for multiple employees, since there was only one given and I would just be guessing at the shape of the actual data.
You can start adding two new columns (i.e. tmpFrom and tmpTo), which should give the correct dates which are needed to calculate the 9 Days.
SELECT
er.employeeId,
es.salary,
`grade`,
date(er.effectiveFrom) roleEffectiveFrom,
date(es.effectiveFrom) salaryEffectiveFrom,
DATEDIFF(LEAST(COALESCE(LEAD(er.effectiveFrom)
OVER (PARTITION BY er.employeeId ORDER By er.effectiveFrom),
DATE_ADD(LAST_DAY(er.effectiveFrom),INTERVAL 1 DAY)),
DATE_ADD(LAST_DAY(er.effectiveFrom),INTERVAL 1 DAY)),
er.effectiveFrom) as '#Days' ,
ROUND((salary * 12) / 365, 2) dailyRate,
date(er.effectiveFrom) tmpFrom,
(select e2.effectiveFrom
from EmployeeRole e2
where e2.employeeId = er.employeeId and e2.effectiveFrom > er.effectiveFrom
order by e2.effectiveFrom
limit 1) as tmpTo
FROM EmployeeRole er
join EmployeeSalary es ON (es.employeeId = er.employeeId)
and er.employeeId = 19966
order by er.effectiveFrom
;
In above query I used a sub-select, which might hurt performance. You can study Window Function, and check if there is a function which suits your needs better than this sub-query.
It's up to you to calculate the number of days between those two columns, but you should also solve the NULL value which should be end of month (But I am not sure if I remember your problem correctly...)
see: DBFIDDLE
Let's say I have the following table:
date | name | value
----------------------------
2020-09-01 | name1 | 10
2020-09-02 | name1 | 9
2020-09-03 | name1 | 12
2020-09-04 | name1 | 11
2020-09-05 | name1 | 11
I would like to identify names where the latest value >= 10 AND where over the last 5 days it has ever dropped below 10. In the example table above, name1 would be returned because the latest date has a value of 11 (which is > 10), and over the last 5 days it has dropped below 10 at least once.
Here is my SELECT statement, but it always returns zero rows:
SELECT
name,
count(value) as count
FROM table_name
WHERE
(date = #date AND value >= 10) AND
date BETWEEN date_sub(#date, interval 5 day) AND #date AND value < 10
GROUP BY name
HAVING count < 5
ORDER BY name
I understand why it's failing, but I don't know what to change.
In MySQL 8.0, you could use window functions and aggregation:
select name
from (
select t.*, row_number() over(partition by name order by date desc) rn
from mytable t
where date >= #date - interval 5 day and date <= #date
) t
having max(case when rn = 1 then value end) >= 10 and min(value) <= 10
How about something like this:
SELECT Name, COUNT(*) AS Ct FROM
(SELECT A.*,B.mdate,
CASE WHEN A.date=B.mdate AND A.value >= 10 THEN 1
WHEN A.date >= B.mdate - INTERVAL 5 DAY AND A.date <> B.mdate AND A.value < 10 THEN 1
ELSE 0 END AS Chk
FROM table_name A
JOIN (SELECT Name,MAX(DATE) AS mdate FROM table_name GROUP BY Name) B ON A.Name=B.Name
HAVING Chk <> 0) V
GROUP BY Name
HAVING Ct >= 2
Here's a fiddle for reference: https://www.db-fiddle.com/f/jX4GktCdTrUbqHBf7ZQwdr/0
And here's a breakdown of what the query above is doing.
Joining table_name with a sub-query of the same table but with MAX(DATE) value for comparison.
Using CASE function to check for your conditions; if matches with the conditions, it will return 1, if not, return 0. Added HAVING to exclude any 0 value from the CASE function.
Turn the query to become a sub-query (assigned as V) and do a COUNT(*) over how many occurrence happen on the name then using HAVING again to get any name that have 2 or more occurrence.
SELECT detailsID,`Topic 1 Scores`, MAX(Date) as "Date"
FROM Information.scores
WHERE `Topic 1 Scores` IS NOT NULL
GROUP BY `detailsID`,`Topic 1 Scores`
Is printing;
detailsID, Topic 1 Scores, MAX(Date)
2 0 26/09/2017
2 45 26/09/2017
2 100 26/09/2017
3 30 25/09/2017
3 80 14/10/2017
Rather than actually selecting the most recent date per detailsID which would be:
2 100 26/09/2017
3 80 14/10/2017
I want to retrieve TOPIC 1 SCORES with the most recent score (excluding null) (sorted by date) for each detailsID, (there are only detailsID 2 and 3 here, therefore only two results should return)
Solution 1 attempt
Inner subquery
You can do this:
SELECT t1.detailsID, t1.`Topic 1 Scores`, t1.date
FROM scores as t1
INNER JOIN
(
SELECT detailsID, MAX(date) as "LatestDate"
FROM scores
WHERE `Topic 1 Scores` IS NOT NULL
GROUP BY `detailsID`
) AS t2 ON t1.detailsID = t2.detailsID AND t1.date = t2.LatestDate
Demo
The subquery will give you the most recent date for each detailsID then in the outer query, there is a join with the original table to eliminate all the rows except those with the most recent date.
Update:
There are some rows with the same latest date, thats why you will have multiple rows with the same date and the same detailsID, to solve this you can add another aggregate for the score, so that you have only one row for each details id with the latest date and max score:
SELECT t1.detailsID, t1.`Topic 1 Scores`, t1.date
FROM scores as t1
INNER JOIN
(
SELECT detailsID, MAX(`Topic 1 Scores`) AS MaxScore, MAX(date) as "LatestDate"
FROM scores
WHERE `Topic 1 Scores` IS NOT NULL
GROUP BY `detailsID`
) AS t2 ON t1.detailsID = t2.detailsID
AND t1.date = t2.LatestDate
AND t1.`Topic 1 Scores` = t2.MaxScore
updated demo
Results:
| detailsID | Topic 1 Scores | date |
|-----------|----------------|------------|
| 2 | 100 | 2017-09-26 |
| 3 | 80 | 2017-10-14 |
WITH MYCTE AS
(
SELECT DetailsId, [Topic 1 Score], ROW_NUMBER() OVER ( Partition BY DetailsID ORDER BY DATE DESC) Num
FROM Scores
)
SELECT * FROM MYCTE WHERE num = 1
GO
I am working with MySQL. I am trying to get the nights of a booking that belong to each interval in a group of intervals of dates. But there are some intervals that are preferred over others and therefore I will take as many nights for the preferred intervals as possible and fill the gaps with the **not preferred interval **. To illustrate this I will show it here:
Given the dates:
check in => 2016-01-16
check out => 2016-02-08
total nights => 24
Preferred | date_from | date_to | Nights
----------------------------------------------------
1 | 2016-01-15 | 2016-01-17 | 2
1 | 2016-02-03 | 2016-02-10 | 6
1 | 2016-01-20 | 2016-01-25 | 6
0 | 2016-01-20 | 2016-01-31 | 2 (2016-01-26 and 2016-01-31 because the other nights are covered by a preferred period)
1 | 2016-01-27 | 2016-01-30 | 4
0 | 2016-01-15 | 2016-01-17 | 0 (these dates are covered by a the first interval which is a preferred interval )
0 | 2016-02-01 | 2016-02-10 | 2 (just 2016-02-01 and 2016-02-02 because 03 - 08 are covered by the second interval which is a preferred interval)
0 | 2016-01-18 | 2016-01-19 | 2
How can I achieve this in MySQL?
assuming you have a table with columns Preferred,date_from,date_to and you're just trying to calculate # of nights.
You can try this query.
SET #checkin = '2016-01-16';
SET #checkout = '2016-02-08';
SELECT T0.preferred,T0.date_from,T0.date_to,IFNULL(NIGHTS.nights,0) as Nights
FROM YourTable T0
LEFT JOIN
(SELECT T1.preferred,T1.date_from,T1.date_to,COUNT(*) AS Nights
FROM YourTable AS T1
INNER JOIN
(SELECT (#checkin + INTERVAL n DAY) as singleday
FROM numbers
WHERE (#checkin + INTERVAL n DAY) <= #checkout)DAYS1
ON DAYS1.singleday BETWEEN T1.date_from AND T1.date_to
WHERE T1.preferred = 1
OR NOT EXISTS
(SELECT 1
FROM YourTable AS T
WHERE T.preferred = 1
AND DAYS1.singleday BETWEEN T.date_from AND T.date_to
)
GROUP BY T1.preferred,T1.date_from,T1.date_to
)NIGHTS
ON T0.preferred = NIGHTS.preferred
AND T0.date_from = NIGHTS.date_from
AND T0.date_to = NIGHTS.date_to
WHERE
T0.date_from <= #checkout
AND T0.date_to >= #checkin
;
http://sqlfiddle.com/#!9/d64344/10
you can replace #checkout and #checkin occurrences with your actual checkin and check out times.
and you can replace YourTable occurrences with your actual table name
Oh yeah in the sqlfiddle i have included a table called Numbers with column n that contains numbers from 0 counting upward to whatever maximum number of possible days of stay. You need to create this table as well.
to create table numbers use the below
CREATE TABLE numbers AS
SELECT a.n+b.n+c.n+d.n+e.n+f.n+g.n+h.n+i.n as n
FROM
(SELECT 0 as n UNION SELECT 1)a,
(SELECT 0 as n UNION SELECT 2)b,
(SELECT 0 as n UNION SELECT 4)c,
(SELECT 0 as n UNION SELECT 8)d,
(SELECT 0 as n UNION SELECT 16)e,
(SELECT 0 as n UNION SELECT 32)f,
(SELECT 0 as n UNION SELECT 64)g,
(SELECT 0 as n UNION SELECT 128)h,
(SELECT 0 as n UNION SELECT 256)i;
explaination of the query
1) subquery DAYS1 returns all single dates
from #checkin to #checkout range
2) T1 is Joined with DAYS1 WHERE
preferred is 1 OR that there doesnt exist a preferred row that covers
the DAYS1's dates
3) then we do a COUNT(*) GROUP BY
preferred,date_from,date_to to get count of single days
4) Then we call our result NIGHTS
5) Then T0 is LEFT JOINED with NIGHTS to get even rows that have 0 nights
6) And only return T0 rows that intercept out #checkin/#checkout range.
UPDATE If you table is too large you can try and narrow down your subqueries with only rows you're interested in like this
SET #checkin = '2016-01-16';
SET #checkout = '2016-02-08';
SELECT T0.preferred,T0.date_from,T0.date_to,IFNULL(NIGHTS.nights,0) as Nights
FROM (SELECT * FROM YourTable WHERE date_from <= #checkout AND date_to >= #checkin) T0
LEFT JOIN
(SELECT T1.preferred,T1.date_from,T1.date_to,COUNT(*) AS Nights
FROM (SELECT * FROM YourTable WHERE date_from <= #checkout AND date_to >= #checkin) AS T1
INNER JOIN
(SELECT (#checkin + INTERVAL n DAY) as singleday
FROM numbers
WHERE (#checkin + INTERVAL n DAY) <= #checkout)DAYS1
ON DAYS1.singleday BETWEEN T1.date_from AND T1.date_to
WHERE T1.preferred = 1
OR NOT EXISTS
(SELECT 1
FROM (SELECT * FROM YourTable WHERE date_from <= #checkout AND date_to >= #checkin) AS T
WHERE T.preferred = 1
AND DAYS1.singleday BETWEEN T.date_from AND T.date_to
)
GROUP BY T1.preferred,T1.date_from,T1.date_to
)NIGHTS
ON T0.preferred = NIGHTS.preferred
AND T0.date_from = NIGHTS.date_from
AND T0.date_to = NIGHTS.date_to
;
I have a table with numbers and dates (1 number each date and dates aren't necessarily at regular intervals).
I would like to get the count of dates when a number isn't in the table.
Where I am :
select *
from
(
select
date from nums
where chiffre=1
order by date desc
limit 2
) as f
I get this :
date
--------------
2014-09-07
--------------
2014-07-26
Basically, I have this query dynamically:
select * from nums where date between "2014-07-26" and "2014-09-07"
And in a second time, browse the whole table (because there I limited to the first 2 rows but I would compare the 2 and 3 and 3 and 4 etc...)
The goal is to get this:
date | actual_number_of_real_dates_between_two_given_dates
2014-09-07 - 2014-07-26 | 20
2014-04-02 - 2014-02-12 | 13
etc...
How can I do this? Thanks.
Edit:
What I have (just an example, dates and "chiffre" are more complex) :
date | chiffre
2014-09-30 | 2
2014-09-29 | 1
2014-09-28 | 2
2014-09-27 | 2
2014-09-26 | 1
2014-09-25 | 2
2014-09-24 | 2
etc...
What I need for the number "1":
actual_number_of_real_dates_between_two_given_dates
1
3
etc...
Edit 2:
My updated query thanks to Gordon Linoff
select count(n.id) as difference
from nums n inner join
(select min(date) as d1, max(date) as d2
from (select date from nums where chiffre=1 order by date desc limit 2) d
) dd
where n.date between dd.d1 and dd.d2
How can I test row 2 with 3? 3 with 4 etc... Not only last 2?
Should I use a loop? Or I can do it without?
Does this do what you want?
select count(distinct n.date) as numDates,
(datediff(dd.d2, dd.d1) + 1) as datesInPeriod,
(datediff(dd.d2, dd.d1) + 1 - count(distinct n.date)) as missingDates
from nums n cross join
(select date('2014-07-26') as d1, date('2014-09-07') as d2) d
where n.date between dd.d1 and dd.d2;
EDIT:
If you just want the last two dates:
select count(distinct n.date) as numDates,
(datediff(dd.d2, dd.d1) + 1) as datesInPeriod,
(datediff(dd.d2, dd.d1) + 1 - count(distinct n.date)) as missingDates
from nums n cross join
(select min(date) as d1, max(date) as d2
from (select date from nums order by date desc limit 2) d
) dd
where n.date between dd.d1 and dd.d2;