How to combine overlapping date ranges in MySQL? - mysql

I have a table „lessons_holidays“. It can contain several holiday date ranges i.e.
"holiday_begin" "holiday_end"
2016-06-15 2016-06-15
2016-06-12 2016-06-16
2016-06-15 2016-06-19
2016-06-29 2016-06-29
I would like to combine each entry if the date ranges overlap. And I need an output like this:
"holiday_begin" "holiday_end"
2016-06-12 2016-06-19
2016-06-29 2016-06-29
SQL: The following sql-statement loads all rows. Now I am stuck.
SELECT lh1.holiday_begin, lh1.holiday_end
FROM local_lessons_holidays lh1
WHERE lh1.holiday_impact = '1' AND
(DATE_FORMAT(lh1.holiday_begin, '%Y-%m') <= '2016-06' AND DATE_FORMAT(lh1.holiday_end, '%Y-%m') >= '2016-06') AND
lh1.uid = '1'

This is a hard problem, made harder because you are using MySQL. Here is an idea. Find the beginning date of all holidays in each group. Then a cumulative sum of the flag for the "beginning" handles the groups. The rest is just aggregation.
The following should do what you want, assuming you have no duplicate records:
select min(holiday_begin) as holiday_begin, max(holiday_end) as holiday_end
from (select lh.*, (#grp := #grp + group_flag) as grp
from (select lh.*,
(case when not exists (select 1
from lessons_holidays lh2
where lh2.holiday_begin <= lh.holiday_end and
lh2.holiday_end > lh.holiday_begin and
(lh2.holiday_begin <> lh.holiday_begin or
lh2.holiday_end < lh.holiday_end)
)
then 1 else 0
end) as group_flag
from lessons_holidays lh
) lh cross join
(select #grp := 0) params
order by holiday_begin, holiday_end
) lh
group by grp;
If you have duplicates, just use select distinct on the innermost references to the table.
Here is a SQL Fiddle.
EDIT:
This version appears to work better:
select min(holiday_begin) as holiday_begin, max(holiday_end) as holiday_end
from (select lh.*, (#grp := #grp + group_flag) as grp
from (select lh.*,
(case when not exists (select 1
from lessons_holidays lh2
where lh2.holiday_begin <= lh.holiday_begin and
lh2.holiday_end > lh.holiday_begin and
lh2.holiday_begin <> lh.holiday_begin and
lh2.holiday_end <> lh.holiday_end
)
then 1 else 0
end) as group_flag
from lessons_holidays lh
) lh cross join
(select #grp := 0) params
order by holiday_begin, holiday_end
) lh
group by grp;
As illustrated here.

Related

Select the distinct latest 2 rows based on the timestamp column

I have a table like below. I want to extract the latest(based on time) 2 rows having same id. If no rows are same do not return anything. Then subtract the values of the latest row with the second latest and return a table with the ID and the value result.
Below is the table. 1st column is the id. Second is the value, third is the time. Id is not primary or unique
Id value time
3 2 2019-01-11 18:59:07.403
2 7 2019-01-10 18:58:40.400
4 5 2019-01-12 18:58:42.400
2 2 2019-01-11 18:59:23.147
5 -5 2019-01-12 18:58:42.400
3 8 2019-01-12 18:59:27.670
2 5 2019-01-12 18:59:43.777
The result should be
id value
2 3
3 6
One possible solution uses aggregation to get the IDs which occur more than once and correlated subqueries with ORDER BY and LIMIT to get the latest and second latest value.
SELECT x.id,
(SELECT t.value
FROM elbat t
WHERE t.id = x.id
ORDER BY t.time DESC
LIMIT 0, 1)
-
(SELECT t.value
FROM elbat t
WHERE t.id = x.id
ORDER BY t.time DESC
LIMIT 1, 1) value
FROM (SELECT t.id
FROM elbat t
GROUP BY t.id
HAVING count(*) > 1) x;
db<>fiddle
In MySQL 8+, you can use window functions and conditional aggregation
select t.id,
sum(case when seqnum = 1 then value else - value end) as diff
from (select t.*,
row_number() over (partition by id order by time desc) as seqnum
from elbat t
) t
where seqnum in (1, 2)
group by id
order by max(time) desc
limit 2;
The same idea can be adapted to earlier versions, using variables:
select t.id,
sum(case when seqnum = 1 then value else - value end) as diff
from (select t.*,
(#rn := if(#i = id, #rn + 1,
if(#i := id, 1, 1)
)
) as seqnum
from (select t.* from elbat t order by id, time desc) t cross join
(select #i := -1, #rn := 0) params
) t
where seqnum in (1, 2)
group by id
order by max(time) desc
limit 2;

Check if a user was "active" in multiple rows - MySQL

How would I go about creating group_ids in the following example based on the area(s) the users are active in?
group_id rep_id area datebegin dateend
1 1000 A 1/1/15 1/1/16
1 1000 B 1/1/15 1/1/16
2 1000 C 1/2/16 12/31/99
In the table you can see that rep 1000 was active in both A and B between 1/15 and 1/16. How would I go about coding the group_id field to group by datebegin & dateend?
Thanks for any help.
You can use variables in order to enumerate groups of records having identical rep_id, datebegin, dateend values:
SELECT rep_id, datebegin, dateend,
#rn := IF(#rep_id <> rep_id,
IF(#rep_id := rep_id, 1, 1),
#rn + 1) AS rn
FROM (
SELECT rep_id, datebegin, dateend
FROM mytable
GROUP BY rep_id, datebegin, dateend) AS t
CROSS JOIN (SELECT #rep_id := 0, #rn := 0) AS v
ORDER BY rep_id, datebegin
Output:
rep_id, datebegin, dateend, rn
-----------------------------------
1000, 2015-01-01, 2016-01-01, 1
1000, 2016-02-01, 2099-12-03, 2
You can use the above query as a derived table and join back to the original table. rn field is the group_id field you are looking for.
You can use variables to assign groups. As you said, only if the date_begin and date_end exactly match for 2 rows, they would be in the same group. Else a new group starts.
select rep_id,area,date_begin,date_end,
,case when #repid <> rep_id then #rn:=1 --reset the group to 1 when rep_id changes
when #repid=rep_id and #begin=date_begin and #end=date_end then #rn:=#rn --if rep_id,date_begin and date_end match use the same #rn previously assigned
else #rn:=#rn+1 --else increment #rn by 1
end as group_id
,#begin:=date_begin
,#end:=date_end
,#repid:=rep_id
from t
cross join (select #rn:=0,#begin:='',#end:='',#repid:=-1) r
order by rep_id,date_begin,date_end
The above query includes variables in the output. To only get the group_id use
select rep_id,area,date_begin,date_end,group_id
from (
select rep_id,area,date_begin,date_end
,case when #repid <> rep_id then #rn:=1
when #repid=rep_id and #begin=date_begin and #end=date_end then #rn:=#rn
else #rn:=#rn+1
end as group_id
,#begin:=date_begin
,#end:=date_end
,#repid:=rep_id
from t
cross join (select #rn:=0,#begin:='',#end:='',#repid:=-1) r
order by rep_id,date_begin,date_end
) x

MYSQL - Total registrations per day

I have the following structure in my user table:
id(INT) registered(DATETIME)
1 2016-04-01 23:23:01
2 2016-04-02 03:23:02
3 2016-04-02 05:23:03
4 2016-04-03 04:04:04
I want to get the total (accumulated) user count per day, for all days in DB
So result should be something like
day total
2016-04-01 1
2016-04-02 3
2016-04-03 4
I tried some sub querying, but somehow i have now idea how to achieve this with possibly 1 SQL statement. Of course if could group by per day count and add them programmatically, but i don't want to do that if possible.
You can use a GROUP BY that does all the counts, without the need of doing anything programmatically, please have a look at this query:
select
d.dt,
count(*) as total
from
(select distinct date(registered) dt from table1) d inner join
table1 r on d.dt>=date(r.registered)
group by
d.dt
order by
d.dt
the first subquery returns all distinct dates, then we can join all dates with all previous registrations, and do the counts, all in one query.
An alternative join condition that can give some improvements in performance is:
on d.dt + interval 1 day > r.registered
Not sure why not just use GROUP BY, without it this thing will be more complicated, anyway, try this;)
select
date_format(main.registered, '%Y-%m-%d') as `day`,
main.total
from (
select
table1.*,
#cnt := #cnt + 1 as total
from table1
cross join (select #cnt := 0) t
) main
inner join (
select
a.*,
if(#param = date_format(registered, '%Y-%m-%d'), #rowno := #rowno + 1 ,#rowno := 1) as rowno,
#param := date_format(registered, '%Y-%m-%d')
from (select * from table1 order by registered desc) a
cross join (select #param := null, #rowno := 0) tmp
having rowno = 1
) sub on main.id = sub.id
SQLFiddle DEMO

Better optimized SELECT SQL query for 50,000+ records

I have a query which works great for 1000 records or less but now I need to optimize it for 50,000+ records and when I run it on that it just stalls...
Here is my code:
SELECT
b1.account_num,b1.effective_date as ed1,b1.amount as am1,
b2.effective_date as ed2,b2.amount as am2
FROM bill b1
left join bill b2 on (b1.account_num=b2.account_num)
where b1.effective_date = (select max(effective_date) from bill where account_num = b1.account_num)
and (b2.effective_date = (select max(effective_date) from bill where account_num = b1.account_num and effective_date < (select max(effective_date) from bill where account_num = b1.account_num)) or b2.effective_date is null)
ORDER BY b1.effective_date DESC
My objective is to get the latest two effective dates and amounts from one table with many records.
Here is a working answer from your SQL-Fiddle baseline
First, the inner preQuery gets the max date per account. That is then joined to the bill table per account AND the effective date is less than the max already detected.
That is then joined to each respective bill for their amounts.
select
FB1.account_num,
FB1.effective_date as ed1,
FB1.amount as am1,
FB2.effective_date as ed2,
FB2.amount as am2
from
( select
pq1.account_num,
pq1.latestBill,
max( b2.effective_date ) as secondLastBill
from
( SELECT
b1.account_num,
max( b1.effective_date ) latestBill
from
bill b1
group by
b1.account_num ) pq1
LEFT JOIN bill b2
on pq1.account_num = b2.account_num
AND b2.effective_date < pq1.latestBill
group by
pq1.account_num ) Final
JOIN Bill FB1
on Final.Account_Num = FB1.Account_Num
AND Final.LatestBill = FB1.Effective_Date
LEFT JOIN Bill FB2
on Final.Account_Num = FB2.Account_Num
AND Final.secondLastBill = FB2.Effective_Date
ORDER BY
Final.latestBill DESC
In mysql , window analytic function like row_number is not there, so we can simulate the same using variables.
The good thing is, the table is scanned only once with this approach.
A row_number is assigned to each partition which is divided based on ( account number, effective date ) and only 2 rows are selected from each partition.
select account_num,
max(case when row_number =1 then effective_date end) as ed1,
max(case when row_number =1 then amount end) as am1,
max(case when row_number =2 then effective_date end) as ed2,
max(case when row_number =2 then amount end )as am2
from (
select account_num, effective_date, amount,
#num := if(#prevacct= account_num , #num + 1, 1) as row_number,
#prevacct := account_num as dummy
from bill, (select #num:=0, #prevacct := '' ) as var
order by account_num , effective_date desc
)T
where row_number <=2
group by account_num

Sum two rows (one previous row and one is current row in same table) in SQL

I have this table:
Date_on deposited withdrawal in_bank
2012-09-1 3000 2000 50000
2012-09-2/t 4000/t 0 54000
2013-09-3/t 3000 2000 55000
Now I want to execute a query to add the deposited amounts and subtract the withdrawals from the previous days entry in_bank. How can I do that? Can any one help me on that? This is my query:
select date_on, in_bank,((in_bank+deposited)-withdrawal)
from tablename where date_on > '2012-09-01' order by date_on
SELECT today.withdrawal, today.deposited, today.date_on,
(IFNULL(today.deposited, 0) - IFNULL(today.withdrawal, 0)) + IFNULL((SELECT in_bank FROM tablename AS yesterday WHERE yesterday.date_on = DATE_SUB(today.date_on, INTERVAL 1 DAY)), 0)
FROM tablename AS today
WHERE date_on BETWEEN '2012-09-01' AND '2012-09-02'
ORDER BY date_on ASC
The query will assume the balance was zero if no previous days date can be found.
Added a fiddle
http://sqlfiddle.com/#!2/d8261/14
Try this:
SELECT
t1.date_on,
t1.in_bank,
(
SELECT in_bank
FROM
(
SELECT *, (#rownum2 := #rownum2 +1) rank
FROM Table1,(SELECT #rownum2 :=0 ) t
ORDER BY date_on
) t2 WHERE t1.rank - t2.rank = 1
) - t1.withdrawal - deposited "total"
FROM
(
SELECT *, (#rownum := #rownum +1) rank
FROM Table1,(SELECT #rownum :=0 ) t
ORDER BY date_on
) t1;
SQL Fiddle Demo
SELECT
date_on,
in_bank,
(( ISNULL(in_bank, 0) + ISNULL(in_deposited,0)) - ISNULL(withdrawal,0))
FROM tablename
WHERE date_on > '2012-09-01'
ORDER BY date_on
I am not very familiar with mysql but this might help
select account.*,D.expectable from account left join
(
select deposited - withdrawal + in_bank as expectable,DATE_ADD(Date_on,INTERVAL 1 DAY) as nDate from account
)
D on account.Date_on = D.nDate