sql query for diff between rows - mysql

I have a table having dates in it, I would want to subtract the first date with the second, the second with the third and so on till the last n-1 with n.
How do I write a query for this?
The table would is called Random and the column name is date
date
+------------+
| 2009-06-20 |
| 2010-02-12 |
| 2012-03-14 |
| 2013-09-10 |
| 2014-01-01 |
| 2015-04-10 |
| 2015-05-01 |
| 2016-01-01 |
+------------+

You need to get the next date. I would use a correlated subquery:
select t.date,
(select min(t2.date)
from t t2
where t2.date > t.date
) as next_date
from t;
You just need to use datediff() to get the difference in days.

Use ROW_NUMBER
For line numbering and then use a sub query to calculate the difference
SELECT column_date
,DATEDIFF( D , column_date
,(SELECT column_date FROM
(
SELECT column_date , ROW_NUMBER() OVER ( ORDER BY column_date) AS RowMum
FROM table_Random AS tBL_1
) AS tbl_2
WHERE tbl_2.RowMum= tBL_1.RowMum-1
)
) DIFF
FROM
(
SELECT column_date , ROW_NUMBER() OVER ( ORDER BY column_date) AS RowMum
FROM table_Random
) AS tBL_1

I did not notice the mysql tag when I wrote my answer first so am updating it now with link to MySQL 8.0 fiddle
https://www.db-fiddle.com/f/myUJYeFrMXmU1piQXAmnv4/0
/* tested against MySQL v8.0 */
WITH T(d) AS (
SELECT '2009-06-20' as d
UNION
SELECT '2010-02-12'
UNION
SELECT '2012-03-14'
UNION
SELECT '2013-09-10'
UNION
SELECT '2014-01-01'
UNION
SELECT '2015-04-10'
UNION
SELECT '2015-05-01'
UNION
SELECT '2016-01-01'
), LAGGED(d, next_d) AS (
SELECT d, LEAD(d) OVER (ORDER BY d ASC) AS next_d
FROM T
)
/* datediff args are in opposite order to SQL server. Also,
only day part is considered */
SELECT l.d, l.next_d, DATEDIFF(l.next_d, l.d) AS n_days
FROM LAGGED AS l
Here is my original answer that targeted SQL Server:
WITH T(d) AS (
SELECT d FROM (
VALUES
('2009-06-20'),
('2010-02-12'),
('2012-03-14'),
('2013-09-10'),
('2014-01-01'),
('2015-04-10'),
('2015-05-01'),
('2016-01-01')
) AS T1(d)
), LAGGED(d, next_d) AS (
SELECT d, LEAD(d) OVER (ORDER BY d ASC) AS next_d
FROM T
)
SELECT l.d, l.next_d, DATEDIFF(DAY, l.d, l.next_d) AS n_days
FROM LAGGED AS l
and produces this output (modulo the fussy hand-editing I have done):
d next_d n_days
2009-06-20 2010-02-12 237
2010-02-12 2012-03-14 761
2012-03-14 2013-09-10 545
2013-09-10 2014-01-01 113
2014-01-01 2015-04-10 464
2015-04-10 2015-05-01 21
2015-05-01 2016-01-01 245
2016-01-01 NULL NULL

Related

Group overlapping ranges of data in MySQL

Is there an easy way avoiding the usage of cursors to convert this:
+-------+------+-------+
| Group | From | Until |
+-------+------+-------+
| X | 1 | 3 |
+-------+------+-------+
| X | 2 | 4 |
+-------+------+-------+
| Y | 5 | 7 |
+-------+------+-------+
| X | 8 | 10 |
+-------+------+-------+
| Y | 11 | 12 |
+-------+------+-------+
| Y | 12 | 13 |
+-------+------+-------+
Into this:
+-------+------+-------+
| Group | From | Until |
+-------+------+-------+
| X | 1 | 4 |
+-------+------+-------+
| Y | 5 | 7 |
+-------+------+-------+
| X | 8 | 10 |
+-------+------+-------+
| Y | 11 | 13 |
+-------+------+-------+
So far I've tried to assign an ID to each row and GROUP BY that ID, but I can't get any closer without using cursors.
SELECT `Group`, `From`, `Until`
FROM ( SELECT `Group`, `From`, ROW_NUMBER() OVER (PARTITION BY `Group` ORDER BY `From`) rn
FROM test t1
WHERE NOT EXISTS ( SELECT NULL
FROM test t2
WHERE t1.`From` > t2.`From`
AND t1.`From` <= t2.`Until`
AND t1.`Group` = t2.`Group` ) ) t3
JOIN ( SELECT `Group`, `Until`, ROW_NUMBER() OVER (PARTITION BY `Group` ORDER BY `From`) rn
FROM test t1
WHERE NOT EXISTS ( SELECT NULL
FROM test t2
WHERE t1.`Until` >= t2.`From`
AND t1.`Until` < t2.`Until`
AND t1.`Group` = t2.`Group` ) ) t4 USING (`Group`, rn)
fiddle
Must work at any overlapping type (partially overlapped, adjacent, fully included).
Will not work if From and/or Until is NULL.
Could you add an explanation in English? – ysth
1st subquery searches joined ranges starts (see the fiddle - it is executed separately) - it searches for From value in a group which is not in the middle/end of any other range (start point equiality allowed).
2nd subquery do the same for joined ranges Until.
Both additionally enumerates found values ascending.
Outer query simply joins each range start and its finish into one row.
If you are using MYSQL version 8+ then you can use row_number to get the desired result:
Demo
SELECT MIN(`FROM`) START,
MAX(`UNTIL`) END,
`GROUP` FROM (
SELECT A.*,
ROW_NUMBER() OVER(ORDER BY `FROM`) RN_FROM,
ROW_NUMBER() OVER(PARTITION BY `GROUP` ORDER BY `UNTIL`) RN_UNTIL
FROM Table_lag A) X
GROUP BY `GROUP`, (RN_FROM - RN_UNTIL)
ORDER BY START;
You can do this with window functions only, using some gaps-and-island technique.
The idea is to build group of consecutive record having the same group and overlapping ranges, using lag() and a window sum(). You can then aggregate the groups:
select grp, min(c_from) c_from, max(c_until) c_until
from (
select
t.*,
sum(lag_c_until < c_from) over(partition by grp order by c_from) mygrp
from (
select
t.*,
lag(c_until, 1, c_until) over(partition by grp order by c_from) lag_c_until
from mytable t
) t
) t
group by grp, mygrp
The column names you chose conflict with SQL keywords (group, from), so I renamed them to grp, c_from and c_until.
Demo on DB Fiddle - with credits to ysth for creating the fiddle in the first place:
grp | c_from | c_until
:-- | -----: | ------:
X | 1 | 4
Y | 5 | 7
X | 8 | 10
Y | 11 | 13
I would use a recursive CTE for this:
with recursive intervals (`Group`, `From`, `Until`) as (
select distinct t1.Group, t1.From, t1.Until
from Table_lag t1
where not exists (
select 1
from Table_lag t2
where t1.Group=t2.Group
and t1.From between t2.From and t2.Until+1
and (t1.From,t1.Until) <> (t2.From,t2.Until)
)
union all
select t1.Group, t1.From, t2.Until
from intervals t1
join Table_lag t2
on t2.Group=t1.Group
and t2.From between t1.From and t1.Until+1
and t2.Until > t1.Until
)
select `Group`, `From`, max(`Until`) as Until
from intervals
group by `Group`, `From`
order by `From`, `Group`;
The anchor expression (select .. where not exists (...)) finds all the group & from that won't combine with some earlier from (so has one row for each row in our eventual output):
Then the recursive query adds rows for merged intervals for each of our rows.
Then just group by group and from (those are awful column names) to get the biggest
interval for each starting group/from.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9efa508504b80e44b73c952572394b76
Alternatively, you can do it with a straightforward set of joins and subqueries, with no CTE or window functions needed:
select
interval_start_range.grp,
interval_start_range.start,
max(merged.finish) finish
from (
select
interval_start.grp,
interval_start.start,
min(later_interval_start.start) next_start
from (
select distinct t1.grp, t1.start, t1.finish
from Table_lag t1
where not exists (
select 1
from Table_lag t2
where t1.grp=t2.grp
and t1.start between t2.start and t2.finish+1
and (t1.start,t1.finish) <> (t2.start,t2.finish)
)
) interval_start
left join (
select distinct t1.grp, t1.start, t1.finish
from Table_lag t1
where not exists (
select 1
from Table_lag t2
where t1.grp=t2.grp
and t1.start between t2.start and t2.finish+1
and (t1.start,t1.finish) <> (t2.start,t2.finish)
)
) later_interval_start
on interval_start.grp=later_interval_start.grp
and interval_start.start < later_interval_start.start
group by interval_start.grp, interval_start.start
) as interval_start_range
join Table_lag merged
on merged.grp=interval_start_range.grp
and merged.start >= interval_start_range.start
and (interval_start_range.next_start is null or merged.start < interval_start_range.next_start)
group by interval_start_range.grp, interval_start_range.start
order by interval_start_range.start, interval_start_range.grp
(I have renamed the columns here to not need backticks.)
Here there's a select to get all the starts of the reportable intervals we will report, joined to another similar select (you could use a CTE to avoid the redundancy) to find the following start of a reportable interval for the same group (if there is one). That's wrapped in a subquery to get the group, the start value, and the start value of the following reportable interval. Then it just needs to join all the other records that start within that range and pick the maximum ending value.
https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=151cc933489c299f7beefa99e1959549

How to show 0 when no data

I want to show 0 or something i want when no data.And this is my query.
SELECT `icDate`,IFNULL(SUM(`icCost`),0) AS icCost
FROM `incomp`
WHERE (`icDate` BETWEEN "2016-01-01" AND "2016-01-05")
AND `compID` = "DDY"
GROUP BY `icDate`
And this is result of this query.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-03 | 3000.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
If every day i want to show data it have a data,It wasn't problem.But it have some day,It don't have data. This will not show this day, Like this.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
But i want it can show data like this.
icDate | icCost
--------------------------
2016-01-01 | 1000.00
2016-01-02 | 2000.00
2016-01-03 | 0.00
2016-01-04 | 4000.00
2016-01-05 | 5000.00
How to write query to get this answer.Thank you.
I made a simulation but I could not see your problem. I created a table for teste and after insert data this was my select. But the test was normal!
SELECT icDate,
format(ifnull(sum(icCost), 0),2) as icCost,
count(icDate) as entries
FROM incomp
WHERE icDate BETWEEN '2016-01-01' AND '2016-01-05'
AND compID = 'DDY'
group by icDate;
This is result of my test, exported in csv file:
icDate | icCost | entries
----------------------------------
2016-01-01 | 8,600.00 | 8
2016-01-02 | 5,600.00 | 4
2016-01-03 | 5,400.00 | 3
2016-01-04 | 0.00 | 1
2016-01-05 | 7,050.00 | 7
Does the icCost field is setting with null value ​​or number zero? Remember some cases that null values ​​setted may be different from other one as empty.
I found the answers, It worked with calendar table.
SELECT tbd.`db_date`,
(SELECT IFNULL(SUM(icCost),0) AS icCost
FROM `incomp`
WHERE icDate = tbd.db_date
AND compID = "DDY"
)AS icCost
FROM tb_date AS tbd
WHERE (tbd.`db_date` BETWEEN "2016-01-01" AND "2016-01-05")
GROUP BY tbd.`db_date`
LIMIT 0,100
Simply, But work.
Ok, you can investigate if you table is filled correctly every day. First you can create a temporary table like this:
CREATE TEMPORARY TABLE myCalendar (
CalendarDate date primary key not null
);
So, after you need to fill this table with valid days. For it, use this procedure:
DELIMITER $$
CREATE PROCEDURE doWhile()
BEGIN
# IF YOU WANT TO USE CURRENT MONTH
#SET #startCount = ADDDATE(LAST_DAY(SUBDATE(CURDATE(), INTERVAL 1 MONTH)), 1);
#SET #endCount = LAST_DAY(sysdate());
# USE TO SET A DATE
SET #startCount = '2016-01-01';
SET #endOfCount = '2016-01-30';
WHILE #startCount <= #endOfCount DO
INSERT INTO myCalendar (CalendarDate) VALUES (#startCount);
SET #startCount = date_add(#startCount, interval 1 day);
END WHILE;
END$$;
DELIMITER ;
You need to run this procedure by command:
CALL doWhile();
Now, run the follow:
SELECT format(ifnull(sum(t1.icCost), 0),2) as icCost,
ifnull(t1.icDate, 'Not found') as icDate,
t2.CalendarDate as 'For the day'
from incomp t1
right join myCalendar t2 ON
t2.CalendarDate = t1.icDate group by t2.CalendarDate;
I think this will help you to find a solution, for example, if exists a register for a day or not.
I hope this can help you!
[]'s
Sorry for my earlier answer. I gave a MSSQL answer instead of a MySQL answer.
You need a calendar table to have a set of all dates in your range. This could be a permanent table or a temporary table. Either way, there are a number of ways to populate it. Here is one way (borrowed from here):
set #beginDate = '2016-01-01';
set #endDate = '2016-01-05';
create table DateSequence(Date Date);
insert into DateSequence
select * from
(select adddate('1970-01-01',t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) selected_date from
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t4) v
where selected_date between #beginDate and #endDate
Your best bet is probably to make a permanent table that has every possible date. That way you only have to populate it once and it's ready to go whenever you need it.
Now you can outer join the calendar table with your inComp table.
set #beginDate date = '2016-01-01'
set #endDate date = '2016-01-05'
select d.Date,
sum(ifnull(i.icCost, 0)) inComp
from DateSequence d
left outer join inComp i on i.icDate = d.Date
where d.Date between #beginDate and #endDate
and i.compID = 'DDY'
group by d.date
order by d.Date;

get totals each day based on a given timestamp

I have a simple table:
user | timestamp
===================
Foo | 1440358805
Bar | 1440558805
BarFoo | 1440559805
FooBar | 1440758805
I would like to get a view with total number of users each day:
date | total
===================
...
2015-08-23 | 1 //Foo
2015-08-24 | 1
2015-08-25 | 1
2015-08-26 | 3 //+Bar +BarFoo
2015-08-27 | 3
2015-08-28 | 4 //+FooBar
...
What I currently have is
SELECT From_unixtime(a.timestamp, '%Y-%m-%d') AS date,
Count(From_unixtime(a.timestamp, '%Y-%m-%d')) AS total
FROM thetable AS a
GROUP BY From_unixtime(a.timestamp, '%Y-%m-%d')
ORDER BY a.timestamp ASC
which counts only the user of a certain day:
date | total
===================
2015-08-23 | 1 //Foo
2015-08-26 | 2 //Bar +BarFoo
2015-08-28 | 1 //FooBar
I've prepared a sqlfiddle
EDIT
The solution by #splash58 returns this result:
date | #t:=coalesce(total, #t)
==================================
2015-08-23 | 1
2015-08-26 | 3
2015-08-28 | 4
2015-08-21 | 4
2015-08-22 | 4
2015-08-24 | 4
2015-08-25 | 4
2015-08-27 | 4
2015-08-29 | 4
2015-08-30 | 4
You can get the cumulative values by using variables:
SELECT date, total, (#cume := #cume + total) as cume_total
FROM (SELECT From_unixtime(a.timestamp, '%Y-%m-%d') as date, Count(*) AS total
FROM thetable AS a
GROUP BY From_unixtime(a.timestamp, '%Y-%m-%d')
) a CROSS JOIN
(SELECT #cume := 0) params
ORDER BY date;
This gives you the dates that are in your data. If you want additional dates (where no users start), then one way is a calendar table:
SELECT c.date, a.total, (#cume := #cume + coalesce(a.total, 0)) as cume_total
FROM Calendar c JOIN
(SELECT From_unixtime(a.timestamp, '%Y-%m-%d') as date, Count(*) AS total
FROM thetable AS a
GROUP BY From_unixtime(a.timestamp, '%Y-%m-%d')
) a
ON a.date = c.date CROSS JOIN
(SELECT #cume := 0) params
WHERE c.date BETWEEN '2015-08-23' AND '2015-08-28'
ORDER BY c.date;
You can also put the dates explicitly in the query (using a subquery), if you don't have a calendar table.
To save order of dates, i think, we need to wrap query in one more select
select date, #n:=#n + ifnull(total,0) total
from
(select Calendar.date, total
from Calendar
left join
(select From_unixtime(timestamp, '%Y-%m-%d') date, count(*) total
from thetable
group by date) t2
on Calendar.date= t2.date
order by date) t3
cross join (select #n:=0) n
Demo on sqlfiddle
You can use function
TIMESTAMPDIFF(DAY,`timestamp_field`, CURDATE())
You will not have to convert timestamp to other field dypes.
drop table if exists thetable;
create table thetable (user text, timestamp int);
insert into thetable values
('Foo', 1440358805),
('Bar', 1440558805),
('BarFoo', 1440559805),
('FooBar', 1440758805);
DROP PROCEDURE IF EXISTS insertTEMP;
DELIMITER //
CREATE PROCEDURE insertTEMP (first date, last date) begin
drop table if exists Calendar;
CREATE TEMPORARY TABLE Calendar (date date);
WHILE first <= last DO
INSERT INTO Calendar Values (first);
SET first = first + interval 1 day;
END WHILE;
END //
DELIMITER ;
call insertTEMP('2015-08-23', '2015-08-28');
select Calendar.date, #t:=coalesce(total, #t)
from Calendar
left join
(select date, max(total) total
from (select From_unixtime(a.timestamp, '%Y-%m-%d') AS date,
#n:=#n+1 AS total
from thetable AS a, (select #n:=0) n
order by a.timestamp ASC) t1
group by date ) t2
on Calendar.date= t2.date,
(select #t:=0) t
result
date, #t:=coalesce(total, #t)
2015-08-23 1
2015-08-24 1
2015-08-25 1
2015-08-26 3
2015-08-27 3
2015-08-28 4

Merging 2 Table and GROUP BY date

I need to merge multiple table group by the count base on date's day.
Below are my table structure :
#table1
id date
1 2015-07-01 00:00:00
2 2015-07-02 00:00:00
3 2015-07-03 00:00:00
#table2
id date
1 2015-07-02 00:00:00
2 2015-07-02 00:00:00
3 2015-07-02 00:00:00
4 2015-07-10 00:00:00
What I wanted to achieve :
#query result
date t1_count t2_count
2015-07-01 1 NULL
2015-07-02 1 3
2015-07-03 1 NULL
2015-07-10 NULL 1
Below are my query that refer to this link:
SELECT left(A.date,10) AS `day`
, COUNT(A.ID) AS `a_count`
, COUNT(B.ID) AS `b_count`
FROM table1 A
LEFT JOIN table2 B
ON LEFT(A.date,10) = LEFT(B.date,10)
GROUP BY LEFT(A.date,10)
UNION
SELECT left(B.date,10) AS `day`
, COUNT(A.ID) AS `a_count`
, COUNT(B.ID) AS `b_count`
FROM table1 A
RIGHT JOIN table2 B
ON LEFT(A.date,10) = LEFT(B.date,10)
GROUP BY LEFT(A.date,10);
but the result was
#query result
date t1_count t2_count
2015-07-01 1 0
2015-07-02 3 3
2015-07-03 1 0
2015-07-10 0 1
I'd try to modified and search other solution like UNION ALL, LEFT JOIN, etc, but I'd no luck to solve this problem.
You can do this using union all and group by:
select date, sum(istable1) as numtable1, sum(istable2) as numtable2
from ((select date(date) as date, 1 as istable1, NULL as istable2
from table1
) union all
(select date(date) as date, NULL as istable1, 1 as istable2
from table2
)
) t
group by date
order by 1;
Under some circumstances, it can be faster to aggregate the data in the subqueries as well:
select date, sum(numtable1) as numtable1, sum(numtable2) as numtable2
from ((select date(date) as date, count(*) as numtable1, NULL as numtable2
from table1
group by date(date)
) union all
(select date(date) as date, NULL as numtable1, count(*) as numtable2
from table2
group by date(date)
)
) t
group by date
order by 1;
If you want 0 instead of NULL in the desired results, use 0 instead of NULL in the subqueries.

Finding a previous, non-contiguous date using SQL

Suppose a table, tableX, like this:
| date | hours |
| 2014-07-02 | 10 |
| 2014-07-03 | 10 |
| 2014-07-07 | 20 |
| 2014-07-08 | 40 |
The dates are 'workdays' -- that is, no weekends or holidays.
I want to find the increase in hours between consecutive workdays, like this:
| date | hours |
| 2014-07-03 | 0 |
| 2014-07-07 | 10 |
| 2014-07-08 | 20 |
The challenge is dealing with the gaps. If there were no gaps, something like
SELECT t1.date1 AS 'first day', t2.date1 AS 'second day', (t2.hours - t1.hours)
FROM tableX t1
LEFT JOIN tableX t2 ON t2.date1 = DATE_add(t1.date1, INTERVAL 1 DAY)
ORDER BY t2.date1;
would get it done, but that doesn't work in this case as there is a gap between 2014-07-03 and 2014-07-07.
Just use a correlated subquery instead. You have two fields, so you can do this with two correlated subqueries, or a correlated subquery with a join back to the table. Here is the first version:
SELECT t1.date1 as `first day`,
(select t2.date1
from tableX t2
where t2.date1 > t.date1
order by t2.date asc
limit 1
) as `next day`,
(select t2.hours
from tableX t2
where t2.date1 > t.date1
order by t2.date asc
limit 1
) - t.hours
FROM tableX t
ORDER BY t.date1;
Another alternative is to rank the data by date and then subtract the hours of the previous workday's date from the hours of the current workday's date.
SELECT
ranked_t1.date1 date,
ranked_t1.hours - ranked_t2.hours hours
FROM
(
SELECT t.*,
#rownum := #rownum + 1 AS rank
FROM (SELECT * FROM tableX ORDER BY date1) t,
(SELECT #rownum := 0) r
) ranked_t1
INNER JOIN
(
SELECT t.*,
#rownum2 := #rownum2 + 1 AS rank
FROM (SELECT * FROM tableX ORDER BY date1) t,
(SELECT #rownum2 := 0) r
) ranked_t2
ON ranked_t2.rank = ranked_t1.rank - 1;
SQL Fiddle demo
Note:
Obviously an index on tableX.date1 would speed up the query.
Instead of a correlated subquery, a join is used in the above query.
Reference:
Mysql rank function on SO
Unfortunately, MySQL doesn't (yet) have analytic functions which would allow you to access the "previous row" or the "next row" of the data stream. However, you can duplicate it with this:
select h2.LogDate, h2.Hours - h1.Hours as Added_Hours
from Hours h1
left join Hours h2
on h2.LogDate =(
select Min( LogDate )
from Hours
where LogDate > h1.LogDate )
where h2.LogDate is not null;
Check it out here. Note the index on the date field. If that field is not indexed, this query will take forever.