calculate the differences between two rows in SQL - mysql

I have a SQL table, one row is the revenue in the specific day, and I want to add a new column in the table, the value is the incremental (could be positive or negative) revenue between a specific day and the previous day, and wondering how to implement by SQL?
Here is an example,
original table,
...
Day1 100
Day2 200
Day3 150
...
new table (add incremental column at the end, and for first column, could assign zero),
Day1 100 0
Day2 200 100
Day3 150 -50
I am using MySQL/MySQL Workbench.
thanks in advance,
Lin

SELECT a.day, a.revenue , a.revenue-COALESCE(b.revenue,0) as previous_day_rev
FROM DailyRevenue a
LEFT JOIN DailyRevenue b on a.day=b.day-1
the query assume that each day has one record in the table. If there could be more than 1 row for each day you need to create a view that sums up all days grouping by day.

If you're okay with re-ordering the columns slightly, something like this is pretty simple to understand:
SET #prev := 0;
SELECT day, revenue - #prev AS diff, #prev := revenue AS revenue
FROM revenue ORDER BY day ASC;
The trick is that we calculate the difference to the previous first, then set the previous to the current and display it as the current in one step.
Note, this depends on the order being correct since the calculations are done during the returning of the rows, so you need to make sure you have an ORDER BY clause that returns the days in the correct order.

Try;
select
t.date_col, t.val_col,
case when t1.val_col is null then 0
else t.val_col - t1.val_col end diff
from (
select t.* , #r := #r + 1 lev
from tbl t,
(select #r := 0) r
order by t.date_col
) t
left join (
select t.* , #r1 := #r1 + 1 lev
from tbl t,
(select #r1 := 1) r
order by t.date_col
) t1
on t.lev = t1.lev
This will calculate value diff even if there is a missing date

Related

Grouping rows via two different columns in MYSQL

I just want to ask if grouping rows with the same value but came from different columns is possible.
I have a scenario that we should sum up the total minutes if the records are found "continuous" transactions by checking if the STARTDATETIME column matches the previous data of ENDDATETIME column if they are the same. See image link below for reference.
Thanks guys.
I modified Gordon Linoff's solution ( see my comment under the question):
SELECT
c.employee_id
,MIN(c.start_date) AS start_date
,MAX(c.end_date) AS end_date
,COUNT(*) AS numcontracts,
TIMESTAMPDIFF(minute,MIN(c.start_date),MAX(c.end_date)) AS timediff
FROM
(
SELECT
c0.*
,(#rn := #rn + COALESCE(startflag, 0)) AS cumestarts
FROM
(SELECT c1.*,
(NOT EXISTS (SELECT 1
FROM contracts c2
WHERE c1.employee_id = c2.employee_id AND
c1.start_date = c2.end_date
)
) AS startflag
FROM contracts c1
ORDER BY employee_id, start_date
) c0 CROSS JOIN (SELECT #rn := 0) params
) c
GROUP BY c.employee_id, c.cumestarts
http://rextester.com/VOGMU19779
timediff contains the minutes passed in the combined interval.

How to eliminate only continuous duplicates but not all duplicates in a select query (MySQL)?

I have a table like this:
01-Jul-17 100
02-Jul-17 100
03-Jul-17 300
04-Jul-17 300
05-Jul-17 500
06-Jul-17 500
07-Jul-17 300
08-Jul-17 400
09-Jul-17 100
10-Jul-17 100
What I want to output is (in this order) by eliminating the continuous duplicates but not all duplicates:
100
300
500
300
400
100
I cannot select Distinct, as it will eliminate the second instances of 300, 100. Is there a way to achieve this result in MySQL?
Thanks!
You want to get the previous value. If the dates really have no gaps or duplicates, just do:
select t.*
from t left join
t tprev
on t.col1 = date_add(tprev.col1, interval 1 day)
where tprev.col2 is null or tprev.col2 <> t.col2;
EDIT:
If the dates don't meet these conditions, then you can use variables:
select t.*
from (select t.*,
(#rn := if(#v = col2, #rn + 1,
if(#v := col2, 1, 1)
)
) as rn
from t cross join
(select #v := 0, #rn := 0) params
order by t.col1
) t
where rn = 1;
Note that MySQL does not guarantee the order of evaluation of expressions in the SELECT. So variables should not be assigned in one expression and then used in another -- they should be assigned in a single expression.
One way to handle this problem is by using session variables to track the changes of the values as ordered by your date column. In the query below, we keep track of the value, ordered by date, and assign a row number to each group of identical value. Then, only the first value in each group is retained. Note that this approach is robust to any number of duplicates. It is also robust with respect to there being gaps in your dates, so long as each record can be ordered by date.
SET #rn = 1;
SET #val = NULL;
SELECT t.val
FROM
(
SELECT
#rn:=CASE WHEN #val = val THEN #rn+1 ELSE 1 END rn,
#val:=val AS val,
dt
FROM yourTable
ORDER BY dt
) t
WHERE t.rn = 1
ORDER BY t.dt;
Output:
Demo here:
Rextester
You can make use of lag and lead functions.
select y from (select y , lag(y,1,0) over (order by x) as prev_y from t1) where y <> prev_y;

Mysql query get SUM() specific row?

Is it possible to get specific row in query using like SUM?
Example:
id tickets
1 10 1-10 10=10
2 35 11-45 10+35=45
3 45 46-90 10+35+45=90
4 110 91-200 10+35+45+110=200
Total: 200 tickets(In SUM), I need to get row ID who have ticket with number like 23(Output would be ID: 2, because ID: 2 contains 11-45tickets in SUM)
You can do it by defining a local variable into your select query (in form clause), e.g.:
select id, #total := #total + tickets as seats
from test, (select #total := 0) t
Here is the SQL Fiddle.
You seem to want the row where "23" fits in. I think this does the trick:
select t.*
from (select t.*, (#total := #total + tickets) as running_total
from t cross join
(select #total := 0) params
order by id
) t
where 23 > running_total - tickets and 23 <= running_total;
SELECT
d.id
,d.tickets
,CONCAT(
TRIM(CAST(d.RunningTotal - d.tickets + 1 AS CHAR(10)))
,'-'
,TRIM(CAST(d.RunningTotal AS CHAR(10)))
) as TicketRange
,d.RunningTotal
FROM
(
SELECT
id
,tickets
,#total := #total + tickets as RunningTotal
FROM
test
CROSS JOIN (select #total := 0) var
ORDER BY
id
) d
This is similar to Darshan's answer but there are a few key differences:
You shouldn't use implicit join syntax, explicit join has more functionality in the long run and has been a standard for more than 20 years
ORDER BY will make a huge difference on your running total when calculated with a variable! if you change the order it will calculate differently so you need to consider how you want to do the running total, by date? by id? by??? and make sure you put it in the query.
finally I actually calculated the range as well.
And here is how you can do it without using variables:
SELECT
d.id
,d.tickets
,CONCAT(
TRIM(d.LowRange)
,'-'
,TRIM(
CAST(RunningTotal AS CHAR(10))
)
) as TicketRange
,d.RunningTotal
FROM
(
SELECT
t.id
,t.tickets
,CAST(COALESCE(SUM(t2.tickets),0) + 1 AS CHAR(10)) as LowRange
,t.tickets + COALESCE(SUM(t2.tickets),0) as RunningTotal
FROM
test t
LEFT JOIN test t2
ON t.id > t2. id
GROUP BY
t.id
,t.tickets
) d

Find gaps in mysql Time

I have a table "channel_001" with timestamp column Time, and i did separate it by 10 minutes.
2013-01-01;00:10:04;
2013-01-01;00:20:00;
2013-01-01;00:30:02;
2013-01-01;00:40:04;
But there are missing datas. How can i detect a missing row? And then insert a row there?!
For example:
2013-01-01;00:10:04;
2013-01-01;00:20:00;
2013-01-01;00:30:02
2013-01-01;00:40:04;
2013-01-01;01:00:02;
then it would be missing:
2013-01-01;00:50:00;
I was thinking of using Join the table to itself, but im new in SQL and too much of a novice to finde the answere alone.
Any ideas?
You can find rows that don't have a "next" time with something like:
select c.*
from channel_001 c
where not exists (select 1
from channel_001 c2
where c2.timestamp > c.timestamp + interval 9 minute and
c2.timestamp < c.timestamp + interval 11 minute
);
If your table is large (tens of thousands of rows), you will probably want to use variables. The following code gets the previous timestamp:
select c.*,
(case when (#tmp := #prevts) is null then null
when (#prevts := timestamp) is null then null
else #tmp
end) as prev_timestamp
from channel_001 c cross join
(select #prevts := 0, #tmp := 0) vars
order by timestamp;
You can use this as a subquery to get gaps that are outside your range.

mysql percentile rank by group

I have a table containing date, id, and value, with about 1000 id rows per date. I need to calculate the percentile rank of each row, by date. I am using the following code for percentile rank for a single date, but with over 10 years of daily data this is very inefficient to run date-by-date. Seems that it should be able to be formulated in MySQL but I've not been able to make it work.
Date ID Value
date1 01 -7.2
date1 02 0.6
date2 01 1.2
date2 02 3.8
SELECT c.id, c.value, ROUND( (
(#rank - rank) / #rank ) *100, 2) AS rank
FROM (
SELECT * , #prev := #curr , #curr := a.value,
#nxtRnk := #nxtRnk + 1,
#rank := IF( #prev = #curr , #rank , #nxtRnk ) AS rank
FROM (
SELECT id, value
FROM temp
WHERE date = '2013-06-28'
) AS a, (
SELECT #curr := NULL , #prev := NULL , #rank :=0, #nxtRnk :=0
) AS b
ORDER BY value DESC
) AS c
So basically I want to SELECT DISTINCT(date), and then for each date perform the above SELECT, which is preceeded by INSERT INTO table2( ... ) to write the results to table2.
Thanks for any help,
Hugh
I finally developed an acceptable solution by using a temporary table. Maybe not the optimum solution, but it works in about 5 sec on a million + record table.
My temporary table (t1) contains date and the count of rows for date.
The third select above is changed to
SELECT t1.date, t1.cnt, id, value FROM t1 LEFT JOIN temp ON(t1.date = temp.date)
Also, the calculations in the first SELECT above were changed to use c.cnt rather than #rank, and an #prevDate variable was created to reset the rank count on date changes.
Thanks to anyone who looked at this and tried to work up a solution.
I was trying to solve this for quite some time and then I found the following answer. Honestly brilliant. Also quite fast even for big tables (the table where I used it contained approx 5 mil records and needed a couple of seconds).
SELECT
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX( GROUP_CONCAT(field_name ORDER BY
field_name SEPARATOR ','), ',', 95/100 * COUNT(*) + 1), ',', -1) AS DECIMAL)
AS 95th Per
FROM table_name;
As you can imagine just replace table_name and field_name with your table's and column's names.
For further information check Roland Bouman's original post