Find difference between sequential rows mysql, no row ID - mysql

Objective: get the difference between the value in a row and the value in the next row (I'm using MySQL). Say we have the table "events":
step: timestamp:
Leave for store 1400000000
Buy hamburgers 1400000002
Big party 1400000005
So the result we'd expect is:
2
3
Complication 1: My table doesn't have an ID column, so I can't do this:
select (e2.timestamp - e1.timestamp)
from events e1, events e2
where (e1.id + 1) = e2.id
Complication 2: I'm using a database connection (Splunk) that won't allow me to create or alter temporary tables (otherwise I'd just add an id column). Am I hosed?
thank you!

Use a user variable to hold the timestamp from the previous line.
SELECT step, timestamp - #prevtime AS diff, #prevtime := timestamp
FROM events
CROSS JOIN (SELECT #prevtime := 0) AS x
ORDER BY timestamp

Related

Find next or previous ID when query contains multiple cases

I am looking for the most efficient way to find the next or previous ID of the following query:
SELECT *
FROM transactions
ORDER
BY CASE order_status
WHEN 'order_accepted' THEN 1
WHEN 'processing_order' THEN 2
WHEN 'order_send_mailer' THEN 3
WHEN 'order_send' THEN 4
WHEN 'order_received' THEN 5
WHEN 'order_refunded' THEN 6
ELSE 7 END
, id DESC limit 1;
I tried adding a where id > '$id' or where id < '$id' claus to the query but it didn't give me te next or previous ID I was looking for.
For those that need some explanation of what I am trying to do: It's to go to the next or previous order by case with a forward of backward button.
What it currently looks like:
-id- -order_status-
9399 order_accepted
9398 processing_order
9363 processing_order
9403 order_send_mailer
9318 order_send
9346 order_received
9345 order_received
9050 order_refunded
The next ID for example of 9403 would be 9363 and previous ID would be 9318
Change your order_status into an enum column. This will save disk space and make sorting by order_status simpler and faster.
-- Add a new version of the column using an enum.
-- These strings are aliases for ordered numbers.
-- 'order_accepted' is 1, 'processing_order' is 2, etc.
alter table transactions add column enum_order_status enum(
'order_accepted',
'processing_order',
'order_send_mailer',
'order_send',
'order_received',
'order_refunded'
) not null;
-- Copy the status into the new enum column.
-- MySQL will translate the string into the number for you.
update transactions
set enum_order_status = order_status;
-- Drop the old column.
alter table transactions drop column order_status;
-- Rename the new enum column.
alter table transactions rename column enum_order_status to order_status;
-- Index it.
create index transactions_order_status on transactions(order_status);
-- Enjoy your vastly simplified and much faster query.
select *
from transactions
order by order_status, id desc
That's not actually necessary, but it makes everything much simpler.
With that out of the way, use the window functions lead and lag to refer to the previous and next rows in a query.
select
id, order_status,
lead(id) over w, lead(order_status) over w,
lag(id) over w, lag(order_status) over w
from transactions
window w as (order by order_status, id desc);
Note, window functions were added in MySQL 8. If you're using an older version I recommend upgrading ASAP; MySQL 8 has many big improvements. Otherwise you can simulate it with correlated subqueries and self-joins.
If you want the previous and next rows of a specific row, use the technique from this answer. We add row_numbers to the table in the desired order, and then fetch 9403 and its previous and next row by row number.
-- Add a row number to your table in the desired order.
with ordered_transactions as (
select
*, row_number() over w as rn
from transactions
window w as (order by order_status, id desc)
)
select *
from ordered_transactions
-- Find the row number for ID 9403, then add -1, 0, and 1.
-- If 9403 is row number 5 you'll fetch row numbers 4, 5, and 6.
where ot.rn in (
select rn+i
from ordered_transactions ot
-- All this is doing is making us three "rows" where i = -1, 0, and 1.
cross join (SELECT -1 AS i UNION ALL SELECT 0 UNION ALL SELECT 1) cj
where ot.id = 9403
);
Try it.

Calculating time difference of every other row from a table

Note: The data for my question is on SQLFiddle right here where you
can query it.
How the table is created
I have data from a table and put into a temp table using the below logic but the BETWEEN start and end date time stamps are dynamically generated based on other logic in the stored proc, etc.
SET #RowNum = 0;
DROP TEMPORARY TABLE IF EXISTS temp;
CREATE TEMPORARY TABLE temp AS
SELECT #RowNum := #RowNum + 1 RowNum
, TimeStr
, Value
FROM mytable
WHERE TimeStr BETWEEN '2018-01-31 06:15:56' AND '2018-01-31 19:27:09'
AND iQuality = 3 ORDER BY TimeStr;
This gives me a temp table with the row number which increments up one number in order starting with the oldest based TimeStr records, so the oldest is the time of the first record or RowNum 1.
Temp Table
The Data
You can get to this temp table data and play with the queries here on the SQLFiddle I've created but I have a few things I tried there you'll see there which don't give me what I need though.
Attempt to Clarify Further
I need to get the time for each ON and OFF set based on the TimeStr values in each set and I can get this using the TIMEDIFF() function.
I'm having a hard time figuring out how to make it give me the result of each ON and OFF record. The records are always in order from oldest to newest and the row number always starts at 1 too.
I some how need to give give every two records with one after the other RowNum values wise a matching CycleNum starting at 1 and increment by one per each ON and OFF cycle or set.
I can use TIMEDIFF(MAX(TimeStr), MIN(TimeStr)) as duration but I'm not sure how to best get it to group every two RowNum records in order as explained to give each set a subsequent CycleNum value that increments.
Expected Output
The expected output show look like the below screen shot for all ON and OFF cycles or every two RowNum in groups and sequence.
Output Clarification
I need the output to include each ON and OFF cycle's start time, end time, and the duration for the time between the start and stop.
If you can guarantee two things:
That the row numbers are strictly sequential with no gaps.
That the on/off flag is always alternating.
Then you can do this with a relatively simple join. The code looks like:
SELECT (#rn := #rn + 1) as cycle, t.*, tnext.timestr,
timediff(tnext.timestr, t.timestr)
FROM temp t JOIN
temp tnext
ON t.rownum = tnext.rownum - 1 and
t.value = 1 and
tnext.value = 0 cross join
(SELECT #rn := 0) params;
If these conditions are not true, then more complex logic is needed.
Here is a simpler one :
SELECT
t1.TimeStr AS StartTime,
t2.TimeStr AS EndTime,
TIMEDIFF(t2.TimeStr, t1.TimeStr) AS Duration
FROM temp t1
INNER JOIN temp t2 ON t2.RowNum = t1.RowNum + 1
WHERE
t2.Value = 0
AND t1.Value = 1
A quick and dirty way to do it would be this:
SELECT
T1.TimeStr AS StartTime,
(SELECT T2.TimeStr FROM temp AS T2 WHERE T2.RowNum = T1.RowNum+1) AS StopTime,
TIMEDIFF((SELECT T2.TimeStr FROM temp AS T2 WHERE T2.RowNum = T1.RowNum+1),
T1.TimeStr) AS Duration
FROM temp AS T1
WHERE Value = 1;
Seems like there must be better ways to do this. Two subqueries will be slow.
You could do it in two steps:
CREATE TEMPORARY TABLE startstop AS
SELECT
T1.TimeStr AS StartTime,
(SELECT T2.TimeStr FROM temp AS T2 WHERE T2.RowNum = T1.RowNum+1) AS StopTime,
0 AS Duration
FROM temp AS T1
WHERE Value = 1;
UPDATE startstop SET Duration = StopTime - StartTime;
However I cannot test this in the Fiddle.

Select last two values from two IDs

I would like to select two specific values, the first value is the last inserted row where the ID_SENSOR is 1, and the second value is the last inserted row where the ID_SENSOR is 2.
My Database table:
My Query:
SELECT DATA FROM (SELECT * FROM registovalores WHERE ID_SENSOR = '1' OR ID_SENSOR = '2' ORDER BY ID_SENSOR DESC LIMIT 2) as r ORDER BY TIMESTAMP
My Query is printing the last value just from the ID_SENSOR 1, which it means that I'm only getting the last inserted values, and not the last inserted value from both IDS.
I would like to print my values like this:
ID_SENSOR 1 = 90
ID SENSOR 2 = 800
What do I need to change on my Query?
Thank you.
One method uses a correlated subquery:
SELECT rv.*
FROM registovalores rv
WHERE rv.ID_SENSOR IN (1, 2) AND
rv.TIMESTAMP = (SELECT MAX(rv2.TIMESTAMP)
FROM registovalores rv2
WHERE rv.ID_SENSOR = rv2.ID_SENSOR
);
You have to have two separate queries, one per sensor.
select id_sensor, data
from the_table
where id_sensor = 'sensor_1'
order by timestamp desc -- the latest value is the first to come
limit 1; -- only pick the top (latest) row.
If you want to query for more than one value in a single database roundtrip, consider using union all between several such queries.
Please note that such a query may return one row or zero rows, since data for a particular sensor may not be available yet.

Getting previous row in MySQL

I'm stucked in a MySQL problem that I was not able to find a solution yet. I have the following query that brings to me the month-year and the number new users of each period in my platform:
select
u.period ,
u.count_new as new_users
from
(select DATE_FORMAT(u.registration_date,'%Y-%m') as period, count(distinct u.id) as count_new from users u group by DATE_FORMAT(u.registration_date,'%Y-%m')) u
order by period desc;
The result is the table:
period,new_users
2016-10,103699
2016-09,149001
2016-08,169841
2016-07,150672
2016-06,148920
2016-05,160206
2016-04,147715
2016-03,173394
2016-02,157743
2016-01,173013
So, I need to calculate for each month-year the difference between the period and the last month-year. I need a result table like this:
period,new_users
2016-10,calculate(103699 - 149001)
2016-09,calculate(149001- 169841)
2016-08,calculate(169841- 150672)
2016-07,So on...
2016-06,...
2016-05,...
2016-04,...
2016-03,...
2016-02,...
2016-01,...
Any ideas: =/
Thankss
You should be able to use a similar approach as I posted in another S/O question. You are on a good track to start. You have your inner query get the counts and have it ordered in the final direction you need. By using inline mysql variables, you can have a holding column of the previous record's value, then use that as computation base for the next result, then set the variable to the new balance to be used for each subsequent cycle.
The JOIN to the SqlVars alias does not have any "ON" condition as the SqlVars would only return a single row anyhow and would not result in any Cartesian product.
select
u.period,
if( #prevCount = -1, 0, u.count_new - #prevCount ) as new_users,
#prevCount := new_users as HoldColumnForNextCycle
from
( select
DATE_FORMAT(u.registration_date,'%Y-%m') as period,
count(distinct u.id) as count_new
from
users u
group by
DATE_FORMAT(u.registration_date,'%Y-%m') ) u
JOIN ( select #prevCount := -1 ) as SqlVars
order by
u.period desc;
You may have to play with it a little as there is no "starting" point in counts, so the first entry in either sorted direction may look strange. I am starting the "#prevCount" variable as -1. So the first record processed gets a new user count of 0 into the "new_users" column. THEN, whatever was the distinct new user count was for the record, I then assign back to the #prevCount as the basis for all subsequent records being processed. yes, it is an extra column in the result set that can be ignored, but is needed. Again, it is just a per-line place-holder and you can see in the result query how it gets its value as each line progresses...
I would create a temp table with two columns and then fill it using a cursor that
does something like this (don't remember the exact syntax - so this is just a pseudo-code):
#val = CURSOR.col2 - (select col2 from OriginalTable t2 where (t2.Period = (CURSOR.Period-1) )))
INSERT tmpTable (Period, NewUsers) Values ( CURSOR.Period, #val)

MySQL query index & performance improvements

I have created an application to track progress in League of Legends for me and my friends. For this purpose, I collect information about the current rank several times a day into my MySQL database. To fetch the results and show the to them in the graph, I use the following query / queries:
SELECT
lol_summoner.name as name, grid.series + ? as timestamp,
AVG(NULLIF(lol.points, 0)) as points
FROM
series_tmp grid
JOIN
lol ON lol.timestamp >= grid.series AND lol.timestamp < grid.series + ?
JOIN
lol_summoner ON lol.summoner = lol_summoner.id
GROUP BY
lol_summoner.name, grid.series
ORDER BY
name, timestamp ASC
SELECT
lol_summoner.name as name, grid.series + ? as timestamp,
AVG(NULLIF(lol.points, 0)) as points
FROM
series_tmp grid
JOIN
lol ON lol.timestamp >= grid.series AND lol.timestamp < grid.series + ?
JOIN
lol_summoner ON lol.summoner = lol_summoner.id
WHERE
lol_summoner.name IN (". str_repeat('?, ', count($names) - 1) ."?)
GROUP BY
lol_summoner.name, grid.series
ORDER BY
name, timestamp ASC
The first query is used in case I want to retrieve all players which are saved in the database. The grid table is a temporary table which generated timestamps in a specific interval to retrive information in chunks of this interval. The two variable in this query are the interval. The second query is used if I want to retrieve information for specific players only.
The grid table is produces by the following stored procedure which is called with three parameters (n_first - first timestamp, n_last - last timestamp, n_increments - increments between two timestamps):
BEGIN
-- Create tmp table
DROP TEMPORARY TABLE IF EXISTS series_tmp;
CREATE TEMPORARY TABLE series_tmp (
series bigint
) engine = memory;
WHILE n_first <= n_last DO
-- Insert in tmp table
INSERT INTO series_tmp (series) VALUES (n_first);
-- Increment value by one
SET n_first = n_first + n_increment;
END WHILE;
END
The query works and finishes in reasonable time (~10 seconds) but I am thankful for any help to improve the query by either rewriting it or adding additional indexes to the database.
/Edit:
After review of #Rick James answer, I modified the queries as follows:
SELECT lol_summoner.name as name, (lol.timestamp div :range) * :range + :half_range as timestamp, AVG(NULLIF(lol.points, 0)) as points
FROM lol
JOIN lol_summoner ON lol.summoner = lol_summoner.id
GROUP by lol_summoner.name, lol.timestamp div :range
ORDER by name, timestamp ASC
SELECT lol_summoner.name as name, (lol.timestamp div :range) * :range + :half_range as timestamp, AVG(NULLIF(lol.points, 0)) as points
FROM lol
JOIN lol_summoner ON lol.summoner = lol_summoner.id
WHERE lol_summoner.name IN (<NAMES>)
GROUP by lol_summoner.name, lol.timestamp div " . $steps . "
ORDER by name, timestamp ASC
This improves the query execution time by a really good margin (finished way under 1s).
Problem 1 and Solution
You need a series of integers between two values? And they differ by 1? Or by some larger value?
First, create a permanent table of the numbers from 0 to some large enough value:
CREATE TABLE Num10 ( n INT );
INSERT INTO Num10 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
CREATE TABLE Nums ( n INT, PRIMARY KEY(n))
SELECT a.n*1000 + b.n*100 + c.n*10 + d.n
FROM Num10 AS a
JOIN Num10 AS b -- note "cross join"
JOIN Num10 AS c
JOIN Num10 AS d;
Now Nums has 0..9999. (Make it bigger if you might need more.)
To get a sequence of consecutive numbers from 123 through 234:
SELECT 123 + n FROM Nums WHERE n < 234-123+1;
To get a sequence of consecutive numbers from 12345 through 23456, in steps of 15:
SELECT 12345 + 15*n FROM Nums WHERE n < (23456-12345+1)/15;
JOIN to a SELECT like one of those instead of to series_tmp.
Barring other issue, that should significantly speed things up.
Problem 2
You are GROUPing BY series, but ORDERing by timestamp. They are related, so you might get the 'right' answer. But think about it.
Problem 3
You seem to be building "buckets" (called "series"?) from "timestamps". Is this correct? If so, let's work backwards -- Turn a "timestamp" into a "bucket" number:
bucket_number = (timestamp - start) / bucket_size
By doing that throughout, you can avoid 'Problem 1' and eliminate my solution to it. That is, reformulate the entire queries in terms of buckets.