sql query to sum over deltas in a table - mysql

I have incoming sensor data that is stored in a table. Each record for a sensor has few "counter" columns. Counters are nothing but a snapshot in time. example records are shown below.
|id | sensor | counter 1 | counter 2 | timestamp
1 5 100 200 10:00 AM
2 5 125 210 10:01 AM
...
60 5 1000 800 11:00 AM
I have thousands of such sensors sending snapshots of their counter values in time. What i need to do is, given a time period say between 10 am and 11 am, map the delta between records to a chart. Delta is always taken to the previous record.
What is the SQL query to get the delta's between consecutive records?
Second question I have is, what is a good table design to store such data which is self-referencing. Probably as a linked list with each record pointing to the previous record?

The generic SQL is:
select t.*,
(select counter1 from t t2 where t2.sensor = t.sensor and t2.timestamp < t.timestamp
order by timestamp desc
limit 1
) prevCounter1
from t
The limit 1 depends on the database. It might be top 1 (SQL Server, Sybase) or where rownum = 1 (Oracle).
You need to do this for each counter.
If you are using Postgres, Oracle, or SQL Server 2012, you can use lag():
select t.*,
lag(counter1) over (partition by sensor order by timestamp) as prevCounter1
from t
In MySQL, you will be best off if you put in a prevID. If not, an index on sensor, timestamp should give you reasonable performance.

Related

Get value difference between two dates in MySQL

I have a following table structure:
page_id view_count date
1 30 2018-08-30
1 33 2018-08-31
1 1 2018-09-01
1 5 2018-09-02
...
View count is reset on 1st of every month, and it's current value is stored on a daily basis, so on 31st of August it was increased by 3 (because 33-30).
What I need to do is to retrieve the view count (difference) between two dates through SQL query. To retrieve view count between two dates in same month would be simple, by just subtracting bigger date with the lower date, but retrieving between two dates that are in different months is what's not sure to me how to achieve. If I wanted to retrieve data between 2018-08-13 and 2018-09-13 I would have to get difference between 2018-08-31 and 2018-08-13, and add it to the value of 2018-09-13.
Also, I would like to do it for all page_id at once, between the same dates if possible within a single query.
assuming that the counter is unique per page and that the page_id counter is inserted daily into the table, I think that such a solution would work
The dates are based on the example,
and should be replaced by the relevant parameters
SELECT
v1.view_count + eom.view_count - v2.view_count
FROM
view_counts v1
INNER JOIN view_counts v2 ON v2.page_id = v1.page_id AND v2.`date` = '2018-08-13'
INNER JOIN view_counts eom ON v2.page_id = v.page_id AND eom.`date` = LAST_DAY(DATE_ADD(v.`date`, INTERVAL -1 MONTH))
WHERE
`date` = '2018-09-13'

SQL Query - Find out how many times a row changes from 0 to another value

I am using MySQL 8 and need to create a stored procedure
I have a single table that has a DATE field and a value field which can be 0 or any other number. This value field represents the daily amount of rain for that day.
The table stores data between today and 10 years.
I need to find out how many periods of rain there will be in the next 10 years.
So, for example, if my table contains the following data:
Date - Value
2018-06-09 - 0
2018-06-10 - 50
2018-06-11 - 0
2018-06-12 - 15
2018-06-13 - 17
2018-06-14 - 0
2018-06-15 - 0
2018-06-16 - 12
2018-06-17 - 123
2018-06-18 - 17
Then the SP should return 3, because there were 3 periods of rain.
Any help in getting me closer to the answer will be appreciated!
You don't need to have a stored procedure for this.
A solution with MySQL's 8.0 LEAD function this supports dates with gaps.
The complete table needs to be scanned but i don't think that a huge problem with ~3560 records.
Query
SELECT
SUM(filter_match = 1) AS number
FROM (
SELECT
((t.value = 0) AND (LEAD(t.value) OVER (ORDER BY t.date ASC) != 0)) AS filter_match
FROM
t
) t
see demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/2
By the way, would you mind expanding your answer to understand how
LEAD and SUM work together?
LEAD(t.value) OVER (ORDER BY t.date ASC) simply means get the next value from the next record ordered by date.
this demo shows it nicely https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/6
SUM(filter_match = 1) is a conditional sum. in this case the alias filter_match needs to be true.
see what filter_match is demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/8
In MySQL aggregate functions can have a SQL expression something like 1 = 1 (which is always true or 1) or 1 = 0 (which is always false or 0).
The conditional sum only sums up when the condition is true.
see demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/7
Use MySQL join:
SELECT COUNT(*) Number_of_Periods
FROM yourTable A JOIN yourTable B
ON DATE(A.`DATE`)=DATE(B.`DATE` - INTERVAL 1 DAY)
AND A.`VALUE`=0 AND B.`VALUE`>0;
See Demo on DB Fiddle.

Selecting first value of every minute in table

I've been trying to work this one out for a while now, maybe my problem is coming up with the correct search query. I'm not sure.
Anyway, the problem I'm having is that I have a table of data that has a new row added every second (imagine the structure {id, timestamp(datetime), value}). I would like to do a single query for MySQL to go through the table and output only the first value of each minute.
I thought about doing this with multiple queries with LIMIT and datetime >= (beginning of minute) but with the volume of data I'm collecting that is a lot of queries so it would be nicer to produce the data in a single query.
Sample data:
id datetime value
1 2015-01-01 00:00:00 128
2 2015-01-01 00:00:01 127
3 2015-01-01 00:00:04 129
4 2015-01-01 00:00:05 127
...
67 2015-01-01 00:00:59 112
68 2015-01-01 00:01:12 108
69 2015-01-01 00:01:13 109
Where I would want the result to select the rows:
1 2015-01-01 00:00:00 128
68 2015-01-01 00:01:12 108
Any ideas?
Thanks!
EDIT: Forgot to add, the data, whilst every second, is not reliably on the first second of every minute - it may be :30 or :01 rather than :00 seconds past the minute
EDIT 2: A nice-to-have (definitely not required for answer) would be a query that is flexible to also take an arbitrary number of minutes (rather than one row each minute)
SELECT t2.* FROM
( SELECT MIN(`datetime`) AS dt
FROM tbl
GROUP BY DATE_FORMAT(`datetime`,'%Y-%m-%d %H:%i')
) t1
JOIN tbl t2 ON t1.dt = t2.`datetime`
SQLFiddle
Or
SELECT *
FROM tbl
WHERE dt IN ( SELECT MIN(dt) AS dt
FROM tbl
GROUP BY DATE_FORMAT(dt,'%Y-%m-%d %H:%i'))
SQLFiddle
SELECT t1.*
FROM tbl t1
LEFT JOIN (
SELECT MIN(dt) AS dt
FROM tbl
GROUP BY DATE_FORMAT(dt,'%Y-%m-%d %H:%i')
) t2 ON t1.dt = t2.dt
WHERE t2.dt IS NOT NULL
SQLFiddle
In MS SQL Server I would use CROSS APPLY, but as far as I know MySQL doesn't have it, so we can emulate it.
Make sure that you have an index on your datetime column.
Create a table of numbers, or in your case a table of minutes. If you have a table of numbers starting from 1 it is trivial to turn it into minutes in the necessary range.
SELECT
tbl.ID
,tbl.`dt`
,tbl.value
FROM
(
SELECT
MinuteValue
, (
SELECT tbl.id
FROM tbl
WHERE tbl.`dt` >= Minutes.MinuteValue
ORDER BY tbl.`dt`
LIMIT 1
) AS ID
FROM Minutes
) AS IDs
INNER JOIN tbl ON tbl.ID = IDs.ID
For each minute find one row that has timestamp greater than the minute. I don't know how to return the full row, rather than one column in MySQL in the nested SELECT, so at first I'm making a temp table with two columns: Minute and id from the original table and then explicitly look up rows from original table knowing their IDs.
SQL Fiddle
I've created a table of Minutes in the SQL Fiddle with the necessary values to make example simple. In real life you would have a more generic table.
Here is SQL Fiddle that uses a table of numbers, just for illustration.
In any case, you do need to know in advance somehow the range of dates/numbers you are interested in.
It is trivial to make it work for any interval of minutes. If you need results every 5 minutes, just generate a table of minutes that has values not every 1 minute, but every 5 minutes. The main query would remain the same.
It may be more efficient, because here you don't join the big table to itself and you don't make calculations on the datetime column, so the server should be able to use the index on it.
The example that I made assumes that for each minute there is at least one row in the big table. If it is possible that there are some minutes that don't have any data at all you'd need to add extra check in the WHERE clause to make sure that the found row is still within that minute.
select * from table where timestamp LIKE "%-%-% %:%:00" could work.
This is similar to this question: Stack Overflow Date SQL Query Question
Edit: This probably would work better:
`select , date_format(timestamp, '%Y-%m-%d %H:%i') as the_minute, count()
from table
group by the_minute
order by the_minute
Similar to this question here: mysql select date format
i'm not really sure, but you could try this:
SELECT MIN(timestamp) FROM table WHERE YEAR(timestamp)=2015 GROUP BY DATE(timestamp), HOUR(timestamp), MINUTE(timestamp)

Get stats for each day in a month without ignoring days with no data

I want to get stats for each day in a given month. However, if a day has no rows in the table, it doesn't show up in the results. How can I include days with no data, and show all days until the current date?
This is the query I have now:
SELECT DATE_FORMAT(FROM_UNIXTIME(timestamp), '%d'), COUNT(*)
FROM data
WHERE EXTRACT(MONTH FROM FROM_UNIXTIME(timestamp)) = 6
GROUP BY EXTRACT(DAY FROM FROM_UNIXTIME(timestamp))
So if I have
Row 1 | 01-06
Row 2 | 02-06
Row 3 | 03-06
Row 4 | 05-06
Row 5 | 05-06
(i changed timestamp values to a day/month date just to explain)
It should output
01 | 1
02 | 1
03 | 1
04 | 0
05 | 2
06 | 0
...Instead of ignoring day 4 and today (day 6).
You will need a calendar table to do something in the form
SELECT `date`, count(*)
FROM Input_Calendar c
LEFT JOIN Data d on c.date=d.date
GROUP BY `date`
I keep a full copy of a calendar table in my database and used a WHILE loop to fill it but you can populate one on the fly for use based on the different solutions out there like http://crazycoders.net/2012/03/using-a-calendar-table-in-mysql/
In MySQL, you can use MySQL variables (act like in-line programming values). You set and can manipulate as needed.
select
dayofmonth( DynamicCalendar.CalendarDay ) as `Day`,
count(*) as Entries
from
( select
#startDate := date_add( #startDate, interval 1 day ) CalendarDay
from
( select #startDate := '2013-05-31' ) sqlvars,
AnyTableThatHasAsManyDaysYouExpectToReport
limit
6 ) DynamicCalendar
LEFT JOIN Input_Calendar c
on DynamicCalendar.CalendarDay = date( from_unixtime( c.date ))
group by
DynamicCalendar.CalendarDay
In the above sample, the inner query can join against as the name implies "Any Table" in your database that has at least X number of records you are trying to generate for... in this case, you are dealing with only the current month of June and only need 6 records worth... But if you wanted to do an entire year, just make sure the "Any Table" has 365 records(or more).
The inner query will start by setting the "#startDate" to the day BEFORE June 1st (May 31). Then, by just having the other table, will result in every record joined to this variable (creates a simulated for/next loop) via a limit of 6 records (days you are generating the report for). So now, as the records are being queried, the Start Date keeps adding 1 day... first record results in June 1st, next record June 2nd, etc.
So now, you have a simulated calendar with 6 records dated from June 1 to June 6. Take that and join to your "data" table and you are already qualifying your dates via the join and get only those dates of activity. I'm joining on the DATE() of the from unix time since you care about anything that happend on June 1, and June 1 # 12:00:00AM is different than June 1 # 8:45am, so matching on the date only portion, they should remain in proper grouping.
You could expand this answer by changing the inner '2013-05-31' to some MySQL Date function to get the last day of the prior month, and the limit based on whatever day in the current month you are doing so these are not hard-coded.
Create a Time dimension. This is a standard OLAP reporting trick. You don't need a cube in order to do OLAP tricks, though. Simply find a script on the internet to generate a Calendar table and join to that table.
Also, I think your query is missing a WHERE clause.
Other useful tricks include creating a "Tally" table that is a list of numbers from 1 to N where N is usually the max of the bigint on your database management system.
No code provided here, as I am not a MySQL guru.
Pseudo-code is:
Select * from Data left join TimeDimension on data.date = timedimension.date

Returning rows of aggregate results from single SQL query

I have a MySQL table containing a column to store time and another to store a value associated with that time.
time | value
------------
1 | 0.5
3 | 1.0
4 | 1.5
.... | .....
The events are not periodic, i.e., the time values do not increment by fix interval.
As there are large number of rows (> 100000), for the purpose of showing the values in a graph I would like to be able to aggregate (mean) the values for an interval of fixed size over the entire length of time for which the data is available. So basically the output should consist of pairs of interval and mean values.
Currently, I am splitting the total time interval into fixed chunks of time, executing individual aggregate queries for that interval and collecting the results in application code (Java). Is there a way to do all of these steps in SQL. Also, I am currently using MySQL but am open to other databases that might support an efficient solution.
SELECT FLOOR(time / x) AS Inter, AVG(value) AS Mean
FROM `table`
GROUP BY Inter;
Where x is your interval of fixed size.
I've usually solved this through a "period" table, with all the valid times in it, and an association with the period on which I report.
For instance:
time day week month year
1 1 1 1 2001
2 1 1 1 2001
....
999 7 52 12 2010
You can then join your time to the "period" table time, and use AVG.