mySQL - Grouping values based on consecutive differences

mySQL - Grouping values based on consecutive differences - mysql

I have a table in my database that contains an ID and DATETIME column, here is some sample data:
ID | DATETIME
1 | 2014-05-06 01:12
1 | 2014-05-06 01:30
1 | 2014-05-06 01:45
1 | 2014-05-06 02:59
2 | 2014-05-06 01:17
2 | 2014-05-06 01:18
2 | 2014-05-06 01:19
2 | 2014-05-06 02:00
I need to produce a query that determines the ID belonging to the object that has the longest time between its DATETIME values, where the time between consecutive DATETIME values does not exceed 20 minutes.
For example, in the sample data, I would want to return 1 as it has DATETIME values from (01:12 - 01:45) without having a consecutive difference of 20 minutes between DATETIME values.
Thanks.

It looks like you will need a self-join. Because if you had 10 entries for an ID, your 20 minute gap might be between entries 3-6 vs 1-4 or even 4-9. So the second instance of the join would be on the same ID and have a date time higher than that of the primary entry, but less than 20 minutes. Then, it could be ordered by the time-gap and limit to the one you want. Something like:
select
YT.ID,
YT.DTColumn,
MAX( YT2.DTColumn ) as MaxDateWithin20Minutes
from
YourTable YT
JOIN YourTable YT2
ON YT.ID = YT2.ID
AND YT.DTColumn < YT2.DTColumn
AND YT2.DTColumn <= date_add( YT.DTColumn, INTERVAL 20 MINUTE )
group by
YT.ID,
YT.DTColumn
order by
timediff(MAX( YT2.DTColumn ), YT.DTColumn) DESC
limit
1

You need to get the next (or previous) value and get the time difference. I think the following does what you want:
select t.*
from (select t.*,
(select t2.datetime
from table t2
where t2.id = t.id and t2.datetime < t.datetime
order by t2.datetime desc
) prev_datetime
from table t
) t
where datetime <= prev_datetime + interval 20 minutes
order by timestampdiff(second, prev_datetime, datetime) desc
limit 1;

Related

Mysql: Subtraction between rows and sum with other table

I have two tables, both with a Time column as timestamp type which is filled by default when the row is created: Table1 is updated approximately every 10 seconds:
Time | Val_1a | Val_2a | Val_3a
2021-11-06 13:59:53 | 15 | 10 | 35
2021-11-06 14:00:02 | 12 | 15 | 34
.................
2021-11-06 14:05:25 | 11 | 13 | 35
2021-11-06 14:05:35 | 11 | 17 | 36
Table2 is updated every hour after mathematical operations on table1:
Time | Var_1b | Var_2b | Var_3b
2021-11-06 11:00:00 | 2 | 15 | 30
2021-11-06 12:00:00 | 8 | 12 | 32
2021-11-06 13:00:00 | 12 | 11 | 35
What I would like to get but I'm not able to do in any way, is:
Check that the last table1.Val_2a value is greater than the first table1.Val_2a value written at the beginning of the current hour (with the tables above, check if 17 > 15). If this condition is not met, the entire query must return 0 otherwise:
2a) If the last row in table2 refers to the previous day, then the query result is simply the difference of the two table1.Val_2a values (17 - 15 = 2)
2b) Otherwise their difference is calculated as at point 2a (17-15 = 2) and it is added to the table2.Var_1b value (2 + 12 = 14)
I hope I was able to explain it in a clearly way, and that it all is possible with a single query. Thanks everyone for the support

Sorry, if I add an Answer but I couldn't add the image into the comment.
This is the qwery I used to test the CASE clause
SELECT t1.dtm, t1.Val_2a2, t1.Val_2a1,
CASE WHEN Val_2a2 > Val_2a1
THEN Val_2a2-Val_2a1 ELSE 0 END AS ValF FROM (SELECT DATE_FORMAT(time, '%Y-%m-%d %H:00:00') dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time),',',1) Val_2a1,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time DESC),',',1) Val_2a2 FROM table1 GROUP BY dtm) t1
and this is the unexpected result
Qwery result

It is possible in a single query but different people will have different method of doing it. Whatever the method is, I personally think that the most important part is to keep the logic intact. The details you've provided in your question got me assuming that this might be a kind of query you're looking for:
SELECT t1.dtm, t1.Val_2a2, t1.Val_2a1, t2.Val_1b2,
CASE WHEN Val_2a2 > Val_2a1
THEN Val_2a2-Val_2a1+Val_1b2 ELSE 0 END AS ValF
FROM
(SELECT DATE_FORMAT(time, '%Y-%m-%d %H:00:00') dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time),',',1) Val_2a1 ,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time DESC),',',1) Val_2a2
FROM table1
GROUP BY dtm) t1
LEFT JOIN
(SELECT DATE(time) dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_1b ORDER BY time DESC),',',1) Val_1b2
FROM table2
GROUP BY dtm) t2
ON DATE(t1.dtm)=t2.dtm;
Demo fiddle

hoping it can help someone else, after some more test this is the final qwery I got, considering I just need a value on the fly without needing of storing it.
Of course every consideration by the experts is more than appreciate.
Thanks to all
SELECT
CASE WHEN
(ABS(t1.Val_2a2) - ABS(t1.Val_2a1)) BETWEEN 0 AND 30
THEN t1.Val_2a2-t1.Val_2a1+t2.Val_1b2
ELSE t2.Val_1b2
END AS My_result
FROM
(SELECT DATE_FORMAT(Time, '%Y-%m-%d %H:00:00') dtm,
(SELECT Val_2a FROM table1 WHERE Time >= DATE_FORMAT(NOW(),"%Y-%m-%d %H:00:00") ORDER BY Time LIMIT 1) Val_2a1,
(SELECT Val_2a FROM table1 WHERE Time >= DATE_FORMAT(NOW(),"%Y-%m-%d %H:00:00") ORDER BY Time DESC LIMIT 1) Val_2a2
FROM table1
GROUP BY dtm
ORDER BY Time DESC LIMIT 1) t1
LEFT JOIN
(SELECT (Time) dtm,
(Val_1b) Val_1b2
FROM table2
GROUP BY dtm ORDER BY dtm DESC LIMIT 1) t2
ON DATE(t1.dtm)= DATE(t2.dtm)

MySQL: select entries with a certain count within a certain period

I have a MySQL table with a datetime row. How can I find all groups with at least 5 entries within 10 minutes?
My only idea is to write a program (in whatever language) and loop over the timestamps, check always 5 (..) successive entries, calculate the time span between the last and the first and check whether it is below the limit.
Can this be done using a single SQL query too?
(The scenario is is simplified and the numbers are just examples.)
As requested, here comes an example:
id | timestamp | other_column
---|---------------------|-------------
3 | 2017-01-01 11:00:00 | thank
2 | 2017-01-01 11:01:00 | you
1 | 2017-01-01 11:02:00 | for
* 6 | 2017-01-01 11:20:00 | your
* 5 | 2017-01-01 11:21:00 | efforts
* 4 | 2017-01-01 11:22:00 | to
* 7 | 2017-01-01 11:23:00 | help
* 8 | 2017-01-01 11:24:00 | me
9 | 2017-01-01 11:40:00 | :
10 | 2017-01-01 11:41:00 | )
If the count limit is 5 and the timespan limit is 10 minutes, I'd like to get the entries marked with "*". The "id" column is the primary key of the table, but the order is not always the order of the timestamps. The "other_column" is used for a where clause. The table has about 1 million entries.

Try to break this down logically. Sorry for the psuedo code bits, I'm a little short on time.
select t1.id, t1.timestamp, t2.timestamp
from yourtable t1
inner join yourtable t2 on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
(plus 20 minutes won't work as is, use appropriate add function)
So this will give you a relatively giant list of all ID's joined to any other id's within a 20 minute time interval (including one row for itself). (add, I'm only picking out the first row of the group at this point, easier just to grab the 'header row' here by this timestamp plus 20 minutes and worry about the rest in the next step) If we group by the ID and time, we get a count of how many rows were within 20 minutes:
select id, t1.timestamp, count(1)
from yourtable t1
inner join yourtable t2 on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
group by id, t1.timestamp
having count(1) > 4
This will now give you a list of all the ID's and it's timestamp that has itself and 4 other records or more within 20 minutes away from that timestamp. Now it depends on how you want to group from here, if you want each of the 5 lines, we can call the query above a subquery and join it back to the main table to get the rows you want returned.
select t3.*
from
(select id, t1.timestamp, count(1)
from yourtable t1
inner join yourtable t2
on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
group by id, t1.timestamp
having count(1) > 4) a
inner join yourtable t3 on t3.timestamp >= a.timestamp and t3.timestamp < (a.timestamp + 20 minutes)
And that should give you ID 4-8 and it's info returned (order as you see fit).
My apologies that I don't have the time to test, but the logic should work.

MySQL get count of periods where date in row

I have an MySQL table, similar to this example:
c_id date value
66 2015-07-01 1
66 2015-07-02 777
66 2015-08-01 33
66 2015-08-20 200
66 2015-08-21 11
66 2015-09-14 202
66 2015-09-15 204
66 2015-09-16 23
66 2015-09-17 0
66 2015-09-18 231
What I need to get is count of periods where dates are in row. I don't have fixed start or end date, there can be any.
For example: 2015-07-01 - 2015-07-02 is one priod, 2015-08-01 is second period, 2015-08-20 - 2015-08-21 is third period and 2015-09-14 - 2015-09-18 as fourth period. So in this example there is four periods.
SELECT
SUM(value) as value_sum,
... as period_count
FROM my_table
WHERE cid = 66
Cant figure this out all day long.. Thx.

I don't have enough reputation to comment to the above answer.
If all you need is the NUMBER of splits, then you can simply reword your question: "How many entries have a date D, such that the date D - 1 DAY does not have an entry?"
In which case, this is all you need:
SELECT
COUNT(*) as PeriodCount
FROM
`periods`
WHERE
DATE_ADD(`date`, INTERVAL - 1 DAY) NOT IN (SELECT `date` from `periods`);
In your PHP, just select the "PeriodCount" column from the first row.
You had me working on some crazy stored procedure approach until that clarification :P

I should get deservedly flamed for this, but anyway, consider the following...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(date DATE NOT NULL PRIMARY KEY
,value INT NOT NULL
);
INSERT INTO my_table VALUES
('2015-07-01',1),
('2015-07-02',777),
('2015-08-01',33),
('2015-08-20',200),
('2015-08-21',11),
('2015-09-14',202),
('2015-09-15',204),
('2015-09-16',23),
('2015-09-17',0),
('2015-09-18',231);
SELECT x.*
, SUM(y.value) total
FROM
( SELECT a.date start
, MIN(c.date) end
FROM my_table a
LEFT
JOIN my_table b
ON b.date = a.date - INTERVAL 1 DAY
LEFT
JOIN my_table c
ON c.date >= a.date
LEFT
JOIN my_table d
ON d.date = c.date + INTERVAL 1 DAY
WHERE b.date IS NULL
AND c.date IS NOT NULL
AND d.date IS NULL
GROUP
BY a.date
) x
JOIN my_table y
ON y.date BETWEEN x.start AND x.end
GROUP
BY x.start;
+------------+------------+-------+
| start | end | total |
+------------+------------+-------+
| 2015-07-01 | 2015-07-02 | 778 |
| 2015-08-01 | 2015-08-01 | 33 |
| 2015-08-20 | 2015-08-21 | 211 |
| 2015-09-14 | 2015-09-18 | 660 |
+------------+------------+-------+
4 rows in set (0.00 sec) -- <-- This is the number of periods

there is a simpler way of doing this, see here SQLfiddle:
SELECT min(date) start,max(date) end,sum(value) total FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl, (SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
GROUP BY diff
This select groups over the same difference between sequential number and date value.
Edit
As Strawberry remarked quite rightly, there was a flaw in my apporach, when a period spans a month change or indeed a change into the next year. The unix_timestamp() function can cure this though: It returns the seconds since 1970-1-1, so by dividing this number by 24*60*60 you get the days since that particular date. The rest is simple ...
If you only need the count, as your last comment stated, you can do it even simpler:
SELECT count(distinct diff) period_count FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl,(SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t

Tnx. #cars10 solution worked in MySQL, but could not manage to get period count to echo in PHP. It returned 0. Got it working tnx to #jarkinstall. So my final select looks something like this:
SELECT
sum(coalesce(count_tmp,coalesce(count_reserved,0))) as sum
,(SELECT COUNT(*) FROM my_table WHERE cid='.$cid.' AND DATE_ADD(date, INTERVAL - 1 DAY) NOT IN (SELECT date from my_table WHERE cid='.$cid.' AND coalesce(count_tmp,coalesce(count_reserved,0))>0)) as periods
,count(*) as count
,(min(date)) as min_date
,(max(date)) as max_date
FROM my_table WHERE cid=66
AND coalesce(count_tmp,coalesce(count_reserved,0))>0
ORDER BY date;

MySQL - Full outer join on same table using COUNT

I am trying to generate a table in the following format.
Proday | 2014-04-01 | 2014-03-01
--------------------------------
1 | 12 | 17
2 | 6 | 0
7 | 0 | 24
13 | 3 | 7
Prodays (duration between two timestamps) is a calculated value and the data for months is a COUNT. I can output the data for a single month, but am having troubles joining queries to additional months. The index (prodays) may not match for each month. e.g.. 2014-04-01 may not have any data for Prodays 7, whereas 2014-03-01 may not have Proday 2. Should indicate with 0 or null.
I suspect FULL OUTER JOIN is what should do the trick. But have read that's not possible in Mysql?
This is the query to get data for a single month:
SELECT round((protime - createtime) / 86400) AS prodays, COUNT(id) AS '2014-04-01'
FROM `tbl_users` as t1
WHERE status = 1 AND DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-04-01'
AND DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-04-01')
GROUP BY prodays
ORDER BY `prodays` ASC
How can I join/union an additional query to create a column for 2014-03-01?

You want to use conditional aggregation -- that is, move the filtering logic from the where clause to the select clause:
SELECT round((protime - createtime) / 86400) AS prodays,
sum(DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-04-01' AND
DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-04-01')
) as `2014-04-01`,
sum(DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-03-01' AND
DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-03-01')
) as `2014-03-01`
FROM `tbl_users` as t1
WHERE status = 1
GROUP BY prodays
ORDER BY `prodays` ASC;

Given a table with time periods, query for a list of sum per day

Let's say I have a table that says how many items of something are valid between two dates.
Additionally, there may be multiple such periods.
For example, given a table:
itemtype | count | start | end
A | 10 | 2014-01-01 | 2014-01-10
A | 10 | 2014-01-05 | 2014-01-08
This means that there are 10 items of type A valid 2014-01-01 - 2014-01-10 and additionally, there are 10 valid 2014-01-05 - 2014-01-08.
So for example, the sum of valid items at 2014-01-06 are 20.
How can I query the table to get the sum per day? I would like a result such as
2014-01-01 10
2014-01-02 10
2014-01-03 10
2014-01-04 10
2014-01-05 20
2014-01-06 20
2014-01-07 20
2014-01-08 20
2014-01-09 10
2014-01-10 10
Can this be done with SQL? Either Oracle or MySQL would be fine

The basic syntax you are looking for is as follows:
For my example below I've defined a new table called DateTimePeriods which has a column for StartDate and EndDate both of which are DATE columns.
SELECT
SUM(NumericColumnName)
, DateTimePeriods.StartDate
, DateTimePeriods.EndDate
FROM
TableName
INNER JOIN DateTimePeriods ON TableName.dateColumnName BETWEEN DateTimePeriods.StartDate and DateTimePeriods.EndDate
GROUP BY
DateTimePeriods.StartDate
, DateTimePeriods.EndDate
Obviously the above code won't work on your database but should give you a reasonable place to start. You should look into GROUP BY and Aggregate Functions. I'm also not certain of how universal BETWEEN is for each database type, but you could do it using other comparisons such as <= and >=.

There are several ways to go about this. First, you need a list of dense dates to query. Using a row generator statement can provide that:
select date '2014-01-01' + level -1 d
from dual
connect by level <= 15;
Then for each date, select the sum of inventory:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual),
periods as (select date '2014-01-01' + level -1 d from dual connect by level <= 15)
select
periods.d,
(select sum(item_count) from sample_data where periods.d between start_date and end_date) available
from periods
where periods.d = date '2014-01-06';
You would need to dynamically set the number of date rows to generate.
If you only needed a single row, then a query like this would work:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual)
select sum(item_count)
from sample_data
where date '2014-01-06' between start_date and end_date;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

mySQL - Grouping values based on consecutive differences - mysql

Related

Mysql: Subtraction between rows and sum with other table

MySQL: select entries with a certain count within a certain period

MySQL get count of periods where date in row

MySQL - Full outer join on same table using COUNT

Given a table with time periods, query for a list of sum per day

Categories

Resources