Mysql query to skip rows and check for status changes - mysql

I'm building a mysql query but I'm stuck... (I'm logging each minute)
I have 3 tables. Logs, log_field, log_value.
logs -> id, create_time
log_value -> id, log_id,log_field_id,value
log_field -> id, name (one on the entries is status and username)
The values for status can be online,offline and idle...
What I would like to see is from my query is:
When in my logs someone changes from status, I want a row with create_time, username, status.
So for a given user, I want my query to skip rows until a new status appears...
And I need to be able to put a time interval in which status changes are ignored.
Can someone please help ?

Although you have nothing to differentiate an actual "User" (such as by user ID) listed in your post, and what happens if you have two "John Smith" names.
First, an introduction to MySQL #variables. You can think of them as an inline program running while the query is processing rows. You create variables, then change them as each row gets processed, IN THE SAME order as the := assignment in the field selection occurs which is critical. I'll cover that shortly.
Fist an initial premise. You have a field value table of all possible fields that can/do get logged. Of which, two of them exist... one is for the user's name, another for the status you are looking a log changed. I don't know what those internal "ID" numbers are, but they would have to be fixed values per your existing table. In my scenario, I am assuming that field ID = 1 is for the User's Name, and field ID 2 = status column... Otherwise, you would need two more joins to get the field table just to confirm which field was the one you wanted. Obviously my "ID" field values will not match your production tables, so please change those accordingly.
Here's the query...
select FinalAlias.*
from (
select
PQ.*,
if( #lastUser = PQ.LogUser, 1, 0 ) as SameUser,
#lastTime := if( #lastUser = PQ.LogUser, #lastTime, #ignoreTime ) as lastChange,
if( PQ.create_time > #lastTime + interval 20 minute, 1, 0 ) as BeyondInterval,
#lastTime := PQ.create_time as chgTime,
#lastUser := PQ.LogUser as chgUser
from
( select
ByStatus.id,
l.create_time,
ByStatus.Value LogStatus,
ByUser.Value LogUser
from
log_value as ByStatus
join logs l
on ByStatus.log_id = l.id
join log_value as ByUser
on ByStatus.log_id = ByUser.log_id
AND ByUser.log_field_id = 1
where
ByStatus.log_field_id = 2
order by
ByUser.Value,
l.create_time ) PQ,
( select #lastUser := '',
#lastTime := now(),
#ignoreTime := now() ) sqlvars
) FinalAlias
where
SameUser = 1
and BeyondInterval = 1
Now, what's going on. The inner-most query (result alias PQ representing "PreQuery") is just asking for all log values where the field_id = 2 (status column) exists. From that log entry, go to the log table for it's creation time... while we're at it, join AGAIN to the log value table on the same log ID, but this time also look for field_id = 1 so we can get the user name.
Once that is done, get the log ID, Creation time, Status Value and Who it was for all pre-sorted on a per-user basis and sequentially time oriented. This is the critical step. The data must be pre-organized by user/time to compare the "last" time for a given user to the "next" time their log status changed.
Now, the MySQL #variables. Join the prequery to another select of #variables which is given an "sqlvars" query alias. This will pre-initialize the variables fo #lastUser, #lastTime and #ignoreTime. Now, look at what I'm doing in the field list via section
if( #lastUser = PQ.LogUser, 1, 0 ) as SameUser,
#lastTime := if( #lastUser = PQ.LogUser, #lastTime, #ignoreTime ) as lastChange,
if( PQ.create_time > #lastTime + interval 20 minute, 1, 0 ) as BeyondInterval,
#lastTime := PQ.create_time as chgTime,
#lastUser := PQ.LogUser as chgUser
This is like doing the following pseudo code in a loop for every record (which is already sequentially ordered by same person and their respective log time
FOR EACH ROW IN RESULT SET
Set a flag "SameUser" = 1 if the value of the #lastUser is the same
as the current person record we are looking at
if the last user is the same as the previous record
use the #lastTime field as the "lastChange" column
else
use the #ignore field as the last change column
Now, build another flag based on the current record create time
and whatever the #lastTime value is based on a 20 minute interval.
set it to 1 if AT LEAST the 20 minute interval has been meet.
Now the key to the cycling the next record.
force the #lastTime = current record create_time
force the #lastUser = current user
END FOR LOOP
So, if you have the following as a result of the prequery... (leaving date portion off)
create status user sameuser lastchange 20minFlag carry to next row compare
07:34 online Bill 0 09:05 0 07:34 Bill
07:52 idle Bill 1 07:34 0 07:52 Bill
08:16 online Bill 1 07:52 1 08:16 Bill
07:44 online Mark 0 09:05 0 07:44 Mark
07:37 idle Monica 0 09:05 0 07:37 Monica
08:03 online Monica 1 07:37 1 08:03 Monica
Notice first record for Bill. The flag same user = 0 since there was nobody before him. The last change was 9:05 (via the NOW() when creating the sqlvars variables), but then look at the "carry to next row compare". This is setting the #lastTime and #lastUser after the current row was done being compared as needed.
Next row for Bill. It sees he is same as last user previous row, so the SameUser flag is set to 1. We now know that we have a good "Last Time" to compare against the current record "Create Time". So, from 7:34 to 7:52 is 18 minutes and LESS than our 20 minute interval so the 20 minute flag is set to 0. Now, we retain the current 7:52 and Bill for third row.
Third row for Bill. Still Same User (flag=1), last change of 7:52 compared to now 8:16 and we have 24 minutes... So the 20 minute flag = 1. Retain 8:16 and Bill for next row.
First row for Mark. Same User = 0 since last user was Bill. Uses same 9:05 ignore time and don't care about 20 min flag, but now save 7:44 and Mark for next row compare.
On to Monica. Different than Mark, so SameUser = 0, etc to finish similar to Bill.
So, now we have all the pieces and rows considered. Now, take all these and wrap them up as the "FinalAlias" of the query and all we do is apply a WHERE clause for "SameUser = 1" AND "20 Minute Flag" has been reached.
You can strip down the final column list as needed, and remove the where clause to look at results, but be sure to add an outer ORDER BY clause for name/create_time to see similar pattern as I have here.

Related

Keep the newest one field value until it changes then keeping its newest field value

I have a few tables that have millions of records where a sensor was sending multiple 0 and 1 values and this data was logged to the table even though we only needed it to keep the very first 1 or 0 per each 1 to 0 or 0 to 1 change.
Adjustments have been made so we only now get the 1 and 0 values on each change and not every one second or whatever but I need to cleanup the unnecessary records from the tables.
I've done some research and testing and I'm having trouble figuring out what method to use here to delete the records not needed. I was trying to figure out how to retain the previous value record using variables and also created row numbers but it's not working as I need it to.
I created an SQLFiddle here and tried some logic per the example post MySQL - How To Select Rows Depending on Value in Previous Row (Remove Duplicates in Each Sequence). I keep getting back no results from this and when I tried running it on a large local MySQL table, and I got an error wto I have to increase the MySQL Workbench read query timeout to 600 or it lost connection.
I also found the "MySql - How get value in previous row and value in next row?" post and tried some variations of it and also "How to get next/previous record in MySQL?" and I've come up with total failure getting the expected results.
The Data
The data in the tables has a TimeStr column and a Value column just as in the screen shot and on the SQLFiddle link I posted with a small sample of the data.
Each record will never have the same TimeStr value but I really only need to keep the very first record time wise when the sensor either turned ON or OFF if that clarifies.
I'm not sure if the records will need an incremental row number added to get the expected results since it only has the TimeStr and the Value records otherwise.
My Question
Can anyone help me determine a method that I can use on a couple large tables to delete the records from a table where there are subsequent and duplicate Value values so the tables only has the very first 1 or 0 records where those actually change from a 1 to 0 or 0 to 1?
I will accept an answer that also results in just the records needed—but any that perform fast would be even more greatly appreciated.
I can easily put those into a temp table, drop the original table, and then create and insert the needed records only into the original table.
Expected Results
| TimeStr | Value |
|----------------------|-------|
| 2018-02-13T00:00:00Z | 0 |
| 2018-02-13T00:00:17Z | 1 |
| 2018-02-13T00:00:24Z | 0 |
| 2018-02-13T00:00:28Z | 1 |
Select t.timestr, t.value from (
SELECT s.*, #pv x1, (#pv := s.value) x2
FROM sensor S, (select #pv := -1) x
ORDER BY TimeStr ) t
where t.x1 != t.x2
See http://sqlfiddle.com/#!9/8d0774/122
Try this :
SET #rownum = 0;
SET #rownum_x = 0;
SELECT b.rownum, b.TimeStr, b.Value
FROM
(
SELECT #rownum := #rownum+1 as rownum, TimeStr, Value
FROM sensor
ORDER BY TimeStr
) b
LEFT JOIN (
SELECT #rownum_x := #rownum_x+1 as rownum_x, TimeStr as TimeStr_x, Value as Value_x
FROM sensor
ORDER BY TimeStr
) x ON b.rownum = x.rownum_x + 1
where b.Value <> x.Value_x or x.Value_x is null
order by b.TimeStr
The result I got is
You want the first record for each value when it appears. This suggests variables. Here is one way that only involves sorting and no joining:
select t.*
from (select t.*,
(case when value = #prev_value then value
when (#save_prev := #prev_value) = NULL then NULL
when (#prev_value := value) = NULL then NULL
else #save_prev
end) as prev_value
from (select t.*
from sensor t
order by timestr
) t cross join
(select #prev_value := -1) params
) t
where prev_value <> value;
Notes:
The subquery for ordering only seems to be needed since MySQL 5.7.
The case is just a way to introduce serialized code. When using a variable it should only be used on one expression.
This only requires one sort -- and if you have an index, that doesn't even need to be a sort.
Here is a SQL Fiddle.

Replace mysql user defined variable

I have a query which works great given that the result is only one number, but now I need to allow for multiple rows to be returned and the query cannot handle that because it uses a user define variable... here is original procedure
CREATE DEFINER=`root`#`%` PROCEDURE `MapRank`(pTankID bigint,pMapID int, pColor int(2))
BEGIN
SET #RankNumber:=0;
select RankNumber
from
(select
TankID,
#RankNumber:=#RankNumber+1 as RankNumber,
MapID,
Color
from MAPDATA WHERE MapID = pMapID order by Rank DESC, TotalPP DESC) Query1 where TankID = pTankID AND COLOR = pColor ;
END
this returns a single number, essentially counting the number of records down it is, giving me the "row" location.
now I need to change it to give me all rows with out the where for mapid and color, so that I can see all ranks for all mapid/color combo
this is what I have that currently does not work
SET #RankNumber:=0;
select
RankNumber,MapID,COlor
from
(select
TankID,
#RankNumber:=#RankNumber + 1 as RankNumber,
MapID,
Color
from
MAPDATA
order by TotalPP DESC) Query1
where
TankID = 18209 ORDER BY RankNumber
the yielding query result looks as such:
1062 3 1
3544 3 0
6717 17 1
6752 17 3
7453 3 2
7860 17 0
7984 17 2
9220 3 3
if I run manually lets say, map id 3 and color 3 which says rank number is 9220 with the FIRST query I get this
6022
I need this to be able to be done possibly from multiple MySQL connections so ideally done without use of a temporary variable since its possible another person may come in and use that... any help would be great.
After digging and messing more I have found the solution to be to set the variable back to zero from within the outer select.. and since user defined variable are connection level and I utilize pooling we should never have an issue.
SET #RankNumber:=0;
select
RankNumber,MapID,COlor, #RankNumber:=0
from
(select
TankID,
#RankNumber:=#RankNumber + 1 as RankNumber,
MapID,
Color
from
MAPDATA
order by MapID, Rank DESC, TotalPP DESC ) Query1
where
TankID = pTankID ORDER BY RankNumber;

SQL - Find all down times and the lengths of the downtimes from MySQL data (set of rows with time stamps and status messages)

I have started monitoring my ISP's downtimes with a looping PHP script which checks the connection automatically every 5 seconds and stores the result in MySQL database. The scripts checks if it's able to reach a couple of remote websites and logs the result. The time and status of the check are always stored in the database.
The structure of the table is following:
id (auto increment)
time (time stamp)
status (varchar)
Now to my issue.
I have the data, but I don't know how to use it to achieve the result I would like to get. Basically I would like to find all the periods of time when the connection was down and for how long the connection was down.
For instance if we have 10 rows with following data
0 | 2012-07-24 22:23:00 | up
1 | 2012-07-24 22:23:05 | up
2 | 2012-07-24 22:23:10 | down
3 | 2012-07-24 22:23:16 | down
4 | 2012-07-24 22:23:21 | up
5 | 2012-07-24 22:23:26 | down
6 | 2012-07-24 22:23:32 | down
7 | 2012-07-24 22:23:37 | up
8 | 2012-07-24 22:23:42 | up
9 | 2012-07-24 22:23:47 | up
the query should return the periods (from 22:23:10 to 22:23:21, and from 22:23:26 to 22:23:37). So the query should find always the time between the first time the connection goes down, and the first time the connection is up again.
One method I thought could work was finding all the rows where the connection goes down or up, but how could I find these rows? And is there some better solution than this?
I really don't know what the query should look like, so the help would be highly appreciated.
Thank you, regards Lassi
Here's one approach.
Start by getting the status rows in order by timestamp (inline view aliased as s). Then use MySQL user variables to keep the values from previous rows, as you process through each row.
What we're really looking for is an 'up' status that immediately follows a sequence of 'down' status. And when we find that row with the 'up' status, what we really need is the earliest timestamp from the preceding series of 'down' status.
So, something like this will work:
SELECT d.start_down
, d.ended_down
FROM (SELECT #i := #i + 1 AS i
, #start := IF(s.status = 'down' AND (#status = 'up' OR #i = 1), s.time, #start) AS start_down
, #ended := IF(s.status = 'up' AND #status = 'down', s.time, NULL) AS ended_down
, #status := s.status
FROM (SELECT t.time
, t.status
FROM mydata t
WHERE t.status IN ('up','down')
ORDER BY t.time ASC, t.status ASC
) s
JOIN (SELECT #i := 0, #status := 'up', #ended := NULL, #start := NULL) i
) d
WHERE d.start_down IS NOT NULL
AND d.ended_down IS NOT NULL
This works for the particular data set you show.
What this doesn't handle (what it doesn't return) is a 'down' period that is not yet ended, that is, a sequence of 'down' status with no following 'up' status.
To avoid a filesort operation to return the rows in order, you'll want a covering index on (time,status). This query will generate a temporary (MyISAM) table to materialize the inline view aliased as d.
NOTE: To understand what this query is doing, peel off that outermost query, and run just the query for the inline view aliased as d (you can add s.time to the select list.)
This query is getting every row with an 'up' or 'down' status. The "trick" is that it is assigning both a "start" and "end" time (marking a down period) on only the rows that end a 'down' period. (That is, the first row with an 'up' status following rows with a 'down' status.) This is where the real work is done, the outermost query just filters out all the "extra" rows in this resultset (that we don't need.)
SELECT #i := #i + 1 AS i
, #start := IF(s.status = 'down' AND (#status = 'up' OR #i = 1), s.time, #start) AS start_down
, #ended := IF(s.status = 'up' AND #status = 'down', s.time, NULL) AS ended_down
, #status := s.status
, s.time
FROM (SELECT t.time
, t.status
FROM mydata t
WHERE t.status IN ('up','down')
ORDER BY t.time ASC, t.status ASC
) s
JOIN (SELECT #i := 0, #status := 'up', #ended := NULL, #start := NULL) i
The purpose of inline view aliased as s is to get the rows ordered by timestamp value, so we can process them in sequence. The inline view aliased as i is just there so we can initialize some user variables at the start of the query.
If we were running on Oracle or SQL Server, we could make use of "analytic functions" or "ranking functions" (as they are named, respectively.) MySQL doesn't provide anything like that, so we have to "roll our own".
I don't really have time to adapt this to work for your setup right now, but I'm doing pretty much the same thing on a web page to monitor when a computer was turned off, and when it was turned back on, then calculating the total time it was on for...
I also don't know if you have access to PHP, if not completely ignore this. If you do, you might be able to adapt something like this:
$lasttype="OFF";
$ontime=0;
$totalontime=0;
$query2 = " SELECT
log_unixtime,
status
FROM somefaketablename
ORDER BY
log_unixtime asc
;";
$result2=mysql_query($query2);
while($row2=mysql_fetch_array($result2)){
if($lasttype=="OFF" && $row2['status']=="ON"){
$ontime = $row2['log_unixtime'];
}elseif($lasttype=="ON" && $row2['status']=="OFF"){
$thisblockontime=$row2['log_unixtime']-$ontime;
$totalontime+=($thisblockontime);
}
$lasttype=$row2['status'];
}
Basically, you start out with a fake row that says the computer is off, then loop through each real row.
IF the computer was off, but is now on, set a variable to see when it was turned on, then keep looping...
Keep looping until the computer was ON, but is now OFF. When that happens, subtract the previously-stored time it was turned on from the current row's time. That shows how long it was on for, for that group of "ON's".
Like I said, you'll have to adapt that pretty heavily to get it to do what you want, but if you replace "computer on/off" with "connection up/down", it's essentially the same idea...
One thing that makes this work is that I'm storing dates as integers, as a unix timestamp. So you might have to convert your dates so the subtraction works.
I'm unsure if this works (if not just comment)
It does: Select rows only if the row with an id 1 smaller than the current id has a different status (therefore selecting the first entry of any perion) and determinate the end Time through the >= and the same status.
SELECT ou.id AS outerId,
ou.timeColumn AS currentRowTime,
ou.status AS currentRowStatus,
( SELECT max(time)
FROM statusTable
WHERE time >= ou.timeColumn AND status = ou.status) AS endTime
FROM statusTable ou
WHERE ou.status !=
(SELECT status
FROM statusTable
WHERE id = (ou.id -1))

Use mysql to work out in and out times of vehicle at customer, multiple stops and entries for each customer

I have a mysql table that contains data as per the screenshot below.
My requirement is to generate a mysql query that will show me the in and out time for each customer.
the issue I have is that I cannot use min or max as the vehicle might have visited the same customer two or three times within a period.
So the output I am looking for is:
Vehicle: RB10
Customer: Hulamin
In: 10:19
out: 10:35
Time Taken: 16 min
In: 11:14
out: 11:29
Time Taken: 15 min
ave time taken: 15.5 min
and the same for each of the other sites and vehicles as required.
How do I tell mysql to take the smallest in time before the corresponding out time and report?
Many thanks for the assistance.
You could do it using SQL Variables to help control when the address changes, even IF they occur multiple times. Without having MySQL readily available, I would approach something like below. Start with an inner query that stamps a "GroupSeq" based on a change in either vehicle and/or address. Keep the order sequential by date/time. After each test against the #lastGroup is either left alone, or added 1 to the sequence, THEN update the #lastAddress and #lastVehicle as basis for the NEXT record being selected into the result set for comparison.
Per your example, the results of each customer would be (all these same vehicle, so not duplicating display of that column)
Address GroupSeq
Hulamin 1
SACD 2
UL 3
NP 4
Hulamin 5
SACD 6
After that, you can then properly do your MIN/MAX based on the GroupSeq assigned.
select
PreQuery.Vehicle,
PreQuery.Address,
PreQuery.GroupSeq,
MIN( PreQuery.`DateTime` ) as InTime,
MAX( PreQuery.`DateTime` ) as OutTime
from
( select
YT.Vehicle,
YT.Address,
YT.`DateTime`,
YT.Direction,
#lastGroup := #lastGroup + if( #lastAddress = YT.Address
AND #lastVehicle = YT.Vehicle, 0, 1 ) as GroupSeq,
#lastVehicle := YT.Vehicle as justVarVehicleChange,
#lastAddress := YT.Address as justVarAddressChange
from
YourTable YT,
( select #lastVehicle := '',
#lastAddress := '',
#lastGroup := 0 ) SQLVars
order by
YT.`DateTime` ) PreQuery
Group By
PreQuery.Vehicle,
PreQuery.Address,
PreQuery.GroupSeq
The above SHOULD result in something like
Vehicle Address GroupSeq InTime OutTime
RB10 Hulamin 1 10:19 10:35
RB10 SACD 2 10:37 10:40
RB10 UL 3 10:41 11:06
RB10 NP 4 11:07 11:14
RB10 Hulamin 5 11:14 11:28
RB10 SACD 6 11:29 12:21
Now, the above sample does not actually compute the total time taken per in-out, nor the average per vehicle/customer average time for what appears to be processing, but you can add those computations after you understand and get this part.
Please note, this is based on natural order as appears by date/time. It looks like one transaction from beginning to end can have many "IN"s, but ALWAYS ends with an "OUT" before proceeding to the next customer address. If this is an incorrect assumption, modifications would obviously need to be made.

MySQL - self join optimization

I have a table of phone events by HomeId. Each row has an EventId (on hook, off hook, ring, DTMF, etc), TimeStamp, Sequence (auto increment) and HomeId. Im working on a query to find specific types of occurrences(IE inbound or outbound calls) and duration.
I had planned on doing this using a multiple self-join on this table to pick out the sequences of events that usually indicate one type of occurrence or the other. EG inbound calls would be a period of inactivity followed by no DTMF, then ringing and caller id (possibly) then an off hook. I would find the next on-hook and thus have the duration.
My table is indexed by HomeId, EventId and Sequence and has ~60K records. When I do an 'explain' of my query it shows indexing and 75, 75, 1, 1, 748 for the row counts. Seems pretty doable. But when I run the query its taking more than 10 minutes (at which point the MySQL query browser times out).
Query for outbound calls:
select pe0.HomeId, pe1.Stamp, pe1.mSec, timediff( pe4.Stamp, pe0.Stamp ) from Phone_Events pe0
join Phone_Events pe1 on pe0.HomeId = pe1.HomeId and pe1.Sequence = pe0.Sequence - 1 and abs(timediff( pe0.Stamp, pe1.Stamp )) > 10
join Phone_Events pe2 on pe0.HomeId = pe2.HomeId and pe2.Sequence = pe0.Sequence + 1 and pe2.EventId = 22
join Phone_Events pe4 on pe4.HomeId = pe0.HomeId and pe4.EventId = 30 and pe4.Stamp > pe0.Stamp
where pe0.eventId = 12 and pe0.HomeId = 111
AND
NOT EXISTS(SELECT * FROM Phone_Events pe3
WHERE pe3.HomeId = pe0.HomeId
AND pe3.EventId not in( 13, 22 )
AND pe3.Stamp > pe0.Stamp and pe3.Stamp < pe4.Stamp );
Is there something specific to self joining that makes this slow? Is there a better way to optimize this? The killer seems to be the 'not exists' portion - this part is there to make sure there are no events between the last 'on hook' and the current 'off hook'.
EDIT: EventId's as follows:
'1', 'device connection'
'2', 'device disconnection'
'3', 'device alarm'
'11', 'ring start'
'12', 'off hook'
'13', 'hang up(other end)'
'15', 'missed call'
'21', 'caller id'
'22', 'dtmf'
'24', 'device error'
'30', 'on hook'
'31', 'ring stop'
Complete rewrite based on new information. How I approached this was to start with an inner-most query to get all records we care about based exclusively on HomeID = 111 and make sure they came back pre-sorted by the sequence ID (have index on HomeID, Sequence). As we all know, a phone call starts by picking up the phone -- eventID = 12, getting dial tone -- eventid = 22, dialing out, and someone answering, until the phone is back on the hook -- eventid = 30). If its a hangup (eventid=13), we want to ignore it.
I don't know why you are looking at the sequence # PRIOR to the current call, don't know if it really has any bearing. It looks like you are just trying to get completed calls and how long the duration. That said, I would remove the portion of the LEFT JOIN Phone_Event and the corresponding WHERE clause. It may have been there while you were just trying to figure this out.
Anyhow, back to the logic. The inner most guarantees the call sequences in order. You won't have two calls simultaneous. So by getting them in order first, I then join to the SQLVars (which creates inline variable #NextCall for the query). The purpose for this is to identify every time a new call is about to begin (EventID = 12). If so, take whatever the sequence number is, and save it. This will remain the same until the next call, so all the other "event IDs" will have the same "starting sequence ID". In addition, I'm looking for the other events... an event = 22 based on the starting sequence +1 and setting it as a flag. Then, the max time based on the start of the call (only set when eventid = 12), and end of the call (eventid = 30), and finally a flag based on your check for a hang up (eventid = 13) ie: don't consider the call if it was a hangup and no connection through.
By doing a group by, I've in essence, rolled-up each call to its own line... grouped by the home ID, and the sequence number used to initiate the actual phone call. Once THAT is done, I can then query the data and compute the call duration since the start/end time are on the same row, no self-self-self joins involved.
Finally, the where clause... Kick out any phone calls that HAD a HANG UP. Again, I don't know if you still need the element of what the starting call's time was of the last ending event.
SELECT
PreGroupedCalls.*,
timediff( PreGroupedCalls.CallEndTime, PreGroupedCalls.CallStartTime ) CallDuration
from
( SELECT
Calls.HomeID,
#NextCall := #NextCall + if( Calls.EventID = 12, Calls.Sequence, #NextCall ) as NextNewCall,
MAX( if( Calls.EventID = 12, Calls.Stamp, 0 )) as CallStartTime,
MAX( if( Calls.EventID = 30, Calls.Stamp, 0 )) as CallEndTime,
MAX( if( Calls.EventID = 22 and Calls.Sequence = #NewCallFirstSeq +1, 1, 0 )) as HadDTMFEntry,
MAX( if( Calls.EventID = 13 and Calls.Sequence = #NewCallFirstSeq +1, 1, 0 )) as WasAHangUp
from
( select pe.HomeId,
pe.Sequence,
pe.EventID,
pe.Stamp
from
Phone_Events pe
where
pe.HomeID = 111
order by
pe.Sequence ) Calls,
( select #NextCall := 0 ) SQLVars
group by
Calls.HomeID,
NextNewCall ) PreGroupedCalls
LEFT JOIN Phone_Event PriorCallEvent
ON PreGroupCalls.NextNewCall = PriorCallEvent.Sequence -1
where
PreGroupedCalls.WasHangUp = 0
AND ( PriorCallEvent.Sequence IS NULL
OR abs(timediff( PriorCallEvent.Stamp, PreGroupedCalls.CallStartTime )) > 10 )
COMMENT FROM FEEDBACK / ERROR reported
To try and fix the DOUBLE error, you obviously will need to make a slight change in the SQLVars select.. try the following
( select #NextCall := CAST( 0 as INT ) ) SQLVars
Now, what the IF() is doing... Lets take a look.
#NextCall + if(Calls.EventID = 12,Calls.Sequence, #NextCall)
means take a look at the Event ID. If it is a 12 (ie: off-hook), grab whatever the sequence number is for that entry. This will become the new "Starting Sequence" of another call. If not, just keep whatever the last value set was, as its a continuation of a call in progress. Now, lets look at some simulated data to help better illustrate all the columns
Original data Values that will ultimately be built into...
HomeID Sequence EventID Stamp #NextCall
111 1 12 8:00:00 1 beginning of a new call
111 2 22 8:00:01 1 not a new "12" event, keep last value
111 3 30 8:05:00 1 call ended, phone back on hook
111 4 12 8:09:00 4 new call, use the sequence of THIS entry
111 5 22 8:09:01 4 same call
111 6 13 8:09:15 4 same call, but a hang up
111 7 30 8:09:16 4 same call, phone back on hook
111 8 12 8:15:30 8 new call, get sequence ID
111 9 22 8:15:31 8 same call...
111 10 30 8:37:15 8 same call ending...
Now, the query SHOULD create something like this
HomeID NextNewCall CallStartTime CallEndTime HadDTMFEntry WasAHangUp
111 1 8:00:00 8:05:00 1 0
111 4 8:09:00 8:09:16 1 1
111 8 8:15:30 8:37:15 1 0
As you can see, the #NextCall keeps all the sequential entries for a given call "Grouped" together so you don't have to just use greater than span information or less than... It is always going to follow a certain path of "events", so whatever is the one that started the call is the basis for the rest of the events until the next call is started, then THAT sequence is grabbed for THAT group call.
Yup, its a lot to grasp.. but hopefully now more digestible for you :)