Calculating time difference between activity timestamps in a query - ms-access

I'm reasonably new to Access and having trouble solving what should be (I hope) a simple problem - think I may be looking at it through Excel goggles.
I have a table named importedData into which I (not so surprisingly) import a log file each day. This log file is from a simple data-logging application on some mining equipment, and essentially it saves a timestamp and status for the point at which the current activity changes to a new activity.
A sample of the data looks like this:
This information is then filtered using a query to define the range I want to see information for, say from 29/11/2013 06:00:00 AM until 29/11/2013 06:00:00 PM
Now the object of this is to take a status entry's timestamp and get the time difference between it and the record on the subsequent row of the query results. As the equipment works for a 12hr shift, I should then be able to build a picture of how much time the equipment spent doing each activity during that shift.
In the above example, the equipment was in status "START_SHIFT" for 00:01:00, in status "DELAY_WAIT_PIT" for 06:08:26 and so-on. I would then build a unique list of the status entries for the period selected, and sum the total time for each status to get my shift summary.

You can use a correlated subquery to fetch the next timestamp for each row.
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#;
Then you can use that query as a subquery in another query where you compute the duration between timestamp and next_timestamp. And then use that entire new query as a subquery in a third where you GROUP BY status and compute the total duration for each status.
Here's my version which I tested in Access 2007 ...
SELECT
sub2.status,
Format(Sum(Nz(sub2.duration,0)), 'hh:nn:ss') AS SumOfduration
FROM
(
SELECT
sub1.status,
(sub1.next_timestamp - sub1.timestamp) AS duration
FROM
(
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#
) AS sub1
) AS sub2
GROUP BY sub2.status;
If you run into trouble or need to modify it, break out the innermost subquery, sub1, and test that by itself. Then do the same for sub2. I suspect you will want to change the WHERE clause to use parameters instead of hard-coded times.
Note the query Format expression would not be appropriate if your durations exceed 24 hours. Here is an Immediate window session which illustrates the problem ...
' duration greater than one day:
? #2013-11-30 02:00# - #2013-11-29 01:00#
1.04166666667152
' this Format() makes the 25 hr. duration appear as 1 hr.:
? Format(#2013-11-30 02:00# - #2013-11-29 01:00#, "hh:nn:ss")
01:00:00
However, if you're dealing exclusively with data from 12 hr. shifts, this should not be a problem. Keep it in mind in case you ever need to analyze data which spans more than 24 hrs.
If subqueries are unfamiliar, see Allen Browne's page: Subquery basics. He discusses correlated subqueries in the section titled Get the value in another record.

Related

MySQL - group by interval query optimisation

Some background first. We have a MySQL database with a "live currency" table. We use an API to pull the latest currency values for different currencies, every 5 seconds. The table currently has over 8 million rows.
Structure of the table is as follows:
id (INT 11 PK)
currency (VARCHAR 8)
value (DECIMAL
timestamp (TIMESTAMP)
Now we are trying to use this table to plot the data on a graph. We are going to have various different graphs, e.g: Live, Hourly, Daily, Weekly, Monthly.
I'm having a bit of trouble with the query. Using the Weekly graph as an example, I want to output data from the last 7 days, in 15 minute intervals. So here is how I have attempted it:
SELECT *
FROM currency_data
WHERE ((currency = 'GBP')) AND (timestamp > '2017-09-20 12:29:09')
GROUP BY UNIX_TIMESTAMP(timestamp) DIV (15 * 60)
ORDER BY id DESC
This outputs the data I want, but the query is extremely slow. I have a feeling the GROUP BY clause is the cause.
Also BTW I have switched off the sql mode 'ONLY_FULL_GROUP_BY' as it was forcing me to group by id as well, which was returning incorrect results.
Does anyone know of a better way of doing this query which will reduce the time taken to run the query?
You may want to create summary tables for each of the graphs you want to do.
If your data really is coming every 5 seconds, you can attempt something like:
SELECT *
FROM currency_data cd
WHERE currency = 'GBP' AND
timestamp > '2017-09-20 12:29:09' AND
UNIX_TIMESTAMP(timestamp) MOD (15 * 60) BETWEEN 0 AND 4
ORDER BY id DESC;
For both this query and your original query, you want an index on currency_data(currency, timestamp, id).

setting up a mysql database to record tasks

I am trying to setup a MySQL database using PHPMyAdmin. Before I get long into it I want advice on setting it up and querying it. I set the table like this.
id: primary ket
time_in: date
time_out: date
task: varchar (128)
business: varchar (128)
All I need it to do is to keep track of how much time spent on each task and for what business. Is this good way of doing it or is there a better way?
If this is correct then I am trying to figure out how to query the time. This is what I have come up with as a query, but it far from what I want.
SELECT `Task`,`Business`, (SELECT `Time-Out` - `Time-In`) as `total time` FROM `Sheet`
Is there a way to convert total time into a more readable format?
Unless you're tracking time in days, I'd recommend using TIME or DATETIME for the time_in and time_out columns.
Personally, I'd probably make time_out nullable, to allow tracking current activity (something I've started, but not yet finished).
There's no need to use a sub-select for subtracting the timestamps, you can subtract those two columns inline as well (just drop the SELECT keyword there). For formatting, you could use the TIMEDIFF function:
SELECT '12:00:00' - '10:45:00';
-> 2
SELECT TIMEDIFF('12:00:00', '10:45:00');
-> '01:15:00'
That would make your query:
SELECT `Task`, `Business`, TIMEDIFF(`Time-Out`, `Time-In`) as `total time` FROM `Sheet`
If you do make time_out (or Time-Out) nullable, you'll need to take that into account in your query:
SELECT TIMEDIFF(NULL, '10:45:00');
-> NULL
So ongoing tasks would give a total time of NULL. If you want to know how long you've been working already, you can wrap it in an IFNULL function and get the current time in that case:
SELECT TIMEDIFF(IFNULL(`Time-Out`, NOW()), `Time-In`);
-> '01:15:00' if `Time-Out` is 12:00:00 and `Time-In` is 10:45:00
-> '02:05:13' if `Time-Out` is NULL and it's currently 12:50:13 (server time)
You will want to use TIMEDIFF(end_time, start_time) and TIME_TO_SEC(time) to convert the difference to a total number of seconds. You can then convert the seconds mathematically to whatever format you want.
So for the time of each task:
select ID
,task
,business
,time_to_sec(timediff(time_out, time_in)) as duration
from sheet
To aggregate by task and business:
select task
,business
,sum(time_to_sec(timediff(time_out,time_in))) as total_time
from sheet
group by task
,business

select minute( now()-ins_time ) returning null

I have a simple report that some users can run to check that we have received data from an external source. It counts the minutes since the last line was updated where INS_TIME is the time that the interface inserted the data.
Code below.
select minute( now()-ins_time ) as LastUpdateMinutes
from p_electro
order by ins_time desc limit 1
This works well and is used in a Crystal Report. However i am trying to make another one for a different table with a little extra complexity. I copied the above code and added to it as below.
select minute( now()-ins_time ) as LastUpdateMinutes
from p_treats
where location like '%e5%' and creator is null
order by ins_time desc limit 1
This criterion should basically indicate that the row was inserted by the interface and then reveal the minutes since the last one was entered. However when i run it, it runs successfully but i get a NULL returned (as below).
LastUpdateMinutes
-----------------
| NULL |
I was expecting to see thousands of minutes given we haven't received any data since 13.05.17 at 11am.
This works fine for the first table and has been in use for years.
To add some context, our data clerk runs this report twice a day and reports if the time in minutes is longer than 20 minutes. With the second table, we have not received data for 5 days (thanks to measures put in place after this cyber-attack) and it wasn't flagged up until the light bulb ignited in my head that this was probably affected.
I want to create another report for this table that we can add to the clerks daily checks and learn of missing data sooner
Anyone have any advice on why the second example doesn't work but the first example does? And how to fix it?
I have tried multiple syntax along the lines of:
SELECT
datediff(now(), ins_time) AS DiffDate
FROM
p_treats
WHERE
location LIKE 'e5%' and
creator is null
ORDER BY ins_time DESC
LIMIT 1
Which actually works to pull back the number of days and:
SELECT
timediff(now(), ins_time) AS DiffDate
FROM
p_hdtreatment
WHERE
hpwhere LIKE 'e5%' and
creator is null
ORDER BY ins_time DESC
LIMIT 1
Which returns the hours and this might actually have to be what we use, but i wanted to make it a little more accurate.
Thanks in advance
Try this.
SELECT TIMESTAMPDIFF(MINUTE, ins_time, now()) AS LastUpdateMinutes
from p_treats
where location like '%e5%'
and creator is null
order by ins_time desc limit 1
Checkout ->
MySQL TimeStampDiff
Side Note:
minute() used as it is in the first "working" example, I believe is giving you wrong answers.. minute() only pulls the Minute part of a date/time string out. Eg. SELECT MINUTE('2008-02-03 10:05:03') yields 5
Assuming "ins_time" is of data type "TimeStamp" or "DateTime", If the result of MINUTE(now() - ins_time) is greater then 1 hour, your answer is wrong. I would need to test this more to be sure.

How to return zero values if nothing was written in time interval?

I am using the Graph Reports for the select below. The MySQL database only has the active records in the database, so if no records are in the database from X hours till Y hours that select does not return anything. So in my case, I need that select return Paypal zero values as well even the no activity was in the database. And I do not understand how to use the UNION function or re-create select in order to get the zero values if nothing was recorded in the database in time interval. Could you please help?
select STR_TO_DATE ( DATE_FORMAT(`acctstarttime`,'%y-%m-%d %H'),'%y-%m-%d %H')
as '#date', count(*) as `Active Paid Accounts`
from radacct_history where `paymentmethod` = 'PayPal'
group by DATE_FORMAT(`#date`,'%y-%m-%d %H')
When I run the select the output is:
Current Output
But I need if there are no values between 2016-07-27 07:00:00 and 2016-07-28 11:00:00, then in every hour it should show zero active accounts Like that:
Needed output with no values every hour
I have created such select below , but it not put to every hour the zero value like i need. showing the big gap between the 12 Sep and 13 Sep anyway, but there should be the zero values every hour
(select STR_TO_DATE ( DATE_FORMAT(acctstarttime,'%y-%m-%d %H'),'%y-%m-%d %H')
as '#date', count(paymentmethod) as Active Paid Accounts
from radacct_history where paymentmethod <> 'PayPal'
group by DATE_FORMAT(#date,'%y-%m-%d %H'))
union ALL
(select STR_TO_DATE ( DATE_FORMAT(acctstarttime,'%y-%m-%d %H'),'%y-%m-%d %H')
as '#date', 0 as Active Paid Accounts
from radacct_history where paymentmethod <> 'PayPal'
group by DATE_FORMAT(#date,'%y-%m-%d %H')) ;
I guess, you want to return 0 if there is no matching rows in MySQL. Here is an example:
(SELECT Col1,Col2,Col3 FROM ExampleTable WHERE ID='1234')
UNION (SELECT 'Def Val' AS Col1,'none' AS Col2,'' AS Col3) LIMIT 1;
Updated the post: You are trying to retrieve data that aren't present in the table, I guess in reference to the output provided. So in this case, you have to maintain a date table to show the date that aren't in the table. Please refer to this and it's little bit tricky - SQL query that returns all dates not used in a table
You need an artificial table with all necessary time intervals. E.g. if you need daily data create a table and add all day dates e.g. start from 1970 till 2100.
Then you can use the table and LEFT JOIN your radacct_history. So for each desired interval you will have group item (group by should be based on the intervals table.

Aggregating/Grouping a set of rows/records in MySQL

I have a table say "sample" which saves a new record each five minutes.
Users might ask for data collected for a specific sampling interval of say 10 min or 30 min or an hour.
Since I have a record every five minutes, when a user asks for data with a hour sample interval, I will have to club/group every 12 (60/5) records in to one record (already sorted based on the time-stamp), and the criteria could be either min/max/avg/last value.
I was trying to do this in Java once I fetch all the records, and am seeing pretty bad performance as I have to iterate through the collection multiple times, I have read of other alternatives like jAgg and lambdaj, but wanted to check if that's possible in SQL (MySQL) itself.
The sampling interval is dynamic and the aggregation function (min/max/avg/last) too is user provided.
Any pointers ?
You can do this in SQL, but you have to carefully construct the statement. Here is an example by hour for all four aggregations:
select min(datetime) as datetime,
min(val) as minval, max(val) as maxval, avg(val) as avgval,
substring_index(group_concat(val order by datetime desc), ',', 1) as lastval
from table t
group by floor(to_seconds(datetime) / (60*60));