Here is a sample table that I am using,
User_id timestamp action
1 2020-10-01 09:00:00 Opened page
1 2020-10-01 09:10:00 Closed page
2 2020-10-02 04:00:00 Signed up
3 2020-10-02 06:00:00 Opened page
3 2020-10-03 11:00:00 Made a booking
3 2020-10-03 09:30:00 Closed page
need to write a SQL query to find the average time spent by a user on the page.
The expected answer is just a number which represents the average time spent by an average user on the page.
You can’t use SQL to calculate how much time a user spends on different pages of your UI application. You will need to implement this logic on your UI whenever there is an event such as when the user navigates to another page or a button click etc. You capture the timestamps you need on the UI and then make a database call through an SP call or Query through your server side code (such as .Net, Java or Node.js).
Once you have captured the data from the UI you will be able to implement any kind of logic on that data through an SP or a function or something like that in using SQL.
If you use TIMESTAMPDIFF(), and set its argument to SECOND, you can get back the difference of two datetime fields in a record in a manner that can be summed and divided. Documentation:
Returns datetime_expr2 − datetime_expr1, where datetime_expr1 and datetime_expr2 are date or datetime expressions.
Then use SUM() to sum up these values, and divide by the results of COUNT(). Documentation:
SUM(): Returns the sum of expr. If the return set has no rows, SUM() returns NULL.
COUNT(): Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement.
Your code will then basically look like this. You may need to make some adjustments based on your database setup.
SELECT
SUM(
TIMESTAMPDIFF(SECOND, OrigDateTime, LastDateTime)
) / (select COUNT(id) FROM yourTable)
AS average
FROM yourTable;
This, of course, follows our standard formula for calculating an average:
sum(differences) / count(differences)
Related
I would like to discuss the "best" way to storage date periods in a database. Let's talk about SQL/MySQL, but this question may be for any database. I have the sensation I am doing something wrong for years...
In english, the information I have is:
-In year 2014, value is 1000
-In year 2015, value is 2000
-In year 2016, there is no value
-In year 2017 (and go on), value is 3000
Someone may store as:
BeginDate EndDate Value
2014-01-01 2014-12-31 1000
2015-01-01 2015-12-31 2000
2017-01-01 NULL 3000
Others may store as:
Date Value
2014-01-01 1000
2015-01-01 2000
2016-01-01 NULL
2017-01-01 3000
First method validation rules looks like mayhem to develop in order to avoid holes and overlaps.
In second method the problem seem to filter one punctual date inside a period.
What my colleagues prefer? Any other suggestion?
EDIT: I used full year only for example, my data usually change with day granularity.
EDIT 2: I thought about using stored "Date" as "BeginDate", order rows by Date, then select the "EndDate" in next (or previous) row. Storing "BeginDate" and "Interval" would lead to hole/overlap problem as method one, that I need a complex validation rule to avoid.
It mostly depends on the way you will be using this information - I'm assuming you do more than just store values for a year in your database.
Lots of guesses here, but I guess you have other tables with time-bounded data, and that you need to compare the dates to find matches.
For instance, in your current schema:
select *
from other_table ot
inner join year_table yt on ot.transaction_date between yt.year_start and yt.year_end
That should be an easy query to optimize - it's a straight data comparison, and if the table is big enough, you can add indexes to speed it up.
In your second schema suggestion, it's not as easy:
select *
from other_table ot
inner join year_table yt
on ot.transaction_date between yt.year_start
and yt.year_start + INTERVAL 1 YEAR
Crucially - this is harder to optimize, as every comparison needs to execute a scalar function. It might not matter - but with a large table, or a more complex query, it could be a bottleneck.
You can also store the year as an integer (as some of the commenters recommend).
select *
from other_table ot
inner join year_table yt on year(ot.transaction_date) = yt.year
Again - this is likely to have a performance impact, as every comparison requires a function to execute.
The purist in me doesn't like to store this as an integer - so you could also use MySQL's YEAR datatype.
So, assuming data size isn't an issue you're optimizing for, the solution really would lie in the way your data in this table relates to the rest of your schema.
I am having a table as follows in MYSQL:
proj_id|hoursWorked|Date.
The date field is of type Date; I want to retrieve all the entries from a table depending on a given week number for the project in my java based web application. Please help me to achieve this.
I am unable to write a single query that will allow me to do so.
Do not use something like WHERE WEEK(column)=something - this is a performance killer: It will calculate the week number on all rows, even if they don't match. In addition to that it will make it impossible to use an index ont this column.
Instead calculate an absolute begin and end date or point in time, depending on your data type, then use BETWEEN. This will do no calculations on non-matching rows and allow the use of an index.
Rule of thumb: If you have the choice between a calculation on a constant and on a field, use the former.
use MySQL WEEK() function.
SELECT WEEK(dateColumn)
FROM...
WHERE WEEK(dateColumn) = 1
WEEK()
from MySQL Docs
This function returns the week number for date. The two-argument form
of WEEK() enables you to specify whether the week starts on Sunday or
Monday and whether the return value should be in the range from 0 to
53 or from 1 to 53.
Use WEEK
select * from your_table
where week(`Date`) = week('2012-12-01')
If you want to get only records from the current week you can do
select * from your_table
where week(`Date`) = week(curdate())
I'm reasonably new to Access and having trouble solving what should be (I hope) a simple problem - think I may be looking at it through Excel goggles.
I have a table named importedData into which I (not so surprisingly) import a log file each day. This log file is from a simple data-logging application on some mining equipment, and essentially it saves a timestamp and status for the point at which the current activity changes to a new activity.
A sample of the data looks like this:
This information is then filtered using a query to define the range I want to see information for, say from 29/11/2013 06:00:00 AM until 29/11/2013 06:00:00 PM
Now the object of this is to take a status entry's timestamp and get the time difference between it and the record on the subsequent row of the query results. As the equipment works for a 12hr shift, I should then be able to build a picture of how much time the equipment spent doing each activity during that shift.
In the above example, the equipment was in status "START_SHIFT" for 00:01:00, in status "DELAY_WAIT_PIT" for 06:08:26 and so-on. I would then build a unique list of the status entries for the period selected, and sum the total time for each status to get my shift summary.
You can use a correlated subquery to fetch the next timestamp for each row.
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#;
Then you can use that query as a subquery in another query where you compute the duration between timestamp and next_timestamp. And then use that entire new query as a subquery in a third where you GROUP BY status and compute the total duration for each status.
Here's my version which I tested in Access 2007 ...
SELECT
sub2.status,
Format(Sum(Nz(sub2.duration,0)), 'hh:nn:ss') AS SumOfduration
FROM
(
SELECT
sub1.status,
(sub1.next_timestamp - sub1.timestamp) AS duration
FROM
(
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#
) AS sub1
) AS sub2
GROUP BY sub2.status;
If you run into trouble or need to modify it, break out the innermost subquery, sub1, and test that by itself. Then do the same for sub2. I suspect you will want to change the WHERE clause to use parameters instead of hard-coded times.
Note the query Format expression would not be appropriate if your durations exceed 24 hours. Here is an Immediate window session which illustrates the problem ...
' duration greater than one day:
? #2013-11-30 02:00# - #2013-11-29 01:00#
1.04166666667152
' this Format() makes the 25 hr. duration appear as 1 hr.:
? Format(#2013-11-30 02:00# - #2013-11-29 01:00#, "hh:nn:ss")
01:00:00
However, if you're dealing exclusively with data from 12 hr. shifts, this should not be a problem. Keep it in mind in case you ever need to analyze data which spans more than 24 hrs.
If subqueries are unfamiliar, see Allen Browne's page: Subquery basics. He discusses correlated subqueries in the section titled Get the value in another record.
Does an IF condition in the where clause of a MySQL query slow down the execution drastically?
Here is the one sample query:-
select * from alert_details_v adv
where (if(day(last_day(now()))<DAY(adv.alert_date),
day(last_day(now())),DAY(adv.alert_date))-adv.alert_trigger_days)<=day(now());
Sample data:
alert_id alert_date alert_trigger_days
==================================================
1 2013-09-14 00:00:00 6
2 2013-09-13 00:00:00 5
alert_date: Some user input date
alert_trigger_days: Number of days before the actual date the alert be triggered.
Brief about query logic:-
Here I am trying to find if the last day of the current month is less than the day of the alert_date (database column). Whichever day comes before would be considered.
Basically this table is meant for storing alert information. So if the user has chosen 30th of some month and the alert is recurring monthly then for February it would not find the day 30th and hence would not show the record.
My question is: does a query with if conditions (as in the sample query above) in where clause slows down the execution of the query drastically or slightly, if there are hundreds of thousands of records in the table?
This may entirely depend upon your table and data. Sometimes it may help in increasing the performance and sometimes it may degrade your performance.
I have view HW02 created from table call_details having following structure.
pri_key | calling_no | called_no | answer_date_time | Duration
and I have to find total duration called by each subscriber on a day.
and create view as
create view hw02 as
select calling_no, day(answer_date_time) as days,duration from call_details;
and I calculate total_duration of each subscriber per days as
select a.calling_no,a.days,sum(b.duration)
from hw02 as a, hw02 as b
where a.calling_no=b.calling_no and a.days=b.days;
This query takes lots of time to execute. So my question is how to optimize this query. (Data :- around 150,000 rows)
Try this, should be faster and serve your purpose
SELECT
calling_no,
DATE(answer_date_time) as day,
SUM( duration )
FROM
call_details
GROUP BY
calling_no,
DATE(answer_date_time)
A self-join on the view itself is not needed in your case, all we want is the total duration a user calls to other users, grouped by the date.
The call duration, I suppose for a particular call would be same for the called and the calling user (records). The day() function used by you, would not return the right results, if you have data for multiple months, hence I have used the date function instead
more on datetime functions in mysql, https://dev.mysql.com/doc/refman/4.1/en/date-and-time-functions.html