I have view HW02 created from table call_details having following structure.
pri_key | calling_no | called_no | answer_date_time | Duration
and I have to find total duration called by each subscriber on a day.
and create view as
create view hw02 as
select calling_no, day(answer_date_time) as days,duration from call_details;
and I calculate total_duration of each subscriber per days as
select a.calling_no,a.days,sum(b.duration)
from hw02 as a, hw02 as b
where a.calling_no=b.calling_no and a.days=b.days;
This query takes lots of time to execute. So my question is how to optimize this query. (Data :- around 150,000 rows)
Try this, should be faster and serve your purpose
SELECT
calling_no,
DATE(answer_date_time) as day,
SUM( duration )
FROM
call_details
GROUP BY
calling_no,
DATE(answer_date_time)
A self-join on the view itself is not needed in your case, all we want is the total duration a user calls to other users, grouped by the date.
The call duration, I suppose for a particular call would be same for the called and the calling user (records). The day() function used by you, would not return the right results, if you have data for multiple months, hence I have used the date function instead
more on datetime functions in mysql, https://dev.mysql.com/doc/refman/4.1/en/date-and-time-functions.html
Related
I'm pretty new to SQL and I'm struggling with one of the questions on my exercise. How would I calculate average session length per daily active user? The table shown is just a sample of what the extended table is. Imagine loads more rows.
I simply used this query to calculate the daily active users:
SELECT COUNT (DISTINCT user_id)
FROM table1
and welcome to StackOverflow!
now, your question:
How would I calculate average session length per daily active user?
you already have the session time, and using AVG function you will get a simple average for all
select AVG(session_length_seconds) avg from table_1
but you want per day... so you need to think as group by day, so how do you get the day? you have a activity_date as a Date entry, it's easy to extract day, month and year from it, for example
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year
from
table_1
will break down the date field in columns you can use...
now, back to your question, it states daily active user, but all you have is sessions, a user could have multiple sessions, so I have no idea, from the context you have shared, how you go about that, and make the avg for each session, makes no sense as data to retrieve, I'll just assume, and serves this answer just to get you started, that you want the avg per day only
knowing how to get the average, let's create a query that has it all together:
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year,
AVG(session_length_seconds) avg
from
table_1
group by
DAY(activity_date),
MONTH((activity_date),
YEAR(activity_date)
will output the average of session_length_seconds per day/month/year
the group by part, you need to have as many fields you have in the select but that do not do any calculation, like sum, count, etc... in our case avg does calculation, so we don't want to group by that value, but we do want to group by the other 3 values, so we have a 3 columns with day, month and year. You can also use concat to join day, month and year into just one string if you prefer...
I am trying to write a single MySQL query which will tell me the total number of active users in the database in week-based intervals. The 2 returned values per row should be the date, and the total number of active users on that date. I was able to get this far:
SELECT from_days(to_days(cast(u.created as datetime)) - mod(to_days(cast(u.created as datetime)) - 1 - 1, 7)) AS date, COUNT(1) as count
FROM users u
WHERE u.active = 1
GROUP BY 1;
I believe this shows me the number of new active users in each given interval, but I can't figure out how to 'aggregate' those counts to show the total number of users increasing over each time interval. Any point in the right direction would be greatly appreciated.
It's hard to say without an example of your output but I would start by making the whole thing a subquery and using an aggregate function or a calculation on top of it.
See this post:
MySQL Running Total with COUNT
When trying to manipulate and display date from ONE table, I am having difficulty coding it correctly.
I need to, from the same table, find the amount of Services done per day (Which has been done, based on the Count of ServiceId). I then need to find the OverallCharge (done) and find the min, max and avg of these overallCharge (s) per day (BasicCharge + AdditionalPartsCharge + AdditionalLabourCharge)
I need to display these charges per ServiceDate in the table
My draft is the following but is telling me that ServiceId is not part of an aggregate function.
SELECT Service.ServiceDate, Service.NumServices , Min(OverallCharge) AS MinOverallCharge, Max(OverallCharge) AS MaxOverallCharge, Avg(OverallCharge) AS AverageOverallCharge
FROM (SELECT Service.ServiceId, Sum([BasicCharges]+[AdditionalLabourCharges]+[AdditionalPartCharges]) AS OverallCharge, Service.ServiceDate, Count (Service.ServiceId) AS NumServices
FROM Service
GROUP BY Service.ServiceDate, NumServices, MinOverallCharge, MaxOverallCharge, AvgerageOverallCharge);
Thanks
I have a table say "sample" which saves a new record each five minutes.
Users might ask for data collected for a specific sampling interval of say 10 min or 30 min or an hour.
Since I have a record every five minutes, when a user asks for data with a hour sample interval, I will have to club/group every 12 (60/5) records in to one record (already sorted based on the time-stamp), and the criteria could be either min/max/avg/last value.
I was trying to do this in Java once I fetch all the records, and am seeing pretty bad performance as I have to iterate through the collection multiple times, I have read of other alternatives like jAgg and lambdaj, but wanted to check if that's possible in SQL (MySQL) itself.
The sampling interval is dynamic and the aggregation function (min/max/avg/last) too is user provided.
Any pointers ?
You can do this in SQL, but you have to carefully construct the statement. Here is an example by hour for all four aggregations:
select min(datetime) as datetime,
min(val) as minval, max(val) as maxval, avg(val) as avgval,
substring_index(group_concat(val order by datetime desc), ',', 1) as lastval
from table t
group by floor(to_seconds(datetime) / (60*60));
I'm reasonably new to Access and having trouble solving what should be (I hope) a simple problem - think I may be looking at it through Excel goggles.
I have a table named importedData into which I (not so surprisingly) import a log file each day. This log file is from a simple data-logging application on some mining equipment, and essentially it saves a timestamp and status for the point at which the current activity changes to a new activity.
A sample of the data looks like this:
This information is then filtered using a query to define the range I want to see information for, say from 29/11/2013 06:00:00 AM until 29/11/2013 06:00:00 PM
Now the object of this is to take a status entry's timestamp and get the time difference between it and the record on the subsequent row of the query results. As the equipment works for a 12hr shift, I should then be able to build a picture of how much time the equipment spent doing each activity during that shift.
In the above example, the equipment was in status "START_SHIFT" for 00:01:00, in status "DELAY_WAIT_PIT" for 06:08:26 and so-on. I would then build a unique list of the status entries for the period selected, and sum the total time for each status to get my shift summary.
You can use a correlated subquery to fetch the next timestamp for each row.
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#;
Then you can use that query as a subquery in another query where you compute the duration between timestamp and next_timestamp. And then use that entire new query as a subquery in a third where you GROUP BY status and compute the total duration for each status.
Here's my version which I tested in Access 2007 ...
SELECT
sub2.status,
Format(Sum(Nz(sub2.duration,0)), 'hh:nn:ss') AS SumOfduration
FROM
(
SELECT
sub1.status,
(sub1.next_timestamp - sub1.timestamp) AS duration
FROM
(
SELECT
i.status,
i.timestamp,
(
SELECT Min([timestamp])
FROM importedData
WHERE [timestamp] > i.timestamp
) AS next_timestamp
FROM importedData AS i
WHERE i.timestamp BETWEEN #2013-11-29 06:00:00#
AND #2013-11-29 18:00:00#
) AS sub1
) AS sub2
GROUP BY sub2.status;
If you run into trouble or need to modify it, break out the innermost subquery, sub1, and test that by itself. Then do the same for sub2. I suspect you will want to change the WHERE clause to use parameters instead of hard-coded times.
Note the query Format expression would not be appropriate if your durations exceed 24 hours. Here is an Immediate window session which illustrates the problem ...
' duration greater than one day:
? #2013-11-30 02:00# - #2013-11-29 01:00#
1.04166666667152
' this Format() makes the 25 hr. duration appear as 1 hr.:
? Format(#2013-11-30 02:00# - #2013-11-29 01:00#, "hh:nn:ss")
01:00:00
However, if you're dealing exclusively with data from 12 hr. shifts, this should not be a problem. Keep it in mind in case you ever need to analyze data which spans more than 24 hrs.
If subqueries are unfamiliar, see Allen Browne's page: Subquery basics. He discusses correlated subqueries in the section titled Get the value in another record.