Is there a function to find average time difference in the standard time format in my sql.
You can use timestampdiff to find the difference between two times.
I'm not sure what you mean by "average," though. Average across the table? Average across a row?
If it's the table or a subset of rows:
select
avg(timestampdiff(SECOND, startTimestamp, endTimestamp)) as avgdiff
from
table
The avg function works like any other aggregate function, and will respond to group by. For example:
select
col1,
avg(timestampdiff(SECOND, startTimestamp, endTimestamp)) as avgdiff
from
table
group by col1
That will give you the average differences for each distinct value of col1.
Hopefully this gets you pointed in the right direction!
What I like to do is a
SELECT count(*), AVG(TIME_TO_SEC(TIMEDIFF(end,start)))
FROM
table
Gives the number of rows as well...
In order to get actual averages in the standard time format from mysql I had to convert to seconds, average, and then convert back:
SEC_TO_TIME(AVG(TIME_TO_SEC(TIMEDIFF(timeA, timeB))))
If you don't convert to seconds, you get an odd decimal representation of the minutes that doesn't really make any sense (to me).
I was curious if AVG() was accurate or not, the way that COUNT() actually just approximates the value ("this value is an approximation"). After all, let's review the average formula: average = sum / count. So, knowing that the count is accurate is actually really important for this formula!
After testing multiple combinations, it definitely seems like AVG() works and is a great approach. You can calculate yourself to see if it's working with...
SELECT
COUNT(id) AS count,
AVG(TIMESTAMPDIFF(SECOND, OrigDateTime, LastDateTime)) AS avg_average,
SUM(TIMESTAMPDIFF(SECOND, OrigDateTime, LastDateTime)) / (select COUNT(id) FROM yourTable) as calculated_average,
AVG(TIME_TO_SEC(TIMEDIFF(LastDateTime,OrigDateTime))) as timediff_average,
SEC_TO_TIME(AVG(TIME_TO_SEC(TIMEDIFF(LastDateTime, OrigDateTime)))) as date_display
FROM yourTable
Sample Results:
count: 441000
avg_average: 5045436.4376
calculated_average: 5045436.4376
timediff_average: 5045436.4376
date_display: 1401:30:36
Seems to be pretty accurate!
This will return:
count: The count.
avg_average: The average based on AVG(). (Thanks to Eric for their answer on this!)
calculated_average: The average based on SUM()/COUNT().
timediff_avg: The average based on TIMEDIFF(). (Thanks to Andrew for their answer on this!)
date_display: A nicely-formatted display version. (Thanks to C S for their answer on this!)
Related
I am having table name as "Table1" in mysql.I have to find Sum of Mean and Std dev on column "Open".I did it easily using python but I am unable to do it using sql.
Select * from BANKNIFTY_cal_spread;
Date Current Next difference
2021-09-03 00:00:00 36914.8 37043.95 129.14999999999418
2021-09-06 00:00:00 36734 36869.15 135.15000000000146
2021-09-07 00:00:00 36572.9 36710.65 137.75
2021-09-08 00:00:00 36945 37065 120
2021-09-09 00:00:00 36770 36895.1 125.09999999999854
Python Code-
nf_fut_mean = round(df['difference'].mean())
print(f"NF Future Mean: {nf_fut_mean}")
nf_fut_std = round(df['difference'].std())
print(f"NF Future Standard Deviation: {nf_fut_std}")
upper_range = round((nf_fut_mean + nf_fut_std))
lower_range = round((nf_fut_mean - nf_fut_std))
I search for Sql solution but I didn't get it. I tried building query but it's not showing correct results in query builder in grafana alerting.
Now I added Mean column ,std dev column , upper_range and lower_range column using python dataframe and pushed to mysql table.
#Booboo,
After removing Date from SQL Query, it's showing correct results in two columns- average + std_deviation and average - std_deviation.
select average + std_deviation, average - std_deviation from (
select avg(difference) as average, stddev_pop(difference) as std_deviation from BANKNIFTY_cal_spread
) sq
It looks as though the sample you're using for the aggregations for MEAN, STDDEV, etc is the entire table - in which case you have to drop the DATE field from the query's result set.
You could also establish the baseline query using a CTE (Common Table Expression) using a WITH statement instead of a subquery, and then apply the subsequent processing:
WITH BN_CTE AS
(
select avg(difference) as average, stddev_pop(difference) as std_deviation from BANKNIFTY_cal_spread
)
select average + std_deviation, average - std_deviation from BN_CTE;
With the data you posted having only a single Open column value for any given Date column value, you standard deviation should be 0 (and the average just that single value).
I am having difficulty in understanding your SQL since I cannot see how it relates to finding the sum (and presumably the difference, which you also seem to want) of the average and standard deviation of column Open in table Table1. If I just go by your English-language description of what you are trying to do and your definition of table Table1, then the following should work. Note that since we want both the sum and difference of two values, which are not trivial to calculate, we should calculate those two values only once:
select Date, average + std_deviation, average - std_deviation from (
select Date, avg(Open) as average, stddev_pop(Open) as std_deviation from Table1
group by Date
) sq
order by Date
Note that I am using column aliases in the subquery that do not conflict with built-in MySQL function names.
SQL does not allow both calculating something in the SELECT clause and using it. (Yes, #variables allow in limited cases; but that won't work for aggregates in the way hinted in the Question.)
Either repeat the expressions:
SELECT average(difference) AS mean,
average(difference) + stddev_pop(difference) AS "mean-sigma",
average(difference) - stddev_pop(difference) AS "mean+sigma"
FROM BANKNIFTY_cal_spread;
Or use a subquery to call the functions only once:
SELECT mean, mean-sigma, mean+sigma
FROM ( SELECT
average(difference) AS mean,
stddev_pop(difference) AS sigma
FROM BANKNIFTY_cal_spread
) AS x;
I expect the timings to be similar.
And, as already mentioned, avoid using aliases that are identical to function names, etc.
I have a table with 3 columns:
id start_service stop_service
I have already managed to catch the time difference between start_service and stop_service using this query:
SELECT
TIMEDIFF (stop_service, start_service) AS tempo
FROM
user_establishment ORDER BY id;
Now I need to add all the results of this query and divide by the number of records, so as to obtain the average time of all services.
The main problem is the conversion of the hours.
Can someone please help me?
Try this if you need to get your results in hours. You can omit the division by 3600 if you need to have it in seconds (this is what I would do and then manipulate it in my code afterwards).
SELECT
AVG(TIME_TO_SEC(TIMEDIFF(stop_service, start_service)))/3600 AS tempo
FROM
user_establishment;
Hope this helps!
If you want the average time difference in hours upto two decimal points, you can try the below query.
select ROUND(AVG(TIME_TO_SEC(tempo)/3600), 2) as avg_time
from (
select TIMEDIFF(stop_service, start_service) as tempo
from user_establishment
) as timeline
You can modify the average time diff as per your need.
Following query will be used to get average of time difference.
No need to add ORDER BY clause while using AVG() function in mysql query. It saves time to being execute.
SELECT AVG(TIMEDIFF(stop_service, start_service)) AS tempo
FROM user_establishment;
select
substr(insert_date, 1, 14),
device, count(1)
from
abc.xyztable
where
insert_date >= DATE_SUB(NOW(), INTERVAL 10 DAY)
group by
device, substr(insert_date, 1, 14) ;
and then I am trying to get average of the same rows count which I got above.
SELECT
date, device, AVG(count)
FROM
(SELECT
substr(insert_date, 1, 14) AS date,
device,
COUNT(1) AS count
FROM
abc.xyztable
WHERE
insert_date >= DATE_SUB(NOW(), INTERVAL 10 DAY)
GROUP BY
device, substr(insert_date, 1, 14)) a
GROUP BY
device, date;
AS I found both queries return the same results, I tried for last 10 days data.
My purpose is to get the average rows count for last 10 days which I get from the above 1st query.
I'm not entirely sure what you're asking, the "difference" between the two queries is that the first one is valid but the second does not appear to be, as per HoneyBadger's comment. They also seem to be trying to achieve two different goals.
However, I think what you are trying to do is produce a query based on the data from the first query, which returns the date, device, and an average of the count column. If so, I believe the following query would calculate this:
WITH
dataset AS (
select substr(insert_date,1,14) AS theDate, device, count(*) AS
theCount
from abc.xyztable
where insert_date >=DATE_SUB(NOW(), INTERVAL 10 DAY)
group by device,substr(insert_date,1,14)
)
SELECT theDate, device, (SELECT ROUND(AVG(CAST(theCount
AS FLOAT)), 2) FROM
dataset) AS Average
FROM dataset
GROUP BY theDate, device
I have referenced the accepted answers of this question to calculate the average: How to calculate average of a column and then include it in a select query in oracle?
And this question to tidy up the query: Formatting Clear and readable SQL queries
Without having a sample of your data, or any proper context, I can't see how this would be especially useful, so if it was not what you were looking for, please edit your question and clarify exactly what you need.
EDIT: Based on what extra information you have provided, I've made a tweak to my solution to increase the precision of the average column. It now calculates the average to two decimal places. You have stated that this returns the same result as your original query, but the two queries are not formulating the same thing. If the count column is consistently the same number with little variation, the AVG function will round this, which in turn could produce results which look the same, especially if you only compare a small sample, so I have amended my answer to demonstrate this. Again, we'd all be able to help you much easier if you would provide more information, such as a sample of your data.
If you want an average you need to change the last GROUP BY
to get an average per device
GROUP BY device;
to get an average per date
GROUP BY date;
or remove it completely to get an average for all rows in the sub-query
Update
Below is a full example for getting the average per device
SELECT device, avg(count)
FROM (SELECT substr(insert_date,1,14) as date, device, count(1) as count
FROM abc.xyztable
WHERE insert_date >=DATE_SUB(NOW(), INTERVAL 10 DAY)
GROUP BY device,substr(insert_date,1,14)) a
GROUP BY device;
This question already has answers here:
ROW_NUMBER() in MySQL
(26 answers)
Closed 8 years ago.
I have a table that tracks the activity in several websites. Each row is of the following form: (Date, Hour, Website, Hits)
The Hour field is a number between 0 and 23 and represents an entire hour (for example, 22 is for any hits between 22:00 and 22:59).
I want to find the overall slowest hour for each website, meaning the input should be something like (Website, Hour).
In order to do that, I was thinking I should have a nested query to find the minimum hits for each website on each day, and then count the values of Hour (again, for each website on each day), and see which value is the maximal.
I'm still new to SQL so I'm having difficulties using the min() function properly, to find the minimal value only for a specific date and website. Then I have the same problem with using count() for a specific website.
I'm also curious if I can get not just the most common slowest hour, but maybe the 3 slowest, but at least to me it seems like it's really complicating the problem.
For the first nested query, I considered something like this:
SELECT DISTINCT Date Date_t, Website Website_t, Hour,
(SELECT min(Hits) from HITS_TABLE WHERE Date=Date_t and Website=Website_t) as MinHits
FROM HITS_TABLE
But not only it takes an abnormally long time to calculate, it also gives me multiple entries of (Date_t, Website_t, Hour, min(Hits)) for each value of Hour, so I take it that I'm not doing it in the smartest, nor the most efficient way.
Thanks in advance for any help!
You can get the minimum hour using a trick in MySQL:
select website, substring_index(group_concat(hour order by hits), ',', 1) as minhour
from table t
group by website;
For each website, this constructs a comma-delimited list of hours, ordered by the number of hits. The function substring_index() returns the first row.
This is something of a hack. In most other databases, you would use window/analytic functions, but these are not available in MySQL.
EDIT:
You can do this in standard SQL as well:
select t.*
from table t
where not exists (select 1
from table t2
where t2.hour = t.hour and
t2.hits < t.hits
);
This is interpreted as: "Get me all rows from the table where there are no other rows with the same hour and a lower number of hits." This is a round-about way of saying: "Get me the hour with the minimum value." Note that this will return multiple rows when there are ties.
I have 4 tables for different dates, the tables looks like this:
what I'm trying to do is to find the maximum tps for each service_name,function_name among all four days according to hour. for example in the figure I posted there is service_name(BatchItemService) in first raw that have (getItemAvailability) as function_name in date 13-06-12 01. I have same service_name for same function_name in all the other 3 tables for the same hour "01" but with different days, like day 13,14,15. I want to find maximum tps for this service_name,function_name set for hour "01" among all the four days.
I tried this, but it give me incorrect result.
SELECT
t.service_name,
t.function_name,
t.date,
max(t.tps)
FROM
(SELECT
service_name, function_name, date, tps
FROM
trans_per_hr_2013_06_12
UNION ALL
SELECT
service_name, function_name, date,tps
FROM
trans_per_hr_2013_06_13
GROUP BY service_name,function_name,date
UNION ALL
SELECT
service_name, function_name,date, tps
FROM
trans_per_hr_2013_06_14
UNION ALL
SELECT
service_name, function_name, date, tps
FROM
trans_per_hr_2013_06_15
UNION ALL
SELECT
service_name, function_name,date, tps
FROM
trans_per_hr_2013_06_16
) t
GROUP BY t.service_name,t.function_name,hour(t.Date);
Thanks a lot...
Your query looks like it should be returning what you want.
One possible issue is the type of the date column. As shown in the output, this looks like it might be stored as a character string rather than a date. If so, the following would work for the group by statement (assuming the format is as shown: DD-MM-YY H).
GROUP BY t.service_name,t.function_name, right(t.Date, 2);
As Bohemian says in the comment, this is not a good data structure. You have parallel tables and you are storing the date both in the table name and in a column. You should learn about table partitioning. This is a way that you can store different days in different files, but still have MySQL interpret them as one table. It would probably greatly simplify your using this data.