Time Frequency of Inserts - mysql

I have data being inserted into a mysql table of the form
id companyid date
how would I find the time average frequency of inserts by companyid.
Some companies send data daily, some weekly, some every 10 days, etc.
would like a result of the form
companyid average frequency of inserts
2 every 5 days
3 every 10 days
4 every 2 days

One definition of average would be the difference between the maximum and minimum values divided by one less than the count. Something like this might be what you are looking for:
select companyid,
(case when max(date) <> min(date())
then datediff(max(date), min(date)) / (count(*) - 1)
end) as average_frequency
from table t
group by companyid;

Related

Calculate average session length per daily active user?

I'm pretty new to SQL and I'm struggling with one of the questions on my exercise. How would I calculate average session length per daily active user? The table shown is just a sample of what the extended table is. Imagine loads more rows.
I simply used this query to calculate the daily active users:
SELECT COUNT (DISTINCT user_id)
FROM table1
and welcome to StackOverflow!
now, your question:
How would I calculate average session length per daily active user?
you already have the session time, and using AVG function you will get a simple average for all
select AVG(session_length_seconds) avg from table_1
but you want per day... so you need to think as group by day, so how do you get the day? you have a activity_date as a Date entry, it's easy to extract day, month and year from it, for example
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year
from
table_1
will break down the date field in columns you can use...
now, back to your question, it states daily active user, but all you have is sessions, a user could have multiple sessions, so I have no idea, from the context you have shared, how you go about that, and make the avg for each session, makes no sense as data to retrieve, I'll just assume, and serves this answer just to get you started, that you want the avg per day only
knowing how to get the average, let's create a query that has it all together:
select
DAY(activity_date) day,
MONTH((activity_date) month,
YEAR(activity_date) year,
AVG(session_length_seconds) avg
from
table_1
group by
DAY(activity_date),
MONTH((activity_date),
YEAR(activity_date)
will output the average of session_length_seconds per day/month/year
the group by part, you need to have as many fields you have in the select but that do not do any calculation, like sum, count, etc... in our case avg does calculation, so we don't want to group by that value, but we do want to group by the other 3 values, so we have a 3 columns with day, month and year. You can also use concat to join day, month and year into just one string if you prefer...

Select where last activity 3 months ago

I have a table of cellular invoices, relevant columns are Cellular_Account_id (INT), billing_end_date(DATE), and data_usage_GB.
There is a separate row for each account every month. I'm trying to get a list of accounts that have had no data usage for each of the past three months.
I'm pretty new to databases in general, so I'm not really even sure what syntax I should be searching for, or what approach I should be taking.
I can, of course, select WHERE data_usage_GB = 0.000 AND MONTH(billing_end_date) = month(current_date()) -1 but that only gives me the info in 1 month's range. I'm not sure how to group together the results where data_usage_GB = 0.000 for each of the last three months.
I'd group by the account, get the maximum date for each and then filter them using a having clause:
SELECT cellular_account_id
FROM invoices
GROUP BY cellular_account_id
HAVING MAX(billing_end_date) < DATE_SUB(CURRENT_DATE, INTERVAL 3 MONTH)

MySQL: Aggregate weekly statistics with substatics (subqueries)

I would like to gather weekly statics on a MySQL-Table.
The table itself has the following structure:
user_id action_id created
0 123 2017-01-01 00.00:00
0 124 ...
1 123 ...
... ... ...
I would like to aggregate the weekly statics for:
How many user where active per week
This is rather simple:
SELECT
YEARWEEK(created) as week,
COUNT(DISTINCT user_id) AS count
FROM data
GROUP BY YEARWEEK(created);
Additionally I could apply a sorting.
The result looks like:
week count
201701 2
201702 3
How many user where active per week for the very first time
I thought about solving it by using a subquery
SELECT
YEARWEEK(created) as week,
COUNT(DISTINCT user_id) AS count,
(
SELECT
COUNT(DISTINCT d2.user_id)
FROM data d2
WHERE YEARWEEK(d2.created) = week
AND NOT EXISTS (SELECT 1 FROM data d3
WHERE YEARWEEK(d3.created) < week AND d2.user_id = d3.user_id)
) as countNewUsers
FROM data d1
GROUP BY YEARWEEK(created);
How many junior user where active per week
Junior users were active between 1 and 10 times before the related week
Similar to the one above, but with other subquery
How many power user where active per week
Senior users were active more than 10 times before the related week
This works as expected, but has a rather poor performance, since the subquery is evaluated before the grouping happens. With millions of rows in a table, this takes ages.
Does anybody have a better solution for this query, ideally returning all values in single result set?
I think all of your queries could derive from one 'intermediate' table. It would contain (yearweek, userid, count).
Users active per week: Pretty much the same query, but faster from this table.
Active for first time: Self-join ON userid and desired week versus MIN(yearweek)
Uses before the target week: ... SUM(count) WHERE ... < week GROUP BY userid
Use the above to determine which userids of Junior/Power.

DB, how to select data based on time and particular interval

I have a table in my database, my program will insert data to that table in every 10 mins.
The table has a field recording the insert date and time.
Now I want to retrieve those data, but I don't want hundreds of data comes out.
I want to get 1 records from every half hour based on insert time stamp (so less than 50 in total of a day).
For that 1 record, it can be either random pick or average from each interval.
Sorry for the ambiguit, cuz I just wanna figure out the way to select from intervals
Let say,
Table name: network_speed
----------------------------------
ID. ....... Speed ......... Insert_time
1 ....... 10 ......... 10:02am......
2 ....... 12 ......... 10:12am......
...
...
...
123 ....... 17 ........ 9:23am........
To get them all but out put must be average of each half hour record
How can I write a query to achieve this?
Here is a query that calculates half hour intervals on a specific day ( 2013-09-04).
SELECT ID, Speed, Insert_time,
ROUND(TIMESTAMPDIFF(MINUTE, '2013-09-04', Insert_time)/48) AS 'interval'
FROM network_speed
WHERE DATE(Insert_time) = '2013-09-04';
Use that in a nested query to get stats on the records in the intervals.
SELECT IT.interval, COUNT(ID), MIN(Insert_time), MAX(Insert_time), AVG(Speed)
FROM
(SELECT ID, Speed, Insert_time,
ROUND(TIMESTAMPDIFF(MINUTE, '2013-09-04', Insert_time)/48) AS 'interval'
FROM network_speed
WHERE DATE(Insert_time) = '2013-09-04') AS IT
GROUP BY IT.interval;
Here it is used to get the first record in each interval.
SELECT NS.*
FROM
(SELECT IT.interval, MIN(ID) AS 'first_id'
FROM
(SELECT ID, Speed, Insert_time,
ROUND(TIMESTAMPDIFF(MINUTE, '2013-09-04', Insert_time)/48) AS 'interval'
FROM network_speed
WHERE DATE(Insert_time) = '2013-09-04') AS IT
GROUP BY IT.interval) AS MI,
network_speed AS NS
WHERE MI.first_id = NS.ID;
Hope that helps.
Is this what you need?
SELECT HOUR(ts) as hr, fld1, fld2 from tbl group by hr
This query selects only hour from the timestamp and then groups the result based on the hour field so you get 1 row for each hour

How to get the average price for the X most recent rows based on date?

I am looking to calculate moving averages over variable dates.
My database is structured:
id int
date date
price decimal
For example, I'd like to find out if the average price going back 19 days ever gets greater than the average price going back 40 days within the past 5 days. Each of those time periods is variable.
What I am getting stuck on is selecting a specific number of rows for subquery.
Select * from table
order by date
LIMIT 0 , 19
Knowing that there will only be 1 input per day, can I use the above as a subquery? After that the problem seems trivial....
if you only have one input per day you don't need id, date can be your primary id? Am i missing something? Then use select sum
SELECT SUM(price) AS totalPrice FROM table Order by date desc Limit (most recent date),(furthest back date)
totalPrice/(total days)
I may not understand your question
Yes you can use that as a sub-query like this:
SELECT
AVG(price)
FROM
(SELECT * FROM t ORDER BY date DESC LIMIT 10) AS t1;
This calculates the average price for the latest 10 rows.
see fiddle.