I've two columns in a table: time, volumes.
The time columns has resolution of seconds (format: YYMMDDHHmmss) and volumes are traffic volumes. I want to write a script that can calculate time series consisting of total volumes over a 5 win window; time series with 5 min bins.
How can I do that? Group by?
Related
I have a table which contains rows of data for 215 VMs. Each row has the following information VM name, datestamp, disk, capacity, throughput, iops, etc.
each VM will have one or more disks, which means for each datestamp there may be more than one row of data for each vm per datestamp.
What I need are results with the total disk capacity per VM. It doesn't matter which datestamp I get the data from as it generally stays the same for each datestamp. However not all VM's exist during each datestamp.
So here is what I have tried.
Query attempt 1:
SELECT DISTINCT `VM`, SUM(CapacityGB) as cap FROM stats Group By `VM`
The issue here is that I get results for each VM, but instead of the sum of capacity for a single datestamp, it sums ALL datestamps for that VM, which means the capacity is WAY to high
Query attempt 2:
SELECT DISTINCT `VM`,`datestamp`, SUM(CapacityGB) as cap FROM stats Group By `VM`,`datestamp`
this query gives me the proper capacity results, but I get a row for each datestamp for each VM (i have 215 vms and it returns 394701 rows because I have about 2 weeks worth of stats on 10 minute intervals)
So what I would like is a hybrid, 1 row of results per VM. any pointers?
You could just wrap your second query up as a subquery and select the MAX value of cap (which should be the same as any of the others), grouping by VM:
SELECT VM, MAX(cap)
FROM (SELECT `VM`,`datestamp`, SUM(CapacityGB) as cap
FROM stats
GROUP BY `VM`,`datestamp`) s
GROUP BY VM
I am currently trying to calculate the retention rate (percentage of customers that returned to a webpage) of 3 days for in a entire table within the span of 14 days. To do so, I am trying to count the total user (visitorId) of whom returned to the page between specific dates, then I would average them together in order to have the the average retention rate for the 14 days. Currently I am using this code but it does not seem to work.
SELECT
pageviews.pageType,
pageviews.pageviewDate,
sessions.sessionDate,
sessions.deviceType,
sessions.visitorId
AVG(COUNT(sessions.visitor > 1 BETWEEN sessions.sessionDate '2018-04-26' AND '2018-04-29')
# There would be multiple of these dates
FROM sessions
INNER JOIN pageviews
ON sessions.visitorId = pageviews.visitorId AND
pageviews.pageviewDate = sessions.sessionDate
WHERE
pageviews.pageType = 'Game' AND sessions.deviceType = 'Desktop';
To be more specific, the desired result would be to have a single number that states the average number of customers that returned to a specific page (in this case, Game) that used Desktops. Can anyone help? Please let me know if more clarification is needed. Note, for simplicity, I did not add all the date that I would calculate the retention rate as it would be many.
I'm afraid I with this situation:
I have a MySQL table with just 3 columns: ID, CREATED, TOTAL_VALUE.
A new TOTAL_VALUE is recorded roughly every 60 seconds, so about 1440 times a day.
I am using PHP to generate some CanvasJS code that plots the MySQL records into line graph - this so that I can see how TOTAL_VALUE changes over time.
it works great for displaying 1 day worth of data, but when doing 1 week(7*1440=10080 plot points) things get really slow.
And a date range of for example 1-JAN-2016 and 1-SEP-2016 just leads to time outs in the PHP script.
How can I write some MySQL that still selects records between a date range but limit the rows returned to ie max 1000 rows?
I need to optimize this by limiting the number of data points that need to be plotted.
Can MySQL do some clever stuff where it decides to skip 1 every so many rows and return 1000 averaged values - this so that my line graph would by approximation still be correct- but using fewer data points?
I need to calculate 10 minute average of my MySql table. But the default average it's not what I want. I've a code for "default" average:
SELECT Date, convert((min(Time) DIV 1000)*1000,time) as Time, ROUND(AVG(Value),2) FROM RawData
GROUP BY Date, Time DIV 1000
How can I calculate average like this:
http://prntscr.com/8k4slp
And one more thing... I need to prevent the average calculation of incomplete intervals. How can I do this?
I need to build the backend for a chart, which needs to have a fixed amount of data points, let's assume 10 for this example. I need to get all entries in a table, have them split into 10 chunks (by their respective date column) and show how many entries there were between each date interval.
I have managed to do kind of the opposite (I can get the entries for a fixed interval, and variable number of data points), but now I need a fixed number of data points and variable date interval.
What I was thinking (which didn't work) is to get the difference between the min and max date from the table, divide it by 10 (number of data points) and have each row's date column divided by that result and also grouped by it. I either screwed up the query somewhere or my logic is faulty, because it didn't work.
Something along these lines:
SELECT (UNIX_TIMESTAMP(created_at) DIV (SELECT (MAX(UNIX_TIMESTAMP(created_at)) - MIN(UNIX_TIMESTAMP(created_at))) / 10 FROM user)) x FROM user GROUP BY x;