I have a great SQL query (provided by the very helpful FoxyGen) which returns a specific subset set of information from large set of metrics.
SELECT SUM(metric_activeSessions),
MAX(metric_timeStamp) as max_time,
MIN(metric_timeStamp) as min_time
FROM metrics_tbl
WHERE metric_timeStamp > NOW() - INTERVAL 60 MINUTE
GROUP BY unix_timestamp(metric_timeStamp) div 300
I'm trying to figure out if I can append something to this existing query that will return only adjacent results which have a major negative difference in value.
For example, if the results are:
1. 2334 # 12:01
2. 2134 # 12:05
In this scenario, result 2 has a changed by a count -100.
Thanks in advance for any guidance.
Mitch
You can calculate the difference using variables in MySQL:
SELECT s, max_time, min_time,
(case when (#x := #sprev) is null then NULL
when (#sprev := s) is null the NULL
else #x
end) as sprev
FROM (SELECT SUM(metric_activeSessions) as s, MAX(metric_timeStamp) as max_time,
MIN(metric_timeStamp) as min_time
FROM metrics_tbl
WHERE metric_timeStamp > NOW() - INTERVAL 60 MINUTE
GROUP BY unix_timestamp(metric_timeStamp) div 300
) t CROSS JOIN
(SELECT #sprev := NULL) vars
order by max_time;
You can then test for whatever difference you like using a having clause:
having s - sprev >= 100
or whatever.
Related
set #cumulativeSum := 0;
SELECT
z.hour+1 as time,
(#cumulativeSum := #cumulativeSum + z.enquiries) as target
FROM
( SELECT
y.hour as hour,
IFNULL(ROUND(AVG(y.enquiries), 0), 0) as enquiries
FROM
( SELECT DAY(o.date_created),
HOUR(o.date_created) as hour,
COUNT(DISTINCT o.phone) as enquiries
FROM orders o
WHERE phone IS NOT NULL
AND name NOT LIKE 'test%'
AND o.email NOT LIKE 'jawor%'
AND o.email NOT LIKE 'test%'
AND o.email NOT LIKE '%moneyshake%'
AND o.date_created < CURRENT_DATE()
AND o.date_created >= DATE(NOW()) + INTERVAL -7 DAY
GROUP BY HOUR(o.date_created), DAY(o.date_created) ) y
GROUP BY hour ) z
This query is meant to give an average number of enquiries by hour for the last 7 days. I've done it this way to exclude duplicates of o.phone only within each hour of each day, rather than across all days or all hours.
It outputs:
time
target
1
1
2
2
3
3
5
4
etc..
etc..
I want it to include a row for 4am, even if the value for target doesn't change (because the AVG for 4am is 0)
Please let me know if more info is needed!
Credit to Akina's comment for solving
I have a mySQL database with each row containing an activate and a deactivate date. This refers to the period of time when the object the row represents was active.
activate deactivate id
2015-03-01 2015-05-10 1
2013-02-04 2014-08-23 2
I want to find the number of rows that were active at any time during each month. Ex.
Jan: 4
Feb: 2
Mar: 1
etc...
I figured out how to do this for a single month, but I'm struggling with how to do it for all 12 months in a year in a single query. The reason I would like it in a single query is for performance, as information is used immediately and caching wouldn't make sense in this scenario. Here's the code I have for a month at a time. It checks if the activate date comes before the end of the month in question and that the deactivate date was not before the beginning of the period in question.
SELECT * from tblName WHERE activate <= DATE_SUB(NOW(), INTERVAL 1 MONTH)
AND deactivate >= DATE_SUB(NOW(), INTERVAL 2 MONTH)
If anybody has any idea how to change this and do grouping such that I can do this for an indefinite number of months I'd appreciate it. I'm at a loss as to how to group.
If you have a table of months that you care about, you can do:
select m.*,
(select count(*)
from table t
where t.activate_date <= m.month_end and
t.deactivate_date >= m.month_start
) as Actives
from months m;
If you don't have such a table handy, you can create one on the fly:
select m.*,
(select count(*)
from table t
where t.activate_date <= m.month_end and
t.deactivate_date >= m.month_start
) as Actives
from (select date('2015-01-01') as month_start, date('2015-01-31') as month_end union all
select date('2015-02-01') as month_start, date('2015-02-28') as month_end union all
select date('2015-03-01') as month_start, date('2015-03-31') as month_end union all
select date('2015-04-01') as month_start, date('2015-04-30') as month_end
) m;
EDIT:
A potentially faster way is to calculate a cumulative sum of activations and deactivations and then take the maximum per month:
select year(date), month(date), max(cumes)
from (select d, (#s := #s + inc) as cumes
from (select activate_date as d, 1 as inc from table t union all
select deactivate_date, -1 as inc from table t
) t cross join
(select #s := 0) param
order by d
) s
group by year(date), month(date);
I have a single table with a list of hits/downloads, every row has of course a date.
I was able to sum all the rows grouped by day.
Do you think it's possible to also calculate the change in percentage of every daily sum compared to the previous day using a single query, starting from the entire list of hits?
I tried to do this
select *, temp1.a-temp2.b/temp1.a*100 as percentage from
(select DATE(date), count(id_update) as a from vas_updates group by DATE(date)) as table1
UNION
(select DATE_ADD(date, INTERVAL 1 DAY), count(id_update) as b from vas_updates group by DATE(date)) as table2, vas_updates
but it won't work (100% CPU + crash).
Of course I can't JOIN them because those two temp tables share nothing in common being with 1 day offset.
The table looks like this, nothing fancy.
id_updates | date
1 2014-07-06 12:45:21
2 2014-07-06 12:46:10
3 2014-07-07 10:16:10
and I want
date | sum a | sum b | percentage
2014-07-07 2 1 -50%
It can be either be positive or negative obviously
select DATE(v.date), count(v.id_update) a, q2.b, count(v.id_update) - q2.b/count(v.id_update)*100 as Percentage
from vas_updates v
Left Join (select DATE_ADD(date, INTERVAL 1 DAY) d2, count(id_update) as b
from vas_updates group by d2) as q2
ON v.date = q2.d2
group by DATE(v.date)
The sum by day is:
select DATE(date), count(id_update) as a
from vas_update
group by DATE(date);
In MySQL, the easiest way to get the previous value is by using variables, which looks something like this:
select DATE(u.date), count(u.id_update) as cnt,
#prevcnt as prevcnt, count(u.id_update) / #prevcnt * 100,
#prevcnt := count(u.id_update)
from vas_update u cross join
(select #prevcnt := 0) vars
group by DATE(u.date)
order by date(u.date);
This will generally work in practice, but MySQL doesn't guarantee the ordering of variables. A more guaranteed approach looks like:
select dt, cnt, prevcnt, (case when prevcnt > 0 then 100 * cnt / prevcnt end)
from (select DATE(u.date) as dt, count(u.id_update) as cnt,
(case when (#tmp := #prevcnt) is null then null
when (#prevcnt := count(u.id_update)) is null then null
else #tmp
end) as prevcnt
from vas_update u cross join
(select #prevcnt := 0, #tmp := 0) vars
group by DATE(u.date)
order by date(u.date)
) t;
How can I do a sql query that will return all the records in a table except for those that have less than 2 seconds difference to another record?
Example - Consider these 5 records:
16:24:00
16:24:10
16:24:11
16:24:12
16:24:30
The query should return:
16:24:00
16:24:10
16:24:30
Any help would be massively appreciated
I had a similar problem a while ago.
Here is what I came up with and which works quite nice.
I. This query will group all results from 'your_table' (here referred as x), by 5 seconds.
This means it will output the count of results that are inside a 5 second timeframe.
SELECT count(x.id), x.created_at FROM your_table AS x
GROUP BY ( 60 * 60 * HOUR( x.created_at ) +
60 * FLOOR( MINUTE( x.created_at )) +
FLOOR( SECOND( x.created_at ) / 5))
II. This query will group all results from 'your_table' (here referred as x), by 1 minute.
Like above it will output the count of results that are inside a 1 minute timeframe.
SELECT count(x.id), x.created_at FROM your_table AS x
GROUP BY ( 60 * HOUR( x.created_at ) +
FLOOR( MINUTE( x.created_at ) / 1))
Example output for query I and your input.
count(x.id), created_at
1 16:24:00
3 16:24:10
1 16:24:30
Hope this helps.
best solution - number and sort your timestamps, self join on condition of 3 seconds or more only on consecutive records:
select a.* from (
select timestamp,
#curRow := #curRow + 1 AS row_number
from table
JOIN (SELECT #curRow := 0) r
order by timestamp
)a inner join (
select timestamp,
#curRow := #curRow + 1 AS row_number
from table
JOIN (SELECT #curRow := 0) r
order by timestamp
)b on time_to_sec(a.timestamp)<time_to_sec(b.timestamp)-2 and
a.row_number=b.row_number-1
to get no more than one per 3 seconds, break it into 3sec intervals and group by that, take the lowest existing value(if it exists) with min()
select min(timestamp)
from table
group by concat(left(timestamp, 6),3*round(right(timestamp, 2)/3))
I have a database that's set up like this:
(Schema Name)
Historical
-CID int UQ AI NN
-ID Int PK
-Location Varchar(255)
-Status Varchar(255)
-Time datetime
So an entry might look like this
433275 | 97 | MyLocation | OK | 2013-08-20 13:05:54
My question is, if I'm expecting 5 minute interval data from each of my sites, how can I determine how long a site has been down?
Example, if MyLocation didn't send in the 5 minute interval data from 13:05:54 until 14:05:54 it would've missed 60 minutes worth of intervals, how could I find this downtime and report on it easily?
Thanks,
*Disclaimer: I'm assuming that your time column determines the order of the entries in your table and that you can't easily (and without heavy performance loss) self-join the table on auto_increment column since it can contain gaps.*
Either you create a table containing simply datetime values and do a
FROM datetime_table d
LEFT JOIN your_table y ON DATE_FORMAT(d.datetimevalue, '%Y-%m-%d %H:%i:00') = DATE_FORMAT(y.`time`, '%Y-%m-%d %H:%i:00')
WHERE y.some_column IS NULL
(date_format() function is used here to get rid of the seconds part in the datetime values).
Or you use user defined variables.
SELECT * FROM (
SELECT
y.*,
TIMESTAMPDIFF(MINUTE, #prevDT, `Time`) AS timedifference
#prevDT := `Time`
FROM your_table y ,
(SELECT #prevDT:=(SELECT MIN(`Time`) FROM your_table)) vars
ORDER BY `Time`
) sq
WHERE timedifference > 5
EDIT: I thought you wanted to scan the whole table (or parts of it) for rows where the timedifference to the previous row is greater than 5 minutes. To check for a specific ID (and still having same assumptions as in the disclaimer) you'd have to do a different approach:
SELECT
TIMESTAMPDIFF(MINUTE, (SELECT `Time` FROM your_table sy WHERE sy.ID < y.ID ORDER BY ID DESC LIMIT 1), `Time`) AS timedifference
FROM your_table y
WHERE ID = whatever
EDIT 2:
When you say "if the ID is currently down" is there already an entry in your table or not? If not, you can simply check this via
SELECT TIMESTAMPDIFF(MINUTE, NOW(), (SELECT MAX(`Time`) FROM your_table WHERE ID = whatever));
So I assume you are going to have some sort of cron job running to check this table. If that is the case you can simply check for the highest time value for each id/location and compare it against current time to flag any id's that have a most recent time that is older than the specified threshold. You can do that like this:
SELECT id, location, MAX(time) as most_recent_time
FROM Historical
GROUP BY id
HAVING most_recent_time < DATE_SUB(NOW(), INTERVAL 5 minutes)
Something like this:
SELECT h1.ID, h1.location, h1.time, min(h2.time)
FROM Historical h1 LEFT JOIN Historical h2
ON (h1.ID = h2.ID AND h2.CID > h1.CID)
WHERE now() > h1.time + INTERVAL 301 SECOND
GROUP BY h1.ID, h1.location, h1.time
HAVING min(h2.time) IS NULL
OR min(h2.time) > h1.time + INTERVAL 301 SECOND