SQL - Comparing difference between values in same column - mysql

I need some help with how to compare values in the same column LogNum to find 'unusual' entries. For example, in my table below LogTbl we can see that on ID number 4 the LogNum entry jumps massively compared to the previous pattern of entries.
How can I compare these LogNum entries and identify/output any that have increased by say more than 5% from the previous entry, using LogDate to age the entries?
ID
LogDate
LogNum
1
2006-05-26 00:00:00.000
112
2
2006-07-19 00:00:00.000
145
3
2006-09-08 00:00:00.000
162
4
2006-11-01 00:00:00.000
1787
Thanks.

There is no formal criteria for your words 'masssive' and 'unusual'. However I suppose that you may try to select the records where LogNum will be within the borders (LogNum >= MEAN(LogNum) - 2 * STDDEV(LogNum)) AND - LogNum <= MEAN(LogNum) + 2 * STDDEV(LogNum))

There's a wide interperetation of what your requirements are, one possible idea would be to identify a threshold using an average or standard deviation and filter rows that exceed the threshold.
with a as (
select *, avg(lognum) over() threshold
from t
)
select *
from a
where lognum > threshold
If you are interested in only differences between adjacent rows you could use lead, ie, find rows where the value is increases by >25% of the previous value
select Id, LogDate, Lognum
from (
select *, Lead(lognum) over(order by logdate) nxt
from t
)t
where nxt > lognum * 1.25

Related

How to get previous row value that is a mathematical argument with alias in mysql query?

I got troubled in getting the getting the previous row value in the query. In the query there is a field percent that is a mathematical equation of ROUND(COUNT(DISTINCT userSteps.userid) / (SELECT COUNT(*) FROM userSteps) * 100.0, 2) AS percent
The idea is, in the percent field, I will get the current row and subtract to the next one and it will be the value of the delta field. on 1st row it should be 0 since there is no row before it. any idea?
I was planning to have (percent - <previous of percent value>) as delta
I tried to use LAG() but it got me some error of unknown column
SET #startDate='';
SET #endDate='';
SET #version='';
WITH userSteps AS
(SELECT ... )
SELECT steps.version as version,
steps.step as step,
steps.stepDesc as stepDesc,
COUNT(DISTINCT userSteps.userid) AS numUsers,
ROUND(COUNT(DISTINCT userSteps.userid) / (SELECT COUNT(*) FROM userSteps) * 100.0, 2) AS percent,
LAG(percent, 1) OVER (
PARTITION BY steps.version
ORDER BY steps.step
) delta
FROM
(SELECT ...) steps,
userSteps
WHERE userSteps.version = steps.version AND userSteps.maxStep >= steps.step
GROUP BY steps.version, steps.step;
this line that I insert got an issue:
LAG(percent, 1) OVER (
PARTITION BY steps.version
ORDER BY steps.step
) delta
I expect that it will look like this one.
version step stepdesc numUsers percent delta
1 1 .. 10 100 0
1 2 .. 5 98 2
1 3 .. 3 92 6
1 4 .. 4 90 2
1 5 .. 8 80 10
``

Mysql single column result to multiple column result

I have a problem with a MySQL query, the problem is I have the following table:
id, rep, val dates
1 rep1 200 06/01/2014
2 rep2 300 06/01/2014
3 rep3 400 06/01/2014
4 rep4 500 06/01/2014
5 rep5 100 06/01/2014
6 rep1 200 02/06/2014
7 rep2 300 02/06/2014
8 rep3 900 02/06/2014
9 rep4 700 02/06/2014
10 rep5 600 02/06/2014
and I want a result like this:
rep 01/06/2014 02/06/2014
rep1 200 200
rep2 300 300
rep3 400 900
rep4 500 700
rep5 100 600
thank you very much!
You seem to want the most recent row for each rep. Here is an approach that often performs well:
select t.*
from table t
where not exists (select 1
from table t2
where t2.repid = t.repid and
t2.id > t.id
);
This transforms the problem to: "Get me the rows in table t where there is no other row with the same repid and a larger id." That is the same logic as getting the last one, just convoluted a bit to help the database know what to do.
For performance reasons, an index on t(repid, id) is helpful.
You seem to want the val for each of the dates.
Assuming the dates you are interested in are fixed then you can do that as follows. For output date column you check of the row matches the date for that column. If so you use the value of val , if not you just use 0. Then you sum all the resulting values, grouping by rep. I have assumed a fixed format of date.
SELECT rep, SUM(IF(dates='2014/06/01'), val, 0) AS '2014/06/01', SUM(IF(dates='2014/06/02'), val, 0) AS '2014/06/02'
FROM sometable
GROUP BY rep
Or if you just wanted the highest val for each day
SELECT rep, MAX(IF(dates='2014/06/01'), val, 0) AS '2014/06/01', MAX(IF(dates='2014/06/02'), val, 0) AS '2014/06/02'
FROM sometable
GROUP BY rep
If the number of dates is variable then not really a direct way to do it (as the number of resulting columns would vary). It would be easiest to do this manly in your calling script based on the following, giving you one row per rep / possible date with a sum of the values of val for that rep / date combination:-
SELECT rep, sub0.dates, SUM(IF(sometable.dates=sub0.dates), val, 0)
FROM sometable
CROSS JOIN
(
SELECT DISTINCT dates
FROM sometable
) sub0
GROUP BY rep, sub0.dates

Checking consecutive values at a MySQL query

I have a MySQL table like this:
ID - Time - Value
And I'm getting every pair of ID, Time (grouped by ID) where Value is greater than a certain threshold. So basicaly, I'm getting every ID which has at least one time a value greater than the threshold. The query looks like this:
SELECT ID, Time FROM mydb.MYTABLE
WHERE Value>%s AND Time>=%s AND Time<=%s
GROUP BY ID
EDIT: The Time checks allow to operate in a time range of my choice between all the data which is into the table; it has nothing else to do with what I am asking.
It works perfectly, but now I want to add some filtering: I want it to avoid those times the value is greater than the threshold (let's call it alarms) if the alarm hasn't happened also the Time just before or just after. I mean: if the alarm accurs at a single, isolated instant of time instead of two consecutive instants of time, I'll consider it is a false alarm and avoid it to be returned at the query response.
Of course I can do this with a call for each Id to check for this, but I'd like to do this in a single query to make it faster. I guess I could use conditionals, but I don't have that expertise at MySQL.
Any help?
EDIT2: Example for Threshold = 10
ID - Time - Value
1 - 2004 - 9
1 - 2005 - 11
1 - 2006 - 8
2 - 2107 - 12
2 - 2109 - 13
3 - 3402 - 11
3 - 3403 - 12
In this example, only ID 3 should be a valid alarm, since 2 consecutive time values for this ID have their value > threshold. ID 1 has a single, isolated alarm, so it should be filteres. For ID 2 there are 2 alarms, but not consecutive, so it should be also filtered.
Something like this:
10 - is a threshold
0 - minimum of the time period
100000 - maximum of the time period
select ID, min(Time)
from
(
SELECT ID, Time,
(select max(time) from t
where Time<t1.Time
and Id=t1.Id
and Value>10) LAG_G,
(select max(time) from t
where Time<t1.Time
and Id=t1.Id
and Value<=10) LAG_L,
(select min(time) from t
where Time>t1.Time
and Id=t1.Id
and Value>10) LEAD_G,
(select min(time) from t
where Time>t1.Time
and Id=t1.Id
and Value<=10) LEAD_L
FROM t as t1
WHERE Value>10 AND Time>=0 AND Time<=100000
) t3
where ifnull(LAG_G,0)>ifnull(LAG_L,0)
OR
ifnull(LEAD_G,100000)<ifnull(LEAD_L,100000)
GROUP BY ID
SQLFiddle demo
This query works for searching near records.
If you need to search records by Time (+1, -1 ) as you've mentioned in the comment try this query:
select ID, min(Time) from t as t1
where Value>10
AND Time>=%s2 AND Time<=%s1
and
(
Exists(select 1 from t where Value>10
and Id=t1.Id
and Time=t1.Time-1)
OR
Exists(select 1 from t where Value>10
and Id=t1.Id
and Time=t1.Time+1)
)
group by ID
SQLFiddle demo
such alarm ?
SELECT ID, Time , count(if(value>%treshold ,1,0)) alert_active
FROM mydb.MYTABLE
WHERE Value>%s3 AND Time>=%s2 AND Time<=%s1
GROUP BY ID;
i don't understand exactly:
In this example, only ID 3 should be a valid alarm, since 2
consecutive time values for this ID have their value > threshold. ID 1
has a single, isolated alarm, so it should be filteres. For ID 2 there
are 2 alarms, but not consecutive, so it should be also filtered.
I guess that You want filter alerts:
SELECT ID, Time
FROM mydb.MYTABLE
WHERE Value>%s3 AND Time>=%s2 AND Time<=%s1
GROUP BY ID
having value<%treshold;

Counting rows in mysql database

I want to count from the row with the least value to the row with a specific value.
For example,
Name / Point
--------------------
Pikachu / 7
Voltorb / 1
Abra / 4
Sunflora / 3
Squirtle / 8
Snorlax / 12
I want to count to the 7, so I get the returned result of '4' (counting the rows with values 1, 3, 4, 7)
I know I should use count() or mysql_num_rows() but I can't think of the specifics.
Thanks.
I think you want this :
select count(*) from mytable where Point<=7;
Count(*) counts all rows in a set.
If you're working with MySQL, then you could ORDER BY Point:
SELECT count(*) FROM table WHERE Point < 7 ORDER BY Point ASC
If you want to know all about ORDER BY, check out the w3schools page: http://www.w3schools.com/sql/sql_orderby.asp
Just in case you want to only count the rows based on the Point values:
SELECT count(*) FROM table WHERE Point < 7 GROUP BY Point
This may help you to get rows falling between range of values :
select count(*) from table where Point >= least_value and Point<= max_value

Break Numbers List Into Min and Max Ranges

Brain is not working today and my google skills are failing me.
I have a column of numbers ranging from 1 - 1000. I want to dump the min and max values for 100 (or whatever I chose) record ranges into a temp table. The plan is to use this temp table to process ranges of records (in this example 100 at a time) in a larger table.
Swear I have done this before with a CTE but then I had something to group on. Here I just want to break up a single list of numbers into ranges of X.
The output from the temp table should look like:
Min Max
0 99
100 199
200 299
300 399
etc.
Thanks!
You can use this trick from Stuart Ainsworth:
http://codegumbo.com/index.php/2009/01/25/building-ranges-using-a-dynamically-generated-numbers-table/
Numbers tables are awesome, but he uses a dynamically generated numbers table, which is even awesome...r.
If you know all numbers are present in the source table, you can use a recursive CTE to generate the number ranges:
; with numbers as
(
select 0 as a
, 99 as b
union all
select a+100
, b+100
from numbers
where a < 900
)
select *
from numbers
If the source table is sparsely populated, you can limit it to numbers that are actually present like:
... insert CTE from above here ...
select min(ot.NumberColumn)
, max(ot.NumberColumn)
from numbers
left join
OtherTable ot
on ot.NumberColumn between numbers.a and numbers.b
group by
numbers.a
enter code hereI have been having a play with a CTE after you posted this and came up with the following, I would be interested to hear if it works for you at all.
DECLARE #segment int = 100
;
WITH _CTE
(rowNum, value)
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY col01) -1, col01
FROM dbo.testTable
)
SELECT rowNum/#segment AS Bucket, MIN(Value) AS MinVal, MAX(Value) AS MaxVal
FROM _CTE
group by rowNum/#segment
ORDER BY Bucket
;
col01 in this case is the column that you want the min/max range values from, as is TestTable.