Selecting multiple rows, where a difference in value is greater than x% - mysql

I'm facing the following problem...
Given this data:
table : votes
=========
value
=========
10
25
38
90
92
93
98
100
120
I would like to return the value only, if the difference between next and previously accepted value is bigger than 10% of the first one:
if abs(int(a)-int(b))*100/int(a) < 10:
return True
So the end list should be (I have added % difference in square brackets):
==========
result
==========
10 ()
25 (150%)
38 (52%)
90 (136%)
100 (11%)
120 (20%)
The query should also sort those values first.
I'm able to do it with code (as shown above), but haven't got any chance in coming even close to a direct query.
MySQL v.8.0.19

You don't mention what version of MySQL you are using, so I'll assume it's a mordern one (8.x). You can use LAG(). For example:
select
concat('', value,
case when prev_value is null then ''
else concat('', 100 * (value - prev_value) / prev_value, '%')
end
) as result
from (
select
value,
lag(value) over (order by value) as prev_value
from t
) x
where prev_value is null or value > prev_value * 1.1
order by value

In MySQL 8.0, you can do this with lag(). Assuming that you want to sort rows by value, that would be:
select value
from (
select
value,
lag(value, 1, 0) over(order by value) lag_value
from mytable t
) t
where value > lag_value * 1.10
If you want to use an different ordering column, then you can change the order by clause to use the relevant column.
In earlier versions, one option is a correlated subquery:
select value
from mytable t
where value > 1.10 * coalesce(
(
select t1.value
from mytable t1
where t1.value < t.value
order by t1.value desc
limit 1
),
0
)
To use another ordering column here, you need to change the where clause and the order by clause of the subquery.
On the other hand, if you want to select the next row according to the ratio against the previously selected row, then that's a different question. You need some kind of iterative process: in SQL, one approach is a recursive query:
with
data as (
select value, row_number() over(order by value) rn
from mytable t
) d,
cte as (
select 1 is_valid, value, rn from data where rn = 1
union all
select
(d.value > 1.1 * c.value),
case when d.value > 1.1 * c.value then d.value else c.value end,
d.rn
from cte c
inner join data d on d.rn = c.rn + 1
)
select value from cte where is_valid order by value
The query enumerates the values, then walks the dataset sequentially while keeping track of the last selected value, and setting flags on records that should appear in the final resultset.

Related

SQL Query about percentage selection

I am trying to write a query for a condition:
If >=80 percent (4 or more rows as 4/5*100=80%) of the top 5 recent rows(by Date Column), for a KEY have Value =A or =B, then change the flag from fail to pass for the entire KEY.
Here is the input and output sample:
I have highlighted recent rows with green colour in the sample.
Can someone help me in this?
I tried till finding the top 5 recent rows by the foll code:
select * from(
select *, row_number() over (partition by "KEY") as 'RN' FROM (
select * from tb1
order by date desc))
where "RN"<=5
Couldnt figure what to be done after this
Test this:
WITH
-- enumerate rows per key group
cte1 AS ( SELECT *,
ROW_NUMBER() OVER (PARTITION BY `key` ORDER BY `date` DESC) rn
FROM sourcetable ),
-- take 5 recent rows only, check there are at least 4 rows with A/B
cte2 AS ( SELECT `key`
FROM cte1
WHERE rn <= 5
GROUP BY `key`
HAVING ( SUM(`value` = 'A') >= 4
OR SUM(`value` = 'B') >= 4 )
-- AND SUM(rn = 5) )
-- update rows with found key values
UPDATE sourcetable
JOIN cte2 USING (`key`)
SET flag = 'PASS';
5.7 version – Ayn76
Convert CTEs to subqueries. Emulate ROW_NUMBER() using user-defined variable.

How to eliminate only continuous duplicates but not all duplicates in a select query (MySQL)?

I have a table like this:
01-Jul-17 100
02-Jul-17 100
03-Jul-17 300
04-Jul-17 300
05-Jul-17 500
06-Jul-17 500
07-Jul-17 300
08-Jul-17 400
09-Jul-17 100
10-Jul-17 100
What I want to output is (in this order) by eliminating the continuous duplicates but not all duplicates:
100
300
500
300
400
100
I cannot select Distinct, as it will eliminate the second instances of 300, 100. Is there a way to achieve this result in MySQL?
Thanks!
You want to get the previous value. If the dates really have no gaps or duplicates, just do:
select t.*
from t left join
t tprev
on t.col1 = date_add(tprev.col1, interval 1 day)
where tprev.col2 is null or tprev.col2 <> t.col2;
EDIT:
If the dates don't meet these conditions, then you can use variables:
select t.*
from (select t.*,
(#rn := if(#v = col2, #rn + 1,
if(#v := col2, 1, 1)
)
) as rn
from t cross join
(select #v := 0, #rn := 0) params
order by t.col1
) t
where rn = 1;
Note that MySQL does not guarantee the order of evaluation of expressions in the SELECT. So variables should not be assigned in one expression and then used in another -- they should be assigned in a single expression.
One way to handle this problem is by using session variables to track the changes of the values as ordered by your date column. In the query below, we keep track of the value, ordered by date, and assign a row number to each group of identical value. Then, only the first value in each group is retained. Note that this approach is robust to any number of duplicates. It is also robust with respect to there being gaps in your dates, so long as each record can be ordered by date.
SET #rn = 1;
SET #val = NULL;
SELECT t.val
FROM
(
SELECT
#rn:=CASE WHEN #val = val THEN #rn+1 ELSE 1 END rn,
#val:=val AS val,
dt
FROM yourTable
ORDER BY dt
) t
WHERE t.rn = 1
ORDER BY t.dt;
Output:
Demo here:
Rextester
You can make use of lag and lead functions.
select y from (select y , lag(y,1,0) over (order by x) as prev_y from t1) where y <> prev_y;

SQL OR clause priority

I have a table with user awards, which can be of various different types.
For example, here are the records for the qualification ID 94:
So as you can see, there are 2 users, one has records for the award type of "average", "min", "max" and "final", the other has the same but no "final" award.
What I want is to get only 1 row per user. If they have an award of type "final" I want that, otherwise I want the "average" one, I don't want "min" or "max" at all.
So as an example, here is the query with just a simple IN clause:
So based on that, what I want the result to be is for the user 34562 I want the row with the "final" award, and for the user 6256 i want the row with their "average" award, since they don't have a "final" record.
I'm sure this should be fairly simple, but i'm failing miserably this morning.
I think I should be able to select the final record, then do a UNION ALL, but I can't seem to work it out in my head. Can anyone point me in the right direction?
I should point out that whilst this is MySQL for me, it needs to be compatible with other database platforms.
Thanks.
An easy way would be to check if an average entry exists in the where clause:
SELECT * FROM Table t
WHERE qualid = 94
and (type = 'average' AND
not exists(SELECT * FROM Table t2
WHERE t.qualid=t2.qualid AND t.userid=t2.userid AND type = 'final')
OR type = 'final')
You can accomplish this using MySQL user defined variables which will be more scalable.
SELECT
t.*
FROM
(
SELECT
*,
IF(#sameUser = userid, #rn := #rn + 1,
IF(#sameUser := userid, #rn := 1,#rn := 1)
) AS row_number
FROM moodle.mdl_bcgt_user_qual_awards
CROSS JOIN (SELECT #sameUser := -1, #rn := 1) AS var
WHERE qualid = 94
AND type IN ('final','average')
ORDER BY userid,
CASE WHEN type = 'final' THEN 0 ELSE 1 END
) AS t
WHERE t.row_number <= 1
ORDER BY t.userid
EDIT:
using NOT EXISTS & UNION ALL
SELECT
*
FROM your_table
WHERE qualid = 94
AND type = 'final'
UNION ALL
SELECT
*
FROM your_table A
WHERE qualid = 94
AND type = 'average'
AND NOT EXISTS (
SELECT 1 FROM your_table B WHERE A.qualid = B.qualid AND B.userid = A.userid AND B.type = 'final'
)
You can try this for sql Server
Select * from (
SELECT *,ROW_NUMBER() over(PARTITION BY type ORDER BY type) rowNo
FROM mdl_bcgt_user_qual_awards
WHERE qualid = 94
AND type in ('final','average')
Order By type
) as t
Where rowNo=1
Hear we have done Order by on type as character meet our requirement if any other text then just add case and sort on the basis of that field.
Here is a pure MySQL solution, which should also be generally ANSI compliant across most RDBMS:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT qualid,
CASE WHEN MAX(CASE WHEN type = 'final' THEN 100
WHEN type = 'average' THEN 10
ELSE 1
END) = 100 THEN 'final'
WHEN MAX(CASE WHEN type = 'final' THEN 100
WHEN type = 'average' THEN 10
ELSE 1
END) = 10 THEN 'average'
ELSE NULL
END AS type
FROM yourTable
GROUP BY qualid
) t2
ON t1.qualid = t2.qualid AND
t1.type = t2.type
This query employs a trick, which is to aggregate a sum for each qualid group based on whether final, average, or neither be present. I assign a value of 100 for final, 10 for average, and 1 for neither. This allows us to assign a type to each group.

Simplify CASE expression used multiple times

For readability, I would like to modify the below statement. Is there a way to extract the CASE statement, so I can use it multiple times without having to write it out every time?
select
mturk_worker.notes,
worker_id,
count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end ) correct,
100 * ( sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end)
/ count(episode_has_accepted_imdb_url) ) percentage
from
mturk_completion
inner join mturk_worker using (worker_id)
where
timestamp > '2015-02-01'
group by
worker_id
order by
percentage desc,
correct desc
You can actually eliminate the case statements. MySQL will interpret boolean expressions as integers in a numeric context (with 1 being true and 0 being false):
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) as correct,
(100 * sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
If you like, you can simplify it further by using the null-safe equals operator:
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url <=> accepted_imdb_url) as correct,
(100 * sum(imdb_url <=> accepted_imdb_url) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
This isn't standard SQL, but it is perfectly fine in MySQL.
Otherwise, you would need to use a subquery, and there is additional overhead in MySQL associated with subqueries.

mysql sorting cells with letters and words

I got cells that might have the following data (number of errors)
1
2
3
PASS
NoFileFound
NoLog
99
10
2
I would like sort with ascending order and descending order where I would PASS to be treated as a value of 0 and any other text based value should be treated as value of 1 error. As of now, these cells are stored as 'text' in the mysql database. How can this be done for MYSQL? What changes do I need to do?
Try this one:
SELECT Number FROM (
SELECT IF(valueField='PASS',0,1) as Number FROM TableMix
WHERE concat('',valueField * 1) <> valueField ) A
UNION ALL
SELECT Number FROM (
SELECT CAST(valueField as UNSIGNED) as Number FROM TableMix
WHERE concat('',valueField * 1) = valueField ) B
ORDER BY Number
See my SqlFiddle Demo
And this one with original values included:
SELECT Number, valueField FROM (
SELECT IF(valueField='PASS',0,1) as Number, valueField FROM TableMix
WHERE concat('',valueField * 1) <> valueField ) A
UNION ALL
SELECT Number, valueField FROM (
SELECT CAST(valueField as UNSIGNED) as Number, valueField FROM TableMix
WHERE concat('',valueField * 1) = valueField ) B
ORDER BY Number
See this Demo.