MySql Query to retrieve a a set number from each group - mysql

If I have the table:
------------------
| Provider | ID |
------------------
| X | 125 |
------------------
| X | 133 |
------------------
| X | 342 |
------------------
| X | 327 |
------------------
| Y | 123 |
------------------
| Y | 853 |
------------------
| Y | 123 |
------------------
| Z | 853 |
------------------
| Z | 533 |
------------------
| Z | 174 |
------------------
I want to get 2 random entries from each of the providers X and Y (ignoring Z) to produce
X id
X id
Y id
Y id
I've tried several queries including
select id, provider from tableName a where (SELECT COUNT(*) FROM tableName b where b.provider = a.provider) = 2;
Any ideas?

Order by rand() and limit to 2 with a where clause for x and y and union all the results. Plainly obvious what you're trying to do and easy to maintain.
SELECT Provider, ID
FROM tableName
WHERE provider = 'X'
ORDER BY Rand()
LIMIT 2
UNION ALL
SELECT Provider, ID
FROM tableName
WHERE provider = 'Y'
ORDER BY Rand()
LIMIT 2

How about you try this query:
select provider, ID, Rank from
(
select *, #row_num := IF(#prev_value=provider,#row_num+1,1) AS rank, #prev_value := provider from
(
select provider, ID, rand() as SortingField from yourTable
order by provider, SortingField
) t1,
(SELECT #rownum := 0) x, (SELECT #prev_value := '') y
) src
where Rank <= 2 and Provider <> 'Z'
Okay, so here's how it works. First thing first, you need random entries so you need to add a "random" sorting field which I did using mySql's rand() function. Since it gives a random number, your fields will always be sorted randomly.
This sets us up for the next bit of your requirement which is the two random entries. Since your field are sorted randomly, we'll just pick the first two records which are going to be different every time the query runs. To do that, we're going to need a line counter that resets every time the desired field changes value (in your case, it's 'provider'). So that's what we're doing using variable:
#rownum is your row counter and is incremented by 1 every time a new record shows up.
#prev_value is your value checker that you use to determine when the counter needs to be reset. Notice that it's set after the #row_num incrementation, which is critical because if you set it before, you'd be checking against the current value, which would be pointless.
And at last, all you have to do is select the desired fields (provider, ID, Rank) and grab ranks that are inferior to 2 and ignore the 'Z' provider which is what the last part of the query does.
If you have questions, don't hesitate to ask, but I hope my explanation was clear enough.

Related

Calculate unique items seen by users via sql

I need help to resolve the next case.
The data which users want to see is accessible by pagination requests and later these requests are stored in the database in the next form:
+----+---------+-------+--------+
| id | user id | first | amount |
+----+---------+-------+--------+
| 1 | 1 | 0 | 5 |
| 2 | 1 | 10 | 10 |
| 3 | 1 | 10 | 5 |
| 4 | 1 | 15 | 10 |
| 5 | 2 | 0 | 10 |
| 6 | 2 | 0 | 5 |
| 7 | 2 | 10 | 5 |
+----+---------+-------+--------+
The table is ordered by user id asc, first asc, amount desc.
The task is to write the SQL statement which calculate what total unique amount of data the user has seen.
For the first user total amount must be 20, since the request with id=1 returned first 5 items, with id=2 returned another 10 items. Request with id=3 returns data already 'seen' by request with id=2. Request with id=4 intersects with id=2, but still returns 5 'unseen' pieces of data.
For the second user total amount must be 15.
As a result of SQL statement, I should get the next output:
+---------+-------+
| user id | total |
+---------+-------+
| 1 | 20 |
+---------+-------+
| 2 | 15 |
+---------+-------+
I am using MySQL 5.7, so window functions are not available for me. I stuck with this task for a day already and still cannot get the desired output. If it is not possible with this setup, I will end up calculating the results in the application code. I would appreciate any suggestions or help with resolving this task, thank you!
This is a type of gaps and islands problem. In this case, use a cumulative max to determine if one request intersects with a previous request. If not, that is the beginning of an "island" of adjacent requests. A cumulative sum of the beginnings assigns an "island", then an aggregation counts each island.
So, the islands look like this:
select userid, min(first), max(first + amount) as last
from (select t.*,
sum(case when prev_last >= first then 0 else 1 end) over
(partition by userid order by first) as grp
from (select t.*,
max(first + amount) over (partition by userid order by first range between unbounded preceding and 1 preceding) as prev_last
from t
) t
) t
group by userid, grp;
You then want this summed by userid, so that is one more level of aggregation:
with islands as (
select userid, min(first) as first, max(first + amount) as last
from (select t.*,
sum(case when prev_last >= first then 0 else 1 end) over
(partition by userid order by first) as grp
from (select t.*,
max(first + amount) over (partition by userid order by first range between unbounded preceding and 1 preceding) as prev_last
from t
) t
) t
group by userid, grp
)
select userid, sum(last - first) as total
from islands
group by userid;
Here is a db<>fiddle.
This logic is similar to Gordon's, but runs on older releases of MySQL, too.
select userid
-- overall length minus gaps
,max(maxlast)-min(minfirst) + sum(gaplen) as total
from
(
select userid
,prevlast
,min(first) as minfirst -- first of group
,max(last) as maxlast -- last of group
-- if there was a gap, calculate length of gap
,min(case when prevlast < first then prevlast - first else 0 end) as gaplen
from
(
select t.*
,first + amount as last -- last value in range
,( -- maximum end of all previous rows
select max(first + amount)
from t as t2
where t2.userid = t.userid
and t2.first < t.first
) as prevlast
from t
) as dt
group by userid, prevlast
) as dt
group by userid
order by userid
See fiddle

SQL Query sort and update by row number (SQLFiddle example)

I have a sports database where I want to sort the data by a custom field ('Rating') and update the field ('Ranking') with the row number.
I have tried the following code to sort the data by my custom field 'Rating'. It works when I sort it by a normal field, but not with a custom/calculated field. When the sorting has been done, I want it to update the field 'Ranking' with the row number.
Ie the fighter with the highest 'Rating' should have the value '1' as 'Ranking.
SELECT id,lastname, wins, Round(((avg(indrating)*13) + (avg(Fightrating)*5) * 20) / 2,2) as Rating,
ROW_NUMBER() OVER (ORDER BY 'Rating' DESC) AS num
from fighters
JOIN fights ON fights.fighter1 = fighters.id
GROUP BY id
The code above isn't sorting the Rating accurately. It sorts by row number, but the highest Rating isn't rated as #1. It seems a bit random.
SQL Fiddle: http://sqlfiddle.com/#!9/aa1fca/1 (This example is correctly sorted, but I want it to update the "Ranking" column by row number - meaning the highest rated fighter (by 'Rating') gets '1' in the Ranking column, the second highest reated fighter gets '2' in the Ranking column etc).
Also I would like to be able to add WHERE clause in the fighters table (where fighters.organization = 'UFC') for example.
First, let's fix your query so it runs on MySQL < 8.0. This requires doing the computing and sorting in a subquery, then using a variable to compute the rank:
select
id,
rating,
#rnk := #rnk + 1 ranking
from
(select #rnk := 0) r
cross join (
select
fighter1 id,
round(((avg(indrating)*13) + (avg(fightrating)*5) * 20) / 2,2) as rating
from fights
group by fighter1
order by rating desc
) x
Now we use the update ... join ... set ... syntax to update the fighters table:
update fighters f
inner join (
select
id,
rating,
#rnk := #rnk + 1 ranking
from
(select #rnk := 0) r
cross join (
select
fighter1 id,
round(((avg(indrating)*13) + (avg(fightrating)*5) * 20) / 2,2) as rating
from fights
group by fighter1
order by rating desc
) x
) y on y.id = f.id
set f.ranking = y.ranking;
Demo in a MySQL 5.6 fiddle based on the fiddle you provided in the comments.
The select query returns:
| id | rating | ranking |
| --- | ------ | ------- |
| 3 | 219.5 | 1 |
| 4 | 213 | 2 |
| 1 | 169.5 | 3 |
| 2 | 156.5 | 4 |
And here is the content of the fighters table after the update:
| id | lastname | ranking |
| --- | ---------- | ------- |
| 1 | Gustafsson | 3 |
| 2 | Cyborg | 4 |
| 3 | Jones | 1 |
| 4 | Sonnen | 2 |

MySQL: how to assign same ID for records with close timestamp

I have a MySQL table with timestamp column t. I need to create another integer column (groupId) which will have the same value for records with timestamp with
less then 3 sec difference. My version of MySQL has no window function support. This is the expected output in 2nd column:
+---------------------+--------+
| t | groupId|
+---------------------+--------+
| 2017-06-17 18:15:13 | 1 |
| 2017-06-17 18:15:14 | 1 |
| 2017-06-17 20:30:06 | 2 |
| 2017-06-17 20:30:07 | 2 |
| 2017-06-17 22:44:58 | 3 |
| 2017-06-17 22:44:59 | 3 |
| 2017-06-17 23:59:50 | 4 |
| 2017-06-17 23:59:51 | 4 |
I tried to use self-join and TIMESTAMPDIFF(SECOND,t1,t2) <3
but I do not know how to generate the unique groupId.
P.S.
It is guaranteed by the nature of data what there is no continues range which spans > 3 sec
You can do this using variables.
select tm
,#diff:=timestampdiff(second,#prev,tm)
,#prev:=tm
,#grp:=case when #diff<3 or #diff is null then #grp else #grp+1 end as groupID
from t
cross join (select #prev:='',#diff:=0,#grp:=1) r
order by tm
For this, I believe that you need to create a stored procedure that first sort your table by the column t (timestamp) and then goes through it grouping and assigning the groupId accordingly.... in this case you can use your own counter as groupID.
What it is important here, is how you split the time into frames of 2 seconds, you could end with different results depending of your point of reference...
This query puts every record in the same group when the previous record is just 3 seconds before:
UPDATE t
JOIN (
SELECT
t.*
, #gid := IF(TIMESTAMPDIFF(SECOND, #prev, t) > 3, #gid + 1, #gid) AS gid
, #prev := t
FROM t
, (SELECT #prev := NULL, #gid := 1) v
ORDER BY t
) sq ON t.t = sq.t
SET t.groupId = sq.gid;
see it working live in an sqlfiddle
learn more about user-defined variables here
This query will work in Oracle sql:
select *
from (
select e.*,
rank() over (partition by trunc(hiredate,'mi') order by trunc(hiredate,'mi') desc) MINu
from emp e
)

Get Average value for each X rows in SQL

Let`s say I have the following table
+----+-------+
| Id | Value |
+----+-------+
| 1 | 2.0 |
| 2 | 8.0 |
| 3 | 3.0 |
| 4 | 9.0 |
| 5 | 1.0 |
| 6 | 4.0 |
| 7 | 2.5 |
| 8 | 6.5 |
+----+-------+
I want to plot these values, but since my real table has thousands of values, I thought about getting and average for each X rows. Is there any way for me to do so for, ie, each 2 or 4 rows, like below:
2
+-----+------+
| 1-2 | 5.0 |
| 3-4 | 6.0 |
| 5-6 | 2.5 |
| 7-8 | 4.5 |
+-----+------+
4
+-----+------+
| 1-4 | 5.5 |
| 5-8 | 3.5 |
+-----+------+
Also, is there any way to make this X value dynamic, based on the total number of rows in my table? Something like, if I have 1000 rows, the average will be calculated based on each 200 rows (1000/5), but if I have 20, calculate it based on each 4 rows (20/5).
I know how to do that programmatically, but is there any way to do so using pure SQL?
EDIT: I need it to work on mysql.
Depending on your DBMS, something like this will work:
SELECT
ChunkStart = Min(Id),
ChunkEnd = Max(Id),
Value = Avg(Value)
FROM
(
SELECT
Chunk = NTILE(5) OVER (ORDER BY Id),
*
FROM
YourTable
) AS T
GROUP BY
Chunk
ORDER BY
ChunkStart;
This creates 5 groups or chunks no matter how many rows there are, as you requested.
If you have no windowing functions you can fake it:
SELECT
ChunkStart = Min(Id),
ChunkEnd = Max(Id),
Value = Avg(Value)
FROM
YourTable
GROUP BY
(Id - 1) / (((SELECT Count(*) FROM YourTable) + 4) / 5)
;
I made some assumptions here such as Id beginning with 1 and there being no gaps, and that you would want the last group too small instead of too big if things didn't divide evenly. I also assumed integer division would result as in Ms SQL Server.
You can use modulos operator to act on every Nth row of the table. This example would get the average value for every 10th row:
select avg(Value) from some_table where id % 10 = 0;
You could then do a count of the rows in the table, apply some factor to that, and use that value as a dynamic interval:
select avg(Value) from some_table where id % (select round(count(*)/1000) from some_table) = 0;
You'll need to figure out the best interval based on the actual number of rows you have in the table of course.
EDIT:
Rereading you post I realize this is getting an average of every Nth row, and not each sequential N rows. I'm not sure if this would suffice, or if you specifically need sequential averages.
Look at the NTILE function (as in quartile, quintile, decile, percentile). You can use it to split your data evenly into a number of buckets - in your case it seems you would like five.
Then you can use AVG to calculate an average for each bucket.
NTILE is in SQL-99 so most DBMSes should have it.
You can try that
CREATE TABLE #YourTable
(
ID int
,[Value] float
)
INSERT #YourTable (ID, [Value]) VALUES
(1,2.0)
,(2,8.0)
,(3,3.0)
,(4,9.0)
,(5,1.0)
,(6,4.0)
,(7,2.5)
,(8,6.5)
SELECT
ID = MIN(ID) + '-' + MAX(ID)
,[Value] = AVG([Value])
FROM
(
SELECT
GRP = ((ROW_NUMBER() OVER(ORDER BY ID) -1) / 2) + 1
,ID = CONVERT(VARCHAR(10), ID)
,[Value]
FROM
#YourTable
) GrpTable
GROUP BY
GRP
DROP TABLE #YourTable

MySQL Rank Not Matching High Score in Table

While making a game the MySQL call to get the top 10 is as follows:
SELECT username, score FROM userinfo ORDER BY score DESC LIMIT 10
This seems to work decently enough, but when paired with a call to get a individual player's rank the numbers may be different if the player has a tied score with other players. The call to get the players rank is as follows:
SELECT (SELECT COUNT(*) FROM userinfo ui WHERE (ui.score, ui.username) >= (uo.score, uo.username)) AS rank FROM userinfo uo WHERE username='boddie';
Example results from first call:
+------------+-------+
| username | score |
+------------+-------+
| admin | 4878 |
| test3 | 3456 |
| char | 785 |
| test2 | 456 |
| test1 | 253 |
| test4 | 78 |
| test7 | 0 |
| boddie | 0 |
| Lz | 0 |
| char1 | 0 |
+------------+-------+
Example results from second call
+------+
| rank |
+------+
| 10 |
+------+
As can be seen, the first call ranks the player at number 8 on the list, but the second call puts him at number 10. What changes or what can I do to make these calls give matching results?
Thank you in advance for any help!
You need in the first query to :
ORDER BY
score DESC,
username DESC
This way it will reach at rank 10... this is due to the username comparison in the second query :
(ui.score, ui.username) >= (uo.score, uo.username)
I would change your Order by to include username, so that you always get the same order. So it would look like:
... ORDER BY score DESC, username ASC ...
Another way of doing it:-
SELECT UsersRank
FROM
(
SELECT userinfo.score, userinfo.username, #Rank:=#Rank+1 AS UsersRank
FROM userinfo
CROSS JOIN (SELECT (#Rank:=0)) Sub1
ORDER BY userinfo.score DESC, userinfo.username
) Sub2
WHERE username = 'boddie'
SELECT
uo.username,
(SELECT COUNT(*)+1 FROM userinfo ui WHERE ui.score>uo.score) AS rank
FROM userinfo uo
WHERE uo.username='boddie'
Or if you need to get username, score and ranks:
SELECT
uo.username,
uo.score,
(#row := #row + 1) AS rank
FROM userinfo uo
JOIN (SELECT #row := 0) r
ORDER BY uo.score DESC, uo.username ASC
You could add
HAVING uo.username = 'boddie'
to get only one user
Try this.
SELECT (COUNT(*)+1) AS rank
FROM userinfo ui
WHERE ui.score > (SELECT score
FROM userinfo
WHERE username='boddie');