MySQL modifying order by rand() to other methods - mysql

I am now trying to make random selections from each grouped column array, with chances followed by the weight of each row. For example, I have a table (DemoTable) like this:
http://sqlfiddle.com/#!9/23470/3/0
Name
State
Grade
Weight
John
NY
100
1
Liam
NY
90
2
Olivia
NY
90
3
Emma
NY
80
4
James
CA
10
1
Henry
CA
20
1
Mia
NJ
50
1
Ava
NJ
30
4
For State = 'NY', there are four rows with grade array: [100, 90, 90, 80] and the weight [1, 2, 3, 4], respectively. So 80 has the largest chance to be picked while 100 has the least within its State group.
I made a query for it:
SELECT a.*,
(SELECT b.Grade FROM DemoTable b WHERE a.State = b.State
ORDER BY RAND() * -b.Weight LIMIT 1) AS 'random_val' FROM DemoTable a;
and it worked with the result:
Name
State
Grade
Weight
random_val
John
NY
100
1
80
Liam
NY
90
2
80
Olivia
NY
90
3
80
Emma
NY
80
4
90
James
CA
10
1
20
Henry
CA
20
1
10
Mia
NJ
50
1
30
Ava
NJ
30
4
30
Though, I would like to know if there is any other method like join or union instead of using order by rand() alone.
Is there any other way to modify my MySQL query that gives the same result?
I've searched for solving this problem all day, but couldn't find the proper way to do so; and that's why I asked here for the aid.
I would sincerely appreciate if I could get some advice.

My first attempt using analytic functions, though I suspect yours is faster over larger datasets...
WITH
ranged AS
(
SELECT
*,
SUM(weight) OVER (PARTITION BY state ORDER BY id) - weight AS weight_range_lower,
SUM(weight) OVER (PARTITION BY state ORDER BY id) AS weight_range_upper,
SUM(weight) OVER (PARTITION BY state ) * rand() AS rand_threshold
FROM
DemoTable
)
SELECT
ranged.*,
lookup.grade AS random_grade
FROM
ranged
INNER JOIN
ranged AS lookup
ON lookup.state = ranged.state
AND lookup.weight_range_lower <= ranged.rand_threshold
AND lookup.weight_range_upper > ranged.rand_threshold
ORDER BY
ranged.id
Or, if you want all members of the same state to be given the same random_grade...
SELECT
*,
FIRST_VALUE(grade) OVER (PARTITION BY state ORDER BY weight * rand() DESC)
FROM
DemoTable
ORDER BY
id
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=133f9e86b013a477ac342d0295132dd5

Related

Can this be done with a subquery?

I have a table looking like this:
Team Name Points
A Peter 26
A John 18
A Carl 20
A Robert 32
A Mike 10
B Tom 22
B Michael 28
B Tina 18
B Donald 35
B Jeff 20
I want to get a result from the query that will give me the best 3 users from a team and the SUM of the point from the 3 highest users.
For team A the 3 highest scores are Robert (32), Peter (26) and (Carl (20) which is a total of 78 points.
For team B the highest 3 scores are Donald (35), Michael (28) and Tom (22) which is a total of 85 points
So it must be something like this:
Place Team Points
1 B 85
2 A 78
I have tried something like this:
SELECT points, user FROM table ORDER BY points DESC, LIMIT 3
That will give me the 3 users with the highest points but I want also the SUM of these 3 users and these records must be per team.
I think it must be done with a subquery, is this correct?
SELECT team,SUM(points) sp FROM
(
(
SELECT * FROM `table` WHERE team = "A" ORDER BY points DESC LIMIT 3
)
UNION ALL
(
SELECT * FROM `table` WHERE team = "B" ORDER BY points DESC LIMIT 3
)
)
AS dbx
GROUP BY team
ORDER BY sp DESC
Explaination:
Your Data:
Get Highest 3 from both Team (A and B).
And then, sum After join them All.
Result:

Need help to figure out sorting sql query

I'm trying to get the total number of levels gained or lost from this sort of table:
id name level timestamp
1 Rex 15 10:25
2 Rex 15 10:26
3 Rex 15 10:27
4 Rex 14 10:28
5 Rex 13 10:29
6 Rex 13 10:30
7 Rex 13 10:31
8 Rex 13 10:29
9 Xer 44 10:30
10 Xer 44 10:31
11 Xer 45 10:32
12 Xer 45 10:33
13 Xer 45 10:34
Currently I'm running
SELECT id, name, level, timestamp, MAX(level) - MIN(level) AS gained
FROM log
GROUP BY name
But the problem with this query is that both gained and lost levels will count as gained. It would be perfect if I could get a negative int in the gained column if the user has lost levels
The output I want from the data above is:
id name level timestamp gained
8 Rex 13 10:29 -2
13 Xer 45 10:34 1
If you need to respect the timeline, then try something like this:
SELECT MAX(id) id, name,
( SELECT level FROM log l0 WHERE l.name = l0.name ORDER BY timestamp DESC LIMIT 1 ) level,
MAX(timestamp) timestamp,
-- last entry for the name
( SELECT level FROM log l1 WHERE l.name = l1.name ORDER BY timestamp DESC LIMIT 1 ) -
-- first entry for the name
( SELECT level FROM log l2 WHERE l.name = l2.name ORDER BY timestamp ASC LIMIT 1 ) gained
FROM log l
GROUP BY name
I used LAG in as subquery to get the changes and then summed those changes in an outer sub-query. To get the last row I uses yet another query to find the max time for each name. Maybe not the most efficient query but it works
SELECT l.id, l.name, l.level, l.timestamp, sg.gain
FROM log l
JOIN (SELECT name, SUM(gain) gain
FROM (SELECT name, level - COALESCE(LAG(level) OVER w, level) as gain
FROM log
WINDOW w AS (PARTITION BY name ORDER BY timestamp)) as g
GROUP BY name) as sg ON sg.name = l.name
JOIN (SELECT name, MAX(time) max_t
FROM log
GROUP BY name) mt ON mt.name = l.name AND mt.max_t = l.time

Grouping by to find average differences for specific indexes in SQL

I have the following table:
person_index score year
3 76 2003
3 86 2004
3 86 2005
3 87 2006
4 55 2005
4 91 2006
I want to group by person_index, getting the average score difference between consecutive years, such that I end up with one row per person, indicating the average increase/decrease:
person_index avg(score_diff)
3 3.67
4 36
So for person with index 3 - there were changes over 3 years, one was 10pt, one was 0, and one was 1pt. Therefore, their average score_diff is 3.67.
EDIT: to clarify, scores can also decrease. And years aren't necessarily consecutive (one person might not get a score at a certain year, so could be 2013 followed by 2015).
Simplest way is to use LAG(MySQL 8.0+):
WITH cte AS (
SELECT *, score - LAG(score) OVER(PARTITION BY person_index ORDER BY year) AS diff
FROM tab
)
SELECT person_index, AVG(diff) AS avg_diff
FROM cte
GROUP BY person_index;
db<>fiddle demo
Output:
+---------------+----------+
| person_index | avg_diff |
+---------------+----------+
| 3 | 3.6667 |
| 4 | 36.0000 |
+---------------+----------+
If the scores only increase -- as in your example -- you can simply do:
select person_id,
( max(score) - min(score) ) / nullif(max(year) - min(year) - 1, 0)
from t
group by person_id;
If they do not only increase, it is a bit trickier because you have to calculate the first and last scores:
select t.person_id,
(tmax.score - tmin.score) / nullif(tmax.year - tmin.year - 1, 0)
from (select t.person_id, min(year) as miny, max(year) as maxy
from t
group by person_id
) p join
t tmin
on tmin.person_id = p.person_id and tmin.year = p.miny join
t tmax
on tmax.person_id = p.person_id and tmax.year = p.maxy join

How to GROUP BY in groups of N rows?

I have a table called PEOPLE with the following data:
MYNAME AGE MYDATE
==========================
MARIO 20 2015/02/03
MARIA 10 2015/02/02
PEDRO 40 2015/02/01
JUAN 15 2015/01/03
PEPE 20 2015/01/02
JULIA 30 2015/01/01
JUANI 50 2014/02/03
MARTIN 10 2014/02/03
NASH 21 2014/02/03
Then I want to get the average of age grouping the people in groups of 3 ordering by MYDATE descending.
I mean, the result that I'm looking for would be something like:
23,3
21,6
27
Where 23,3 is the average of the age of:
MARIO 20 2015/02/03
MARIA 10 2015/02/02
PEDRO 40 2015/02/01
And 21,6 is the average of the age of:
JUAN 15 2015/01/03
PEPE 20 2015/01/02
JULIA 30 2015/01/01
And 27 is the average of the age of:
JUANI 50 2014/02/03
MARTIN 10 2014/02/03
NASH 21 2014/02/03
How could I handle this? I know how to use GROUP BY but only to group for a particular field of the table.
SQL tables are inherently unordered, so I assume that you want to order by mydate descending. You can enumerate the rows using variables, use arithmetic to define the groups, and then get the average:
select avg(age)
from (select t.*, (#rn := #rn + 1) as seqnum
from table t cross join
(select #rn := 0) vars
order by mydate desc
) t
group by floor((seqnum - 1) / 3);
Try - You can also then use Grp to get which three people the average relates to
select MYNAME, AGE, MYDATE, RN / 3 As Grp into #x
from
(select MYNAME, AGE , MYDATE, ROW_NUMBER() over(order by MyDate) + 2 as RN
from MYdata)x
select Grp, AVG(Age) as AvgAge
From #x
Group By Grp

Query: Count Alphabetical wise name

I have one voter table which contain large amount of data. Like
Voter_id name age
1 san 24
2 dnyani 20
3 pavan 23
4 ddanial 19
5 sam 20
6 pickso 38
I need to show all voter_name by Alphabetically and count them.Like
name
san
sam
s...
s...
dnyani
ddanial
d...
pavan
pickso
p..
p..
I try using count(voter_name) or GROUP BY.
But both not working for me..... Suppose table contain 50 voters details.
number of person name start with A=15,b=2, c=10,y=3 and so on.
Then how to count and show first 15 record of 'A' person, next 2 record of 'B' person and so on..
Give me any reference or hint..
Thanks in Advance.
It is as simple as this,
SELECT SUBSTRING(name,1,1) as ALPHABET, COUNT(name) as COUNT
FROM voter GROUP BY SUBSTRING(name,1,1);
This order names only:
SELECT `name` FROM `voter` ORDER BY `name` ASC
This counts each occurrence of the first letter and group them group them together
ex.:
Letter COUNT
------ -------
A 15
B 2
C 10
y 3
SELECT SUBSTR(`name`,1,1) GRP, COUNT(`name`) FROM `voter` WHERE
SUBSTR(`name`,1,1)=SUBSTR(`name`,1,1) GROUP BY GRP ORDER BY GRP ASC
Here you go!
If you need names and their counts in ascending order, then you can use:
SELECT
name, COUNT(*) AS name_count
FROM
voter
GROUP BY
name
ORDER BY
name ASC
Which will give the output like
name name_count
------------------
albert 15
baby 6
...
If you need to display all records along with their counts, then you may use this:
SELECT
voter_id, name, age, name_count
FROM
(
SELECT
name, COUNT(name) AS name_count
FROM
voter
GROUP BY
name
) counts
JOIN actor
USING (name)
ORDER BY
name
and you get the output as:
voter_id name age name_count
------------------------------------
6 abraham 26 2
24 abraham 36 2
2 albert 19 1
4 babu 24 4
15 babu 53 4
99 babu 28 4
76 babu 43 4
...
Check the SUBSTRING function of MySQL here
http://dev.mysql.com/doc/refman/5.5/en/string-functions.html#function_substring
And we can use a sub-query to achieve our result.
So using that, how about this
SELECT voter_id, name, age, COUNT(*) AS alphabet
FROM
(SELECT voter_id, name, age, SUBSTRING(name, 1, 1) AS first_letter FROM voter)
AS voter
GROUP BY first_letter
ORDER BY first_letter ASC