This question already has answers here:
Get top n records for each group of grouped results
(12 answers)
Using LIMIT within GROUP BY to get N results per group?
(14 answers)
Closed 5 years ago.
I am using the sql to retrieve the last 20 rows from the table grouped by date. I would like to limit it so that within each post_day group only the top 10 rows votes DESC are selected.
SELECT *, DATE(timestamp) as post_day
FROM stories
ORDER BY post_day DESC, votes DESC
LIMIT 0, 20
This is what the table looks like:
STORYID TIMESTAMP VOTES
1 2015-03-10 1
2 2015-03-10 2
3 2015-03-9 5
4 2015-03-9 3
Schema
create table stories
( storyid int auto_increment primary key,
theDate date not null,
votes int not null
);
insert stories(theDate,votes) values
('2015-03-10',1),
('2015-03-10',2),
('2015-03-09',5),
('2015-03-09',3),
('2015-03-10',51),
('2015-03-10',26),
('2015-03-09',75),
('2015-03-09',2),
('2015-03-10',12),
('2015-03-10',32),
('2015-03-09',51),
('2015-03-09',63),
('2015-03-10',1),
('2015-03-10',11),
('2015-03-09',5),
('2015-03-09',21),
('2015-03-10',1),
('2015-03-10',2),
('2015-03-09',5),
('2015-03-09',3),
('2015-03-10',51),
('2015-03-10',26),
('2015-03-09',75),
('2015-03-09',2),
('2015-03-10',12),
('2015-03-10',44),
('2015-03-09',11),
('2015-03-09',7),
('2015-03-10',19),
('2015-03-10',7),
('2015-03-09',51),
('2015-03-09',79);
The Query
set #rn := 0, #thedate := '';
select theDate, votes
from
(
select storyid, theDate, votes,
#rn := if(#thedate = theDate, #rn + 1, 1) as rownum,
#thedate := theDate as not_used
from stories
order by theDate, votes desc
) A
where A.rownum <= 10;
The Results
+------------+-------+
| theDate | votes |
+------------+-------+
| 2015-03-09 | 79 |
| 2015-03-09 | 75 |
| 2015-03-09 | 75 |
| 2015-03-09 | 63 |
| 2015-03-09 | 51 |
| 2015-03-09 | 51 |
| 2015-03-09 | 21 |
| 2015-03-09 | 11 |
| 2015-03-09 | 7 |
| 2015-03-09 | 5 |
| 2015-03-10 | 51 |
| 2015-03-10 | 51 |
| 2015-03-10 | 44 |
| 2015-03-10 | 32 |
| 2015-03-10 | 26 |
| 2015-03-10 | 26 |
| 2015-03-10 | 19 |
| 2015-03-10 | 12 |
| 2015-03-10 | 12 |
| 2015-03-10 | 11 |
+------------+-------+
20 rows in set, 1 warning (0.00 sec)
Usually you should use ROW_NUMBER() per group to order records inside of each group and then select records with ROW_NUMBER <= 10. In MySQL there is no ROW_NUMBER() aggregate function but you can use User-Defined variables in MySQL to emulate ROW_NUMBER()
select storyId, post_day , votes
from (
select storyId,
DATE(timestamp) as post_day,
votes,
#num := if(#grp = DATE(timestamp), #num + 1, 1) as row_number,
#grp := DATE(timestamp) as dummy
from stories,(select #num := 0, #grp := null) as T
order by DATE(timestamp) DESC, votes DESC
) as x where x.row_number <= 10;
SQLFiddle demo
Also look at:
How to select the first/least/max row per group in SQL
Related
Is there any way to fatch all entities from table grouped by common property while loop?
Table storage looks like this
id | product_id | category_id
-----------+-----------------------+--------------------------
1 | 1 | 15
2 | 2 | 17
3 | 3 | 18
4 | 4 | 17
5 | 5 | 15
6 | 6 | 17
7 | 7 | 18
and final result supposed to look like this
id | product_id | category_id
-----------+-----------------------+--------------------------
1 | 1 | 15
2 | 2 | 17
3 | 3 | 18
4 | 5 | 15
5 | 4 | 17
6 | 7 | 18
7 | 6 | 15
What i want is this:
Select each record grouped by category id. It means, if table size is 3200, i need to select all of 3200 records grouped by category id in ASC order
You seem to want the values interleaved. You can use row_number() in the order by:
select s.*
from storage s
order by row_number() over (partition by storage_id order by id),
storage_id;
Here is a db<>fiddle.
EDIT:
In older versions of MySQL, you can assign a sequential number within each group using variables and then use that for ordering:
select s.*
from (select s.*,
(#rn := if(#sid = storage_id, #rn + 1,
if(#sid := storage_id, 1, 1)
)
) as seqnum
from (select s.* from storage s order by storage_id, id) s cross join
(select #rn := 0, #sid := -1) params
) s
order by seqnum, id;
The SQL Fiddle has both methods.
I'm trying to get a users ranking getting his highest performances in every beatmap.
I get the user highest performance in every beatmap (only taking the top 5 performances) and adding them together, but it fails when the highest performance in one beatmap is repeated... because it counts twice
I'm based in this solution, but it doesn't works well for me...
Using MySQL 5.7
What i'm doing wrong?
Fiddle
Using this code:
SET group_concat_max_len := 1000000;
SELECT #i:=#i+1 rank, x.userID, x.totalperformance FROM (SELECT r.userID, SUM(r.performance) as totalperformance
FROM
(SELECT Rankings.*
FROM Rankings INNER JOIN (
SELECT userID, GROUP_CONCAT(performance ORDER BY performance DESC) grouped_performance
FROM Rankings
GROUP BY userID) group_max
ON Rankings.userID = group_max.userID
AND FIND_IN_SET(performance, grouped_performance) <= 5
ORDER BY
Rankings.userID, Rankings.performance DESC) as r
GROUP BY userID) x
JOIN
(SELECT #i:=0) vars
ORDER BY x.totalperformance DESC
Expected result:
+------+--------+------------------+
| rank | userID | totalperformance |
+------+--------+------------------+
| 1 | 1 | 450 |
+------+--------+------------------+
| 2 | 2 | 250 |
+------+--------+------------------+
| 3 | 5 | 140 |
+------+--------+------------------+
| 4 | 3 | 50 |
+------+--------+------------------+
| 5 | 75 | 10 |
+------+--------+------------------+
| 6 | 45 | 0 | --
+------+--------+------------------+
| 7 | 70 | 0 | ----> This order is not relevant
+------+--------+------------------+
| 8 | 76 | 0 | --
+------+--------+------------------+
Actual Result:
+------+--------+------------------+
| rank | userID | totalperformance |
+------+--------+------------------+
| 1 | 1 | 520 |
+------+--------+------------------+
| 2 | 2 | 350 |
+------+--------+------------------+
| 3 | 5 | 220 |
+------+--------+------------------+
| 4 | 3 | 100 |
+------+--------+------------------+
| 5 | 75 | 10 |
+------+--------+------------------+
| 6 | 45 | 0 | --
+------+--------+------------------+
| 7 | 70 | 0 | ----> This order is not relevant
+------+--------+------------------+
| 8 | 76 | 0 | --
+------+--------+------------------+
As you have mentioned that you are picking only top 5 performances per user across beatmaps then you can try this way:
select #i:=#i+1, userid,performance from (
select userid,sum(performance) as performance from (
select
#row_number := CASE WHEN #last_category <> t1.userID THEN 1 ELSE #row_number + 1 END AS row_number,
#last_category :=t1.userid,
t1.userid,
t1.beatmapid,
t1.performance
from (
select
userid, beatmapid,
max(performance) as performance
from Rankings
group by userid, beatmapid
) t1
CROSS JOIN (SELECT #row_number := 0, #last_category := null) t2
ORDER BY t1.userID , t1.performance desc
) t3
where row_number<=5
group by userid
)
t4 join (SELECT #i := 0 ) t5
order by performance desc
Above query will not consider duplicate Performance Score and pick only top 5 performance values.
DEMO
I have the following column names:
customer_email
increment_id
other_id (psuedo name)
created_at
increment_id and other_id will be unique, customer_email will have duplicates. As the results are returned I want to know what number of occurance of the email it is.
For each row, I want to know how many times thecustomer_email value has shown up so far. There will be an order by clause at the end for the created_at field and I plan to also add a where clause of where occurrences < 2
I am querying > 5 million rows but performance isn't too important because I'll be running this as a report on a read-replica database from production. In my use case, I will sacrifice performance for robustness.
| customer_email | incremenet_id | other_id | created_at | occurances <- I want this |
|----------------|---------------|----------|---------------------|---------------------------|
| joe#test.com | 1 | 81 | 2019-11-00 00:00:00 | 1 |
| sue#test.com | 2 | 82 | 2019-11-00 00:01:00 | 1 |
| bill#test.com | 3 | 83 | 2019-11-00 00:02:00 | 1 |
| joe#test.com | 4 | 84 | 2019-11-00 00:03:00 | 2 |
| mike#test.com | 5 | 85 | 2019-11-00 00:04:00 | 1 |
| sue#test.com | 6 | 86 | 2019-11-00 00:05:00 | 2 |
| joe#test.com | 7 | 87 | 2019-11-00 00:06:00 | 3 |
You can use variables in earlier versions of MySQL:
select t.*,
(#rn := if(#ce = customer_email, #rn + 1,
if(#ce := customer_email, 1, 1)
)
) as occurrences
from (select t.*
from t
order by customer_email, created_at
) t cross join
(select #ce := '', #rn := 0) params;
In MyQL 8+, I would recommend row_number():
select t.*,
row_number() over (partition by customer_email order by created_at) as occurrences
from t;
If you are running MySQL 8.0, you can just do a window count:
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
You don't need an order by clause at the end of the query for this to work (but you need one if you want to order the results).
If you need to filter on the results of the window count, an additional level is needed, since window functions cannot be used in the where clause of a query:
select *
from (
select
t.*,
count(*) over(partition by customer_email order by created_at) occurences
from mytable t
) t
where occurences < 2
This is example of my table :
+-----+-----+------------+--------+-------------+--------------+
| LID | AID | Created | TypeID | PaymentDate | PaymentValue |
+-----+-----+------------+--------+-------------+--------------+
| 1 | 529 | 2017-05-12 | 1 | 2017-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 2 | 529 | 2018-04-10 | 4 | 2018-04-10 | 200 |
+-----+-----+------------+--------+-------------+--------------+
| 3 | 441 | 2014-01-23 | 3 | 2014-01-23 | 300 |
+-----+-----+------------+--------+-------------+--------------+
| 4 | 324 | 2017-09-14 | 1 | 2017-09-14 | 400 |
+-----+-----+------------+--------+-------------+--------------+
| 5 | 111 | 2018-05-12 | 0 | 2018-05-12 | 340 |
+-----+-----+------------+--------+-------------+--------------+
| 6 | 529 | 2018-05-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 7 | 529 | 2018-06-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 8 | 529 | 2018-07-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 9 | 529 | 2018-08-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 10 | 529 | 2018-09-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 11 | 529 | 2018-01-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 12 | 529 | 2018-05-14 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 13 | 529 | 2018-05-21 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
| 14 | 529 | 2018-03-12 | 1 | 2018-05-12 | 100 |
+-----+-----+------------+--------+-------------+--------------+
Here another table
+-----+-------+
| ID |caption|
+-----+-------+
| 0 | bad |
+-----+-------+
| 1 | good |
+-----+-------+
I need to get 10 latest records per AID. If there less than 10 records for some AID anyway i need to get ten rows and put "No payment date" into PaymentDate and Created fields, Null into TypeID and 0 into PaymentValue. I can get 10 or less latest records with
select *
from (select *,
(#rn := if(#c = AID, #rn + 1,
if(#c := AID, 1, 1)
)
) as rn
from history cross join
(select #rn := 0, #c := -1) params
order by AID, Created desc
) t
having rn <= 10;
But i dont know how force mysql to output 10 rows for each AID. Help me please.
Result should be in a form
AID,TypeId,Created,Caption
I have done it.
This query needs to create a row of 10 records to combine with distinct AID valies in the table. I was able to show the result for Amount and Create date and will leave it to you to continue since you will get the idea.
The critical part is to build a table with 10 rows times distinct AID so about 40 rows in table r. Then do a left join to table t which is similar to what you have done. Table t gets a rank of at most 10 records. Any missing rank up to 10 recs will be filled by table r. Coalesce will assign the default values such as 0 fro amount and 'no create date' for date.
http://sqlfiddle.com/#!9/855c21/2
SELECT coalesce(r.aid, t.aid) as aid,
coalesce(t.paymentvalue, 0) as paymentvalue,
coalesce(cast(t.created as char), 'no create date') as created
FROM (select * from (
select 1 as rw union
select 2 union select 3
union select 4 union select 5
union select 6 union select 7
union select 8 union select 9
union select 10) u
cross join (select distinct aid
from history) h
) as r
LEFT JOIN (
SELECT a.aid, a.paymentvalue,
a.created, count(*) rn
FROM history a
JOIN history b
ON a.aid = b.aid
AND a.created <= b.created
GROUP BY a.aid, a.created
HAVING COUNT(*) <= 10) t
on r.rw=t.rn and r.aid=t.aid
order by aid, created;
I have added RIGHT JOIN to bring in the null rows to top up to 10 (or n) rows per AID. Initially I use SELECT 1 UNION SELECT 2 ... to generate the 10 rows. In order to make it easier to increase the number of rows (say 100), I am trying this idea of generate_series equivalent for mysql. In order for this to work, the number of rows in history table must be equal to greater than the number of rows required per AID.
select t1.lid
,t2.aid
,coalesce(t1.created, "no created date") as created
,t1.typeID
,coalesce(t1.paymentdate, "no payment date") as paymentDate
,coalesce(t1.paymentvalue, 0) as paymentValue
,t2.rn
from
(
select *,
(#rn := if(#c = AID, #rn + 1,
if(#c := AID, 1, 1)
)
) as rn
from history cross join
(select #rn := 0, #c := -1) params
order by AID, Created desc
) t1
right join
( select *
from (select distinct aid from history ) h1
cross join
(select rn -- generate table with n rows numbered from 1 to n
from
(select
#num:= 0) init
cross join
(select #num := #num +1 rn
from history ) t -- assume history has at least 10 rows
limit
10 ) h2 -- n = 10; change it to the number of rows per aid required
) t2
on t1.aid = t2.aid and t1.rn = t2.rn
order by t2.aid, t2.rn
I have the following ranking table:
CREATE TABLE IF NOT EXISTS ranking(
user_id int(11) unsigned NOT NULL,
create_date date NOT NULL,
score int(8),
PRIMARY KEY (user_id, create_date)
);
I want to get each user's maximum number of consecutive days during which the score is greater or equal to 10. For example, if the table contains the following entries, the output (user, max_number) is listed below. My question is how to write the query in MySQL?
user_id | create_date | score
1 | 2017-03-08 | 40
1 | 2017-03-07 | 50
1 | 2017-03-06 | 60
1 | 2017-03-05 | 0
1 | 2017-03-04 | 70
1 | 2017-03-03 | 80
1 | 2017-03-02 | 0
2 | 2017-03-10 | 20
2 | 2017-03-09 | 30
2 | 2017-03-08 | 40
2 | 2017-03-07 | 50
2 | 2017-03-06 | 0
2 | 2017-03-05 | 60
2 | 2017-03-04 | 70
Output:
user_id | max_number
1 | 3
2 | 4
You can use user variables for this task:
select user_id, max(cnt) max_cnt
from (
select user_id, date_group, count(*) cnt
from (
select t.*, date_sub(create_date, interval(#rn := #rn + 1) day) date_group
from your_table t, (select #rn := 0) x
where score >= 10
order by user_id, create_date
) t
group by user_id, date_group
) t
group by user_id;
Produces:
user_id max_cnt
1 3
2 4
Demo: Rextester
How it works:
We generate a sequence number in the order of user_id and create_date (both increasing) and then, subtract as many days as this sequence number from the create_date to create groups where the dates are consecutive and then apply required aggregations to get the results.