MySQL-ordering partitions randomly - mysql

table1 has 3 columns in my database: id, category, timestamp. I need to query the newest 3 rows from each category:
WITH ranked_rows AS
(SELECT t.*, ROW_NUMBER() OVER (PARTITION BY category ORDER BY t.timestamp DESC) AS rn
FROM table1 AS t)
SELECT ranked_rows.* FROM ranked_rows WHERE rn<=3
now I need to select 10 partitions from the results randomly (please notice that each partition has 3 rows). how to do that?

There are various methods. One is:
WITH ranked_rows AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY t.timestamp DESC) AS seqnum,
DENSE_RANK() OVER (ORDER BY MD5(category)) as catnum
FROM table1 t
)
SELECT ranked_rows.*
FROM ranked_rows
WHERE seqnum <= 3 AND catnum <= 10;
The md5() just makes the results look random.

if you want true random per category, here is one way :
with categorycte as (
select category , rand() randomcatid
from table1
group by category
),ranked_rows AS
(
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY category ORDER BY t.timestamp DESC) AS rn
, dense_rank() over (order by randomcatid) catnum
FROM table1 AS t
join categorycte c on t.category = c.category
)
SELECT ranked_rows.* FROM ranked_rows
WHERE rn<=3 and catnum <= 10;

Related

I need to get last created eligible rider ids and pinged rider ids accordeing to a orderId using a sql query

I need to get my data set as this table
I am trying to get eligible set like this, need to group_concat pinged set also
x.id IN (SELECT MAX(x.id) FROM x WHERE ping rider id IS NULL GROUP BY orderId)
You can assign a group based on the cumulative number of non-null values in eligible_riders. Then aggregate and take the last value:
select og.*
from (select order_id, grp, max(eligible_riders) as eligible_riders,
group_concat(rider_id) as riders,
row_number() over (partition by order_id order by min(id) desc) as seqnum
from (select t.*,
sum(eligible_riders <> '') over (partition by order_id order by id) as grp
from t
) t
group by order_id, grp
) og
where seqnum = 1;
Hmmm . . . You could also do this with a correlated subquery, which might look a bit simpler:
select order_id, max(eligible_riders) as eligible_riders,
group_concat(rider_id) as riders
from t
where t.id >= (select max(t2.id)
from t t2
where t2.order_id = t.order_id and
t2.eligible_riders <> ''
)
group by order_id;
For performance, you want an index on (order_id, eligible_riders).

only max(value) from union by several columns

I want to retrieve one result for each [Group] by the highest Time.
Result of current code:
SELECT [Group], ArticleNumber, max(TimeTrue) as Time
FROM PerformanceOpc (NOLOCK) WHERE ([Group]='Pack2' OR [Group]='70521-030')
GROUP BY [Group], ArticleNumber
UNION
SELECT [Group], ArticleNumber, max(StopTime) as Time
FROM StoppageOpc (NOLOCK) WHERE ([Group]='Pack2' OR [Group]='70521-030')
GROUP BY [Group], ArticleNumber
ORDER BY Time DESC
The result should be only two records (csv):
Group,ArticleNumber,Time
70521-030,,2021-03-15 13:50:15
Pack2,183026,2021-03-15 13:47:39
Hmmm . . . you would seem to want to union all before aggregating:
SELECT [Group], ArticleNumber, max(Time) as Time
FROM ((SELECT [Group], ArticleNumber, TimeTrue as Time
FROM PerformanceOpc
WHERE [Group] IN ('Pack2', '70521-030')
) UNION ALL
(SELECT [Group], ArticleNumber, StopTime as Time
FROM StoppageOpc
WHERE [Group] IN ('Pack2', '70521-030')
)
) g
GROUP BY [Group], ArticleNumber;
This returns one row per group and article, which seems to be what your query is doing.
If you really want only one row per group, then you want ROW_NUMBER() and not aggregation:
SELECT g.*
FROM (SELECT g.*, ROW_NUMBER() OVER (PARTITION BY [Group] ORDER BY time DESC) as seqnum
FROM ((SELECT [Group], ArticleNumber, TimeTrue as Time
FROM PerformanceOpc
WHERE [Group] IN ('Pack2', '70521-030')
) UNION ALL
(SELECT [Group], ArticleNumber, StopTime as Time
FROM StoppageOpc
WHERE [Group] IN ('Pack2', '70521-030')
)
) g
) g
WHERE seqnum = 1;
Try select top 1 with order by in temporal tables and then query them with union
SELECT top 1[Group], ArticleNumber, max(TimeTrue) as Time into #tmp1
FROM PerformanceOpc (NOLOCK) WHERE ([Group]='Pack2' OR [Group]='70521-030')
GROUP BY [Group], ArticleNumber
order by Time desc
SELECT top 1 [Group], ArticleNumber, max(StopTime) as Time into #tmp2
FROM StoppageOpc (NOLOCK) WHERE ([Group]='Pack2' OR [Group]='70521-030')
GROUP BY [Group], ArticleNumber
ORDER BY Time DESC
select * from #tmp1
union
select * from #tmp2
drop table #tmp1
drop table #tmp2

Find the most frequent value in mysql,display all in case of a tie

For example, we have 1, 2 and 3 are the most frequent values at the same time, how to return them when it is a tie?
id
1
1
1
2
2
2
3
3
3
4
You could try:
SELECT id
FROM yourTable
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*) FROM yourTable
GROUP BY id ORDER BY COUNT(*) DESC LIMIT 1);
On more recent versions of MySQL 8+, we can use RANK here:
WITH cte AS (
SELECT id, RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM yourTable
GROUP BY id
)
SELECT id
FROM cte
WHERE rnk = 1;

How can i filter few repeatation in MySQL Query

I have a mysql table like below. It used to store document with versioning.
I want to select a docid of latest (with higest major version and minor version). It will eleminate all same doc id only fetch the document with highest major_version & minor_version. So i want result as below.
In MySQL 8.0, you can filter with row_number():
select *
from (
select
t.*,
row_number() over(partition by id, docid order by major_version, minor_version) rn
from mytable t
) t
where rn = 1
In earlier versions, you can filter with a correlated subquery. Assuming that you have a primary key in the table, say column pk, you can do:
select t.*
from mytable t
where t.pk = (
select t1.pk
from mytable t1
where t1.id = t.id and t1.docid = t.docid
order by t1.major_version desc, t1.minor_version desc
limit 1
)
For performance, consider an index on (id, docid, major_version, minor_version).
Without a unique column that can be used as primary key, it is a bit more complicated. One way to do it is to use not exists:
select t.*
from mytable t
where not exists (
select 1
from mytable t1
where
t1.id = t.id
and t1.docid = t.docid
and (
t1.major_version > t.major_version
or (t1.major_version = t.major_version and t1.minor_version > t.minor_version)
)
)
One method uses row_number():
select t.*
from (select t.*,
row_number() over (partition by docid order by major_version desc, minor_version desc) as seqnum
from t
) t
where seqnum = 1;
This is a pain in earlier versions. Probably the simplest and most efficient method is to use variables:
select t.*
from (select t.*,
(#rn := if(#d = docid, #rn + 1,
if(#d := docid, 1, 1)
)
) as rn
from (select t.*
from t
order by docid, major_version desc, minor_version desc
) t cross join
(select #rn := 0, #d := '') params
) t
where rn = 1;

Group by date and take the last one

This is my table :
What I'm trying to do, is to take the last disponibility of a user, by caserne. Example, I should have this result :
id id_user id_caserne id_dispo created_at
31 21 12 1 2019-10-24 01:21:46
33 21 13 1 2019-10-23 20:17:21
I've tried this sql, but it does not seems to work all the times :
SELECT * FROM
( SELECT id, id_dispo, id_user, id_caserne, MAX(created_at)
FROM disponibilites GROUP BY id_user, id_caserne, id_dispo
ORDER BY created_at desc ) AS sub
GROUP BY id_user, id_caserne
What am I doing wrong ?
I would simply use filtering in the where clause using a correlated subquery:
select d.*
from disponibilites d
where d.created_at = (select max(d2.created_at)
from disponibilites d2
where d2.id_user = d.id_user
);
EDIT:
Based on your comments:
select d.*
from disponibilites d
where d.created_at = (select max(d2.created_at)
from disponibilites d2
where d2.id_user = d.id_user and
d2.id_caserne = d.id_caserne
where date(d2.created_at) = date(d.created_at)
);
You can use a correlated subquery, as demonstrated by Gordon Linoff, or a window function if your RDBMS supports it:
select * from (
select
t.*,
rank() over(partition by id_caserne, id_user order by created_at desc) rn
from disponibilites t
) x
where rn = 1
Another option is to use a correlated subquery without aggregation, only with a sort and limit:
select *
from mytable t
where created_at = (
select created_at
from mytable t1
where t1.id_user = t.id_user and t1.id_caserne = t.id_caserne
order by created_at desc
limit 1
)
With an index on (id_user, id_caserne, created_at), this should be a very efficient option.
you can join your max(created_date) to your original table
select t1.* from disponibilites t1
inner join
(select max(created_at), id_caserne, id
from disponibilites
group by id_caserne, id) t2
on t2.id = t1.id