Group By value RAND() - mysql

It's possible get a random value of the group by?
----------------
nID | val
---------------
A | XXX
A | YYY
B | L
B | M
B | N
B | P
----------------
With this SQL:
SELECT nID, VAL FROM T1 GROUP BY nID
My result always is:
nID val
--------
A XXX
B L
But i want a diferent result of evey nID. Like:
nID val
--------
A YYY
B N
or
nID val
--------
A XXX
B P
It's possible?
http://sqlfiddle.com/#!2/357b8/3

Use a sub-query.
SELECT r.nID,
(SELECT r1.val FROM T1 r1 WHERE r.nID=r1.nID ORDER BY rand() LIMIT 1) AS 'val' FROM T1 r
GROUP BY r.nID
http://sqlfiddle.com/#!2/357b8/18

You can use order by rand()
then group by them.
Like
SELECT nID, VAL FROM (
SELECT nID, VAL
FROM T1
ORDER BY RAND()
)AS subquery
GROUP BY nID

SELECT
t1.nID,
(SELECT
t2.var
FROM your_table t2
WHERE t1.nID = t2.nID ORDER BY rand() LIMIT 1
) AS var
FROM your_table t1
GROUP BY t1.nID ;

Try This
SELECT nID, VAL
FROM (select nID, VAL from T1 order by rand()) as T
group by nID

The following solution is similar in spirit to those from xdazz or jonnyynnoj. But instead of SELECT FROM T1 GROUP BY nID I use a subquery to select all distinct IDs. I believe there is a chance that the performance might differ, so give this one a try as well.
SELECT nID,
(SELECT VAL
FROM T1
WHERE T1.nID = ids.nID
ORDER BY RAND()
LIMIT 1
) AS VAL
FROM (SELECT DISTINCT nID FROM T1) AS ids

rand + rownum
SELECT t.*
, #rownum := #rownum+1 AS rowNum
FROM(
SELECT nID, VAL
FROM T1
ORDER BY RAND()
) AS t, (SELECT #rownum :=0) AS R
GROUP BY nID
ORDER BY nID, rowNum

Related

get the range of sequence values in table column

I have a list of value in my column. And want to query the range.
Eg. If values are 1,2,3,4,5,9,11,12,13,14,17,18,19
I want to display
1-5,9,11-14,17-19
Assuming that each value is stored on a separate row, you can use some gaps-and-island technique here:
select case when min(val) <> max(val)
then concat(min(val), '-', max(val))
else min(val)
end val_range
from (select val, row_number() over(order by val) rn from mytable) t
group by val - rn
order by min(val)
The idea is to build groups of consecutive values by taking the difference between the value and an incrementing rank, which is computed using row_number() (available in MySQL 8.0):
Demo on DB Fiddle:
| val_range |
| :-------- |
| 1-5 |
| 9 |
| 11-14 |
| 17-19 |
In earlier versions, you can emulate row_number() with a correlated subquery, or a user variable. The second option goes like:
select case when min(val) <> max(val)
then concat(min(val), '-', max(val))
else min(val)
end val_range
from (select #rn := 0) x
cross join (
select val, #rn := #rn + 1 rn
from (select val from mytable order by val) t
) t
group by val - rn
order by min(val)
As a complement to other answers:
select dn.val as dnval, min(up.val) as upval
from mytable up
join mytable dn
on dn.val <= up.val
where not exists (select 1 from mytable a where a.val = up.val + 1)
and not exists (select 1 from mytable b where b.val = dn.val - 1)
group by dn.val
order by dn.val;
1 5
9 9
11 14
17 19
Needless to say, but using an OLAP function like #GNB does, is orders of magnitude more efficient.
A short article on how to mimic OLAP functions in MySQL < 8 can be found at:
mysql-row_number
Fiddle
EDIT:
If another dimension is introduced (in this case p), something like:
select dn.p, dn.val as dnval, min(up.val) as upval
from mytable up
join mytable dn
on dn.val <= up.val
and dn.p = up.p
where not exists (select 1 from mytable a where a.val = up.val + 1 and a.p = up.p)
and not exists (select 1 from mytable b where b.val = dn.val - 1 and b.p = dn.p)
group by dn.p, dn.val
order by dn.p, dn.val;
can be used, see Fiddle2

Group database rows by one of two columns

The goal
I am trying to write a query to find duplicate rows. A row is duplicate when either Column A or Column B is the same.
Writing it so that both need to be the same is easy; just a simple GROUP BY A, B.
However, filtering by just one of the two is proving to be a bit more difficult. How would one go about doing this?
I've tried the following:
select distinct a as col_a,
b as col_b,
(
select count(*)
from table_name
where a = col_a
or b = col_b
) as duplicate_count
from table_name
having duplicate_count > 1;
but it does not feel like the right way to go about this and with 84.000 rows it is also very slow.
Example
With the following table:
+----+------------------------+---+---------+
| id | name | a | b |
+----+------------------------+---+---------+
| 1 | Lorem ipsum | 1 | Donec |
+----+------------------------+---+---------+
| 2 | dolor sit | 2 | rhoncus |
+----+------------------------+---+---------+
| 3 | amet | 3 | rhoncus |
+----+------------------------+---+---------+
| 4 | consectetur adipiscing | 1 | primis |
+----+------------------------+---+---------+
| 5 | vulputate cursus | 4 | Aliquam |
+----+------------------------+---+---------+
Either result 1 or 4 (same A) and either result 2 or 3 (same B) should be returned, both with a duplicate_count of 2.
Which one of the two "duplicates" is returned does not matter.
Versions
On my local machine I use MySQL 5.7.24.
I just checked the live server, it uses 10.1.43-MariaDB.
You already know that this query:
select a, b
from tablename
group by a, b
having count(*) > 1
returns duplicates with both a and b equal.
You can get the rest of the duplicates for your requirement with EXISTS:
select t.a, t.b
from tablename t
where exists (
select 1 from tablename
where (a = t.a and b <> t.b) or (a <> t.a and b = t.b)
)
Or if you want them all use UNION ALL:
select a, b
from tablename
group by a, b
having count(*) > 1
union all
select t.a, t.b
from tablename t
where exists (
select 1 from tablename
where (a = t.a and b <> t.b) or (a <> t.a and b = t.b)
)
Update:
If you have an ID column then use EXISTS like this:
select t.*
from tablename t
where exists (
select 1 from tablename
where id <> t.id and (a = t.a or b = t.b)
)
Or if you want just 1 of the duplicates use id > t.id instead of id <> t.id.
See the demo.
Or with a self join:
select t.*
from tablename t inner join tablename tt
on (tt.a = t.a or tt.b = t.b) and tt.id <> t.id
Following solution works :
Another demo with a line that has duplication in a and b
CREATE TEMPORARY TABLE ab_duplicates (
a INTEGER
) AS
SELECT a, count(*) as cnt
FROM tablename
group by a, b
Having cnt > 1;
ALTER TABLE ab_duplicates ADD INDEX (a);
-- Select duplicates for a, but not for a and b
SELECT id, name, a, b
FROM (SELECT x.*, t.id, t.name, t.a, t.b,
#rn := IF(t.a = #a, #rn + 1, 1) rn,
#a := t.a,
ab.a as ab_exists
FROM (select #a := null, #rn := 0) x,
tablename t
LEFT JOIN ab_duplicates ab on ab.a = t.a
ORDER BY a
) a_duplicates
where rn = 2 and ab_exists is null
UNION
-- union duplicates for b, including duplicates for a and b
SELECT id, name, a, b
FROM (SELECT x.*, t.id, t.name, t.a, t.b,
#rn := IF(t.b = #b, #rn + 1, 1) rn,
#b := t.b
FROM (select #b := null, #rn := 0) x,
tablename t
ORDER BY b
) b_and_ab_duplicates
where rn = 2;
Previous solutions that only worked in some edge cases
Using group by and count() :
First finding ids with duplicates for a :
SELECT min(id) id, count(*) cnt from tablename t group by a having cnt > 1
-- this will work better if you have an index starting with a
Same with b :
SELECT min(id) id, count(*) cnt from tablename t group by b having cnt > 1
-- this will work better if you have an index starting with b
First solution :
Union gives you ids where there are duplicates for a or b requires 2 indices)
SELECT min(id) id, count(*) cnt from tablename t group by a having cnt > 1
UNION
SELECT min(id) id, count(*) cnt from tablename t group by b having cnt > 1
Use the ids to filter the table, if you need more data from the table :
SELECT tablename.*
FROM (
SELECT min(id) id, count(*) cnt from tablename t group by a having cnt > 1
UNION
SELECT min(id) id, count(*) cnt from tablename t group by b having cnt > 1
) as ids
JOIN tablename on tablename.id = ids.id
Now this might not use an index, but you can use a temporary table to have one :
First solution, using a temporary table (might be faster) :
-- using a temporary table to set an index
CREATE TEMPORARY TABLE ids (
-- adds an index on id, for the JOIN in the result query
`id` INTEGER PRIMARY KEY
) as
SELECT id
FROM (
-- duplicates on a, requires an index (a) on tablename
SELECT min(id) id, count(*) cnt from tablename t group by a having cnt > 1
-- removes duplicates between both part of the UNION : this might be slow
-- if there cannot be duplicates on a and b at the same time, consider using UNION ALL
UNION
-- duplicates on b, requires an index (b) on tablename
SELECT min(id) id, count(*) cnt from tablename t group by b having cnt > 1
) tempids;
SELECT tablename.*
FROM ids -- using the temporary table, MUST be in the same database connection, will filter duplicates
JOIN tablename on tablename.id = ids.id;
I do not know if setting the index on the temporary table is better then setting one after populating the data :
-- you might want to postpone the index after the ids are set
-- using a temporary table to set an index
CREATE TEMPORARY TABLE ids2 (
`id` INTEGER
) as
SELECT id
FROM (
-- duplicates on a, requires an index (a) on tablename
SELECT min(id) id, count(*) cnt from tablename t group by a having cnt > 1
-- removes duplicates between both part of the UNION : this might be slow
-- if there cannot be duplicates on a and b at the same time, consider using UNION ALL
UNION
-- duplicates on b, requires an index (b) on tablename
SELECT min(id) id, count(*) cnt from tablename t group by b having cnt > 1
) tempids;
ALTER TABLE ids2 ADD INDEX (id);
SELECT tablename.*
FROM ids2 -- using the temporary table, MUST be in the same database connection, will filter duplicates
JOIN tablename on tablename.id = ids2.id;
With mariadb 10.2, or mysql 8 you could use window function (I guess).
Another solution : using vars :
SELECT id, name, a, b, rn
FROM (SELECT *,
#rn := IF(a = #a, #rn + 1, 1) rn,
#a := a
FROM (select #a := null, #rn := 0) x,
tablename
ORDER BY a
) a_duplicates
where rn = 2
UNION
SELECT id, name, a, b, rn
FROM (SELECT *,
#rn := IF(b = #b, #rn + 1, 1) rn,
#b := b
FROM (select #b := null, #rn := 0) x,
tablename
ORDER BY b
) b_duplicates
where rn = 2
Demo : with some extra steps to understand
Edit : this only works if you don t have lines where a and b are duplicates. Which is the case in the example.

Group by date and take the last one

This is my table :
What I'm trying to do, is to take the last disponibility of a user, by caserne. Example, I should have this result :
id id_user id_caserne id_dispo created_at
31 21 12 1 2019-10-24 01:21:46
33 21 13 1 2019-10-23 20:17:21
I've tried this sql, but it does not seems to work all the times :
SELECT * FROM
( SELECT id, id_dispo, id_user, id_caserne, MAX(created_at)
FROM disponibilites GROUP BY id_user, id_caserne, id_dispo
ORDER BY created_at desc ) AS sub
GROUP BY id_user, id_caserne
What am I doing wrong ?
I would simply use filtering in the where clause using a correlated subquery:
select d.*
from disponibilites d
where d.created_at = (select max(d2.created_at)
from disponibilites d2
where d2.id_user = d.id_user
);
EDIT:
Based on your comments:
select d.*
from disponibilites d
where d.created_at = (select max(d2.created_at)
from disponibilites d2
where d2.id_user = d.id_user and
d2.id_caserne = d.id_caserne
where date(d2.created_at) = date(d.created_at)
);
You can use a correlated subquery, as demonstrated by Gordon Linoff, or a window function if your RDBMS supports it:
select * from (
select
t.*,
rank() over(partition by id_caserne, id_user order by created_at desc) rn
from disponibilites t
) x
where rn = 1
Another option is to use a correlated subquery without aggregation, only with a sort and limit:
select *
from mytable t
where created_at = (
select created_at
from mytable t1
where t1.id_user = t.id_user and t1.id_caserne = t.id_caserne
order by created_at desc
limit 1
)
With an index on (id_user, id_caserne, created_at), this should be a very efficient option.
you can join your max(created_date) to your original table
select t1.* from disponibilites t1
inner join
(select max(created_at), id_caserne, id
from disponibilites
group by id_caserne, id) t2
on t2.id = t1.id

How to get the maximum count, and ID from the table SQL

**castID**
nm0000116
nm0000116
nm0000116
nm0000116
nm0000116
nm0634240
nm0634240
nm0798899
This is my table (created as a view). Now I want to list the castID which has the most count (in this case which is nm0000116, and how many occurences/count it has in this table ( should be 5 times) and I'm not quite sure which query to use
try
Select CastId, count(*) countOfCastId
From table
Group By CastId
Having count(*)
= (Select Max(cnt)
From (Select count(*) cnt
From table
Group By CastId) z)
SELECT
MAX(E),
castId
FROM
(SELECT COUNT(castId) AS E,castId FROM [directors winning movies list] GROUP BY castId) AS t
You could just return the topmost count using LIMIT:
SELECT castID,
COUNT(*) AS Cnt
FROM atable
GROUP BY castID
ORDER BY Cnt DESC
LIMIT 1
;
However, if there can be ties, the above query would return only one row. If you want all the "winners", you could take the count from the above query as a scalar result and compare it against all the counts to return only those that match:
SELECT castID,
COUNT(*) AS Cnt
FROM atable
GROUP BY castID
HAVING COUNT(*) = (
SELECT COUNT(*)
FROM atable
GROUP BY castID
ORDER BY Cnt DESC
LIMIT 1
)
;
(Basically, same as Charles Bretana's approach, it just derives the top count differently.)
Alternatively, you could use a variable to rank all the counts and then return only those that have the ranking of 1:
SELECT castID,
Cnt
FROM (
SELECT castID,
COUNT(*) AS Cnt,
#r := IFNULL(#r, 0) + 1 AS r
FROM atable
GROUP BY castID
ORDER BY Cnt DESC
) AS s
WHERE r = 1
;
Please note that with the above method the variable must either not exist or be pre-initialised with a 0 or NULL prior to running the query. To be on the safe side, you could initialise the variable directly in your query:
SELECT s.castID,
s.Cnt
FROM (SELECT #r := 0) AS x
CROSS JOIN
(
SELECT castID,
COUNT(*) AS Cnt,
#r := #r + 1 AS r
FROM atable
GROUP BY castID
ORDER BY Cnt DESC
) AS s
WHERE s.r = 1
;

Finding the most common entry in a column - SQL

I have a table called MyTable like so
A B
101 Dog
209 Cat
209 Cat
209 Dog
193 Cow
193 Dog
101 Dog
193 Dog
193 Cow
And I want to pull out the most common B for each A so it would end up being like this (note that there can be ties)
A B
101 Dog
209 Cat
193 Dog
193 Cow
How could I write sql to do this?
Alternatively, you can use HAVING clause instead of JOIN.
SELECT A, B
FROM table1 o
GROUP BY A, B
HAVING COUNT(*) =
(
SELECT MAX(totalCOunt)
FROM
(
SELECT A, B, COUNT(*) totalCount
FROM table1
GROUP BY A,B
) x
WHERE o.A = x.A
GROUP BY x.A
)
SQLFiddle Demo
You could use a filtering join to list the (A,B) combination with the highest rowcount:
select src.*
from (
select A
, B
, count(*) cnt
from YourTable
group by
A
, B
) src
join (
select A
, max(cnt) as maxcnt
from (
select A
, B
, count(*) cnt
from YourTable
group by
A
, B
) comb
group by
A
) maxab
on maxab.A = src.A
and maxab.maxcnt = src.cnt
Example at SQL Fiddle.
If your database supports windowing functions, you can use dense_rank(), like:
select *
from (
select dense_rank() over (
partition by A
order by cnt desc) as rn
, *
from (
select A
, B
, count(*) cnt
from YourTable
group by
A
, B
) t1
) t2
where rn = 1
Window function example at SQL Fiddle. Windowing functions are available on recent versions of SQL Server, Oracle and PostgeSQL.
select g3.A,g3.B
from
(
select A,Max(C) MC
from
(
select A,B,count(*) C
from (<your entire select query>) tbl
group by A,B
) g1
group by A
) g2
join
(
select A,B,count(*) C
from (<your entire select query>) tbl
group by A,B
) g3 on g2.A=G3.A and g3.C=g2.MaxC
SQL FIDDLE Example
select
A, B
from
(
select
A, B, row_number() over (partition by A order by cnt desc) as RowNum
from
(
select
T.A, T.B, count(*) over (partition by T.A, T.B) as cnt
from T
) as A
) as B
where RowNum = 1