How to select N random groups within a table - mysql

My groups are split across rows like so:
Row Group
1 A
2 A
3 A
4 B
5 B
6 C
7 C
8 C
9 C
How can I select all rows for any 2 randomly chosen groups?

Sort random, pick 2 groups.
SELECT *
FROM your_table AS t
WHERE `Group` IN (
SELECT `Group` FROM (
SELECT `Group`
FROM your_table
GROUP BY `Group`
ORDER BY RAND()
LIMIT 2) q);
db<>fiddle here

To select rows of any two groups
select *
from t_groups
group by Group
having Group in ['A', 'B'];
To randomly select any two groups
select *
from t_groups
group by Group
having Group in (
SELECT Group FROM t_groups
ORDER BY RAND()
LIMIT 2);

Related

Random elements inside JOIN

I have this code here
INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
SELECT cat.CatalogId, dep.Id, #department_type, false
FROM Directory.Catalog cat
JOIN (SELECT * FROM (
SELECT * FROM Taxonomy.Department LIMIT 10
) as dep_tmp ORDER BY RAND() LIMIT 3) AS dep
WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = #department_type)
AND cat.UrlStatus = #url_status_green
AND (cat.StatusId = #status_published
OR cat.StatusId = #status_review_required);
And the problem is that, it should for each catalog take the first 10 elements from Department and randomly choose 3 of them, then add to CatalogDepartment 3 rows, each containing the catalog id and a taxonomy id. But instead it randomly chooses 3 Department elements and then adds those 3 elements to each catalog.
The current result looks like this:
1 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
1 001d4060-2924-4c75-b304-d780454f261b
1 001bc4b8-c1bc-498d-9aee-3825a40587d5
2 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
2 001d4060-2924-4c75-b304-d780454f261b
2 001bc4b8-c1bc-498d-9aee-3825a40587d5
3 000de9d7-af8b-4bac-bdbd-e6e361e5bc5e
3 001d4060-2924-4c75-b304-d780454f261b
3 001bc4b8-c1bc-498d-9aee-3825a40587d5
As you can see, there are only 3 departments chosen and repeated for every catalog
If you think that the query:
SELECT * FROM (
SELECT * FROM Taxonomy.Department LIMIT 10
) as dep_tmp
ORDER BY RAND() LIMIT 3
that you join to Directory.Catalog returns 3 different departments for each catalog then you are wrong.
This query is executed only once and returns 3 random departments which are joined (always the same 3) to Directory.Catalog.
What you can do is after you CROSS JOIN 10 departments to Directory.Catalog, choose randomly 3 of them for each catalog.
Try this:
INSERT INTO Directory.CatalogTaxonomy (`CatalogId`, `TaxonomyId`, `TaxonomyTypeId`, `IsApprovalRelevant`)
WITH cte AS (
SELECT cat.CatalogId, dep.Id AS TaxonomyId, #department_type AS TaxonomyTypeId, false AS IsApprovalRelevant
FROM Directory.Catalog AS cat
CROSS JOIN (SELECT * FROM Taxonomy.Department LIMIT 10) AS dep
WHERE cat.CatalogId NOT IN (SELECT CatalogId FROM Directory.CatalogTaxonomy WHERE TaxonomyTypeId = department_type)
AND cat.UrlStatus = #url_status_green
AND (cat.StatusId = #status_published OR cat.StatusId = #status_review_required);
)
SELECT t.CatalogId, t.TaxonomyId, t.TaxonomyTypeId, t.IsApprovalRelevant
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY CatalogId ORDER BY RAND()) rn
FROM cte
) t
WHERE t.rn <= 3
Note that this:
SELECT * FROM Taxonomy.Department LIMIT 10
does not guarantee that you get the first 10 elements from Department because a table is not ordered.

How do I find duplicate values across multiple columns in Mysql?

I have a table like this
I want to check the all rows in Column A with column B and get the count of duplicates.
For example, I want to get the
count of 12 as 3(2 times in A+1 time in B)
count of 11 as 2(2 times in A+0 time in B)
count of 13 as 2(1 time in A+0 time in B)
How can I acheive it?
You can calculate the total occurrences from a union all. A where clause can show only the values that occur in the A column:
select nr
, count(*)
from (
select A as nr
from YourTable
union all
select B
from YourTable
) sub
where nr in -- only values that occur at least once in the A column
(
select A
from YourTable
)
group by
nr
having count(*) > 1 -- show only duplicates
You can combine all values in A and B then do the group by.
Then only select those values found in column A.
Select A, count(A) as cnt
From (
Select A
from yourTable
Union All
Select B
from yourTable) t
Where t.A in
(select distinct A from yourTable)
Group by t.A
Order by t.A;
Result:
A cnt
11 2
12 3
13 1
See demo: http://sqlfiddle.com/#!9/9fcfe9/3

How to i order values depending on its numbers of hits

I want to retrieve the table value in order of from most to least. For example, if we had a single column table like the following,
selected_val
2
3
3
2
1
3
1
1
1
4
I need a SQL that will return the values in the order of 1, 3, 2, 4, because there are four 1's, three 3's, two 2's, and one 4 in the table. I want 1,1,1,1,3,3,3,2,2,4. Is this possible?
ANd if i add more colums that
selected_val Rack
2 A
3 A
3 B
2 B
1 C
3 C
1 A
1 A
1 B
4 C
Thanks for the help
you can use count and group by
select selected_val
from my_table
group by selected_val
order by count(*) DESC
for the secondo part of the question you can use group concat if you need the rack related to selected_vale
select selected_val, group_concat(rack)
from my_table
group by selected_val
order by count(*) DESC
or add the column rack to group by if you need the relative counting and order
select selected_val, rack
from my_table
group by selected_val, rack
order by count(*) DESC
TRY THIS for the desired output and you can select other columns as well:
select t.selected_val
from test t
inner join (
select t1.selected_val, count(1) ord
from test t1
group by t1.selected_val) t2 on t2.selected_val = t.selected_val
order by t2.ord desc
OUTPUT:
1
1
1
1
3
3
3
2
2
4
Output is not actually in row but you can use GROUP_CONCAT to retrieve output in a row.
Use subquery to count, then LEFT JOIN result to the subquery's result and order by count.
select t1.selected_val from table t1
left join (
SELECT
t2.selected_val,
count(*) AS count
FROM table t2
GROUP BY selected_val
) t3 on t1.selected_val = t3.selected_val
ORDER BY t3.count DESC;
Using Cross Join.
select A.C,A.Rack from
(select selected_val,Rack,count(*) c from #TableName group by selected_val,Rack)a
,(select selected_val,Rack,count(*) c from #TableName group by selected_val,Rack)b
where B.c <= A.selected_val
ORDER BY A.selected_val DESC
OutPut :

mysql select top n max values

How can you select the top n max values from a table?
For a table like this:
column1 column2
1 foo
2 foo
3 foo
4 foo
5 bar
6 bar
7 bar
8 bar
For n=2, the result needs to be:
3
4
7
8
The approach below selects only the max value for each group.
SELECT max(column1) FROM table GROUP BY column2
Returns:
4
8
For n=2 you could
SELECT max(column1) m
FROM table t
GROUP BY column2
UNION
SELECT max(column1) m
FROM table t
WHERE column1 NOT IN (SELECT max(column1)
WHERE column2 = t.column2)
for any n you could use approaches described here to simulate rank over partition.
EDIT:
Actually this article will give you exactly what you need.
Basically it is something like this
SELECT t.*
FROM
(SELECT grouper,
(SELECT val
FROM table li
WHERE li.grouper = dlo.grouper
ORDER BY
li.grouper, li.val DESC
LIMIT 2,1) AS mid
FROM
(
SELECT DISTINCT grouper
FROM table
) dlo
) lo, table t
WHERE t.grouper = lo.grouper
AND t.val > lo.mid
Replace grouper with the name of the column you want to group by and val with the name of the column that hold the values.
To work out how exactly it functions go step-by-step from the most inner query and run them.
Also, there is a slight simplification - the subquery that finds the mid can return NULL if certain category does not have enough values so there should be COALESCE of that to some constant that would make sense in the comparison (in your case it would be MIN of domain of the val, in article it is MAX).
EDIT2:
I forgot to mention that it is the LIMIT 2,1 that determines the n (LIMIT n,1).
If you are using mySQl, why don't you use the LIMIT functionality?
Sort the records in descending order and limit the top n i.e. :
SELECT yourColumnName FROM yourTableName
ORDER BY Id desc
LIMIT 0,3
Starting from MySQL 8.0/MariaDB support window functions which are designed for this kind of operations:
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY column2 ORDER BY column1 DESC) AS r
FROM tab) s
WHERE r <= 2
ORDER BY column2 DESC, r DESC;
DB-Fiddle.com Demo
This is how I'm getting the N max rows per group in MySQL
SELECT co.id, co.person, co.country
FROM person co
WHERE (
SELECT COUNT(*)
FROM person ci
WHERE co.country = ci.country AND co.id < ci.id
) < 1
;
how it works:
self join to the table
groups are done by co.country = ci.country
N elements per group are controlled by ) < 1 so for 3 elements - ) < 3
to get max or min depends on: co.id < ci.id
co.id < ci.id - max
co.id > ci.id - min
Full example here:
mysql select n max values per group/
mysql select max and return multiple values
Note: Have in mind that additional constraints like gender = 0 should be done in both places. So if you want to get males only, then you should apply constraint on the inner and the outer select

Get all the rows for the most recent 3 groups

I googled a bit and looked on SO but I didn't find anything that helped me.
I have a working MySQL query that selects some columns (accross three tables, with two JOIN statements) and I am looking to do something extra on the result set.
I would like to SELECT all rows from the 3 most recent groups. (I can only assume I have to use a GROUP BY on that column) I'm having a hard time explaining this clearly so I'll use an example:
id | group
--------------
1 | 1
2 | 2
3 | 2
4 | 2
5 | 3
6 | 3
7 | 4
8 | 4
Of course, I dumbed it down a lot for the sake of simplicity (and my current query doesn't include an id column).
Right now my ideal query would return, in order (that's the id field):
8, 7, 6, 5, 4, 3, 2
If I were to add the following 9th element:
id | group
--------------
9 | 5
My ideal query would then return, in order:
9, 8, 7, 6, 5
Because these are all the rows from the most 3 recent groups. Also, when two rows have the same group (and are still in the results set), I would like to ORDER them BY another field (which I have not included in my dumbed down example).
In my search I only found how to do actions on elements of GROUPS (MAX of each, AVG of group elements, etc.) and not GROUPS themselves (first 3 groups ordered by a field).
Thank you in advance for your help!
Edit: Here is what my real query looks like.
SELECT t1.f1, t1.f2, t2.f1, t2.f2, t2.f3, t3.f1, t3.f2, t3.f3, t3.f4
FROM t1
LEFT JOIN t2 ON t2.f1=t1.f3
LEFT JOIN t3 ON t2.f1=t3.f5
WHERE t1.f4='some_constant' AND t2.f4='some_other_constant'
ORDER BY t1.f2 DESC
SELECT `table`.* FROM
(SELECT DISTINCT `group`
FROM `table`
ORDER BY `group` DESC LIMIT 3) t1
INNER JOIN `table` ON `table`.`group` = t1.`group`
the subquery should return the three groups with the largest value, the INNER JOIN will ensure no rows are included which do not have these group values.
assuming t1.f2 is your group column:
SELECT a,b,c,d,e,f,g,h,i
FROM
(
SELECT t1.f1 as a, t1.f2 as b, t2.f1 as c, t2.f2 as d, t2.f3 as e, t3.f1 as f, t3.f2 as g, t3.f3 as h, t3.f4 as i
FROM t1
LEFT JOIN t2 ON t2.f1=t1.f3
LEFT JOIN t3 ON t2.f1=t3.f5
WHERE t1.f4='some_constant' AND t2.f4='some_other_constant'
ORDER BY t1.f2 DESC
) first_table
INNER JOIN
(
SELECT DISTINCT `f2`
FROM `t1`
ORDER BY `f2` DESC LIMIT 3
) second_table
ON first_table.b = second_table.f2
Note that this may be very inefficient depending on your table structure, but is the best I can do without more information.
how about this way... (i use groupId instead of 'group'
[QUERY] => something like (SELECT id, groupId from tables.....) (your query with 2 joins).
-- with this query you have the last thre groups.
[QUERY2] => SELECT distinct(groupId) as groupId FROM ([QUERY]) ORDER BY groupId DESC LIMIT 0,3
and finally you will have:
SELECT id, groupId from tables----...... WHERE groupId in ([QUERY2]) order by groupId DESC, id DESC