mysql random selection in inner join - mysql

Question Mysql Random Row Query on Inner Join is much the same as mine but it was never answered.
I have a master table m and slave s. S contains 1 to many rows for each m. I would like a query that selects every master row joined to exactly one randomly chosen slave.
If the table schemas were:
M
---
id
S
---
id
mid
then, in pseudo code the query would be:
select * from m inner join s on m.id = s.mid where s.id is one randomly chosen from the values that exist
Can this be translated into real SQL?

I think the following query does the required job but using a subquery (not inner join):
SELECT *, (SELECT id FROM S WHERE S.mid = M.id ORDER BY RAND() LIMIT 1) AS S_id
FROM M
Here is a link to test it.
Hope it helps.

This can be solved using Row_Number() concept. We need to randomly assign row number values within a partition of mid in the table s. And, do a Join from the m table to s using mid and row_number = 1. This will pick a single Random row everytime.
In MySQL version below 8, we can use User-defined Variables to emulate Row_Number(). To understand how this works, you may check this answer for the explanation: https://stackoverflow.com/a/53465139/2469308
Note that this technique will be efficient on Large tables than using a Subquery (in the SELECT clause), as it will be doing overall table Sorting only once
View on DB Fiddle
create table m (id int, m_nm varchar(10));
create table s (id int,
mid int references m(mid),
s_nm varchar(10));
insert into m values(1, "a");
insert into m values(2, "b");
insert into m values(3, "c");
insert into s values(1, 1, "aa");
insert into s values(2, 1, "aa");
insert into s values(3, 2, "bb");
insert into s values(4, 2, "bbb");
insert into s values(5, 2, "bbbb");
insert into s values(6, 3, "cc");
insert into s values(7, 3, "ccc");
Query
SELECT
m.*, s_dt.id, s_dt.mid, s_dt.s_nm
FROM
m
JOIN
(
SELECT
#rn := IF(#m = dt.mid, #rn+1, 1) AS row_num,
#m := dt.mid AS mid,
dt.id,
dt.s_nm
FROM
(
SELECT
id, mid, s_nm, RAND() as rand_num
FROM s
ORDER BY mid, rand_num ) AS dt
CROSS JOIN (SELECT #rn:=0, #m:=0) AS user_vars
) AS s_dt
ON s_dt.mid = m.id AND
s_dt.row_num = 1;
Result (Run #1)
| id | m_nm | id | mid | s_nm |
| --- | ---- | --- | --- | ---- |
| 1 | a | 2 | 1 | aa |
| 2 | b | 5 | 2 | bbbb |
| 3 | c | 7 | 3 | ccc |
Result (Run #2)
| id | m_nm | id | mid | s_nm |
| --- | ---- | --- | --- | ---- |
| 1 | a | 1 | 1 | aa |
| 2 | b | 4 | 2 | bbb |
| 3 | c | 6 | 3 | cc |
Result (Run #3)
| id | m_nm | id | mid | s_nm |
| --- | ---- | --- | --- | ---- |
| 1 | a | 1 | 1 | aa |
| 2 | b | 3 | 2 | bb |
| 3 | c | 7 | 3 | ccc |
MySQL 8.0.2+ / MariaDB 10.3+ solution would be simply the following:
SELECT
m.*, s_dt.id, s_dt.mid, s_dt.s_nm
FROM
m
JOIN
(
SELECT
s.*,
ROW_NUMBER() OVER w AS row_num
FROM s
WINDOW w AS (PARTITION BY mid
ORDER BY RAND())
) AS s_dt
ON s_dt.mid = m.id AND
s_dt.row_num = 1
View on DB Fiddle

Related

Aggregate "once-only" whether 1 or 2 rows in join

I'm trying to run an aggregate query where a join can find 0, 1 or 2 rows in the join table.
I want to aggregate "once-only" regardless of whether the join finds 1 or 2 matching rows.
Minimal example.
+--------------+--------+-----------+
| container_id | thing | alternate |
+--------------+--------+-----------+
| 1 | box | 0 |
| 1 | box | 1 |
| 1 | hat | 0 |
| 2 | monkey | 0 |
| 3 | monkey | 1 |
| 3 | chair | 1 |
+--------------+--------+-----------+
+--------------+------+
| container_id | uses |
+--------------+------+
| 1 | 3 |
| 2 | 1 |
| 3 | 2 |
+--------------+------+
You can see that 'box' is associated with container_id number 1 twice. Once with alternate=0 and once with alternate=1.
SELECT
thing, COUNT(DISTINCT ct.container_id) AS occurrencs, SUM(uses) AS uses
FROM
container_thing AS ct
INNER JOIN
container_usage AS cu ON cu.container_id = ct.container_id
GROUP BY
thing
gives:
+--------+------------+------+
| thing | occurrencs | uses |
+--------+------------+------+
| box | 1 | 6 |
| chair | 1 | 2 |
| hat | 1 | 3 |
| monkey | 2 | 3 |
+--------+------------+------+
but I really want is:
+--------+------------+------+
| thing | occurrencs | uses |
+--------+------------+------+
| box | 1 | 3 |
| chair | 1 | 2 |
| hat | 1 | 3 |
| monkey | 2 | 3 |
+--------+------------+------+
I want 3 as the value for uses in the first row because 'box' was in containers that were used a total of three times. Because of the 'alternate' column I get 6 for that value. Can I either join differently or group by differently or express in the SUM expression to only SUM once for each distinct thing regardless of the value of alternate?
(Note that a thing can appear in a container with alternate, without alternate or both.)
SQL necessary to set up the minimal example:
-- Set up db
CREATE DATABASE sumtest;
USE sumtest;
-- Set up tables
CREATE TABLE container (id INT PRIMARY KEY);
CREATE TABLE container_thing (container_id INT, thing NVARCHAR(10), alternate BOOLEAN);
CREATE TABLE container_usage (container_id INT, uses INT);
-- Insert data
INSERT INTO container (id) VALUES (1), (2), (3);
INSERT INTO container_thing (container_id, thing, alternate) VALUES (1, 'box', FALSE), (1, 'box', TRUE), (1, 'hat', FALSE), (2, 'monkey', FALSE), (3, 'monkey', TRUE), (3, 'chair', TRUE);
INSERT INTO container_usage VALUES (1, 3), (2, 1), (3, 2);
-- Query
SELECT thing, COUNT(DISTINCT ct.container_id) AS occurrencs, SUM(uses) AS uses FROM container_thing AS ct INNER JOIN container_usage AS cu ON cu.container_id = ct.container_id GROUP BY thing;
You can work around this by only selecting DISTINCT values of container_id and thing from container_thing in a derived table and JOINing that to container_usage:
SELECT thing, COUNT(ct.container_id) AS occurrences, SUM(uses) AS uses
FROM (SELECT DISTINCT container_id, thing
FROM container_thing) AS ct
INNER JOIN container_usage AS cu ON cu.container_id = ct.container_id
GROUP BY thing;
Output
thing occurrences uses
box 1 3
chair 1 2
hat 1 3
monkey 2 3
Demo on dbfiddle
If you want only the use .. then you should not perform the sum in join .. because the join produce T1xT2 rows for each macthing ON clause
where N is the number of row from table1 and M is the number of rows from table2 so in the case of box you have 2 x 1 with value 3 = 6.
for avoid this you should join container_usage with the subqiery for aggreated result for count of container_thing
select t.thing, t.count_container, cu.uses
from (
SELECT thing, container_id, COUNT(DISTINCT ct.container_id) count_container
FROM container_thing
GROUP BY thing, container_id
) t
inner join container_usage AS cu ON cu.container_id = t.container_id

SubQuery returns one row when getting data from main query comma separated ids?

SELECT
e.*,
(
SELECT GROUP_CONCAT(topic_name)
FROM topic
WHERE id IN (e.topic_ids)) AS topics
FROM exam e
result :
topics = xyz topic
this query returns a single name of topic as result but when i use this :
SELECT
e.*,
(
SELECT GROUP_CONCAT(topic_name)
FROM topic
WHERE id IN (1,4)) AS topics
FROM exam e
result :
topics = xyz topic,abc topic
That works fine,and exam table had the same value in DB (comma separated topic ids = 1,4) as varchar type field.
is there any issue with datatype of field?
First, let me lecture you about how bad CSV in field is.
| id | topic_ids |
|----|-----------|
| 1 | a,b,c |
| 2 | a,b |
This, is how Satan look like in relational DB. Probably the worst, just after the
"lets put columns as line and use a recursive join to get everything back."
How it should be ?
exam
| id |
|----|
| 1 |
| 2 |
exam_topic
| exam_id | topic_id |
|---------|----------|
| 1 | a |
| 1 | b |
| 1 | c |
| 2 | a |
| 2 | b |
topic
| id |
|----|
| a |
| b |
| c |
Now, as awful as it may be, this is the "dynamic" alternative, using FIND_IN_SET() :
SELECT
e.*,
(
SELECT GROUP_CONCAT(topic_name)
FROM topic
WHERE FIND_IN_SET(id, e.topic_ids) > 0
) AS topics
FROM exam e
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE exam
(`id` int, `topic_ids` varchar(5))
;
INSERT INTO exam
(`id`, `topic_ids`)
VALUES
(1, 'a,b,c'),
(2, 'a,b'),
(3, 'b,c,d'),
(4, 'd')
;
CREATE TABLE topic
(`id` varchar(1), `topic_name` varchar(4))
;
INSERT INTO topic
(`id`, `topic_name`)
VALUES
('a', 'topA'),
('b', 'topB'),
('c', 'topC'),
('d', 'topD')
;
Query 1:
SELECT
e.*,
(
SELECT GROUP_CONCAT(topic_name)
FROM topic
WHERE FIND_IN_SET(id, e.topic_ids) > 0
) AS topics
FROM exam e
Results:
| id | topic_ids | topics |
|----|-----------|----------------|
| 1 | a,b,c | topA,topB,topC |
| 2 | a,b | topA,topB |
| 3 | b,c,d | topB,topC,topD |
| 4 | d | topD |

Get last mysql record only from a column

This is my existing table
id name version
| 1 | a | 1.1 |
| 2 | b | 2.1 |
| 3 | c | 3.1 |
| 4 | d | 1.2 |
| 5 | e | 4.1 |
how can I write a query to generate results where i will return all records but only the last record in the column version is selected like below?
id name version
| 4 | d | 1.2 |
| 2 | b | 2.1 |
| 3 | c | 3.1 |
| 5 | e | 4.1 |
If you prefer a slightly less laborious solution...
SELECT x.*
FROM t x
JOIN
( SELECT MAX(grade) grade
FROM t
GROUP
BY FLOOR(grade)
) y
ON y.grade = x.grade
http://sqlfiddle.com/#!9/f17db1/16
This is a bit laborious but it can be done
SELECT
SUBSTRING_INDEX(GROUP_CONCAT(id ORDER BY REPLACE(grade,'.','')*1 DESC),',',1) as id,
SUBSTRING_INDEX(GROUP_CONCAT(letter ORDER BY REPLACE(grade,'.','')*1 DESC),',',1) as letter,
MAX(grade) as grade
FROM
t
GROUP BY SUBSTRING_INDEX(grade,'.',1)
ORDER BY REPLACE(grade,'.','')*1
Assuming the last column is float you can use ORDER BY lastcol directly
FIDDLE
CREATE TABLE t
(`id` int, `letter` varchar(7), `grade` varchar(55))
;
INSERT INTO t
VALUES
(1, 'a', '1.1'),
(2, 'b', '2.1'),
(3, 'c', '3.1'),
(4, 'd', '1.2'),
(5, 'e', '4.1')

Complex SQL query with group by and having in condition

Suppose I have the table test below:
------------------------------
id | active| record
------------------------------
3 | O | 2015-10-16
3 | O | 2015-10-15
3 | N | 2015-10-14
4 | N | 2015-10-15
4 | O | 2015-10-14
I want to do an update on the table on the lines with:
- An id having the column active = 'O' more than once.
- Among theses lines having active = 'O' more than once, the update shall change the value of active to 'N', except for the one with max(record), which will stay with active = 'O'.
In my example, the id having the column active = 'O' more than once is id = 3.
id |active | record
------------------------------
3 | O | 2015-10-16
3 | O | 2015-10-15
3 | N | 2015-10-14
I want to have this result:
id |active | record
------------------------------
3 | O | 2015-10-16
3 | N | 2015-10-15
3 | N | 2015-10-14
I tried this query, but there is an error:
update test as t1,
(select id
from test
where active = 'O'
group by id
having count(*) > 1) as t2
set t1.actif = 'N'
where t1.record != max(t2.record);
Thanks in advance!
Given this sample data:
CREATE TABLE t
(`id` int, `active` varchar(1), `record` date)
;
INSERT INTO t
(`id`, `active`, `record`)
VALUES
(3, 'O', '2015-10-16'),
(3, 'O', '2015-10-15'),
(3, 'N', '2015-10-14'),
(4, 'N', '2015-10-15'),
(4, 'O', '2015-10-14')
;
This query
UPDATE
t
JOIN (
SELECT
id, MAX(record) AS max_record
FROM
t
WHERE active = 'O'
GROUP BY id
HAVING COUNT(*) > 1
) sq ON t.id = sq.id
SET t.active = IF(t.record = sq.max_record, 'O', 'N');
produces this result:
+------+--------+------------+
| id | active | record |
+------+--------+------------+
| 3 | O | 2015-10-16 |
| 3 | N | 2015-10-15 |
| 3 | N | 2015-10-14 |
| 4 | N | 2015-10-15 |
| 4 | O | 2015-10-14 |
+------+--------+------------+
Can you try with something like this
select ID,
count(*) Counted,
max(record) record
into #TempTable from Table
where Active = 'O'
group by ID
Update tab
set tab.Active = 'N'
from Table tab
join #tempTable temp on tab.ID = temp.ID
where temp.Counted > 1 and
tab.record != temp.record
drop table #tempTable
Basically, you just counting Os while grabbing ID and max record into temp table and after that you doing the update, also this code might need some changes as i just took a glance to point you toward direction i would do it

I need to get the average for every 3 records in one table and update column in separate table

Table Mytable1
Id | Actual
1 ! 10020
2 | 12203
3 | 12312
4 | 12453
5 | 13211
6 | 12838
7 | 10l29
Using the following syntax:
SELECT AVG(Actual), CEIL((#rank:=#rank+1)/3) AS rank FROM mytable1 Group BY rank;
Produces the following type of result:
| AVG(Actual) | rank |
+-------------+------+
| 12835.5455 | 1 |
| 12523.1818 | 2 |
| 12343.3636 | 3 |
I would like to take AVG(Actual) column and UPDATE a second existing table Mytable2
Id | Predict |
1 | 11133
2 | 12312
3 | 13221
I would like to get the following where the Actual value matches the ID as RANK
Id | Predict | Actual
1 | 11133 | 12835.5455
2 | 12312 | 12523.1818
3 | 13221 | 12343.3636
IMPORTANT REQUIREMENT
I need to set an offset much like the following syntax:
SELECT #rank := #rank + 1 AS Id , Mytable2.Actual FROM Mytable LIMIT 3 OFFSET 4);
PLEASE NOTE THE AVERAGE NUMBER ARE MADE UP IN EXAMPLES
you can join your existing query in the UPDATE statement
UPDATE Table2 T2
JOIN (
SELECT AVG(Actual) as AverageValue,
CEIL((#rank:=#rank+1)/3) AS rank
FROM Table1, (select #rank:=0) t
Group BY rank )T1
on T2.id = T1.rank
SET Actual = T1.AverageValue