How to JOIN the same table with itself using ORDER BY RAND()? - mysql

I want to get random combinations of id of a table with itself.
SELECT id FROM t1
SELECT id as id2 FROM t1 ORDER BY RAND()
SELECT id as id3 FROM t1 ORDER BY RAND()
How can I JOIN these queries to get
SELECT id,id2,id3
1 random_id random_id
2 random_id random_id
3 random_id random_id
4 random_id random_id
5 random_id random_id
In other words, what can be the point of JOINing to simply place these three SELECTs side by side.
It is beneficial to create a unique combination, but with the above query ORDER BY RAND() can repeat the same id to id2 and id3. The former case is ideal, but the latter sufficiently works for me.

If you are using MySQL 8+, then ROW_NUMBER might work here:
WITH cte AS (
SELECT id,
ROW_NUMBER() OVER (ORDER BY id) rn1,
ROW_NUMBER() OVER (ORDER BY RAND(UNIX_TIMESTAMP())) rn2,
ROW_NUMBER() OVER (ORDER BY RAND(UNIX_TIMESTAMP()+1)) rn3
FROM t1
)
SELECT
t1.id,
t2.id AS id2,
t3.id AS id3
FROM cte t1
INNER JOIN cte t2
ON t1.rn1 = t2.rn2
INNER JOIN cte t3
ON t1.rn1 = t3.rn3;
The demo below was from a sample table containing the id values from 1 to 10 inclusive.
Demo

If you truly want random, then repeats are allowed. That suggests:
select t.*,
(select t2.id
from t t2
order by rand()
limit 1
) as id2,
(select t3.id
from t t3
order by rand()
limit 1
) as id3
from t;
If you want permutations in older versions of MySQL, then variables are handy:
select t.id, t1.id, t2.id
from (select t.id, (#rn := #rn + 1) as seqnum
from t cross join
(select #rn := 0) params
) t join
(select t.id, (#rn1 := #rn1 + 1) as seqnum
from (select t.* from t order by rand()) t cross join
(select #rn1 := 0) params
) t1
using (seqnum) join
(select t.id, (#rn2 := #rn2 + 1) as seqnum
from (select t.* from t order by rand()) t cross join
(select #rn2 := 0) params
) t2
using (seqnum);
In MySQL 8+, Tim's answer is the best approach.
Here is a db<>fiddle

For this sample data of sequential ids from 1 to n, this query that uses string functions will work and will return in the columns id2 and id3 all if the ids once:
select t1.id,
find_in_set(t1.id, t2.ids2) id2,
find_in_set(t1.id, t3.ids3) id3
from tablename t1
cross join (
select group_concat(id order by rand()) ids2
from tablename
) t2
cross join (
select group_concat(id order by rand()) ids3
from tablename
) t3
See the demo.
Results (like):
| id | id2 | id3 |
| --- | --- | --- |
| 1 | 5 | 4 |
| 2 | 2 | 5 |
| 3 | 1 | 2 |
| 4 | 4 | 3 |
| 5 | 3 | 1 |

Related

Incrementing count ONLY for duplicates in MySQL

Here is my MySQL table. I updated the question by adding an 'id' column to it (as instructed in the comments by others).
id data_id
1 2355
2 2031
3 1232
4 9867
5 2355
6 4562
7 1232
8 2355
I want to add a new column called row_num to assign an incrementing number ONLY for duplicates, as shown below. Order of the results does not matter.
id data_id row_num
3 1232 1
7 1232 2
2 2031 null
1 2355 1
5 2355 2
8 2355 3
6 4562 null
4 9867 null
I followed this answer and came up with the code below. But following code adds a count of '1' to non-duplicate values too, how can I modify below code to add a count only for duplicates?
select data_id,row_num
from (
select data_id,
#row:=if(#prev=data_id,#row,0) + 1 as row_num,
#prev:=data_id
from my_table
)t
If you are running MySQL 8.0, you can do this more efficiently with window functions only:
select
data_id,
case when count(*) over(partition by data_id) > 1
then row_number() over(partition by data_id order by data_id) row_num
end
from mytable
When the window count returns more than 1, you know that the current data_id has duplicates, in which case you can use row_number() to assign the incrementing number.
Note that, in absence of an ordering columns to uniquely identify each record within groups sharing the same data_id, it is undefined which record will actually get each number.
I am assuming that id is the column that defines the order on the rows.
In MySQL 8 you can use row_number() to get the number of each data_id and a CASE with EXISTS to exclude the rows which have no duplicate.
SELECT t1.data_id,
CASE
WHEN EXISTS (SELECT *
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id <> t1.id) THEN
row_number() OVER (PARTITION BY t1.data_id
ORDER BY t1.id)
END row_num
FROM my_table t1;
In older versions you can use a subquery counting the rows with the same data_id but smaller id. With an EXISTS in a HAVING clause you can exclude the rows that have no duplicate.
SELECT t1.data_id,
(SELECT count(*)
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id < t1.id
HAVING EXISTS (SELECT *
FROM my_table t2
WHERE t2.data_id = t1.data_id
AND t2.id <> t1.id)) + 1 row_num
FROM my_table t1;
db<>fiddle
Join with a query that returns the number of duplicates.
select t1.data_id, IF(t2.dups > 1, row_num, '') AS row_num
from (
select data_id,
#row:=if(#prev=data_id,#row,0) + 1 as row_num,
#prev:=data_id
from my_table
order by data_id
) AS t1
join (
select data_id, COUNT(*) AS dups
FROM my_table
GROUP BY data_id
) AS t2 ON t1.data_id = t2.data_id
If you want to have the old "order" of the old table, you need much more code
SELECT
data_id, IF (row_num = 1 AND cntid = 1, NULL,row_num)
FROM
(SELECT
#row:=IF(#prev = t1.data_id, #row, 0) + 1 AS row_num,
cntid,
#prev:=t1.data_id data_id
FROM
(SELECT
*
FROM
my_table
ORDER BY data_id) t1
INNER JOIN (SELECT Count(*) cntid,data_id FROM my_table GROUP BY data_id)t2
ON t1.data_id = t2.data_id) t2
data_id | IF (row_num = 1 AND cntid = 1, NULL,row_num)
------: | -------------------------------------------:
1232 | 1
1232 | 2
2031 | null
2355 | 1
2355 | 2
2355 | 3
4562 | null
9867 | null
db<>fiddle here

MySQL join on row number (first with first, second with second etc)

Let's say I have 2 simple tables
Table t1 Table t2
+------+ +------+
| i | | j |
+------+ +------+
| 42 | | a |
| 1 | | b |
| 5 | | c |
+------+ +------+
How can I have an output of the 2 tables, joined without any condition except the row number?
I would like to avoid the creation of another index if possible.
I am using MySQL 5.7
With this example, the output would be :
Table output
+------+------+
| i | j |
+------+------+
| 42 | a |
| 1 | b |
| 5 | c |
+------+------+
What you ask can be done, assuming that your comment is true;
"Even if table i and j are subqueries (containing order by)?"
Schema (MySQL v5.7)
CREATE TABLE table_1 ( i INT );
CREATE TABLE table_2 ( j VARCHAR(4) );
INSERT INTO table_1
VALUES (3),(5),(1);
INSERT INTO table_2
VALUES ('c'), ('b'),('a');
Query
SELECT t1.i, t2.j
FROM (SELECT t1.i
, #rownum1 := #rownum1 + 1 AS rownum
FROM (SELECT table_1.i
FROM table_1
ORDER BY ?) t1
CROSS JOIN (SELECT #rownum1 := 0) v) t1
JOIN (SELECT t2.j
, #rownum2 := #rownum2 + 1 AS rownum
FROM (SELECT table_2.j
FROM table_2
ORDER BY ?) t2
CROSS JOIN (SELECT #rownum2 := 0) v) t2 ON t2.rownum = t1.rownum;
However, this approach is a) not efficient, and b) indicative of questionable design. You probably want to look for something that actually relates your two tables or, if nothing exists, create something. If there is really nothing that relates the two tables, then you'll have trouble with the ORDER BY clauses anyway.
If the tables do not necessarily have the same number of rows, then use union all and group by -- along with variables:
select max(t.i) as i, max(t.j) as j
from ((select (#rn1 := #rn1 + 1) as seqnum, t1.i
from t1 cross join
(select #rn1 := 0) params
) union all
(select (#rn2 := #rn2 + 1) as seqnum, t2.j
from t2 cross join
(select #rn2 := 0) params
)
) t
group by seqnum;
Note: The results in each column are in an arbitrary and indeterminate order. The order might vary on different runs on the query.
You don't provide enough information to ensure the ordering.
you can try this code
select t1.i,t2.j
from
(SELECT i,#row_num:=#row_num+1 as row_num FROM t1, (SELECT #row_num:= 0) AS sl) t1
join
(SELECT j,#row_num:=#row_num+1 as row_num FROM t2, (SELECT #row_num:= 0) AS sl) t2
on t1.row_num=t2.row_num

SQL query to join two tables with no repeated values?

Table 1
ID | NAME | WARD_ID|
1 A 1
2 B 1
3 C 2
4 D 2
Table 2
ID | MONTH1 | MONTH2 | WARD_ID|
1 9 10 1
2 6 11 1
3 5 12 2
4 13 14 2
I want to join this two table and produce the following output:
ID | NAME | MONTH1 | MONTH2 | WARD_ID|
1 A 9 10 1
2 B 6 11 1
3 C 5 12 2
4 D 13 14 2
In the ON condition of the query I have to keep WARD_ID equal for both the tables. I could not able to figure out the solution. Anyone have any experience with a query like this?
I think you want something like this:
select t1.*, t2.*
from (select t1.*,
(#rn1 := if(#w1 = ward_id, #rn1 + 1,
if#w1 := ward_id, 1, 1)
)
) as rn
from (select t1.* from table1 t1 order by ward_id, id ) t1 cross join
(select #w1 := -1, #rn1 := -1) params
) t1 join
(select t2.*,
(#rn2 := if(#w2 = ward_id, #rn2 + 1,
if#w2 := ward_id, 1, 1)
)
) as rn
from (select t2.* from table2 t2 order by ward_id, id ) t2 cross join
(select #w2 := -1, #rn1 := -1) params
) t1
on t2.ward_id = t1.ward_id and t2.rn = t1.rn;
The subqueries enumerate the rows in each table. The join then uses the enumeration.
This is much simpler in MySQL 8.0, using row_number().
I'm assuming here that ID is intended to be the same from both tables. If so, I think you can do a multi-condition join:
select * from table1 t1
inner join table2 t2
on t1.ID=t2.ID and t1.WARD_ID=t2.WARD_ID
You can do something like:
SET #rn:=0;
SET #rn2:=0;
SELECT *
FROM (
SELECT #rn:=#rn+1 AS rn1, t1.ID, t1.NAME, t1.WARD_ID
FROM t1
GROUP BY t1.WARD_ID, t1.NAME
ORDER BY t1.WARD_ID, t1.NAME
) s1
INNER JOIN (
SELECT #rn2:=#rn2+1 AS rn2, t2.ID, t2.MONTH1, t2.MONTH2, t2.WARD_ID
FROM t2
GROUP BY t2.WARD_ID, t2.MONTH1,t2.MONTH2
ORDER BY t2.WARD_ID, t2.MONTH1,t2.MONTH2
) s2 ON s1.WARD_ID = s2.WARD_ID
AND s1.rn1 = s2.rn2
But it really doesn't reliably sort the tables to join the same rows every time. I still think there isn't a reliable/repeatable way to join the two tables the same every time.
============================================================
http://sqlfiddle.com/#!9/aa2db0/1 <<<< If ID can be used to reliably sort the two tables, you can use it in the ORDER BYs. I've added it in this Fiddle, and included rows in the setup that would fall before the existing records and potentially change the sorting. This also includes more records in Table 2 than there are in Table 1, so would possibly result in duplicated rows. These new rows are ignored since they can't be matched between tables.

join multiple columns with single column in mysql

I have 2 tables as follows
t1:
id code field1 field2
1 1000 a1111 a2222
2 2000 b1111 b2222
3 1000 a3333 a4444
4 2000 b3333 b4444
5 2000 b5555 b6666
6 3000 c1111 c2222
7 3000 c3333 c4444
8 3000 c5555 c6666
t2:
t2id t1_code var1
1 1000 xxxx
2 2000 yyyyy
3 3000 zzz
4 3000 mmm
i want the result table as:
code field1 field2 t1_code var1
1000 a3333 a4444 1000 xxxx
1000 a1111 a2222 null null
2000 b3333 b4444 2000 yyyyy
2000 b5555 b6666 null null
2000 b1111 b2222 null null
3000 c1111 c2222 3000 mmm
3000 c3333 c4444 3000 zzz
3000 c3333 c4444 null null
i tried :
SELECT t1.code, t1.field1, t1.field2, t2.t1_code, t2.var1
FROM t2, t1
WHERE t1_code = code
ORDER BY code
is not giving me the answer.
Please help...
Okay, you want the join to go to the "first" matching row in t1 (where "first" means lowest id).
Here is almost a way to do it:
SELECT t1.code, t1.field1, t1.field2, t2.t1_code, t2.var1
FROM t1 left outer join
(select t1.code, min(id) as minid
from t1
group by t1.code
) t1min
on t1.id = t1min.minid left outer join
t2
on t2.t1_code = t1min.code;
What this does is join the original table to another table to find the minimum id for each code. By joining on the ids, the t1min.code will only have values at the min ids. The final join just brings in the code for these.
EDIT:
The actual problem is a bit more complicated. I think the best way is simply to enumerate the values in each table and do the join with two keys:
select t1.code, t1.field1, t1.field2, t2.t1_code, t2.var1
from (select t1.*,
#rn1 := if(#code = code, #rn1 + 1, 1) as rn, #code := code
from t1 cross join
(select #rn1 := 0, #code := '') const
order by code, id
) t1 left outer join
(select t2.*,
#rn2 := if(#code = t1_code, #rn2 + 1, 1) as rn, #code := t1_code
from t2 cross join
(select #rn2 := 0, #code := '') const
order by t1_code, t2id
) t2
on t1.code = t2.t1_code and t1.rn = t2.rn
order by t1.id;

SQL DELETE all rows apart from last N rows for each unique value

Here's a tough one,
How would I delete all but the last, say 3 rows, for each unique value in a different field?
Here's a visual of the problem:
id | otherfield
---------------
1 | apple <- DELETE
2 | banana <- KEEP
3 | apple <- DELETE
4 | apple <- KEEP
5 | carrot <- KEEP
6 | apple <- KEEP
7 | apple <- KEEP
8 | banana <- KEEP
How would I accomplish this in SQL?
Non tested, but something along these lines might work:
DELETE t.*
FROM table t JOIN (
SELECT id
#rowNum := IF(#otherfield <> otherfield, 1, #rowNum + 1) rn,
#otherfield := otherfield otherfield
FROM (
SELECT id, otherfield
FROM table
ORDER BY otherfield, id DESC
) t, (SELECT #otherfield := NULL, #rowNum := -1) dm
) rs ON t.id = rs.id
WHERE rs.rn > 3
Delete MyTable
Where Id In (
Select Id
From (
Select Id
, (Select COUNT(*)
From MyTable As T2
Where T2.OtherField = T.OtherField
And T2.Id > T.Id) As Rnk
From MyTable As T
) As Z
Where Z.Rnk > 2
)
Another version which might be a bit faster:
Delete MyTable
Where Id In (
Select T.Id
From MyTable As T
Left Join MyTable As T2
On T2.OtherField = T.OtherField
And T2.Id > T.Id
Group By T.Id
Having Count(T2.Id) > 2
)