mysql Rand() function cause unexpected multirow results - mysql

When I try to get random row from table by id using RAND() function I get unexpected unstable results. The following query (where id column is primary key) returns 1, 2 or even more rows:
I tried next variant as well which produces same result:
SELECT id, word FROM words WHERE id = FLOOR(RAND() * 1000)
I found another solution for my task:
SELECT id, word FROM words ORDER BY RAND() LIMIT 1
But I want to know why MySQL behavior is so unexpected with using so elementary functionality. It scares me.
I experimented in different IDE with the same results.

The behavior is not unexpected. The RAND() function is evaluated per-row:
SELECT RAND() FROM sometable LIMIT 10
+----------------------+
| RAND() |
+----------------------+
| 0.7383128467372738 |
| 0.6141578719151746 |
| 0.8558508500976961 |
| 0.4367806654766022 |
| 0.6163508078235674 |
| 0.7714120734216757 |
| 0.0080079743713214 |
| 0.7258036823252251 |
| 0.6049945192458057 |
| 0.8475615799869984 |
+----------------------+
Keeping this in mind, this query:
SELECT * FROM words WHERE id = FLOOR(RAND() * 1000)
means that every row with id between 0 and 999 has 1/1000 probability of being SELECTed!

Related

How do I check whether the result row of given mysql query exceeds a certain number without using count()

Now my problem is to know a mysql query will fetch result which exceeds a certain row count (like 5000 rows). I know it can use select * ... limit 5001 to replace count() for performance optimization in terms of time effeciency, but it still return 5001 row of records which is totally useless in my scenario, becasue all I want is a sample 'yes/no' answer. Is there any better approach? big thanks ! ^_^
The accepted answer in the link provided by Devsi Odedra
is substantially correct but if you don't want a big result set select a column into a user defined variable and limit 1
for example
MariaDB [sandbox]> select * from dates limit 7;
+----+------------+
| id | dte |
+----+------------+
| 1 | 2018-01-02 |
| 2 | 2018-01-03 |
| 3 | 2018-01-04 |
| 4 | 2018-01-05 |
| 5 | 2018-01-06 |
| 6 | 2018-01-07 |
| 7 | 2018-01-08 |
+----+------------+
SELECT SQL_CALC_FOUND_ROWS ID INTO #ID FROM DATES WHERE ID < 5 LIMIT 1;
SELECT FOUND_ROWS();
+--------------+
| FOUND_ROWS() |
+--------------+
| 4 |
+--------------+
1 row in set (0.001 sec)
SELECT 1 FROM tbl
WHERE ... ORDER BY ...
LIMIT 5000, 1;
will give you either a row or no row -- This indicates whether there are more than 5000 row or not. Wrapping it in EXISTS( ... ) turns that into "true" or "false" -- essentially the same effort, but perhaps clearer syntax.
Caution: If the WHERE and ORDER BY are used but cannot handled by an INDEX, the query may still read the entire table before getting to the 5000 and 1.
When paginating, I recommend
LIMIT 11, 1
to fetch 10 rows, plus an indication that there are more rows.

Sql query with mixed selected rows order

I have a table that looks like below:
| id | group_id | title |
-------------------------
| 1 | 1 | Hello |
| 2 | 1 | World |
| 3 | 2 | Foo |
| 4 | 2 | Bar |
My query may look like below to return the results above:
SELECT * FROM my_table ORDER BY id
Question
How can I order this table so that the group ids appears to be random, but still the same every time the query is executed.
Possible result example
This result looks to be in a random order. If I run the same query a week later, I want to see the exact same order which means it's not really random.
| id | group_id | title |
-------------------------
| 2 | 1 | World |
| 4 | 2 | Bar |
| 1 | 1 | Hello |
| 3 | 2 | Foo |
Appears to be random from a group_id perspective. It's no longer ordered by group_id like 1 1 2 2, but 1 2 1 2. It could also have been 2 1 1 2 or something that does not increase.
Should return the same results every time, not random each time.
I could order by title but if a title should change that row will be reordered. So the order needs to be made with the id I guess.
I want to avoid file or database caching if possible.
Is it possible?
How about taking the modulo function for your advantage.
SELECT * FROM my_table ORDER BY id % 3,id
Define a value to use with the modulo function (in my example 3) and order your table by the modulo of the id.
This should return the same order everytime you run the query and return some order that is pseudo random.
Since the modulo function can return the same value for different ids you also need to order by the original id to have a defined, reproducable order.
order this table so that the group ids appears to be random
Only ORDER BY RAND() may provide really random ordering.
but still the same every time the query is executed
Create separate static ordering table, fill it randomly with source table's ids, join it and order by it.
I did not solve the problem with the solution from #Kylro, but I found another way which works great.
SELECT * FROM my_table ORDER BY COS(id), id
Cos is sometimes a positive value and sometimes a negative value, almost random like. It works perfecty for this problem.

How to sample rows in MySQL using RAND(seed)?

I need to fetch a repeatable random set of rows from a table using MySQL. I implemented this using the MySQL RAND function using the bigint primary key of the row as the seed. Interestingly this produces numbers that don't look random at all. Can anyone tell me whats going on here and how to get it to work properly?
select id from foo where rand(id) < 0.05 order by id desc limit 100
In one example out of 600 rows not a single one was returned. I change the select to include "id, rand(id)" and get rid of the rand clause in the where this is what I got:
| 163345 | 0.315191733944408 |
| 163343 | 0.814825518815616 |
| 163337 | 0.313726862253367 |
| 163334 | 0.563177533972242 |
| 163333 | 0.312994424545201 |
| 163329 | 0.312261986837035 |
| 163327 | 0.811895771708242 |
| 163322 | 0.560980224573035 |
| 163321 | 0.310797115145994 |
| 163319 | 0.810430896291911 |
| 163318 | 0.560247786864869 |
| 163317 | 0.310064677437828 |
Look how many 0.31xxx lines there are. Not at all random.
PS: I know this is slow but in my app the where clause limits the number of rows to a few 1000.
Use the same seed for all the rows to do that, like:
select id from foo where rand(42) < 0.05 order by id desc limit 100
See the rand() docs for why it works that way. Change the seed if you want another set of values.
Multiply the decimal number returned by id:
select id from foo where rand() * id < 5 order by id desc limit 100

Order the rows of a MySQL result based on a "next_id" field

I'm currently working with a database table that is structured as follows:
______________________________
| id | content | next_id |
|------|-----------|-----------|
| 1 | (value) | 4 |
| 2 | (value) | 1 |
| 3 | (value) | (NULL) |
| 4 | (value) | 3 |
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
The value of the next_id field defines the id of the row of data that should follow it. A value of NULL means that no row follows it.
Is there a way I can query the database in such a way that in the resulting rows will be ordered using this method? For example, in the case I gave above, the rows should be returned ordered so that the ids are in this order: 2, 1, 4, 3. I'm looking for a solution that can do this regardless of the number of rows in this sequence.
I know that it is possible to reorder the results after retrieving them from the database (using the programming language I'm working with), but I'm hoping that there is a way that I can do it in SQL.
I can't see a solution without as many self-joins as you have rows. Instead I would build a nested set out of it in a temp table using push down stack algorithm and then retrieve a full tree.
I've got something that's close.
/*one select to init the #next variable to the first row*/
select #next:= id from table1 order by isnull(next_id) asc, next_id asc limit 1;
select distinct a.id, a.next_id from table1 b
inner join
(
select #rank:= id as id, #next:= next_id as next_id from table1
where id = #next
) a
on (b.id = b.id);
This outputs
+----+---------+
| id | next_id |
+----+---------+
| 2 | 1 |
| 1 | 4 |
And then stops. If only I could find a way for it to continue....
Anyway this sort of force feeding values into a query is dodgy enough when doing ranking, let alone this sort of stuff, so maybe I'm going down a dead end.

MySQL Sorting Results takes a long time

Lately I've been getting MySQL to hang on specific queries. I have a table with 500,000+ records. Here is the query being run:
SELECT * FROM items WHERE (itemlist_id = 115626) ORDER BY tableOrder DESC LIMIT 1
Here is the explain:
| 1 | SIMPLE | items | ALL | NULL | NULL | NULL | NULL | 587113 | Using where; Using filesort |
And here is the process_list entry:
| 252996 | root | localhost | itemdb | Query | 0 | Sorting result | SELECT * FROM items WHERE (itemlist_id = 115642) ORDER BY tableOrder DESC LIMIT 1 |
Any idea what could be causing this query to take 10 minutes to process? When I run it manually it's done quickly. (1 row in set (0.86 sec))
Thanks
You need to create an index on items (itemList_id, TableOrder) and rewrite the query a little:
SELECT *
FROM items
WHERE itemlist_id = 115626
ORDER BY
itemlist_id DESC, tableOrder DESC
LIMIT 1
The first condition in ORDER BY may seem to be redundant, but it helps MySQL to choose the correct plan (which does not sort).