MySQL limit work around - mysql

I need to limit records based on percentage but MYSQL does not allow that. I need 10 percent User Id of (count(User Id)/max(Total_Users_bynow)
My code is as follows:
select * from flavia.TableforThe_top_10percent_of_the_user where `User Id` in (select distinct(`User Id`) from flavia.TableforThe_top_10percent_of_the_user group by `User Id` having count(distinct(`User Id`)) <= round((count(`User Id`)/max(Total_Users_bynow))*0.1)*count(`User Id`));
Kindly help.

Consider splitting your problem in pieces. You can use user variables to get what you need. Quoting from this question's answers:
You don't have to solve every problem in a single query.
So... let's get this done. I'll not put your full query, but some examples:
-- Step 1. Get the total of the rows of your dataset
set #nrows = (select count(*) from (select ...) as a);
-- --------------------------------------^^^^^^^^^^
-- The full original query (or, if possible a simple version of it) goes here
-- Step 2. Calculate how many rows you want to retreive
-- You may use "round()", "ceiling()" or "floor()", whichever fits your needs
set #limrows = round(#nrows * 0.1);
-- Step 3. Run your query:
select ...
limit #limrows;
After checking, I found this post which says that my above approach won't work. There's, however, an alternative:
-- Step 1. Get the total of the rows of your dataset
set #nrows = (select count(*) from (select ...) as a);
-- --------------------------------------^^^^^^^^^^
-- The full original query (or, if possible a simple version of it) goes here
-- Step 2. Calculate how many rows you want to retreive
-- You may use "round()", "ceiling()" or "floor()", whichever fits your needs
set #limrows = round(#nrows * 0.1);
-- Step 3. (UPDATED) Run your query.
-- You'll need to add a "rownumber" column to make this work.
select *
from (select #rownum := #rownum+1 as rownumber
, ... -- The rest of your columns
from (select #rownum := 0) as init
, ... -- The rest of your FROM definition
order by ... -- Be sure to order your data
) as a
where rownumber <= #limrows
Hope this helps (I think it will work without a quirk this time)

Related

Find next or previous ID when query contains multiple cases

I am looking for the most efficient way to find the next or previous ID of the following query:
SELECT *
FROM transactions
ORDER
BY CASE order_status
WHEN 'order_accepted' THEN 1
WHEN 'processing_order' THEN 2
WHEN 'order_send_mailer' THEN 3
WHEN 'order_send' THEN 4
WHEN 'order_received' THEN 5
WHEN 'order_refunded' THEN 6
ELSE 7 END
, id DESC limit 1;
I tried adding a where id > '$id' or where id < '$id' claus to the query but it didn't give me te next or previous ID I was looking for.
For those that need some explanation of what I am trying to do: It's to go to the next or previous order by case with a forward of backward button.
What it currently looks like:
-id- -order_status-
9399 order_accepted
9398 processing_order
9363 processing_order
9403 order_send_mailer
9318 order_send
9346 order_received
9345 order_received
9050 order_refunded
The next ID for example of 9403 would be 9363 and previous ID would be 9318
Change your order_status into an enum column. This will save disk space and make sorting by order_status simpler and faster.
-- Add a new version of the column using an enum.
-- These strings are aliases for ordered numbers.
-- 'order_accepted' is 1, 'processing_order' is 2, etc.
alter table transactions add column enum_order_status enum(
'order_accepted',
'processing_order',
'order_send_mailer',
'order_send',
'order_received',
'order_refunded'
) not null;
-- Copy the status into the new enum column.
-- MySQL will translate the string into the number for you.
update transactions
set enum_order_status = order_status;
-- Drop the old column.
alter table transactions drop column order_status;
-- Rename the new enum column.
alter table transactions rename column enum_order_status to order_status;
-- Index it.
create index transactions_order_status on transactions(order_status);
-- Enjoy your vastly simplified and much faster query.
select *
from transactions
order by order_status, id desc
That's not actually necessary, but it makes everything much simpler.
With that out of the way, use the window functions lead and lag to refer to the previous and next rows in a query.
select
id, order_status,
lead(id) over w, lead(order_status) over w,
lag(id) over w, lag(order_status) over w
from transactions
window w as (order by order_status, id desc);
Note, window functions were added in MySQL 8. If you're using an older version I recommend upgrading ASAP; MySQL 8 has many big improvements. Otherwise you can simulate it with correlated subqueries and self-joins.
If you want the previous and next rows of a specific row, use the technique from this answer. We add row_numbers to the table in the desired order, and then fetch 9403 and its previous and next row by row number.
-- Add a row number to your table in the desired order.
with ordered_transactions as (
select
*, row_number() over w as rn
from transactions
window w as (order by order_status, id desc)
)
select *
from ordered_transactions
-- Find the row number for ID 9403, then add -1, 0, and 1.
-- If 9403 is row number 5 you'll fetch row numbers 4, 5, and 6.
where ot.rn in (
select rn+i
from ordered_transactions ot
-- All this is doing is making us three "rows" where i = -1, 0, and 1.
cross join (SELECT -1 AS i UNION ALL SELECT 0 UNION ALL SELECT 1) cj
where ot.id = 9403
);
Try it.

Getting previous row in MySQL

I'm stucked in a MySQL problem that I was not able to find a solution yet. I have the following query that brings to me the month-year and the number new users of each period in my platform:
select
u.period ,
u.count_new as new_users
from
(select DATE_FORMAT(u.registration_date,'%Y-%m') as period, count(distinct u.id) as count_new from users u group by DATE_FORMAT(u.registration_date,'%Y-%m')) u
order by period desc;
The result is the table:
period,new_users
2016-10,103699
2016-09,149001
2016-08,169841
2016-07,150672
2016-06,148920
2016-05,160206
2016-04,147715
2016-03,173394
2016-02,157743
2016-01,173013
So, I need to calculate for each month-year the difference between the period and the last month-year. I need a result table like this:
period,new_users
2016-10,calculate(103699 - 149001)
2016-09,calculate(149001- 169841)
2016-08,calculate(169841- 150672)
2016-07,So on...
2016-06,...
2016-05,...
2016-04,...
2016-03,...
2016-02,...
2016-01,...
Any ideas: =/
Thankss
You should be able to use a similar approach as I posted in another S/O question. You are on a good track to start. You have your inner query get the counts and have it ordered in the final direction you need. By using inline mysql variables, you can have a holding column of the previous record's value, then use that as computation base for the next result, then set the variable to the new balance to be used for each subsequent cycle.
The JOIN to the SqlVars alias does not have any "ON" condition as the SqlVars would only return a single row anyhow and would not result in any Cartesian product.
select
u.period,
if( #prevCount = -1, 0, u.count_new - #prevCount ) as new_users,
#prevCount := new_users as HoldColumnForNextCycle
from
( select
DATE_FORMAT(u.registration_date,'%Y-%m') as period,
count(distinct u.id) as count_new
from
users u
group by
DATE_FORMAT(u.registration_date,'%Y-%m') ) u
JOIN ( select #prevCount := -1 ) as SqlVars
order by
u.period desc;
You may have to play with it a little as there is no "starting" point in counts, so the first entry in either sorted direction may look strange. I am starting the "#prevCount" variable as -1. So the first record processed gets a new user count of 0 into the "new_users" column. THEN, whatever was the distinct new user count was for the record, I then assign back to the #prevCount as the basis for all subsequent records being processed. yes, it is an extra column in the result set that can be ignored, but is needed. Again, it is just a per-line place-holder and you can see in the result query how it gets its value as each line progresses...
I would create a temp table with two columns and then fill it using a cursor that
does something like this (don't remember the exact syntax - so this is just a pseudo-code):
#val = CURSOR.col2 - (select col2 from OriginalTable t2 where (t2.Period = (CURSOR.Period-1) )))
INSERT tmpTable (Period, NewUsers) Values ( CURSOR.Period, #val)

Reorder a MYSQL table

I have a MySql table with a 'Order' field but when a record gets deleted a gap appears
how can i update my 'Order' field sequentially ?
If possible in one query 1 1
id.........order
1...........1
5...........2
4...........4
3...........6
5...........8
to
id.........order
1...........1
5...........2
4...........3
3...........4
5...........5
I could do this record by record
Getting a SELECT orderd by Order and row by row changing the Order field
but to be honest i don't like it.
thanks
Extra info :
I also would like to change it this way :
id.........order
1...........1
5...........2
4...........3
3...........3.5
5...........4
to
id.........order
1...........1
5...........2
4...........3
3...........4
5...........5
In MySQL you can do this:
update t join
(select t.*, (#rn := #rn + 1) as rn
from t cross join
(select #rn := 0) const
order by t.`order`
) torder
on t.id = torder.id
set `order` = torder.rn;
In most databases, you can also do this with a correlated subquery. But this might be a problem in MySQL because it doesn't allow the table being updated as a subquery:
update t
set `order` = (select count(*)
from t t2
where t2.`order` < t.`order` or
(t2.`order` = t.`order` and t2.id <= t.id)
);
There is no need to re-number or re-order. The table just gives you all your data. If you need it presented a certain way, that is the job of a query.
You don't even need to change the order value in the query either, just do:
SELECT * FROM MyTable WHERE mycolumn = 'MyCondition' ORDER BY order;
The above answer is excellent but it took me a while to grok it so I offer a slight rewrite which I hope brings clarity to others faster:
update
originalTable
join (select originalTable.ID,
(#newValue := #newValue + 10) as newValue
from originalTable
cross join (select #newValue := 0) newTable
order by originalTable.Sequence)
originalTable_reordered
on originalTable.ID = originalTable_reordered.ID
set originalTable.Sequence = originalTable_reordered.newValue;
Note that originalTable.* is NOT required - only the field used for the final join.
My example assumes the field to be updated is called Sequence (perhaps clearer in intent than order but mainly sidesteps the reserved keyword issue)
What took me a while to get was that "const" in the original answer was not a MySQL keyword. (I'm never a fan of abbreviations for that reason -- the can be interpreted many ways at times especially at these very when it is best they not be misinterpreted. Makes for verbose code I know but clarity always trumps convenience in my books.)
Not quite sure what the select #newValue := 0 is for but I think this is a side effect of having to express a variable before it can be used later on.
The value of this update is of course an atomic update to all the rows in question rather than doing a data pull and updating single rows one by one pragmatically.
My next question, which should not be difficult to ascertain, but I've learned that SQL can be a trick beast at the best of times, is to see if this can be safely done on a subset of data. (Where some originalTable.parentID is a set value).

What is the best way to pick a random row from a table in MySQL? [duplicate]

What is a fast way to select a random row from a large mysql table?
I'm working in php, but I'm interested in any solution even if it's in another language.
Grab all the id's, pick a random one from it, and retrieve the full row.
If you know the id's are sequential without holes, you can just grab the max and calculate a random id.
If there are holes here and there but mostly sequential values, and you don't care about a slightly skewed randomness, grab the max value, calculate an id, and select the first row with an id equal to or above the one you calculated. The reason for the skewing is that id's following such holes will have a higher chance of being picked than ones that follow another id.
If you order by random, you're going to have a terrible table-scan on your hands, and the word quick doesn't apply to such a solution.
Don't do that, nor should you order by a GUID, it has the same problem.
I knew there had to be a way to do it in a single query in a fast way. And here it is:
A fast way without involvement of external code, kudos to
http://jan.kneschke.de/projects/mysql/order-by-rand/
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
MediaWiki uses an interesting trick (for Wikipedia's Special:Random feature): the table with the articles has an extra column with a random number (generated when the article is created). To get a random article, generate a random number and get the article with the next larger or smaller (don't recall which) value in the random number column. With an index, this can be very fast. (And MediaWiki is written in PHP and developed for MySQL.)
This approach can cause a problem if the resulting numbers are badly distributed; IIRC, this has been fixed on MediaWiki, so if you decide to do it this way you should take a look at the code to see how it's currently done (probably they periodically regenerate the random number column).
Here's a solution that runs fairly quickly, and it gets a better random distribution without depending on id values being contiguous or starting at 1.
SET #r := (SELECT ROUND(RAND() * (SELECT COUNT(*) FROM mytable)));
SET #sql := CONCAT('SELECT * FROM mytable LIMIT ', #r, ', 1');
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;
Maybe you could do something like:
SELECT * FROM table
WHERE id=
(FLOOR(RAND() *
(SELECT COUNT(*) FROM table)
)
);
This is assuming your ID numbers are all sequential with no gaps.
Add a column containing a calculated random value to each row, and use that in the ordering clause, limiting to one result upon selection. This works out faster than having the table scan that ORDER BY RANDOM() causes.
Update: You still need to calculate some random value prior to issuing the SELECT statement upon retrieval, of course, e.g.
SELECT * FROM `foo` WHERE `foo_rand` >= {some random value} LIMIT 1
There is another way to produce random rows using only a query and without order by rand().
It involves User Defined Variables.
See how to produce random rows from a table
In order to find random rows from a table, don’t use ORDER BY RAND() because it forces MySQL to do a full file sort and only then to retrieve the limit rows number required. In order to avoid this full file sort, use the RAND() function only at the where clause. It will stop as soon as it reaches to the required number of rows.
See
http://www.rndblog.com/how-to-select-random-rows-in-mysql/
if you don't delete row in this table, the most efficient way is:
(if you know the mininum id just skip it)
SELECT MIN(id) AS minId, MAX(id) AS maxId FROM table WHERE 1
$randId=mt_rand((int)$row['minId'], (int)$row['maxId']);
SELECT id,name,... FROM table WHERE id=$randId LIMIT 1
I see here a lot of solution. One or two seems ok but other solutions have some constraints. But the following solution will work for all situation
select a.* from random_data a, (select max(id)*rand() randid from random_data) b
where a.id >= b.randid limit 1;
Here, id, don't need to be sequential. It could be any primary key/unique/auto increment column. Please see the following Fastest way to select a random row from a big MySQL table
Thanks
Zillur
- www.techinfobest.com
For selecting multiple random rows from a given table (say 'words'), our team came up with this beauty:
SELECT * FROM
`words` AS r1 JOIN
(SELECT MAX(`WordID`) as wid_c FROM `words`) as tmp1
WHERE r1.WordID >= (SELECT (RAND() * tmp1.wid_c) AS id) LIMIT n
The classic "SELECT id FROM table ORDER BY RAND() LIMIT 1" is actually OK.
See the follow excerpt from the MySQL manual:
If you use LIMIT row_count with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result.
With a order yo will do a full scan table.
Its best if you do a select count(*) and later get a random row=rownum between 0 and the last registry
An easy but slow way would be (good for smallish tables)
SELECT * from TABLE order by RAND() LIMIT 1
In pseudo code:
sql "select id from table"
store result in list
n = random(size of list)
sql "select * from table where id=" + list[n]
This assumes that id is a unique (primary) key.
Take a look at this link by Jan Kneschke or this SO answer as they both discuss the same question. The SO answer goes over various options also and has some good suggestions depending on your needs. Jan goes over all the various options and the performance characteristics of each. He ends up with the following for the most optimized method by which to do this within a MySQL select:
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
HTH,
-Dipin
I'm a bit new to SQL but how about generating a random number in PHP and using
SELECT * FROM the_table WHERE primary_key >= $randNr
this doesn't solve the problem with holes in the table.
But here's a twist on lassevks suggestion:
SELECT primary_key FROM the_table
Use mysql_num_rows() in PHP create a random number based on the above result:
SELECT * FROM the_table WHERE primary_key = rand_number
On a side note just how slow is SELECT * FROM the_table:
Creating a random number based on mysql_num_rows() and then moving the data pointer to that point mysql_data_seek(). Just how slow will this be on large tables with say a million rows?
I ran into the problem where my IDs were not sequential. What I came up with this.
SELECT * FROM products WHERE RAND()<=(5/(SELECT COUNT(*) FROM products)) LIMIT 1
The rows returned are approximately 5, but I limit it to 1.
If you want to add another WHERE clause it becomes a bit more interesting. Say you want to search for products on discount.
SELECT * FROM products WHERE RAND()<=(100/(SELECT COUNT(*) FROM pt_products)) AND discount<.2 LIMIT 1
What you have to do is make sure you are returning enough result which is why I have it set to 100. Having a WHERE discount<.2 clause in the subquery was 10x slower, so it's better to return more results and limit.
Use the below query to get the random row
SELECT user_firstname ,
COUNT(DISTINCT usr_fk_id) cnt
FROM userdetails
GROUP BY usr_fk_id
ORDER BY cnt ASC
LIMIT 1
In my case my table has an id as primary key, auto-increment with no gaps, so I can use COUNT(*) or MAX(id) to get the number of rows.
I made this script to test the fastest operation:
logTime();
query("SELECT COUNT(id) FROM tbl");
logTime();
query("SELECT MAX(id) FROM tbl");
logTime();
query("SELECT id FROM tbl ORDER BY id DESC LIMIT 1");
logTime();
The results are:
Count: 36.8418693542479 ms
Max: 0.241041183472 ms
Order: 0.216960906982 ms
Answer with the order method:
SELECT FLOOR(RAND() * (
SELECT id FROM tbl ORDER BY id DESC LIMIT 1
)) n FROM tbl LIMIT 1
...
SELECT * FROM tbl WHERE id = $result;
I have used this and the job was done
the reference from here
SELECT * FROM myTable WHERE RAND()<(SELECT ((30/COUNT(*))*10) FROM myTable) ORDER BY RAND() LIMIT 30;
Create a Function to do this most likely the best answer and most fastest answer here!
Pros - Works even with Gaps and extremely fast.
<?
$sqlConnect = mysqli_connect('localhost','username','password','database');
function rando($data,$find,$max = '0'){
global $sqlConnect; // Set as mysqli connection variable, fetches variable outside of function set as GLOBAL
if($data == 's1'){
$query = mysqli_query($sqlConnect, "SELECT * FROM `yourtable` ORDER BY `id` DESC LIMIT {$find},1");
$fetched_data = mysqli_fetch_assoc($query);
if(mysqli_num_rows($fetched_data>0){
return $fetch_$data;
}else{
rando('','',$max); // Start Over the results returned nothing
}
}else{
if($max != '0'){
$irand = rand(0,$max);
rando('s1',$irand,$max); // Start rando with new random ID to fetch
}else{
$query = mysqli_query($sqlConnect, "SELECT `id` FROM `yourtable` ORDER BY `id` DESC LIMIT 0,1");
$fetched_data = mysqli_fetch_assoc($query);
$max = $fetched_data['id'];
$irand = rand(1,$max);
rando('s1',$irand,$max); // Runs rando against the random ID we have selected if data exist will return
}
}
}
$your_data = rando(); // Returns listing data for a random entry as a ASSOC ARRAY
?>
Please keep in mind this code as not been tested but is a working concept to return random entries even with gaps.. As long as the gaps are not huge enough to cause a load time issue.
Quick and dirty method:
SET #COUNTER=SELECT COUNT(*) FROM your_table;
SELECT PrimaryKey
FROM your_table
LIMIT 1 OFFSET (RAND() * #COUNTER);
The complexity of the first query is O(1) for MyISAM tables.
The second query accompanies a table full scan. Complexity = O(n)
Dirty and quick method:
Keep a separate table for this purpose only. You should also insert the same rows to this table whenever inserting to the original table. Assumption: No DELETEs.
CREATE TABLE Aux(
MyPK INT AUTO_INCREMENT,
PrimaryKey INT
);
SET #MaxPK = (SELECT MAX(MyPK) FROM Aux);
SET #RandPK = CAST(RANDOM() * #MaxPK, INT)
SET #PrimaryKey = (SELECT PrimaryKey FROM Aux WHERE MyPK = #RandPK);
If DELETEs are allowed,
SET #delta = CAST(#RandPK/10, INT);
SET #PrimaryKey = (SELECT PrimaryKey
FROM Aux
WHERE MyPK BETWEEN #RandPK - #delta AND #RandPK + #delta
LIMIT 1);
The overall complexity is O(1).
SELECT DISTINCT * FROM yourTable WHERE 4 = 4 LIMIT 1;

quick selection of a random row from a large table in mysql

What is a fast way to select a random row from a large mysql table?
I'm working in php, but I'm interested in any solution even if it's in another language.
Grab all the id's, pick a random one from it, and retrieve the full row.
If you know the id's are sequential without holes, you can just grab the max and calculate a random id.
If there are holes here and there but mostly sequential values, and you don't care about a slightly skewed randomness, grab the max value, calculate an id, and select the first row with an id equal to or above the one you calculated. The reason for the skewing is that id's following such holes will have a higher chance of being picked than ones that follow another id.
If you order by random, you're going to have a terrible table-scan on your hands, and the word quick doesn't apply to such a solution.
Don't do that, nor should you order by a GUID, it has the same problem.
I knew there had to be a way to do it in a single query in a fast way. And here it is:
A fast way without involvement of external code, kudos to
http://jan.kneschke.de/projects/mysql/order-by-rand/
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
MediaWiki uses an interesting trick (for Wikipedia's Special:Random feature): the table with the articles has an extra column with a random number (generated when the article is created). To get a random article, generate a random number and get the article with the next larger or smaller (don't recall which) value in the random number column. With an index, this can be very fast. (And MediaWiki is written in PHP and developed for MySQL.)
This approach can cause a problem if the resulting numbers are badly distributed; IIRC, this has been fixed on MediaWiki, so if you decide to do it this way you should take a look at the code to see how it's currently done (probably they periodically regenerate the random number column).
Here's a solution that runs fairly quickly, and it gets a better random distribution without depending on id values being contiguous or starting at 1.
SET #r := (SELECT ROUND(RAND() * (SELECT COUNT(*) FROM mytable)));
SET #sql := CONCAT('SELECT * FROM mytable LIMIT ', #r, ', 1');
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;
Maybe you could do something like:
SELECT * FROM table
WHERE id=
(FLOOR(RAND() *
(SELECT COUNT(*) FROM table)
)
);
This is assuming your ID numbers are all sequential with no gaps.
Add a column containing a calculated random value to each row, and use that in the ordering clause, limiting to one result upon selection. This works out faster than having the table scan that ORDER BY RANDOM() causes.
Update: You still need to calculate some random value prior to issuing the SELECT statement upon retrieval, of course, e.g.
SELECT * FROM `foo` WHERE `foo_rand` >= {some random value} LIMIT 1
There is another way to produce random rows using only a query and without order by rand().
It involves User Defined Variables.
See how to produce random rows from a table
In order to find random rows from a table, don’t use ORDER BY RAND() because it forces MySQL to do a full file sort and only then to retrieve the limit rows number required. In order to avoid this full file sort, use the RAND() function only at the where clause. It will stop as soon as it reaches to the required number of rows.
See
http://www.rndblog.com/how-to-select-random-rows-in-mysql/
if you don't delete row in this table, the most efficient way is:
(if you know the mininum id just skip it)
SELECT MIN(id) AS minId, MAX(id) AS maxId FROM table WHERE 1
$randId=mt_rand((int)$row['minId'], (int)$row['maxId']);
SELECT id,name,... FROM table WHERE id=$randId LIMIT 1
I see here a lot of solution. One or two seems ok but other solutions have some constraints. But the following solution will work for all situation
select a.* from random_data a, (select max(id)*rand() randid from random_data) b
where a.id >= b.randid limit 1;
Here, id, don't need to be sequential. It could be any primary key/unique/auto increment column. Please see the following Fastest way to select a random row from a big MySQL table
Thanks
Zillur
- www.techinfobest.com
For selecting multiple random rows from a given table (say 'words'), our team came up with this beauty:
SELECT * FROM
`words` AS r1 JOIN
(SELECT MAX(`WordID`) as wid_c FROM `words`) as tmp1
WHERE r1.WordID >= (SELECT (RAND() * tmp1.wid_c) AS id) LIMIT n
The classic "SELECT id FROM table ORDER BY RAND() LIMIT 1" is actually OK.
See the follow excerpt from the MySQL manual:
If you use LIMIT row_count with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result.
With a order yo will do a full scan table.
Its best if you do a select count(*) and later get a random row=rownum between 0 and the last registry
An easy but slow way would be (good for smallish tables)
SELECT * from TABLE order by RAND() LIMIT 1
In pseudo code:
sql "select id from table"
store result in list
n = random(size of list)
sql "select * from table where id=" + list[n]
This assumes that id is a unique (primary) key.
Take a look at this link by Jan Kneschke or this SO answer as they both discuss the same question. The SO answer goes over various options also and has some good suggestions depending on your needs. Jan goes over all the various options and the performance characteristics of each. He ends up with the following for the most optimized method by which to do this within a MySQL select:
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
HTH,
-Dipin
I'm a bit new to SQL but how about generating a random number in PHP and using
SELECT * FROM the_table WHERE primary_key >= $randNr
this doesn't solve the problem with holes in the table.
But here's a twist on lassevks suggestion:
SELECT primary_key FROM the_table
Use mysql_num_rows() in PHP create a random number based on the above result:
SELECT * FROM the_table WHERE primary_key = rand_number
On a side note just how slow is SELECT * FROM the_table:
Creating a random number based on mysql_num_rows() and then moving the data pointer to that point mysql_data_seek(). Just how slow will this be on large tables with say a million rows?
I ran into the problem where my IDs were not sequential. What I came up with this.
SELECT * FROM products WHERE RAND()<=(5/(SELECT COUNT(*) FROM products)) LIMIT 1
The rows returned are approximately 5, but I limit it to 1.
If you want to add another WHERE clause it becomes a bit more interesting. Say you want to search for products on discount.
SELECT * FROM products WHERE RAND()<=(100/(SELECT COUNT(*) FROM pt_products)) AND discount<.2 LIMIT 1
What you have to do is make sure you are returning enough result which is why I have it set to 100. Having a WHERE discount<.2 clause in the subquery was 10x slower, so it's better to return more results and limit.
Use the below query to get the random row
SELECT user_firstname ,
COUNT(DISTINCT usr_fk_id) cnt
FROM userdetails
GROUP BY usr_fk_id
ORDER BY cnt ASC
LIMIT 1
In my case my table has an id as primary key, auto-increment with no gaps, so I can use COUNT(*) or MAX(id) to get the number of rows.
I made this script to test the fastest operation:
logTime();
query("SELECT COUNT(id) FROM tbl");
logTime();
query("SELECT MAX(id) FROM tbl");
logTime();
query("SELECT id FROM tbl ORDER BY id DESC LIMIT 1");
logTime();
The results are:
Count: 36.8418693542479 ms
Max: 0.241041183472 ms
Order: 0.216960906982 ms
Answer with the order method:
SELECT FLOOR(RAND() * (
SELECT id FROM tbl ORDER BY id DESC LIMIT 1
)) n FROM tbl LIMIT 1
...
SELECT * FROM tbl WHERE id = $result;
I have used this and the job was done
the reference from here
SELECT * FROM myTable WHERE RAND()<(SELECT ((30/COUNT(*))*10) FROM myTable) ORDER BY RAND() LIMIT 30;
Create a Function to do this most likely the best answer and most fastest answer here!
Pros - Works even with Gaps and extremely fast.
<?
$sqlConnect = mysqli_connect('localhost','username','password','database');
function rando($data,$find,$max = '0'){
global $sqlConnect; // Set as mysqli connection variable, fetches variable outside of function set as GLOBAL
if($data == 's1'){
$query = mysqli_query($sqlConnect, "SELECT * FROM `yourtable` ORDER BY `id` DESC LIMIT {$find},1");
$fetched_data = mysqli_fetch_assoc($query);
if(mysqli_num_rows($fetched_data>0){
return $fetch_$data;
}else{
rando('','',$max); // Start Over the results returned nothing
}
}else{
if($max != '0'){
$irand = rand(0,$max);
rando('s1',$irand,$max); // Start rando with new random ID to fetch
}else{
$query = mysqli_query($sqlConnect, "SELECT `id` FROM `yourtable` ORDER BY `id` DESC LIMIT 0,1");
$fetched_data = mysqli_fetch_assoc($query);
$max = $fetched_data['id'];
$irand = rand(1,$max);
rando('s1',$irand,$max); // Runs rando against the random ID we have selected if data exist will return
}
}
}
$your_data = rando(); // Returns listing data for a random entry as a ASSOC ARRAY
?>
Please keep in mind this code as not been tested but is a working concept to return random entries even with gaps.. As long as the gaps are not huge enough to cause a load time issue.
Quick and dirty method:
SET #COUNTER=SELECT COUNT(*) FROM your_table;
SELECT PrimaryKey
FROM your_table
LIMIT 1 OFFSET (RAND() * #COUNTER);
The complexity of the first query is O(1) for MyISAM tables.
The second query accompanies a table full scan. Complexity = O(n)
Dirty and quick method:
Keep a separate table for this purpose only. You should also insert the same rows to this table whenever inserting to the original table. Assumption: No DELETEs.
CREATE TABLE Aux(
MyPK INT AUTO_INCREMENT,
PrimaryKey INT
);
SET #MaxPK = (SELECT MAX(MyPK) FROM Aux);
SET #RandPK = CAST(RANDOM() * #MaxPK, INT)
SET #PrimaryKey = (SELECT PrimaryKey FROM Aux WHERE MyPK = #RandPK);
If DELETEs are allowed,
SET #delta = CAST(#RandPK/10, INT);
SET #PrimaryKey = (SELECT PrimaryKey
FROM Aux
WHERE MyPK BETWEEN #RandPK - #delta AND #RandPK + #delta
LIMIT 1);
The overall complexity is O(1).
SELECT DISTINCT * FROM yourTable WHERE 4 = 4 LIMIT 1;