Is there a query to select random row from a set of returned rows from a SELECT statement in sql? - mysql

Suppose a SELECT query returns 10 rows. Is there any one line query such as this (which I tried but did not work) to select one random row from return results of a SELECT query -
select name from (select * from my_table where age > 10 ORDER BY age ASC AS rows)
ORDER BY RAND() LIMIT 1;

One approach is to do LIMIT RAND() and enclose this in another SELECT statement which does LIMIT 1.
Another approach is to add a new column to the table, initialize it with RAND(), and then select from it, ordering by the random column with LIMIT 1. You may even be able to do this on the fly, by JOINing your original table with another table consisting of a single column that takes values from RAND().

Your logic is correct, you just have the syntax wrong.
select name from (
select *
FROM my_table
where age > 10) AS rows
ORDER BY RAND()
LIMIT 1;
You were missing FROM in the subquery, and the alias for the subquery goes outside the parentheses.
DEMO

Related

Outer query is very slow when inner query returns no results

I'm trying to fetch a row from a table called export with random weights. It should then fetch one row from another table export_chunk which references the first row. This is the query:
SELECT * FROM export_chunk
WHERE export_id=(
SELECT id FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight LIMIT 1)
AND status='PENDING'
LIMIT 2;
The export table can have 1000 rows while the export_chunk table can have millions of rows.
The query is very fast when the inner query returns a row. However, if there are no rows with schedulable=1, the outer query performs a full table scan on export_chunk. Why does this happen and is there any way to prevent it?
EDIT: Trying COALESCE()
Akina in the comments suggested using COALESCE, ie.:
SELECT * FROM export_chunk
WHERE export_id=COALESCE(
SELECT id FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight LIMIT 1)
,-1)
AND status='PENDING'
LIMIT 2;
This should work. When I run:
SELECT COALESCE((SELECT id FROM export WHERE schedulable=1 ORDER BY -LOG(1-RAND())/export.weight LIMIT 1), -1) FROM export;
It does return -1 for each row which Akina predicted. And if I manually search for -1 instead of the inner query it returns no rows very quickly. However, when I try to use COALESCE on the inner query it is still really slow. I do not understand why.
Test this:
SELECT export_chunk.*
FROM export_chunk
JOIN ( SELECT id
FROM export
WHERE schedulable=1
ORDER BY -LOG(1 - RAND())/export.weight
LIMIT 1 ) AS random_row ON export_chunk.export_id=random_row.id
WHERE export_chunk.status='PENDING'
LIMIT 2;
Does this matches needed logic? especially when no matching rows in the subquery - do you need none output rows (like now) or any 2 rows in this case?
PS. LIMIT without ORDER BY in outer query is strange.

Why does SQL LIMIT clause returns random rows for every query?

It is a very simple query. For every query, I get a different result. Similar things happen when I used TOP 1. I would like a random sub-sample and it works. But am I missing something? Why does it return a different value every time?
SELECT DISTINCT user_id FROM table1
where day_id>="2009-01-09" and day_id<"2011-02-16"
LIMIT 1;
There's no guarantee that you will get a random result with your query. It's quite likely you'll get the same result each time (although the actual result returned will be indeterminate). To guarantee that you get a random, unique user_id, you should SELECT a random value from the list of DISTINCT values:
SELECT user_id
FROM (SELECT DISTINCT user_id
FROM table1
WHERE day_id >= "2009-01-09" AND day_id < "2011-02-16"
) u
ORDER BY RAND()
LIMIT 1
SQL statements represent unordered sets, add order by clause such as
...
ORDER BY user_id
LIMIT 1

How to get the number of rows returned from a query?

Is there a way that I can get the COUNT(*) of what a query will return? For example:
SELECT * FROM table LIMIT 10 // Query
SELECT (*) FROM table LIMIT 10 // Query Count
This would actually ignore the limit (
MySQL COUNT with LIMIT). While this might be fine and 'correct' within sql, I need the exact number of rows the query is returning. How would this be done?
if you actually want to accommodate limit, you can use:
select count(*) from (SELECT * FROM table LIMIT 10) as t

SELECT SUM(CRC32(column)) LIMIT doesn't work

I need to get the checksum of a specific column and specific number of rows using LIMIT.
SELECT SUM(CRC32(column)) FROM table LIMIT 0, 100;
But, the value returned is the checksum of the entire table. Why does that happen? How can I use LIMIT to get the checksum of only a specific number of rows?
The following works:
SELECT SUM(CRC32(column)) FROM table WHERE id > 0 AND id < 101;
But I don't want to use this method because of potential jumps in the auto_increment value.
Why does that happen?
LIMIT gets applied after aggregate functions.
When used without GROUP BY, SUM aggregates over the whole table, leaving you with only one record, which falls under the LIMIT.
Use this:
SELECT SUM(CRC32(column))
FROM (
SELECT column
FROM mytable
ORDER BY
id
LIMIT 100
) q
You can do
SELECT SUM(CRC32(column)) FROM
(SELECT column FROM table LIMIT 0, 100) AS tmp;

Select most common value from a field in MySQL

I have a table with a million rows, how do i select the most common(the value which appears most in the table) value from a field?
You need to group by the interesting column and for each value, select the value itself and the number of rows in which it appears.
Then it's a matter of sorting (to put the most common value first) and limiting the results to only one row.
In query form:
SELECT column, COUNT(*) AS magnitude
FROM table
GROUP BY column
ORDER BY magnitude DESC
LIMIT 1
This thread should shed some light on your issue.
Basically, use COUNT() with a GROUP BY clause:
SELECT foo, COUNT(foo) AS fooCount
FROM table
GROUP BY foo
ORDER BY fooCount DESC
And to get only the first result (most common), add
LIMIT 1
To the end of your query.
In case you don't need to return the frequency of the most common value, you could use:
SELECT foo
FROM table
GROUP BY foo
ORDER BY COUNT(foo) DESC
LIMIT 1
This has the additional benefit of only returning one column and therefore working in subqueries.