SELECT *
FROM users
ORDER BY highscore DESC
LIMIT 5 OFFSET 5
and
SELECT *
FROM users
ORDER BY highscore DESC
LIMIT 5 OFFSET 10
returning the same result. And they are different than the records when I omit the LIMIT clause! I searched the community. There are similar questions, but they are of no help.
EDIT: Here are the table data-
Presumably, the issue is that you have ties for highscore. When you have ties, then MySQL orders the rows with the same value in an arbitrary and indeterminate way. Even two runs of the same query can result in different orderings.
Why? The reason is simple. There is no "natural" order for sorting keys with the same value. SQL tables represent unordered sets.
To make the sorting stable, include a unique id as the last key in the ORDER BY:
SELECT u.*
FROM users u
ORDER BY u.highscore DESC, u.userId
LIMIT 5 OFFSET 5;
Then, when you fetch the next 5 rows, they will be different.
Related
It is a very simple query. For every query, I get a different result. Similar things happen when I used TOP 1. I would like a random sub-sample and it works. But am I missing something? Why does it return a different value every time?
SELECT DISTINCT user_id FROM table1
where day_id>="2009-01-09" and day_id<"2011-02-16"
LIMIT 1;
There's no guarantee that you will get a random result with your query. It's quite likely you'll get the same result each time (although the actual result returned will be indeterminate). To guarantee that you get a random, unique user_id, you should SELECT a random value from the list of DISTINCT values:
SELECT user_id
FROM (SELECT DISTINCT user_id
FROM table1
WHERE day_id >= "2009-01-09" AND day_id < "2011-02-16"
) u
ORDER BY RAND()
LIMIT 1
SQL statements represent unordered sets, add order by clause such as
...
ORDER BY user_id
LIMIT 1
I need to select say 2000000 records at random from a very large database. I looked at previous questions. So please do not mark this question as duplicate. I need clarification. Most answers suggest using ORDER BY RAND() function. So my query will be:
SELECT DISTINCT no
FROM table
WHERE name != "null"
ORDER BY RAND()
LIMIT 2000000;
I want each record to be selected at random. I am not sure if I understand the ORDER BY RAND() effect here. But I am afraid it will select a random record, say 3498 and will continue selection from there, say, the next records will be: 3499, 3500, 3501, etc.
I want each recor to be random, not to start the order from a random record.
How can I select 2000000 random record where each record is selected at random? Can you simplify what exactly ORDER BY RAND() does?
Note that I use Google BigQuery so the performance issue should not be a big problem here. I just want to achieve the requirement of selecting random 2000000 records.
SELECT x
FROM T
ORDER BY RAND()
is equivalent to
SELECT x
FROM (
SELECT x, RAND() AS r
FROM T
)
ORDER BY r
The query generates a random value for each row, then uses that random value to order the rows. If you include a limit:
SELECT x
FROM T
ORDER BY RAND()
LIMIT 10
This randomly selects 10 rows from the table.
I'm creating a list for best movies which are based on the users votes like imdb. I have done the list with this sql query:
SELECT data_id, COUNT(point), SUM(point)
FROM voting_table
WHERE data_type='1'
GROUP BY data_id
order by SUM(point)/COUNT(point)
DESC limit 100
This works well but i want also the number of votes(point) affect the order. I mean if a movie gets average of 8 with 10 votes and another movie gets average of 8 but with 5 votes. The one which has more votes should be listed higher than the other one. How can i do it? And is the query i wrote is efficent for server performance?
There is function AGV, I suggest you use that.
sort by avg, then by count or sum
...
ORDER BY AVG(point) DESC, COUNT(point) DESC
...
As of performance, there is not much you can do wihout complicating data structure.
It should be fine as it is unless your site si going to be as popular as imdb.
If your voting table grows past the point where speedup is needed then you need to start precalculating stuff (for real time updates using triggers that update score in movies table or some other intermediate table dedicated for that, other methods)
Just add the second order you want separated by comma .try this
SELECT data_id, COUNT(point), SUM(point)
FROM voting_table
WHERE data_type='1'
GROUP BY data_id
order by SUM(point)/COUNT(point) ,COUNT(point)
DESC limit 100
Try changing your order by clause to be:
order by SUM(point)/COUNT(point) desc, COUNT(point) desc
As it stands, your query appears to be efficient.
I was under the impression that using an ORDER BY in an SQL query would not affect which records were selected for the result set. I thought that ORDER BY would only affect the presentation of the result set.
Recently, however, I was getting unexpected results from a query until I used an ORDER BY clause. This suggests that either a) ORDER BY can affect which records are included in the result set, or b) I have some other bug which I need to work on.
Which is it?
Here's the query: SELECT node_id FROM users ORDER BY node_id LIMIT 100
(node_id is both a primary key and foreign key).
As you can see, the query includes a LIMIT clause. It seems that if I use the ORDER BY, the records are ordered before the top 100 are selected. I had expected it to select 100 records based on natural order, then order them according to node_id.
I've looked for info on ORDER BY but as yet, the only info I can find suggests that it affects presentation only... I am using MySQL.
ORDER BY reflects the order of all of the records before the LIMIT Clause. To get the result you want you will need this:
select u.node_id
from users u
join
(
SELECT node_id
FROM users
LIMIT 100
) us ON u.node_id = us.node_id
ORDER BY u.node_id
This way you will use the limit clause first and get the top 100 records and then you will sort the result of that. The join clause is faster than a double Select statement especially if you are working with many records.
You can use a nested query:
SELECT node_id FROM
(
SELECT node_id FROM users LIMIT 100
) u
ORDER BY node_id
I have a query that looks like this:
SELECT article FROM table1 ORDER BY publish_date LIMIT 20
How does ORDER BY work? Will it order all records, then get the first 20, or will it get 20 records and order them by the publish_date field?
If it's the last one, you're not guaranteed to really get the most recent 20 articles.
It will order first, then get the first 20. A database will also process anything in the WHERE clause before ORDER BY.
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements).
With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:
SELECT * FROM tbl LIMIT 95,18446744073709551615;
With one argument, the value specifies the number of rows to return from the beginning of the result set:
SELECT * FROM tbl LIMIT 5; # Retrieve first 5 rows
In other words, LIMIT row_count is equivalent to LIMIT 0, row_count.
All details on: http://dev.mysql.com/doc/refman/5.0/en/select.html
Just as #James says, it will order all records, then get the first 20 rows.
As it is so, you are guaranteed to get the 20 first published articles, the newer ones will not be shown.
In your situation, I recommend that you add desc to order by publish_date, if you want the newest articles, then the newest article will be first.
If you need to keep the result in ascending order, and still only want the 10 newest articles you can ask mysql to sort your result two times.
This query below will sort the result descending and limit the result to 10 (that is the query inside the parenthesis). It will still be sorted in descending order, and we are not satisfied with that, so we ask mysql to sort it one more time. Now we have the newest result on the last row.
select t.article
from
(select article, publish_date
from table1
order by publish_date desc limit 10) t
order by t.publish_date asc;
If you need all columns, it is done this way:
select t.*
from
(select *
from table1
order by publish_date desc limit 10) t
order by t.publish_date asc;
I use this technique when I manually write queries to examine the database for various things. I have not used it in a production environment, but now when I bench marked it, the extra sorting does not impact the performance.
You could add [asc] or [desc] at the end of the order by to get the earliest or latest records
For example, this will give you the latest records first
ORDER BY stamp DESC
Append the LIMIT clause after ORDER BY
If there is a suitable index, in this case on the publish_date field, then MySQL need not scan the whole index to get the 20 records requested - the 20 records will be found at the start of the index. But if there is no suitable index, then a full scan of the table will be needed.
There is a MySQL Performance Blog article from 2009 on this.
You can use this code
SELECT article FROM table1 ORDER BY publish_date LIMIT 0,10
where 0 is a start limit of record & 10 number of record
LIMIT is usually applied as the last operation, so the result will first be sorted and then limited to 20. In fact, sorting will stop as soon as first 20 sorted results are found.
Could be simplified to this:
SELECT article FROM table1 ORDER BY publish_date DESC FETCH FIRST 20 ROWS ONLY;
You could also add many argument in the ORDER BY that is just comma separated like: ORDER BY publish_date, tab2, tab3 DESC etc...