MySQL Optimising order by primary key on large table with limit

MySQL Optimising order by primary key on large table with limit - mysql

Sorry for a cryptic title... My issue:
I have a mysql query which in the most simplified form would looks like this:
SELECT * FROM table
WHERE _SOME\_CONDITIONS_
ORDER BY `id` DESC
LIMIT 50
Without the LIMIT clause, the query would return around 50,000 rows, however I am ever only interested in the first 50 rows. Now I realise that because I add ORDER BY bit MySQL has to create a temporary table and load all results in it, then order the 50,000 results and only then it can return the first 50 results.
When I compare performance of this query versus query without ORDER BY I get a staggering difference of 1.8 seconds vs 0.02 seconds.
Given that id is auto incrementing primary key I thought that there should be an elegant work around for my problem. Is there any?

Are the SOME_CONDITIONS such that you could give the query an ID range? At the very least, you could limit the number of rows being added into the temporary table before the sort.
For example:
SELECT * FROM table
WHERE _SOME\_CONDITIONS_
AND id BETWEEN 1 AND 50;
Alternatively maybe a nested query to give the min and max IDs if the SOME_CONDITIONS prevent you from making this sort of assumption about the range of IDs in your result.
If the performance is really that important, I would de-normalize the data by creating another table or cache of the first 50 results and keeping it updated separately.

Related

Optimize LIMIT the number of rows to be SELECT in SQL

Consider a table Test having 1000 rows
Test Table
id name desc
1 Adi test1
2 Sam test2
3 Kal test3
.
.
1000 Jil test1000
If i need to fetch, say suppose 100 rows(i.e. a small subset) only, then I am using LIMIT clause in my query
SELECT * FROM test LIMIT 100;
This query first fetches 1000 rows and then returns 100 out of it.
Can this be optimised, such that the DB engine queries only 100 rows and returns them
(instead of fetching all 1000 rows first and then returning 100)
Reason for above supposition is that the order of processing will be
FROM
WHERE
SELECT
ORDER BY
LIMIT

You can combine LIMIT ROW COUNT with an ORDER BY, This causes MySQL to stop sorting as soon as it has found the first ROW COUNT rows of the sorted result.
Hope this helps, If you need any clarification just drop a comment.

The query you wrote will fetch only 100 rows, not 1000. But, if you change that query in any way, my statement may be wrong.
GROUP BY and ORDER BY are likely to incur a sort, which is arguably even slower than a full table scan. And that sort must be done before seeing the LIMIT.
Well, not always...
SELECT ... FROM t ORDER BY x LIMIT 100;
together with INDEX(x) -- This may use the index and fetch only 100 rows from the index. BUT... then it has to reach into the data 100 times to find the other columns that you ask for. UNLESS you only ask for x.
Etc, etc.
And here's another wrinkle. A lot of questions on this forum are "Why isn't MySQL using my index?" Back to your query. If there are "only" 1000 rows in your table, my example with the ORDER BY x won't use the index because it is faster to simply read through the table, tossing 90% of the rows. On the other hand, if there were 9999 rows, then it would use the index. (The transition is somewhere around 20%, but it that is imprecise.)
Confused? Fine. Let's discuss one query at a time. I can [probably] discuss the what and why of each one you throw at me. Be sure to include SHOW CREATE TABLE, the full query, and EXPLAIN SELECT... That way, I can explain what EXPLAIN tells you (or does not).
Did you know that having both a GROUP BY and ORDER BY may cause the use of two sorts? EXPLAIN won't point that out. And sometimes there is a simple trick to get rid of one of the sorts.
There are a lot of tricks up MySQL's sleeve.

understanding mysql limit with non indexed

i have this query which is very simple but i dont want to use index here due to some constraints.
so my worry is how to avoid huge load on server if we are calling non indexed item in where clause.
the solution i feel will be limit.
i am sure of having data in 1000 rows so if i use limit i can check the available values.
SELECT *
from tableA
where status='1' and student='$student_no'
order by id desc
limit 1000
here student column is not indexed in mysql so my worry is it will cause huge load in server
i tried with explain and it seems to be ok but problem is less no of rows in table and as u know mysql goes crazy with more data like millions of rows.
so what are my options ??
i should add index for student ??
if i will add index then i dont need 1000 rows in limit. one row is sufficient and as i said table is going to be several millions of rows so it requires lot of space so i was thinking to avoid indexing of student column and other query is 1000 row with desc row should not cause load on server as id is indexed.
any help will be great

You say:
but i dont want to use index here due to some constraints...
and also say:
how to avoid huge load on server...
If you don't use an index, you'll produce "huge load" on the server. If you want this query to be less resource intensive, you need to add an index. For the aforementioned query the ideal index is:
create index on tableA (student, status, id);
This index should make your query very fast, even with millions of rows.

LIMIT 100 doesn't force the database to search in the first 100 rows.
It just stop searching after 100 matches are found.
So it is not used for performance.
In the query below
SELECT *
from tableA
where status='1' and student='$student_no'
order by id desc
limit 1000
The query will run until it finds 1000 matches.
It doesn't have to search only the first 1000 rows
So this is the behaviour of the above query:
int nb_rows_matched = 0;
while (nb_rows_matched < 1000){
search_for_match();
}

mysql slow query when results are less than limit

i've a table with 550.000 records
SELECT * FROM logs WHERE user = 'user1' ORDER BY date DESC LIMIT 0, 25
this query takes 0.0171 sec. without LIMIT, there are 3537 results
SELECT * FROM logs WHERE user = 'user2' ORDER BY date DESC LIMIT 0, 25
this query takes 3.0868 sec. without LIMIT, there are 13 results
table keys are:
PRIMARY KEY (`id`),
KEY `date` (`date`)
when using "LIMIT 0,25" if there are less records than 25, the query slows down. How can I solve this problem?

Using limit 25 allows the query to stop when it found 25 rows.
If you have 3537 matching rows out of 550.000, it will, on average, assuming equal distribution, have found 25 rows after examining 550.000/3537*25 rows = 3887 rows in a list that is ordered by date (the index on date) or a list that is not ordered at all.
If you have 13 matching rows out of 550.000, limit 25 will have to examine all 550.000 rows (that are 141 times as many rows), so we expect 0.0171 sec * 141 = 2.4s. There are obviously other factors that determine runtime too, but the order of magnitude fits.
There is an additional effect. Unfortunately the index by date does not contain the value for user, so MySQL has to look up that value in the original table, by jumping back and forth in that table (because the data itself is ordered by the primary key). This is slower than reading the unordered table directly.
So actually, not using an index at all could be faster than using an index, if you have a lot of rows to read. You can force MySQL to not use it by using e.g. FROM logs IGNORE INDEX (date), but this will have the effect that it now has to read the whole table in absolutely every case: the last row could be the newest and thus has to be in the resultset, because you ordered by date. So it might slow down your first query - reading the full 550.000 rows fast can be slower than reading 3887 rows slowly by jumping back and forth. (MySQL doesn't know this either beforehand, so it took a choice - for your second query obviously the wrong one).
So how to get faster results?
Have an index that is ordered by user. Then the query for 'user2' can stop after 13 rows, because it knows there are no more rows. And this will now be faster than the query for 'user1', that has to look through 3537 rows and then order them afterwards by date.
The best index for your query would therefore be user, date, because it then knows when to stop looking for further rows AND the list is already ordered the way you want it (and beat your 0.0171s in all cases).
Indexes require some resources too (e.g. hdd space and time to update the index when you update your table), so adding the perfect index for every single query might be counterproductive sometimes for the system as a whole.

Fastest way to SELECT through MySQL table records backwards from a certain row?

With a table over 18 million rows:
SELECT * FROM tbl WHERE id > 10000000 LIMIT 30
Took 0.0724 sec.
SELECT * FROM tbl WHERE id < 10000000 ORDER BY id DESC LIMIT 30
Took 0.0565 sec.
Is this the fastest way to SELECT certain number of records before a certain row in MySQL?
It seems good enough but doesn't MySQL have to first order those 10 million rows in descending order before SELECT-ing the 30 rows?
I'm asking this is because I'm not so sure of this query I came up. It does seem work and fast enough but looking at the grammatical semantics, I'm not so sure.
Is MySQL intelligent enough to know that it doesn't have to order all those 10 million rows?
Or is there any better way to achieve this?

It won't have to order the rows at all if you have an index on the id column (or have it as the initial portion of the primary key).
It will simply use that index, giving already sorted data, and just grab the first thirty elements.
Yes, if it can't efficiently retrieve the rows in sorted order, it likely will have to get the lot and then sort. But that would be rather bad DBMS design.

How to speed up the mysql query when I have million records and have to fetch record in slabs of 50?

Say, I've table with 100,000 rows, I want to fetch rows in slabs of 50 with a particular Where clause
Standard way of doing this: select * from table where userid=5 limit 50 offset 90500;
This runs awefully slow.
Cause: All 100,000 rows are analyzed first and Limit is applied at the last stage.
Any thoughts how to speed this up. Anyone ?

Putting an index on "userid" should really help.

Use a primary key in the order by as galz says e.g order by id assuming your table has the column id
Alternatively, try use an index field (generalization of 1)
You may also use partitions

1 - Using ORDER BY you can improve, but no so much;
2 - Enabling the cache and then selecting from the cache, may improve the query;
3 - Setting an index helps, but no so much too.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008