I got one question over here towards MySQL Limit.
let's say i got one table with 100 rows
then after i done query operation (SELECT, WHERE, etc etc)
then i limit the size with LIMIT(10)
in this case the MySQL is retrieving the 100 rows records first then only cut to size 10 OR count the result size until 10 then stop retrieving the remaining already?
Let's think about this logically, and maybe the answer will become evident. Imagine you are using the following query:
SELECT someCol
FROM yourTable
ORDER BY someCol
LIMIT 10
It should be intuitive that MySQL has to know the ordinal position of every record in the result set in order to be able to guarantee that the 10 records returned are in fact the first 10 records of what the entire result set would be.
If MySQL were to just take the first 10 records which it hit during the scan, then in general it could not guarantee that the records returned respect the ordering you specified.
Related
If I have a mysql limited query:
SELECT * FROM my_table WHERE date > '2020-12-12' LIMIT 1,16;
Is there a faster way to check and see how many results are left after my limit?
I was trying to do a count with limit, but that wasn't working, i.e.
SELECT count(ID) AS count FROM my_table WHERE date > '2020-12-12' LIMIT 16,32;
The ultimate goal here is just to determine if there ARE any other rows to be had beyond the current result set, so if there is another faster way to do this that would be fine too.
It's best to do this by counting the rows:
SELECT count(*) AS count FROM my_table WHERE date > '2020-12-12'
That tells you how many total rows match the condition. Then you can compare that to the size of the result you got with your query using LIMIT. It's just arithmetic.
Past versions of MySQL had a function FOUND_ROWS() which would report how many rows would have matched if you didn't use LIMIT. But it turns out this had worse performance than running two queries, one to count rows and one to do your limit. So they deprecated this feature.
For details read:
https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
https://dev.mysql.com/worklog/task/?id=12615
(You probably want OFFSET 0, not 1.)
It's simple to test whether there ARE more rows. Assuming you want 16 rows, use 1 more:
SELECT ... WHERE ... ORDER BY ... LIMIT 0,17
Then programmatically see whether it returned only 16 rows (no more available) or 17 (there ARE more).
Because it is piggybacking on the fetch you are already doing and not doing much extra work, it is very efficient.
The second 'page' would use LIMIT 16, 17; 3rd: LIMIT 32,17, etc. Each time, you are potentially getting and tossing an extra row.
I discuss this and other tricks where I point out the evils of OFFSET: Pagination
COUNT(x) checks x for being NOT NULL. This is [usually] unnecessary. The pattern COUNT(*) (or COUNT(1)) simply counts rows; the * or 1 has no significance.
SELECT COUNT(*) FROM t is not free. It will actually do a full index scan, which is slow for a large table. WHERE and ORDER BY are likely to add to that slowness. LIMIT is useless since the result is always 1 row. (That is, the LIMIT is applied to the result, not to the counting.)
Consider a table Test having 1000 rows
Test Table
id name desc
1 Adi test1
2 Sam test2
3 Kal test3
.
.
1000 Jil test1000
If i need to fetch, say suppose 100 rows(i.e. a small subset) only, then I am using LIMIT clause in my query
SELECT * FROM test LIMIT 100;
This query first fetches 1000 rows and then returns 100 out of it.
Can this be optimised, such that the DB engine queries only 100 rows and returns them
(instead of fetching all 1000 rows first and then returning 100)
Reason for above supposition is that the order of processing will be
FROM
WHERE
SELECT
ORDER BY
LIMIT
You can combine LIMIT ROW COUNT with an ORDER BY, This causes MySQL to stop sorting as soon as it has found the first ROW COUNT rows of the sorted result.
Hope this helps, If you need any clarification just drop a comment.
The query you wrote will fetch only 100 rows, not 1000. But, if you change that query in any way, my statement may be wrong.
GROUP BY and ORDER BY are likely to incur a sort, which is arguably even slower than a full table scan. And that sort must be done before seeing the LIMIT.
Well, not always...
SELECT ... FROM t ORDER BY x LIMIT 100;
together with INDEX(x) -- This may use the index and fetch only 100 rows from the index. BUT... then it has to reach into the data 100 times to find the other columns that you ask for. UNLESS you only ask for x.
Etc, etc.
And here's another wrinkle. A lot of questions on this forum are "Why isn't MySQL using my index?" Back to your query. If there are "only" 1000 rows in your table, my example with the ORDER BY x won't use the index because it is faster to simply read through the table, tossing 90% of the rows. On the other hand, if there were 9999 rows, then it would use the index. (The transition is somewhere around 20%, but it that is imprecise.)
Confused? Fine. Let's discuss one query at a time. I can [probably] discuss the what and why of each one you throw at me. Be sure to include SHOW CREATE TABLE, the full query, and EXPLAIN SELECT... That way, I can explain what EXPLAIN tells you (or does not).
Did you know that having both a GROUP BY and ORDER BY may cause the use of two sorts? EXPLAIN won't point that out. And sometimes there is a simple trick to get rid of one of the sorts.
There are a lot of tricks up MySQL's sleeve.
Can someone explain how construction group by + having + limit exactly work? MySQL query:
SELECT
id,
avg(sal)
FROM
StreamData
WHERE
...
GROUP BY
id
HAVING
avg(sal)>=10.0
AND avg(sal)<=50.0
LIMIT 100
Query without limit and having clauses executes for 7 seconds, with limit - instantly if condition covers a large amount of data or ~7 seconds otherwise.
Documentation says that limit executes after having which after group by, this means that query should always execute for ~7 seconds. Please help to figure out what is limited by LIMIT clause.
Using LIMIT 100 simply tells MySQL to return only the first 100 records from your result set. Assuming that you are measuring the query time as the round trip from Java, then one component of the query time is the network time needed to move the result set from MySQL across the network. This can take a considerable time for a large result set, and using LIMIT 100 should reduce this time to zero or near zero.
Things are logically applied in a certain pipeline in SQL:
Table expressions are generated and executed (FROM, JOIN)
Rows filtered (WHERE)
Projections and aggregations applied (column list, aggregates, GROUP BY)
Aggregations filtered (HAVING)
Results limited (LIMIT, OFFSET)
Now these may be composed into a different execution order by the planner if that is safe but you always get the proper data out if you think through them in this order.
So group by groups, then these are filtered with having, then the results of that are truncated.
As soon as MySQL has sent the required number of rows to the client,
it aborts the query unless you are using SQL_CALC_FOUND_ROWS. The
number of rows can then be retrieved with SELECT FOUND_ROWS(). See
Section 13.14, “Information Functions”.
http://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html
This effectively means that if your table has a rather hefty number of rows, the server doesn't need to look at all of them. It can stop as soon as it has found a 100 because it knows that's all that you need.
I am testing my database design under load and I need to retrieve only a fixed number of rows (5000)
I can specify a LIMIT to achieve this, however it seems that the query builds the result set of all rows that match and then returns only the number of rows specified in the limit. Is that how it is implemented?
Is there a for MySQL to read one row, read another one and basically stop when it retrieves the 5000th matching row?
MySQL is smart in that if you specify a LIMIT 5000 in your query, and it is possible to produce that result without generating the whole result set first, then it will not build the whole result.
For instance, the following query:
SELECT * FROM table ORDER BY column LIMIT 5000
This query will need to scan the whole table unless there is an index on column, in which case it does the smart thing and uses the index to find the rows with the smallest column.
SELECT * FROM `your_table` LIMIT 0, 5000
This will display the first 5000 results from the database.
SELECT * FROM `your_table` LIMIT 1001, 5000
This will show records from 1001 to 6000 (counting from 0).
Complexity of such query is O(LIMIT) (unless you specify order by).
It means that if 10000000 rows will match your query, and you specify limit equal to 5000, then the complexity will be O(5000).
#Jarosław Gomułka is right
If you use LIMIT with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result. If ordering is done by using an index, this is very fast. In either case, after the initial rows have been found, there is no need to sort any remainder of the result set, and MySQL does not do so.
if the set is not sorted it terminates the SELECT operation as soon as it's got enough rows to the result set.
The exact plan the query optimizer uses depends on your query (what fields are being selected, the LIMIT amount and whether there is an ORDER BY) and your table (keys, indexes, and number of rows in the table). Selecting an unindexed column and/or ordering by a non-key column is going to produce a different execution plan than selecting a column and ordering by the primary key column. The later will not even touch the table, and only process the number of rows specified in your LIMIT.
Each database defines its own way of limiting the result set size depends on the database you are using.
While the SQL:2008 specification defines a standard syntax for limiting a SQL query, MySQL 8 does not support it.
Therefore, on MySQL, you need to use the LIMIT clause to restrict the result set to the Top-N records:
SELECT
title
FROM
post
ORDER BY
id DESC
LIMIT 50
Notice that we are using an ORDER BY clause since, otherwise, there is no guarantee which are the first records to be included in the returning result set.
In MySQL, how can I retrieve ALL rows in a table, starting from row X? For example, starting from row 6:
LIMIT 5,0
This returns nothing, so I tried this:
LIMIT 5,ALL
Still no results (sql error).
I'm not looking for pagination functionality, just retrieving all rows starting from a particular row. LIMIT 5,2000 seems like overkill to me. Somehow Google doesn't seem to get me some answers. Hope you can help.
Thanks
According to the documentation:
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:
SELECT * FROM tbl LIMIT 95, 18446744073709551615;
This is the maximum rows a MyISAM table can hold, 2^64-1.
There is a limit of 2^32 (~4.295E+09) rows in a MyISAM table. If you build MySQL with the --with-big-tables option, the row limitation is increased to (2^32)^2 (1.844E+19) rows. See Section 2.16.2, “Typical configure Options”. Binary distributions for Unix and Linux are built with this option.
If you're looking to get the last x number of rows, the easiest thing to do is SORT DESC and LIMIT to the first x rows. Granted, the SORT will slow your query down. But if you're opposed to setting an arbitrarily large number as the second LIMIT arg, then that's the way to do it.
The only solution I am aware of currently is to do as you say and give a ridiculously high number as the second argument to LIMIT. I do not believe there is any difference in performance to specifying a low number or a high number, mysql will simply stop returning rows at the end of the result set, or when it hits your limit.
I think you don't need to enter max value for select all by LIMIT. It is enough to find count of table and then use it as max LIMIT.
The next query should work too, and is in my opinion more effective...
SELECT * FROM mytbl WHERE id != 1 ORDER BY id asc
By ordering the query will find the id imediately and skip this one, so the next rows he won't check anymore whether the id = 1.