MySQL / PHP query performance? - mysql

Im in a dilema on which one of these methods are most efficient.
Suppose you have a query joining multiple tables and querying thousand of records. Than, you gotta get the total to paginate throughout all these results.
Is it faster to?
1) Do a complete select (suppose you have to select 50's columns), count the rows and than run another query with limits? (Will the MySQL cache help this case already selecting all the columns you need on the first query used to count?)
2) First do the query using COUNT function and than do the query to select the results you need.
3) Instead of using MySQL COUNT function, do the query selecting the ID's for example and use the PHP function mysql_num_rows?
I think the number 2 is the best option, using MySQL built in COUNT function, but I know MySQL uses cache, so, selecting all the results on first query gonna be faster?
Thanks,

Have a look at Found_Rows()
A SELECT statement may include a LIMIT clause to restrict the number
of rows the server returns to the client. In some cases, it is
desirable to know how many rows the statement would have returned
without the LIMIT, but without running the statement again. To obtain
this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT
statement, and then invoke FOUND_ROWS() afterward:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
The second SELECT returns a number indicating how many rows the first SELECT`
would have returned had it been written without the LIMIT clause.

My guess is number 2, but the truth is that it will depend entirely on data size, tables, indexing, MySql version etc.
The only way of finding the answer to this is to try each one and measure how long they take. But like I say, my hunch would be number 2.

Related

How to get count of rows and rows in one query?

Let's say I have a black box query that I don't really understand how it works, something along the lines of:
SELECT ... FROM ... JOIN ... = A (denoted as A)
Let's say A returns 500 rows.
I want to get the count of the number of rows (500 in this case), and then only return a limit of 50.
How can I wrote a query built around A that would return the number '500' and 50 rows of data?
You can use window functions (available in MySQL 8.0 only) and a row-limiting clause:
select a.*, count(*) over() total_rows
from ( < your query >) a
order by ??
limit 50
Note that I added an order by clause to the query. Although this is not technically required, it is a best practice: without an order by clause where the column (or set of columns) uniquely identifies each row, it is undefined which 50 rows the database will return, and the results may not be consistent over consecutive executions of the same query.
This is what SELECT SQL_CALC_FOUND_ROWS is intended to do.
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();
The first query returns the limited set of rows.
The second query calls FOUND_ROWS() which returns an integer number of how many rows matched the most recent query, the number of rows which would have been returned if that query had not used LIMIT.
See https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_found-rows
However, keep in mind that using SQL_CALC_FOUND_ROWS incurs a significant performance cost. Benchmarks show that it's usually faster to just run two queries:
SELECT COUNT(*) FROM tbl_name WHERE id > 100; -- the count of matching rows
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10; -- the limited result
See https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/
There are a few ways you can do this (assuming that I am understanding your question correctly). You can open run two queries (and point a cursor to each) and then open and return both cursors, or you can run a stored procedure in which the count query is ran first, the result is stored into a variable, then it is used in another query.
Let me know if you would like an example of either of these

Two selects in one query

I have a product table.
I'm going to create a pagination on my frontend.
So I need to do this select:
select * from `product` limit 20 offset 0
I also need the total number of records in the table of products (I'll show the number of pages). I have this query:
select count(*) as all from `product`
I wanted to run these two queries in just one query.
Something like:
select *, count(*) as total from `hospedes` limit 20 offset 0
the above query does not work. is there any way to do this?
I don't believe you can run this as one query unless you are willing to give up the "LIMIT" and just retrieve all records, do a count on the result set, and throw away all but the first 20.
I recommend you run two queries. One to get the total COUNT(*) from the table, and then run a separate query for your pagination.
In theory you can do this:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
This promises to let you get the information of how many tables match the query as if you had not used LIMIT. Read https://dev.mysql.com/doc/refman/5.7/en/information-functions.html#function_found-rows for details
It does give you the correct count, but using this method usually causes a high performance cost.
Read https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/ for a test that shows it is a better strategy to run separate queries. Admittedly, that blog is from 2007 and the test was performed on MySQL 5.0. You can try to reproduce their test on the current version of MySQL to see if it has improved since then.

Get total number of rows when using LIMIT? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find total number of results in mySQL query with offset+limit
I have a very complex sql query which returns results that are paginated. The problem is to get the total row count before LIMIT I have to run the sql query twice. The first time without the limit clause to get the total row count. The sql query is really complex and I think they must be a better way of doing this without running the query twice.
Luckily since MySQL 4.0.0 you can use SQL_CALC_FOUND_ROWS option in your query which will tell MySQL to count total number of rows disregarding LIMIT clause. You still need to execute a second query in order to retrieve row count, but it’s a simple query and not as complex as your query which retrieved the data.
Usage is pretty simple. In you main query you need to add SQL_CALC_FOUND_ROWS option just after SELECT and in second query you need to use FOUND_ROWS() function to get total number of rows. Queries would look like this:
SELECT SQL_CALC_FOUND_ROWS name, email FROM users WHERE name LIKE 'a%' LIMIT 10;
SELECT FOUND_ROWS();
The only limitation is that you must call second query immediately after the first one because SQL_CALC_FOUND_ROWS does not save number of rows anywhere.
Although this solution also requires two queries it’s much faster, as you execute the main query only once.
You can read more about SQL_CALC_FOUND_ROWS and FOUND_ROWS() in MySQL docs.
EDIT: You should note that in most cases running the query twice is actually faster than SQL_CALC_FOUND_ROWS. see here
EDIT 2019:
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version.
https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_found-rows
It's recommended to use COUNT instead
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT COUNT(*) WHERE id > 100;

Effective use of pagination and MySQL queries

At the moment I run two MySQL queries to handle my pagination...
Query 1 selects all rows from a table so I know how many pages I need to list.
Query 2 selects the rows for the current page (e.g: rows 0 to 19 (LIMIT 0, 19) for page 1, rows 20-39 for page two etc etc).
It seems like a waste of two duplicate queries with the only difference being the LIMIT part.
What would be a better way to do this?
Should I use PHP to filter the results after one query has been run?
Edit: Should I run one query and use something like array_slice() to only list the rows I want?
The best & fastest way is to use 2 MYSQL queries for pagination (as you are already using), to avoid over headache you must simplify the query used to find out the total number of rows by selecting only one column (say the primary key) that's enough.
SELECT * FROM sampletable WHERE condition1>1 AND condition2>2
for paginating such a query you may use these two queries
SELECT * FROM sampletable WHERE condition1>1 AND condition2>2 LIMIT 0,20
SELECT id FROM sampletable WHERE condition1>1 AND condition2>2
Use FOUND_ROWS()
A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
The second SELECT returns a number indicating how many rows the first SELECT would have returned had it been written without the LIMIT clause.
Note that this puts additional strain on the database, because it has to find out the size of the full result set every time. Use SQL_CALC_FOUND_ROWS only when you need it.

get total for limit in mysql using same query?

I am making a pagination method, what i did was:
First query will count all results and the second query will do the normal select with LIMIT
Is there technically any way to do this what I've done, but with only one query?
What I have now:
SELECT count(*) from table
SELECT * FROM table LIMIT 0,10
No one really mentions this, but the correct way of using the SQL_CALC_FOUND_ROWS technique is like this:
Perform your query: SELECT SQL_CALC_FOUND_ROWS * FROM `table` LIMIT 0, 10
Then run this query directly afterwards: SELECT FOUND_ROWS(). The result of this query contains the full count of the previous query, i.e. as if you hadn't used the LIMIT clause. This second query is instantly fast, because the result has already been cached.
You can do it with a subquery :
select
*,
(select count(*) from mytable) as total
from mytable LIMIT 0,10
But I don't think this has any kind of advantage.
edit: Like Ilya said, the total count and the rows have a totally different meaning, there's no real point in wanting to retrieve these data in the same query. I'll stick with the two queries. I just gave this answer for showing that this is possible, not that it is a good idea.
While i've seen some bad approaches to this, when i have looked into this previously there were two commonly accepted solutions:
Running your query and then running the same query with a count as you have done in your question.
Run your query and then run it again with the SQL_CALC_FOUND_ROWS keyword.
eg. SELECT SQL_CALC_FOUND_ROWS * FROM table LIMIT 0,10
This second approach is how phpMyAdmin does it.
You can run the first query and then the second query, and so you'll get both the count and the first results.
A query returns a set of records. The count is definitely not one of the records a "SELECT *" query can return, because there is only one count for the entire result set.
Anyway, you didn't say what programming language you run these SQL queries from and what interface you're using. Maybe this option exists in the interface.
SELECT SQL_CALC_FOUND_ROWS
your query here
Limit ...
Without running any other queries or destroying the session then run
SELECT FOUND_ROWS();
And you will get the total count of rows
The problem with 2 queries is consistency. In the time between the 2 queries the data may be changed. If your logic depends on what is counted has to be returned then your only approach would be to use SQL_CALC_FOUND_ROWS.
As of Mysql 8.0.17 SQL_CALC_FOUND_ROWS and FOUND_ROWS() will be deprecated. I don't know of a consistent solution after this functionality has been removed.