Can i use SQL_CALC_FOUND_ROWS with DISTINCT? - mysql

Hello and have a good time.
I have such query:
SELECT DISTINCT SQL_CALC_FOUND_ROWS SUBSTRING(color, 1, 3) as color FROM colors WHERE color LIKE '$color%' AND ottenok LIKE '$ottenok%';
And after it:
SELECT FOUND_ROWS();
for getting all records, like it has no DISTINCT. But after SELECT FOUND_ROWS() i am getting only unique records.
Have not found in search engines information about using SQL_CALC_FOUND_ROWS with DISTINCT. So i am looking for advice, how can i edit my query to have an ability to get both massives - unique, and non-unique.
p.s. of course, i know that i can remove "distinct" from my query, and get a unique list AFTER my sql query, using php function array_unique(), but this does not suite me, i need to solve this task using only sql...

In short: found_rows() will not get you the count without distinct, you need to count those records separately.
Background: as mysql documentation on found_rows() says (emphasis is mine):
A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include an SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:
So, this technique simply allows you to get the count, had no limit been applied to the same statement. Distinct modifier is applied before limit, therefore this technique cannot be used to get the count without distinct.
You can use a separate count(*) query without any distinct modifier to get the desired count.

Related

Finding Total rows for query which uses Limit

I have a function which return me X rows, where X is user selected parameter. I know I can use SQLCALCFOUND_ROWS in query, but I have to use select foundrows(); immediate after. If use select foundrows() after the main query inside my cfquery tag, only the values of total rows are returned. If I use it in another cfquery, it is possible that there is another query in mysql thread and my results are not available. What would be the better way to handle this.
Ok, so I implemented it by wrapping my queries in a cftransaction. Please NOTE: I have to resort to this method only when my primary query is huge and/or running that as subquery to get total record count is not an option. MySql states that SELECT FOUND_ROWS() should run immediately after the query with SQL_CALC_FOUND_ROWS. If I run it in same cfquery tag, I cannot access data from my primary query. If I run another cfquery there is risk that query connection will be returned to pool. #Leigh mentioned that running both queries within a cftransaction ensures that the connection is retained.
Not the best solution, but much better than running a huge query twice.
<cftransaction>
<cfquery name="qry1" datasource="dsn">
select SQL_CALC_FOUND_ROWS col1,col2 from someTable
some complex joins
where a=b
</cfquery>
<cfquery name="qry2" datasource="dsn">
select FOUND_ROWS() as TotalRows;
</cfquery>
</cftransaction>
This answer depends very much on the complexity and time cost of your query. But in a simple example, you can just build a count into the initial query as a subselect. If you have a table of countries for example, you can do the following to get the top 10 and also the total number
select top 10
country_name
,(select COUNT(*) from country) as total
from
country
You could just do a query of the query object that's returned. Feed the name of the query as the datasource and you can run SQL on it. I believe you should be able to do a select count(*) and count the rows that are returned from the first query.

Get total number of rows when using LIMIT? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Find total number of results in mySQL query with offset+limit
I have a very complex sql query which returns results that are paginated. The problem is to get the total row count before LIMIT I have to run the sql query twice. The first time without the limit clause to get the total row count. The sql query is really complex and I think they must be a better way of doing this without running the query twice.
Luckily since MySQL 4.0.0 you can use SQL_CALC_FOUND_ROWS option in your query which will tell MySQL to count total number of rows disregarding LIMIT clause. You still need to execute a second query in order to retrieve row count, but it’s a simple query and not as complex as your query which retrieved the data.
Usage is pretty simple. In you main query you need to add SQL_CALC_FOUND_ROWS option just after SELECT and in second query you need to use FOUND_ROWS() function to get total number of rows. Queries would look like this:
SELECT SQL_CALC_FOUND_ROWS name, email FROM users WHERE name LIKE 'a%' LIMIT 10;
SELECT FOUND_ROWS();
The only limitation is that you must call second query immediately after the first one because SQL_CALC_FOUND_ROWS does not save number of rows anywhere.
Although this solution also requires two queries it’s much faster, as you execute the main query only once.
You can read more about SQL_CALC_FOUND_ROWS and FOUND_ROWS() in MySQL docs.
EDIT: You should note that in most cases running the query twice is actually faster than SQL_CALC_FOUND_ROWS. see here
EDIT 2019:
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version.
https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_found-rows
It's recommended to use COUNT instead
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT COUNT(*) WHERE id > 100;

MySQL / PHP query performance?

Im in a dilema on which one of these methods are most efficient.
Suppose you have a query joining multiple tables and querying thousand of records. Than, you gotta get the total to paginate throughout all these results.
Is it faster to?
1) Do a complete select (suppose you have to select 50's columns), count the rows and than run another query with limits? (Will the MySQL cache help this case already selecting all the columns you need on the first query used to count?)
2) First do the query using COUNT function and than do the query to select the results you need.
3) Instead of using MySQL COUNT function, do the query selecting the ID's for example and use the PHP function mysql_num_rows?
I think the number 2 is the best option, using MySQL built in COUNT function, but I know MySQL uses cache, so, selecting all the results on first query gonna be faster?
Thanks,
Have a look at Found_Rows()
A SELECT statement may include a LIMIT clause to restrict the number
of rows the server returns to the client. In some cases, it is
desirable to know how many rows the statement would have returned
without the LIMIT, but without running the statement again. To obtain
this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT
statement, and then invoke FOUND_ROWS() afterward:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
The second SELECT returns a number indicating how many rows the first SELECT`
would have returned had it been written without the LIMIT clause.
My guess is number 2, but the truth is that it will depend entirely on data size, tables, indexing, MySql version etc.
The only way of finding the answer to this is to try each one and measure how long they take. But like I say, my hunch would be number 2.

Effective use of pagination and MySQL queries

At the moment I run two MySQL queries to handle my pagination...
Query 1 selects all rows from a table so I know how many pages I need to list.
Query 2 selects the rows for the current page (e.g: rows 0 to 19 (LIMIT 0, 19) for page 1, rows 20-39 for page two etc etc).
It seems like a waste of two duplicate queries with the only difference being the LIMIT part.
What would be a better way to do this?
Should I use PHP to filter the results after one query has been run?
Edit: Should I run one query and use something like array_slice() to only list the rows I want?
The best & fastest way is to use 2 MYSQL queries for pagination (as you are already using), to avoid over headache you must simplify the query used to find out the total number of rows by selecting only one column (say the primary key) that's enough.
SELECT * FROM sampletable WHERE condition1>1 AND condition2>2
for paginating such a query you may use these two queries
SELECT * FROM sampletable WHERE condition1>1 AND condition2>2 LIMIT 0,20
SELECT id FROM sampletable WHERE condition1>1 AND condition2>2
Use FOUND_ROWS()
A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
The second SELECT returns a number indicating how many rows the first SELECT would have returned had it been written without the LIMIT clause.
Note that this puts additional strain on the database, because it has to find out the size of the full result set every time. Use SQL_CALC_FOUND_ROWS only when you need it.

get total for limit in mysql using same query?

I am making a pagination method, what i did was:
First query will count all results and the second query will do the normal select with LIMIT
Is there technically any way to do this what I've done, but with only one query?
What I have now:
SELECT count(*) from table
SELECT * FROM table LIMIT 0,10
No one really mentions this, but the correct way of using the SQL_CALC_FOUND_ROWS technique is like this:
Perform your query: SELECT SQL_CALC_FOUND_ROWS * FROM `table` LIMIT 0, 10
Then run this query directly afterwards: SELECT FOUND_ROWS(). The result of this query contains the full count of the previous query, i.e. as if you hadn't used the LIMIT clause. This second query is instantly fast, because the result has already been cached.
You can do it with a subquery :
select
*,
(select count(*) from mytable) as total
from mytable LIMIT 0,10
But I don't think this has any kind of advantage.
edit: Like Ilya said, the total count and the rows have a totally different meaning, there's no real point in wanting to retrieve these data in the same query. I'll stick with the two queries. I just gave this answer for showing that this is possible, not that it is a good idea.
While i've seen some bad approaches to this, when i have looked into this previously there were two commonly accepted solutions:
Running your query and then running the same query with a count as you have done in your question.
Run your query and then run it again with the SQL_CALC_FOUND_ROWS keyword.
eg. SELECT SQL_CALC_FOUND_ROWS * FROM table LIMIT 0,10
This second approach is how phpMyAdmin does it.
You can run the first query and then the second query, and so you'll get both the count and the first results.
A query returns a set of records. The count is definitely not one of the records a "SELECT *" query can return, because there is only one count for the entire result set.
Anyway, you didn't say what programming language you run these SQL queries from and what interface you're using. Maybe this option exists in the interface.
SELECT SQL_CALC_FOUND_ROWS
your query here
Limit ...
Without running any other queries or destroying the session then run
SELECT FOUND_ROWS();
And you will get the total count of rows
The problem with 2 queries is consistency. In the time between the 2 queries the data may be changed. If your logic depends on what is counted has to be returned then your only approach would be to use SQL_CALC_FOUND_ROWS.
As of Mysql 8.0.17 SQL_CALC_FOUND_ROWS and FOUND_ROWS() will be deprecated. I don't know of a consistent solution after this functionality has been removed.