Does MySQL block nested loop optimizer switch affect query results? - mysql

If I set set optimizer_switch='block_nested_loop=off' as suggested here, can i get 100% certainty the same result for option on and off ?
I want to change this option to off, because it increases query performance in my case from 56s to 1s.
What are the pros and cons for this optimizer switch, is it safe?

Yes, it is safe. The optimizer_switch tells MySql how to search for the answer to the query. Regardless of how optimizer_switch is set, it will generate the same result for your query (unless there are bugs in MySql).
The only disadvantage to using set optimizer_switch='block_nested_loop=off' is that other queries might become slower, so you might want to set it back to on after executing your query.

Related

Explain MySQL query without changing row_count() from previous query

In an application, sometimes queries are slow, and I "explain" them after the fact (if they are slow) and log them, so I can improve the application over time.
However, if I run an "explain " after, row_count() no longer reflects the number of rows affected by the original query, which I don't want. Is there a way to run an explain query (or perhaps any query), and not change row_count()?
Note: What I am currently doing is to open a separate link to the database, and explain using that link. That works, but I am unable to explain queries using temporary tables in that way. I am looking for a different solution that will preserve row_count() and work with temporary tables.
Capture row_count() into a variable, if you need it later. You should probably be doing this anyway, since the scope of validity of this value is very limited.
The value is tied to the specific connection, and is reset with each query you execute... and EXPLAIN ... is a query.
There's not a way to change this behavior.
Rearrange your code...
SELECT ...
get ROW_COUNT ...
EXPLAIN SELECT ...
Note also that EXPLAIN's "Row" column is an approximation; it will rarely match ROW_COUNT().

How to check performance of mysql query?

I have been learning query optimization, increase query performance and all but in general if we create a query how can we know if this is a wise query.
I know we can see the execution time below, But this time will not give a clear indication without a good amount of data. And usually, when we create a new query we don't have much data to check.
I have learned about clauses and commands performance. But is there is anything by which we can check the performance of the query? Performance here is not execution time, it means that whether a query is "ok" or not, without data dependency.
As we cannot create that much data that would be in live database.
General performance of a query can be checked using the EXPLAIN command in MySQL. See https://dev.mysql.com/doc/refman/5.7/en/using-explain.html
It shows you how MySQL engine plans to execute the query and allows you to do some basic sanity checks i.e. if the engine will use keys and indexes to execute the query, see how MySQL will execute the joins (i.e. if foreign keys aren't missing) and many more.
You can find some general tips about how to use EXPLAIN for optimizing queries here (along with some nice samples): http://www.sitepoint.com/using-explain-to-write-better-mysql-queries/
As mentioned above, Right query is always data-dependent. Up to some level you can use the below methods to check the performance
You can use Explain to understand the Query Execution Plan and that may help you to correct some stuffs. For more info :
Refer Documentation Optimizing Queries with EXPLAIN
You can use Query Analyzer. Refer MySQL Query Analyzer
I like to throw my cookbook at Newbies because they often do not understand how important INDEXes are, or don't know some of the subtleties.
When experimenting with multiple choices of query/schema, I like to use
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
That counts low level actions, such as "read next record". It essentially eliminates caching issues, disk speed, etc, and is very reproducible. Often there is a counter in that output (or multiple counters) that match the number of rows in the table (sometimes +/-1) -- that tells me there are table scan(s). This is usually not as good as if some INDEX were being used. If the query has a LIMIT, that value may show up in some Handler.
A really bad query, such as a CROSS JOIN, would show a value of N*M, where N and M are the row counts for the two tables.
I used the Handler technique to 'prove' that virtually all published "get me a random row" techniques require a table scan. Then I could experiment with small tables and Handlers to come up with a list of faster random routines.
Another tip when timing... Turn off the Query_cache (or use SELECT SQL_NO_CACHE).

Preventing a single query from appearing in slow query log

Is a way to prevent a single query from appearing in mysql slow query log?
One may actually disable logging before executing the query (by setting a global variable) and enable it back after the query, but this would prevent logging in other threads as well, which is not desirable.
Do you have any ideas?
In MySQL 5.1 and later, you can make runtime changes to the time threshold for which queries are logged in the slow query log. Set it to something ridiculously high and the query is not likely to be logged.
SET SESSION long_query_time = 20000;
SELECT ...whatever...
SET SESSION long_query_time = 2;
Assuming 2 is the normal threshold you use.
I don't know if you can prevet a single query from appearing in the slow query log, but you could use a grepped output from the query log. Having said that, if I remember correctly, every slow query is dumped as multiple lines so it would not be easy to grep it out, but not impossible.
mysqldumpslow has a "-g pattern" option to "Consider only queries that match the (grep-style) pattern." which may help in your situation.
I hope this helps.
Cheers
Tymek

Sphinx indexer slows down the database: How to give it low priority?

All my tables use InnoDB, and I have set sphinx sql_range_step to the minimum, which is 128. This improved the performance a lot, but it is still very slow if you make a request right after a new step begins.
I'm sure it would work just fine if I could reduce the range step to 10 or something, but someone found that the min value is hardcoded and there is no way to change it (other than editing the source).
So I was wondering if there was a way to deal with this directly from MySQL. When I am indexing a database, the other databases aren't affected, so it's not the whole server which has been slowed down, but only the database I am indexing.
Is there a way to give less priority to a user or query, or something?
At first, try to optimize your SQL queries sql_query and sql_query_range.
Also, you can throttle queries while indexing via sql_ranged_throttle. For example, set it to 1000, to get 1 second (1000 ms) delay before each ranged query.

Is there an effect on the speed of a query when using SQL_CALC_FOUND_ROWS in MySQL?

The other day I found the FOUND_ROWS() (here) function in MySQL and it's corresponding SQL_CALC_FOUND_ROWS option. The later looks especially useful (instead of running a second query to get the row count).
I'm wondering what speed impact there is by adding SQL_CALC_FOUND_ROWS to a query?
I'm guessing it will be much faster than runnning a second query to count the rows, but will it be a lot different. Also, I have found limiting a query to make it much faster (for example when you get the first 10 rows of 1000). Will adding SQL_CALC_FOUND_ROWS to a query with a small limit cause the query to run much slower?
I know I can test this, but I'm wondering about general practices here.
When I was at the MySQL Conference in 2008, part of one session was dedicated to exactly this - benchmarks between SQL_CALC_FOUND_ROWS and doing a separate SELECT.
I believe the result was that there was no benefit to SQL_CALC_FOUND_ROWS - it wasn't faster, in fact it may have been slower. There was also a 3rd way.
Additionally, you don't always need this information, so I would go the extra query route.
I'll try to find the slides...
Edit: Hrm, google tells me that I actually liveblogged from that session: http://beerpla.net/2008/04/16/mysql-conference-liveblogging-mysql-performance-under-a-microscope-the-tobias-and-jay-show-wednesday-200pm/. Google wins when memory fails.
To calculate SQL_CALC_FOUND_ROWS the query will be execute as if no LIMIT was set, but the result set sent to the client will obey the LIMIT.
Update: for COUNT(*) operations which would be using only the index, SQL_CALC_FOUND_ROWS is slower (reference).
I assume it would be slightly faster for queries that you need the number of rows know, but would incur and overhead for queries that you don't need to know.
The best advice I could give is to try it out on your development server and benchmark the difference. Every setup is different.
I would advise to use as few proprietary SQL extensions as possible when developing an application (or actually not using SQL queries at all). Doing a separate query is portable, and actually I don't think MySql could do better at getting the actual information than re-querying. Btw. as the page mentions the command has some drawbacks too when used in replicated environments.