How to check performance of mysql query? - mysql

I have been learning query optimization, increase query performance and all but in general if we create a query how can we know if this is a wise query.
I know we can see the execution time below, But this time will not give a clear indication without a good amount of data. And usually, when we create a new query we don't have much data to check.
I have learned about clauses and commands performance. But is there is anything by which we can check the performance of the query? Performance here is not execution time, it means that whether a query is "ok" or not, without data dependency.
As we cannot create that much data that would be in live database.

General performance of a query can be checked using the EXPLAIN command in MySQL. See https://dev.mysql.com/doc/refman/5.7/en/using-explain.html
It shows you how MySQL engine plans to execute the query and allows you to do some basic sanity checks i.e. if the engine will use keys and indexes to execute the query, see how MySQL will execute the joins (i.e. if foreign keys aren't missing) and many more.
You can find some general tips about how to use EXPLAIN for optimizing queries here (along with some nice samples): http://www.sitepoint.com/using-explain-to-write-better-mysql-queries/

As mentioned above, Right query is always data-dependent. Up to some level you can use the below methods to check the performance
You can use Explain to understand the Query Execution Plan and that may help you to correct some stuffs. For more info :
Refer Documentation Optimizing Queries with EXPLAIN
You can use Query Analyzer. Refer MySQL Query Analyzer

I like to throw my cookbook at Newbies because they often do not understand how important INDEXes are, or don't know some of the subtleties.
When experimenting with multiple choices of query/schema, I like to use
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
That counts low level actions, such as "read next record". It essentially eliminates caching issues, disk speed, etc, and is very reproducible. Often there is a counter in that output (or multiple counters) that match the number of rows in the table (sometimes +/-1) -- that tells me there are table scan(s). This is usually not as good as if some INDEX were being used. If the query has a LIMIT, that value may show up in some Handler.
A really bad query, such as a CROSS JOIN, would show a value of N*M, where N and M are the row counts for the two tables.
I used the Handler technique to 'prove' that virtually all published "get me a random row" techniques require a table scan. Then I could experiment with small tables and Handlers to come up with a list of faster random routines.
Another tip when timing... Turn off the Query_cache (or use SELECT SQL_NO_CACHE).

Related

Join order differs for between two instances of the same Mysql DB

There is a query that I want to optimize. To make some tests, I took a snapshot of the production database and create a new test instance of this database. Using the explain clause, I can see that the order of the joins differ between the two databases. The two databases have the same version (MySQL 5.6.19a), the same engine (InnoDB), the same schema, the same indexes, the same data, and are executed on the same material. The only difference, is that the production database use more memory (obviously) because it has more connections to it.
What may cause the join order to be different?
The memory usage?
The indexes are still building in the test instance?
The indexes of the production database are fragmented?
This is rare but quite feasible. InnoDB has "statistics" about each index on each table; it uses them to decide what it the best way to perform the query, including what order to look at the tables.
The statistics used to come from 8 'random' dives into the BTree to get a crude feel for the number of rows and the distribution of the data. The timing of the dives, the number '8', and the randomness have all been criticized, and gradually they have been improved. Only some improvements exist in 5.6.19.
Also the "cost" model of deciding how to perform the query has recently had an overhaul (5.7 / 8.0). 8.0 and MariaDB 10.0 have "histograms", which should lead to better query plan choices. Not yet implemented (as of 8.0.0): Noticing which blocks are already cached; this could picking a 'worse' index because more of it is cached, hence faster.
Because of the complexity of the optimization problem and the huge number of possibilities, there are even some cases where a newer version picks a worse query plan.
Even if you are running the same query on the same machine, the query plan could be different.
I presume you already knew that changing a constant in the query can change the query plan -- and do it for the better. I have seen the same query come up with 6 different query plans, presumably due to different constants. This can be annoying if you are doing EXPLAIN on a query found in the slowlog -- you can't be sure that that query plan was used when it was "slow".
We simply have to live with all this.
You could do ANALYZE TABLE to recompute the statistics. But that can make things worse or better, depending on the phase of the moon. It might even (coincidentally) make your two instances perform the query the same.
The real question is "did one server run your query significantly faster than the other?" (After accounting for caching, other activity, etc, etc.)
When both of two tables in a JOIN are being filtered (something in WHERE), it is very difficult for the Optimizer to decide. If there is also ORDER BY and LIMIT, it becomes even harder to decide.
If you would like to provide your SELECT, its EXPLAIN, and SHOW CREATE TABLE, we can discuss details. (But start a new question.)

SQL - Query Performance

How to find the inefficient queries in mysql database ? I want to do performance tuning on my queries , but i coudn't find where my queries are located ? Please suggest me where can i find mysql queries for my tables .
Thanks
Prabhakaran.R
You can enable the general log and slow query logs.
Enabling general query log will log all the queries and might be heavy if you have many reads/writes. In slow query log, you can mention a threshold and only queries taking time beyond some time will be logged. Post that, you can manually analyze it or you can use tools provided( Percona has great tools)
Have you analyzed your queries with explain plans? You may be able to find a query that will return the result set you wish that isn't as heavy a load on the query engine. Remember to only select the columns you actually need (try to avoid SELECT *), well planned indexing and to use inner/outer joins in favour of a huge list of WHERE clause filters.
Good luck.
RP
In addition to what the others said, use pt-query-digest (from percona.com) to summarize the slowlog. That will tell you the worst queries.
Performance tuning often involves knowing what indexes to put on tables. My index cookbook is one place to learn how to build an INDEX for a given SELECT.

How to find slower MySQL queries from many small queries

I'm wondering if anyone has a suggestion for my situation:
I have a process that runs many tens of thousands of queries. The whole process takes between 5 and 10 minutes. I want to know which queries are running slower than the rest, but I know that none of them are running for more than say, 5 seconds (with this many queries, that would be very noticeable in my logs). How should I find out which ones are taking the most time, and are the ones that, if optimized, would provide the best results?
MORE DETAILS:
My queries run single-threaded and synchronous, and I'd say 70% SELECT and 30% INSERT/UPDATE. I'd have to get some heads together and determine if the work can be split up into different units that can be run simultaneously - I'm not sure...
All the queries are either simple INSERT statements, single-property UPDATE statements, or SELECT statements on either a primary or foreign key or a two-field ANDed restriction.
DESCRIPTION OF THE ISSUE:
What I'm doing is basically copying a complex directed graph structure, in its entirety. Nodes are database entries, and adjacencies represent essentially foreign keys, but not strictly-speaking (they could be a two-field combination, where the first says what table the second is the id for).
Take a look at MySQL's slow query log. You can configure the threshold of what is regarded as "slow".
Basically, you track time from query start to query end, e.g.:
function getmicrotime() {
list($usec, $sec) = explode(" ",microtime());
return ((float)$usec + (float)$sec);
}
$query_start = getmicrotime();
$query = 'SELECT ...';
mysql_query($query, $connect);
$query_end = getmicrotime();
if ($query_end - $query_start > 2) {
// add query to log
your_slow_queries_logger($query);
}
I'd sugest to make some kind of a wrapper for mysql_query function that would take care of logging slow queries.
I, personally, log my slow queries using syslog
If you serious in optimizing MySQL, have a read at this book
High Performance MySQL: Optimization, Backups, Replication, and More
Some of the contents may be outdated for the newest MySQL versions, but should be sufficient as it covers MySQL 5.1. It also has an extensive chapter on benchmarking along with techniques and methods and guiding you to create a benchmark plan that suits your needs.
It also has chapters on Index and Query optimizations which would be very helpful to you if you need to optimize the slow queries identified with the benchmarks.

Optimizing MySQL queries/database

I have two tables TABLE A and TABLE B.
TABLE A contain 1 million (1,000,000) records and 4 fields while TABLE 2 contain 60,000 and 3 fields.
I am running a query which joins these two tables and usees WHERE clause to find specific products like WHERE product like '%Bags%' and product like 'Bags%' e.t.c.
When I run the query directly in phpMyAdmin then it returns records in around 1 or 2 seconds. But when they are being used on website, they are sometime taking 9 or 10 seconds according to MySQL 'slow query' log. Actually my website response was very slow at times so upon investigation I found out it is due to MySQL as I came to know about 'slow query log'.
The slow query log consists of all SQL statements that took more than long_query_time seconds to execute and required at least min_examined_row_limit rows to be examined.
So according to that log "query_time" for above query was 13 seconds while in some cases they even had "query_time" exceeding 50 seconds.
Both my tables are using PRIMARY keys as well as INDEXES. So I want to know how can I optimize them more or is there any way I can optimize MySQL settings in general?
This slowness of website doesn't happen all the time but sometimes (may be once in a week) and lasts for around 1 or 2 minutes. It gets decent amount of traffic and there are many other queries too, the above I posted was just one example.
Thanks
For all things MySQL and performance related, check out http://www.mysqlperformanceblog.com/
Check your queries with EXPLAIN, see here and here for info on how to use EXPLAIN as query diagnostic tool.
It's not enough to just have indexes. Are you indexing the fields searched in the WHERE clause? Also do you have indexes for the fields used in the WHERE clause (including the fields you mention in ORDER BY, GROUP BY, and HAVING clauses as well as JOINs)? If you have grouped fields in a single index, that index won't be hit unless you have a query that searches all those fields together. If you group fields in an index make sure they the index will actually be used in your query (EXPLAIN is your friend).
That said, it could be many other things as well: poorly configured MySQL server, poorly tuned server, bad schema. But your queries and your indexes are good place to start your investigation.
Here is a nice summary of performance best practices from Jay Pipes of MySQL.
like '%Bags%' query cannot be optimized using indexes.
The only way to improve performance here is to use fulltext indexes or get sphinx to search.
Its because of some other queries are run at the time when you are going to refresh the page of your website. so if for example your website going to run 8-10 queries at time of page refresh then it will take some more time than you run single query in phpmyadmin. and if its take 1-1.5 min to execute then its may not the query problem but it may have prob with the server speed also.
and you also can use MATCH() AGAINST() statement for optimize this type of search queries.
Otherwise you are already using PRIMARY KEY, INDEXES and JOINS so there is no need to worry about other things.
just check it out.
Thanks.
There are many ways to optimize Databases and queries. My method is the following.
Look at the DB Schema and see if it makes sense
Most often, Databases have bad designs and are not normalized. This can greatly affect the speed of your Database. As a general case, learn the 3 Normal Forms and apply them at all times. The normal forms above 3rd Normal Form are often called de-normalization forms but what this really means is that they break some rules to make the Database faster.
What I suggest is to stick to the 3rd normal form except if you are a DBA (which means you know subsequent forms and know what you're doing). Normalization after the 3rd NF is often done at a later time, not during design.
Only query what you really need
Filter as much as possible
Your Where Clause is the most important part for optimization.
Select only the fields you need
Never use "Select *" -- Specify only the fields you need; it will be faster and will use less bandwidth.
Be careful with joins
Joins are expensive in terms of time. Make sure that you use all the keys that relate the two tables together and don't join to unused tables -- always try to join on indexed fields. The join type is important as well (INNER, OUTER,... ).
Optimize queries and stored procedures (Most Run First)
Queries are very fast. Generally, you can retrieve many records in less than a second, even with joins, sorting and calculations. As a rule of thumb, if your query is longer than a second, you can probably optimize it.
Start with the Queries that are most often used as well as the Queries that take the most time to execute.
Add, remove or modify indexes
If your query does Full Table Scans, indexes and proper filtering can solve what is normally a very time-consuming process. All primary keys need indexes because they makes joins faster. This also means that all tables need a primary key. You can also add indexes on fields you often use for filtering in the Where Clauses.
You especially want to use Indexes on Integers, Booleans, and Numbers. On the other hand, you probably don't want to use indexes on Blobs, VarChars and Long Strings.
Be careful with adding indexes because they need to be maintained by the database. If you do many updates on that field, maintaining indexes might take more time than it saves.
In the Internet world, read-only tables are very common. When a table is read-only, you can add indexes with less negative impact because indexes don't need to be maintained (or only rarely need maintenance).
Move Queries to Stored Procedures (SP)
Stored Procedures are usually better and faster than queries for the following reasons:
Stored Procedures are compiled (SQL Code is not), making them faster than SQL code.
SPs don't use as much bandwidth because you can do many queries in one SP. SPs also stay on the server until the final results are returned.
Stored Procedures are run on the server, which is typically faster.
Calculations in code (VB, Java, C++, ...) are not as fast as SP in most cases.
It keeps your DB access code separate from your presentation layer, which makes it easier to maintain (3 tiers model).
Remove unneeded Views
Views are a special type of Query -- they are not tables. They are logical and not physical so every time you run select * from MyView, you run the query that makes the view and your query on the view.
If you always need the same information, views could be good.
If you have to filter the View, it's like running a query on a query -- it's slower.
Tune DB settings
You can tune the DB in many ways. Update statistics used by the optimizer, run optimization options, make the DB read-only, etc... That takes a broader knowledge of the DB you work with and is mostly done by the DBA.
****> Using Query Analysers****
In many Databases, there is a tool for running and optimizing queries. SQL Server has a tool called the Query Analyser, which is very useful for optimizing. You can write queries, execute them and, more importantly, see the execution plan. You use the execution to understand what SQL Server does with your query.

Does MySQL record how often it uses indices?

I've got a table (InnoDB) with a fair number of indices. It would be great to know if one (or more) of these was never actually used. I don't care as much about the disk space, but insertion speed is sometimes an issue.
Does MySQL record any statistics on how often it has used each index when running queries?
There is a way to find unused indexes with a small patch to MySQL.
I'd suggest turning on profiling and doing some bottleneck analysis.
set profiling=1;
After which point you can let some of your heavy queries run for awhile. Eventually, you turn it off and than examine the queries that ran to see which were heaviest in execution time. If you don't see any
Some other commands to note are:
show profiles;
select sum(duration) from information_schema.profiling where query_id=<ID you want to look at>;
show profile for query <you want to look at>;
Finally if you wish to see unused indexes and you have userstats enabled
SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME
FROM information_schema.statistics `s` LEFT JOIN information_schema.index_statistics INDXS
ON (s.TABLE_SCHEMA = INDXS.TABLE_SCHEMA AND
s.TABLE_NAME=INDXS.TABLE_NAME AND
s.INDEX_NAME=INDXS.INDEX_NAME)
WHERE INDXS.TABLE_SCHEMA IS NULL;
AFAIK, it only records total count of index usages.
SHOW STATUS; has the Select_scan column.
The plans for queries don't vary much - turn on the (slow) query logging with a threshold of 0 seconds, then write a bit of code in the language of your choice to parse the log files and strip all the literals out the queries, then do explain plans for any query executed more then 'N' times.
C.