Is there any software that can analyze a mySQL query, and suggest a specific index creation.
I know its best to do it by hand, but i need to something that can save some time.
Many thanks,
The MySQL Enterprise Monitor has a Query Analyer feature. But MEM is not free.
Percona Toolkit is a free, open-source software product that gives you most of the information to do the analysis yourself.
pt-query-digest --explain analyzes the top queries that appear in your query log and shows you their current optimization plan.
pt-index-usage analyzes the queries in your query log and shows you how they are using indexes (and also shows you indexes that it considered but decided not to use).
Full disclosure: I work for Percona.
Well, this won't automate the process, but it explains how you can figure out what the best INDEX is for a given SELECT: Index Cookbook
I agree with Bill that pt-query-digest (together with the slowlog) is an excellent way to identify the "worst" queries. (Disclosure: I do not work for Percona.)
Related
We have percona monitoring tool for monitoring Mysql db, generating slow log file report not gives us instant results. do we have any best approach to handle it using metrics/ promql's or query analytics etc. where we get min,max,average time of critical queries
min,max,average -- These don't make sense until you have enough samples to take min,max,average against. Rethink the need for "instant results".
pt-query-digest could be run daily or hourly (or whatever) to get results for the "recent past".
A lot of metrics can be graphed by the various monitoring tools available from Percona, MariaDB, and Oracle, plus others. Some cost money. Some come "close" to "instant results" even for slow queries.
Please describe your goal in different words; we may be able to better direct you.
Metrics like SHOW GLOBAL STATUS LIKE 'Threads_running'; (or a graph monitoring that) can spot a spike in realtime. But it there is nothing actionable in knowing that there is a spike.
I prefer looking at the slowlog afterward. The "worst" queries are readily identified by pt-query-digest. Spikes and bottlenecks can be identified, but not until they are "finished".
Deadlocks come from the hard-to-parse SHOW ENGINE InnoDB STATUS;, but only one at a time and after the fact.
In a not-well-tuned system, the first entry in pt-query-digest is (sometimes) a query that consumes over 50% of the system resources. Fixing that one query makes a big difference. Very cost-effective.
I want to check the performance of my database in mysql. I googled and came to know about show full processlist etc commands, but not very clear. i just want to know and calaculate the performance of database in terms of how much heap memory it is taking and other such.
Is there any way to know and assess the performance of the database. so that I can optimize and improve the performance.
Thanks in advance
The basic tool is MySQL Workbench which will work with any recent version of MySQL. It's not as powerful as the enterprise version, but is a great place to start.
The configuration can be exposed with SHOW VARIABLES and the current state of the system is exposed through SHOW STATUS. These status numbers are what ends up being graphed in most tools.
Don't forget that you can do a lot of monitoring on the application side, turning on database logs for instance. Barring that you can enable the "slow query" log in MySQL to check which queries are having the most impact. These can then be diagnosed with EXPLAIN.
Download mysql enterprise tools. They will allow you to monitor load on the server as well as performance of individual queries.
You can use open source tools from Percona called as Percona Toolkit and start using some useful tools which can help you in Efficiently archive rows, Find duplicate indexes, Summarize MySQL servers, Analyze queries from logs and tcpdump and Collect vital system information when problems occur.
You can try experimenting with Performance_Schema tables avialable in MySQL v5.6 onwards which can give a detailed information of query, database statistics.
http://www.markleith.co.uk/2012/07/04/mysql-performance-schema-statement-digests/
the question is about the best practice.
How to perform a reliable SQL query test?
That is the question is about optimization of DB structure and SQL query itself not the system and DB performance, buffers, caches.
When you have a complicated query with a lot of joins etc, one day you need to understand how to optimize it and you come to EXPLAIN command (mysql::explain, postresql::explain) to study the execution plan.
After tuning the DB structure you execute the query to see any performance changes but here you're on the pan of multiple level of optimization/buffering/caching. How to avoid this? I need the pure time for the query execution and be sure it is not affected.
If you know different practise for different servers please specify explicitly: mysql, postgresql, mssql etc.
Thank you.
For Microsoft SQL Server you can use DBCC FREEPROCCACHE (to drop compiled query plans) and DBCC DROPCLEANBUFFERS (to purge the data cache) to ensure that you are starting from a completely uncached state. Then you can profile both uncached and cached performance, and determine your performance accurately in both cases.
Even so, a lot of the time you'll get different results at different times depending on how complex your query is and what else is happening on the server. It's usually wise to test performance multiple times in different operating scenarios to be sure you understand what the full performance profile of the query is.
I'm sure many of these general principles apply to other database platforms as well.
In the PostgreSQL world you need to flush the database cache as well as the OS cache as PostgreSQL leverages the OS caching system.
See this link for some discussions.
http://archives.postgresql.org/pgsql-performance/2010-08/msg00295.php
Why do you need pure execution time? It depends on so many factors and almost meaningless on live server. I would recommend to collect some statistic from live server and analyze queries execution time using pgfouine tool (it's for postgresql) and make decisions based on it. You will see exactly what do you need to tune and how effective was your changes on a report.
I'm struggling with MySQL index optimization for some queries that should be simple but are taking forever. Rather than post the specific problem, I wanted to ask if there is an automated way of dealing with these.
I searched around but couldn't find anything. Surely, if query/ index optimization is just following a set of steps, then someone must have written an app to automate it for a given query... or am I not appreciating the complexities involved?
Well, I can offer a SQL indexing tutorial. Let us know if you succeed with automation ;)
Not so sure about MySQL, but there are tools for Oracle and SQL Server. They cover the trivial cases, but they tend to give a false sense of safety regarding non-trivial cases. Nor do they consider the overall workload very will, they are usually limited to suggesting indexes for particular statements.
IF it was that simple, you'd have automated index builder within MySQL.
Actually there is a query optimizer build into MySQL and it transparently rewrites your queries into what it finds most optimal form before executing them. It doesn't always work all that well though, and has it's own quirks. Knowing these helps avoid some common pitfalls (like using dependent subqueries)
There are tools, that can, with the query log given, show you which indexes are not used, and you can, by enabling logging of queries, that do not use index, see which one need an index. The problem is that indexes are expensive and you cannot just index everything, and which indexes you need depends on your queries.
Why don't databases automatically index tables based on query frequency? Do any tools exist to analyze a database and the queries it is receiving, and automatically create, or at least suggest which indexes to create?
I'm specifically interested in MySQL, but I'd be curious for other databases as well.
That is a best question I have seen on stackoverflow. Unfortunately I don't have an answer. Google's bigtable does automatially index the right columns, but BigTable doesn't allow arbitrary joins so the problem space is much smaller.
The only answer I can give is this:
One day someone asked, "Why can't the computer just analyze my code and and compile & statically type the pieces of code that run most often?"
People are solving this problem today (e.g. Tamarin in FF3.1), and I think "auto-indexing" relational databases is the same class of problem, but it isn't as much a priority. A decade from now, manually adding indexes to a database will be considered a waste of time. For now, we are stuck with monitoring slow queries and running optimizers.
There are database optimizers that can be enabled or attached to databases to suggest (and in some cases perform) indexes that might help things out.
However, it's not actually a trivial problem, and when these aids first came out users sometimes found it actually slowed their databases down due to inferior optimizations.
Lastly, there's a LOT of money in the industry for database architects, and they prefer the status quo.
Still, databases are becoming more intelligent. If you use SQL server profiler with Microsoft SQL server you'll find ways to speed your server up. Other databases have similar profilers, and there are third party utilities to do this work.
But if you're the one writing the queries, hopefully you know enough about what you're doing to index the right fields. If not then having the right indexes is likely the least of your problems...
-Adam
MS SQL 2005 also maintains an internal reference of suggested indexes to create based on usage data. It's not as complete or accurate as the Tuning Advisor, but it is automatic. Research dm_db_missing_index_groups for more information.
There is a script on I think an MS SQL blog with a script for suggesting indexes in SQL 2005 but I can't find the exact script right now! Its just the thing from the description as I recall. Here's a link to some more info http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx
PS just for SQL Server 2005 +
There are tools out there for this.
For MS SQL, use the SQL Profiler (to record activity against the database), and the Database Engine Tuning Advisor (SQL 2005) or the Index Tuning Wizard (SQL 2000) to analyze the activities and recommend indexes or other improvements.
Yes, some engines DO support automatic indexing. One such example for mysql is Infobright, their engine does not support "conventional" indexes and instead implicitly indexes everything - this is a column-based storage engine.
The behaviour of such engines tends to be very different from what developers (And yes, you need ot be a DEVELOPER to even be thinking about using Infobright; it is not a plug-in replacement for a standard engine) expect.
I agree with what Adam Davis says in his comment. I'll add that if such a mechanism existed to create indexes automatically, the most common reaction to this feature would be, "That's nice... How do I turn it off?"
Part of the reason may be that indexes don't just give a small speedup. If you don't have a suitable index on a large table queries can run so slowly that the application is entirely unusable, and possibly if it is interacting with other software it simply won't work. So you really need the indexes to be right before you start trying to use the application.
Also, rather than building an index in the background, and slowing things down further while it's being built, it is better to have the index defined before you start adding significant amounts of data.
I'm sure we'll get more tools that take sample queries and work out what indexes are necessary; also probably we will eventually get databases that do as you suggest and monitor performance and add indexes they think are necessary, but I don't think they will be a replacement for starting off with the right indexes.
Seems that MySQL doesn't have a user-friendly profiler. Maybe you want to try something like this, a php class based in MySQL profiler.
Amazon's SimpleDB has automatic indexing on all columns based on your usage:
http://aws.amazon.com/simpledb/
It has other limitations though:
It's a key-value store, not an RDB. Obviously that means slow joins (and no built-in join support).
It has a 10gb limit on table size. There are libraries that will handle partitioning big data for you although this locks you into that library's way of doing things, which can have its own problems.
It stores all values as strings, even numbers, which makes sorting a column with a 1,9, and 10 come out like 1,10,9 unless you use a library which hacks this by 0 padding. This also impacts negative numbers.
The 10gb limit is bigger than many might assume, so you could proceed with this for a simple site that you plan on rewriting if it ever hits big.
It's unfortunate this kind of automatic indexing didn't make it into DynamoDb, which appears to have replaced it - they don't even mention SimpleDb in their Product list anymore, you have to find it through old links to it.
Google App Engine does that (see the index.yaml file).