Locating unused indexes - mysql

What method(s) does one use to locate unused indexes on an extant MYSQL installation? Percona has tools, but these boxes are Amazon RDS instances so we don't have access to the nuts and bolts side for use of those tools.
I did locate http://hackmysql.com/mysqlidxchk and I think it may be my only option at this point. I can manually comb through and look for indexes with duplicate leading keys, but that also seems counter productive.
Are there other solutions that I am not seeing?

Yes, there is pt-index-usage, but you don't necessarily need to run it against your RDS instance.
You could collect query logs,* and then run pt-index-usage against a snapshot of your database running anywhere, even on your laptop. The tool just runs EXPLAIN for all queries in the log, and then reports any indexes that exist in the database but were not used by any EXPLAIN report.
RDS supports only table-based query logs, be careful of the overhead caused by this.
And you need to export the table-based query logs before using it as input to pt-index-usage. Here' a script that can do the export: https://github.com/billkarwin/bk-tools/blob/master/export-slow-log-table
MySQL 5.6 also has a new performance_schema table table_io_waits_summary_by_index_usage (see http://dev.mysql.com/doc/refman/5.6/en/table-waits-summary-tables.html) and you can enable this to find out how frequently each index is loaded from disk into RAM, therefore it is being used. Though you may not be using MySQL 5.6, and I don't know if you can enable performance_schema options on RDS anyway.
My colleague at Percona just posted a blog that confirms you can enable the performance_schema on Amazon RDS, though not through the Web UI. http://www.mysqlperformanceblog.com/2013/08/21/amazon-rds-with-mysql-5-6-configuration-variables/

EXPLAIN along with your query is helpful, in this regard. Try it out, if it doesn't help, say so, with what you'd like to know that's missing as a comment and I'll look into it.

Related

MySQL changing large table to InnoDB

I have a MySQL server running on CentOS which houses a large (>12GB) DB. I have been advised to move to InnoDB for performance reasons as we are experiencing lockups where the application that relies on the DB becomes unresponsive when the server is busy.
I have been reading around and can see that the ALTER command that changes the table to InnoDB is likely to take a long time and hammer the server in the process. As far as I can see, the only change required is to use the following command:
ALTER TABLE t ENGINE=InnoDB
I have run this on a test server and it seems to complete fine, taking about 26 minutes on the largest of the tables that needs to be converted.
Having never run this on a production system I am interested to know the following:
What changes are recommended to be made to the MySQL config to take advantage of additional performance of InnoDB tables? The server currently has 3GB assigned to InnoDB cache - was thinking of increasing this to 15GB once the additional RAM is installed.
Is there anything else I should do to the server with this change?
I would really recommend using either Percona MySQL or MariaDB. Both have tools that will help you get the most out of InnoDB, as well as some tools to help you diagnose and optimize your database further (for example, Percona's Online Schema Change tool could be used to alter your tables without downtime).
As far as optimization of InnoDB, I think most would agree that innodb_buffer_pool_size is one of the most important parameters to tune (and typically people set it around 70-80% of total available memory, but that's not a magic number). It's not the only important config variable, though, and there's really no magic run_really_fast setting. You should also pay attention to innodb_buffer_pool_instances (and there's a good discussion about this topic on https://dba.stackexchange.com/questions/194/how-do-you-tune-mysql-for-a-heavy-innodb-workload)
Also, you should definitely check out the tips offered in the MySQL documentation itself (http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb.html). It's also a good idea to pay attention to your InnoDB hit ratio (Rolado over at DBA Stackexchange has a great answer on this topic, eg, https://dba.stackexchange.com/questions/65341/innodb-buffer-pool-hit-rate) and analyze your slow query logs carefully. Towards that later end, I would definitely recommend taking a look at Percona again. Their slow query analyzer is top notch and can really give you a leg up when it comes to optimizing SQL performance.

How to calculate or see the performance of my database in mysql?

I want to check the performance of my database in mysql. I googled and came to know about show full processlist etc commands, but not very clear. i just want to know and calaculate the performance of database in terms of how much heap memory it is taking and other such.
Is there any way to know and assess the performance of the database. so that I can optimize and improve the performance.
Thanks in advance
The basic tool is MySQL Workbench which will work with any recent version of MySQL. It's not as powerful as the enterprise version, but is a great place to start.
The configuration can be exposed with SHOW VARIABLES and the current state of the system is exposed through SHOW STATUS. These status numbers are what ends up being graphed in most tools.
Don't forget that you can do a lot of monitoring on the application side, turning on database logs for instance. Barring that you can enable the "slow query" log in MySQL to check which queries are having the most impact. These can then be diagnosed with EXPLAIN.
Download mysql enterprise tools. They will allow you to monitor load on the server as well as performance of individual queries.
You can use open source tools from Percona called as Percona Toolkit and start using some useful tools which can help you in Efficiently archive rows, Find duplicate indexes, Summarize MySQL servers, Analyze queries from logs and tcpdump and Collect vital system information when problems occur.
You can try experimenting with Performance_Schema tables avialable in MySQL v5.6 onwards which can give a detailed information of query, database statistics.
http://www.markleith.co.uk/2012/07/04/mysql-performance-schema-statement-digests/

Locking DB w/ Large Reads (Ruby-on-Rails/Heroku)

Currently I have a Web API running on Heroku that is constantly writing information we're collecting from other data sources (currently theres about half a GB of data and it's growing very quickly). We're looking to add a reporting system on top of the current database that we can use to extract useful information out of the DB. The problem is that when we're running reports we're locking the DB and any other sites communicating with the DB are timing out. Does anyone have any solutions on how to solve this type of issue? Amazon RDS seems to have some interesting stuff with database replication but I don't know if that will solve my problems.
Any advice would be greatly appreciated.
Thanks
Be sure you are running innodb tables and not the old isam or myisam tables - innodb has row level locks which is much more scalable.
Make sure that you have indexes defined on all your joining/foreign keys... if you do joins without indexes it will grind. Also make sure you have indexes where appropriate for data that you search or sort on (as long as it is diverse data, not boolean or a small number of values)
Replication is another good idea, as you could target the reports at the secondary server in read-only mode, and it will just catch up once it unlocks. half a GB of data should not really be locking it up yet, so I'd look at the indexes and innodb first.
One solution to this is to have a replica of the database, so that your normal traffic goes to the master database, while long-running queries execute on the slave. I'm not sure how much control you get over the database on Heroku though, they may not support replication.
However, have you considered that the Heroku setup may be the problem here? A 500 MB database shouldn't really have performance issues unless you're performing really complex queries.
If you're happy using MySQL instead of Postgres, Engine Yard supports database replication (although generally it may not be as easy to use as Heroku).

Help! Why did MySql just screech to a halt?

Our relatively high traffic website just screeched to a halt, and we're totally stumped. We run on Django and Mysql (InnoDB), and we're trying to figure out why it's all of a sudden totally slow.
Here's what we know so far:
On our mysql server, a simple query (from django shell) runs fast.
On our app server, a simple query (from django shell) runs very slow.
Without having any details on the query or on the tables involved in the query, it is quite difficult to answer this question.
Most likely it is because of a lot of data in the table and a missing index on the field you are querying.
This would explain why it is slow on the production box, but fast on the dev box (since there's less data).
To answer the question better, could you provide us with more details? Table structure, query, number of rows in the table, etc. ?
More assumptions: Disk I/O on the app server could be a problem, maybe the log files in MySql are not properly configured (especially with InnoDB this could lead to a problem). Maybe there's a load-heavy query running too often? Table locks when multiple users write to/read from the same tables?
As I said, without having more details, it is quite difficult to guess. But I hope, at least I could point you in the right direction.
Run EXPLAIN on the SELECT.
Study this page carefully:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
Understanding the concepts on that page are key to properly index your tables.
Thanks for the responses everyone.
Turns out it was a DNS issue (which was a regression). MySQL is really stupid in that the default is to use DNS lookups. They got really slow, which killed all the network flow between the app server and the db server. It was as simple as adding "skip-name-resolve" to our my.cnf.
Are the 'mysql server' and 'app server' on the same box and talking to the same DB instance?
Your question suggests not, so I'd look for a problem on the network - start by pinging the database server from each box and compare the results.
Once you've done that you'll need to be a little more specific about the problem - were the ping times the same, are you running the same query, etc...

How do I oversee my MySQL replication server?

I've had a tough time setting up my replication server. Is there any program (OS X, Windows, Linux, or PHP no problem) that lets me monitor and resolve replication issues? (btw, for those following, I've been on this issue here, here, here and here)
My production database is several megs in size and growing. Every time the database replication stops and the databases inevitably begin to slide out of sync i cringe. My last resync from dump took almost 4 hours roundtrip!
As always, even after sync, I run into this kind of show-stopping error:
Error 'Duplicate entry '252440' for key 1' on query.
I would love it if there was some way to closely monitor whats going on and perhaps let the software deal with it. I'm even all ears for service companies which may help me monitor my data better. Or an alternate way to mirror altogether.
Edit: going through my previous questions i found this which helps tremendously. I'm still all ears on the monitoring solution.
To monitor the servers we use the free tools from Maatkit ... simple, yet efficient.
The binary replication is available in 5.1, so I guess you've got some balls. We still use 5.0 and it works OK, but of course we had our share of issues with it.
We use a Master-Master replication with a MySql Proxy as a load-balancer in front, and to prevent it from having errors:
we removed all unique indexes
for the few cases where we really needed unique constraints we made sure we used REPLACE instead of INSERT (MySql Proxy can be used to guard for proper usage ... it can even rewrite your queries)
scheduled scripts doing intensive reports are always accessing the same server (not the load-balancer) ... so that dangerous operations are replicated safely
Yeah, I know it sounds simple and stupid, but it solved 95% of all the problems we had.
We use mysql replication to replicate data to close to 30 servers. We monitor them with nagios. You can probably check the replication status and use an event handler to restart it with 'SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; Start Slave;'. That will fix the error, but you'll lose the insert that caused the error.
About the error, do you use memory tables on your slaves? I ask this because the only time we ever got a lot of these error they where caused by a bug in the latests releases of mysql. 'Delete From Table Where Field = Value' will delete only one row in memory tables even though they where multiple rows.
mysql bug descritpion