I recently consolidated several databases on a single much more powerful server. They have several dozen tables each with the larger ones having 2-6 million rows each. I noticed that some queries that were running in around 15ms were now taking a full 10 seconds to finish.
I ran mysqlcheck -c on the databases which reported everything was okay with each table. I then tried to optimize the tables anyways. That did not work. What did work was manually deleting every single index and recreating it.
I'm a novice when it comes to DBA. Why isn't optimize fixing any broken indexes? Is there hopefully a better way to do this than having to manually delete a little over 1000 indexes and recreate them?
Thanks for your help and replies.
In mysql the indexes are always balanced trees. The engine is updating them and keeping them optimized. You should not need to delete and recreate them. Probably the performance will not change if you recreate them all.
Did you analyze the queries that are now slower? Try to optimize them and the indexes that are related to them. Use explain to show the query execution plan.
Related
I have two identical databases on separate server machines, and if I execute one query on both machines, on one server it would go smoothly while on the other it would cause slow log. Explain shows me that they are not using same indexes. Any suggestion or advice, it would be helpful.
The index statistics which MySQL keeps, sometimes become inaccurate (I don't know why/when).
Running ANALYZE TABLE <table> on both servers should correct the statistics.
If the problem appears again, you can use index hints and/or IF's to force MySQL to use the correct index.
I am willing to use sphinx with MySQL for my current project.
MYISAM as database engine as this db is gonna be only read-only with 10-25 millions of records.
so i would like to know whether ,
Does using union or joins in query causes performance issues in Sphinx ?
as i am about to design database and if union/joins gonna cause the slower performance then i can go for optimized design for sphinx.
Maybe like creating one big table with all fields and data and then creating separate INDEXES in sphinx depending on the data to be searched.
please guide me in correct direction.
thanks for your time.
Sphinx cant do joins anyway. Can do unions, just searching multiple indexes at once.
Or do you mean to build the sphinx index (ie in sql_query)? Indexer will only run the queries to build the indexes in the first place.
As you say read only - hence no updates, the indexes should never rebuilding, so doesnt really matter how slow they are.
In general a sphinx index will perform very similar regardless of how many feilds. So shouldnt need to split into different indexes. JUst have one multi purpose index (if its possible).
YOu can however shard the index into bits, so can distribute to multiple servers if performance becomes an issue.
I found a lot of information on how indexes works in MySQL by looking at the following SO link: How do MySQL indexes work? However, I am facing a mysql issue I can not resolve, and I'm unsure whether it is related to indexing or not.
The problem is: I used multiple indexes in most of my tables, and everything seems to be working fine. However, when I restore the old back up data to my existing data, the size of the db keeps getting larger (it almost doubles each time).
Example: I was using a mysql db named DB1 last week, I made a backup and continued to use DB1. A few days later, I needed to continue from that backup db, so I restored it to DB1.
Before the restore, DB1's size was 115MB, but afterward it was suddenly 350MB.
Can anyone help shed some light on what might be happening?
This is not surprising. If you have lots of indexes, it's not unusual for them to take up as much space as the data itself.
When you are talking about 115MB vs. 350MB though, I'd guess the increase in query speed you get is probably worth that extra couple hundred megs of disk space. If not, then you might want to take a closer look at your indexes and make sure they are all actually providing some benefit.
I have a system that a client designed and the table was originally not supposed to get larger than 10 gigs (maybe 10 million rows) over a few years. Well, they've imported a lot more information than they were thinking and within a month, the table is now up to 208 gigs (900 million rows).
I have very little experience with MySQL and a lot more experience with Microsoft SQL. Is there anything in MySQL that would allow the client to have the database span multiple files so the queries that are run wouldn't have to use the entire table and index? There is a field on the table that could easily be split on, but I wasn't sure how to do this.
The main issue I'm trying to solve is a retrieval query from this table. Inserts aren't a big deal at all since it's all done by a back-end service. I have a test system where the table is about 2 gigs (6 million rows) and my query takes less than a second. When this same query is run on the production system, it takes 20 seconds. I have feeling that the query is doing well, it's just the size of the table that's causing the issue. There is an index on this table created specifically for this query, and using an EXPLAIN, it is using it.
If you have any other suggestions/questions, please feel free to ask.
Use partitioning and especially the part of create table that sets the data_directory and index_directory.
With these options you can put partitions on separate drives if needed. Usially though, it's enough to partition with a key that you can use on each query, usually time.
In addition to partitioning which has been mentioned you might also want to run the tuning-primer script to ensure your mysql configuration is optimal.
I work on a big web application that uses a MySQL 5.0 database with InnoDB tables. Twice over the last couple of months, we have experienced the following scenario:
The database server runs fine for weeks, with low load and few slow queries.
A frequently-executed query that previously ran quickly will suddenly start running very slowly.
Database load spikes and the site hangs.
The solution in both cases was to find the slow query in the slow query log and create a new index on the table to speed it up. After applying the index, database performance returned to normal.
What's most frustrating is that, in both cases, we had no warning about the impending doom; all of our monitoring systems (e.g., graphs of system load, CPU usage, query execution rates, slow queries) told us that the database server was in good health.
Question #1: How can we predict these kinds of tipping points or avoid them altogether?
One thing we are not doing with any regularity is running OPTIMIZE TABLE or ANALYZE TABLE. We've had a hard time finding a good rule of thumb about how often (if ever) to manually do these things. (Since these commands LOCK tables, we don't want to run them indiscriminately.) Do these scenarios sound like the result of unoptimized tables?
Question #2: Should we be manually running OPTIMIZE or ANALYZE? If so, how often?
More details about the app: database usage pattern is approximately 95% reads, 5% writes; database executes around 300 queries/second; the table used in the slow queries was the same in both cases, and has hundreds of thousands of records.
The MySQL Performance Blog is a fantastic resource. Namely, this post covers the basics of properly tuning InnoDB-specific parameters.
I've also found that the PDF version of the MySQL Reference Manual to be essential. Chapter 7 covers general optimization, and section 7.5 covers server-specific optimizations you can toy with.
From the sound of your server, the query cache may be of IMMENSE value to you.
The reference manual also gives you some great detail concerning slow queries, caches, query optimization, and even disk seek analysis with indexes.
It may be worth your time to look into multi-master replication, allowing you to lock one server entirely and run OPTIMIZE/ANALYZE, without taking a performance hit (as 95% of your queries are reads, the other server could manage the writes just fine).
Section 12.5.2.5 covers OPTIMIZE TABLE in detail, and 12.5.2.1 covers ANALYZE TABLE in detail.
Update for your edits/emphasis:
Question #2 is easy to answer. From the reference manual:
OPTIMIZE:
OPTIMIZE TABLE should be used if you have deleted a large part of a table or if you have made many changes to a table with variable-length rows. [...] You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data table.
And ANALYZE:
ANALYZE TABLE analyzes and stores the key distribution for a table. [...] MySQL uses the stored key distribution to decide the order in which tables should be joined when you perform a join on something other than a constant. In addition, key distributions can be used when deciding which indexes to use for a specific table within a query.
OPTIMIZE is good to run when you have the free time. MySQL optimizes well around deleted rows, but if you go and delete 20GB of data from a table, it may be a good idea to run this. It is definitely not required for good performance in most cases.
ANALYZE is much more critical. As noted, having the needed table data available to MySQL (provided with ANALYZE) is very important when it comes to pretty much any query. It is something that should be run on a common basis.
Question #1 is a bit more of a trick. I would watch the server very carefully when this happens, namely disk I/O. My bet would be that your server is thrashing either your swap or the (InnoDB) caches. In either case, it may be query, tuning, or load related. Unoptimized tables could cause this. As mentioned, running ANALYZE can immensely help performance, and will likely help out too.
I haven't found any good way of predicting MySQL "tipping points" -- and I've run into a few.
Having said that, I've found tipping points are related to table size. But not merely raw table size, rather how big the "area of interest" is to a query. For example, in a table of over 3 million rows and about 40 columns, about three-quarters integers, most queries that would easily select a portion of them based on indices are fast. However, when one value in a query on one indexed column means two-thirds of the rows are now "interesting", the query is now about 5-times slower than normal. Lesson: try to arrange your data so such a scan isn't necessary.
However, such behaviour now gives you a size to look for. This size will be heavily dependant on your server setup, the MySQL server variables and the table's schema and data.
Similarly, I've seen reporting queries run in reasonable time (~45 seconds) if the period is two weeks, but take half-an-hour if the period is extended to four weeks.
Use slow query log that will help you to narrow down the queries you want to optimize.
For time critical queries it sometimes better to keep stable plan by using hints.
It sounds like you have a frustrating situation and maybe not the best code review process and development environment.
Whenever you add a new query to your code you need to check that it has the appropriate indexes ready and add those with the code release.
If you don't do that your second option is to constantly monitor the slow query log and then go beat the developers; I mean go add the index.
There's an option to enable logging of queries that didn't use an index which would be useful to you.
If there are some queries that "works and stops working" (but are "using and index") then it's likely that the query wasn't very good in the first place (low cardinality in the index; inefficient join; ...) and the first rule of evaluating the query carefully when it's added would apply.
For question #2 - On InnoDB "analyze table" is basically free to run, so if you have bad join performance it doesn't hurt to run it. Unless the balance of the keys in the table are changing a lot it's unlikely to help though. It almost always comes down to bad queries. "optimize table" rebuilds the InnoDB table; in my experience it's relatively rare that it helps enough to be worth the hassle of having the table unavailable for the duration (or doing the master-master failover stuff while it's running).