One of the projects I'm working on is suffering from a recent slowdown in the DB (since last week).
Code hasn't changed, data may have changed a little but not significantly so at this stage I'm just exploring DB configuration (as we are on a managed hosting platform, end have had some similar issues in the past).
Unfortunately I'm out of my depth a bit... could anyone please take a look at the output from SHOW STATUS below and see if any of it sets alarm bells off? The only thing I've spotted so far is that key_reads vs key_read_requests don't seem quite right.
Our setup is 2 servers replicated, with all reads done from the slave. Queries which run in 0.01 secs on the master are taking up to 7 secs on the slave... and this has only started recently.
All tables are MyIsam and inserts/updates are negligible (updates happen out of hours). Front end is an ASP .NET website (.NET 4) running on IIS8 with a devart component for data access.
Thanks!
SHOW STATUS output is here: http://pastebin.com/w6xDeD48
Other factors can impact MySQL performance:
virus scanning software -> I had a issue with McAfee bogging out peformance due to it scanning temporary table files
Other services running on server?
Have you tried a EXPLAIN SELECT on the query? This would given you an indication of the index size. As #Liath indicated the indexes may be out of date on the slave but find on the master.
Just an update in case it ever helps anyone else in future - it looks like the culprit might be the query cache for now, as we are seeing better performance with it turned off (still not quite as good as we had before the issue).
So we will try to tune it a little and get back to great performance!
Related
This morning, one of our Aurora clusters suddenly began experiencing high latency, slower running queries and was being reported as exceeding exceeding capacity - with up to 20 sessions for the db.r5.large instance which has only 2 CPUs.
There we no code changes, no deploy, no background process or any other cause we can identify. The higher latency is intermittent, occurring every 10 minutes and lasting for about as long. The Aurora monitoring isn't helping much, the only change of note being higher latency on the all queries (selects, updates and deletes).
Under Performance Metrics, the cases where usage spikes - we're seeing that of the total 20 sessions, these are attributed almost solely to the io/file/myisam/kfile Wait. Researching online has yielded very little and so I'm somewhat stumped as to what this means, and how to go about getting to the cause of the issue. Looking at the SQL queries ran during spikes, their slow run time appears more caused by the intermittent issue - as opposed to the being the cause of it.
So my question is: can anyone explain what the 'myisam/kfile' Wait is, and how I can use this knowledge to help diagnose the cause of the problem here?
My feeling is that it's one of those rare occurrences where an AWS instance unexplainably goes rogue at a level below which we can directly control and is only solved by spinning up a new instance (even where all else is equal from a configuration and code perspective). All the same, I'd love to better understand the issue here, especially when none of our DB table are MyISAM, all being innoDB.
Is there a table called kfile? How big is it? What operations are being performed?
While the problem is occurring, do SHOW FULL PROCESSLIST; to see what is running. That may give a good clue.
If the slowlog is turned on, look at it shortly after the problem has subsided; probably the naughty query will be at the end of the list. Publish pt-query-digest path_to_slowlog. The first one or two queries are very likely to be the villains.
Check SHOW ENGINE INNODB STATUS;. Near the front will be the "latest deadlock". That may be a clue.
In most situations most of the graphs don't provide any useful information. When something does go wrong, it is not obvious which graph to look at. I would look for graphs that look "different" in sync with the problem. The one you show us perhaps indicates that there is a 20-second timeout, and everyone is stuck until they hit that timeout.
Do you run any ALTERs? When do backups occur?
Are you using MyISAM? Don't. That ENGINE does not allow concurrency, hence could lead to a bunch of queries piling up. More on converting to InnoDB: http://mysql.rjweb.org/doc.php/myisam2innodb
I have just moved a WP installation from one hosting provider to another. Everything went fine except for a problem I have with the new installation. Please note that I have moved from a regular VPS to a kinda powerful and fast dedicated machine.
The thing is that now, the website is slower than when in the previous server. It takes 6-7 seconds to load a page and according to Chrome's Dev Tools network panel, it has a period of 3-4 seconds to the get the first response byte (TTFB), which is insane.
I have tried the following with no success:
Review database for anomalies
Disable all plugins (and delete them)
Disable template (and delete it)
With these last two actions, I lowered the loading time to 5-6 seconds, which is a lot for small site (a few hundreds of posts and 50-60 pages), with no comments enabled. I still have the 3-4 TTFB period.
After that, I installed the Query Monitor plugin and found out that, at every page load, WP performs hundreds (ranging from 400 to 800) database queries and, in some cases, even 1500 database queries. OMG!
Honestly, I am quite lost here. I mean, on one hand I have this strange database behavior I cannot really understand. And on the other hand, I cannot help wondering how it was faster on the previous & slower server.
By the way, I have moved from MySQL to MariaDB, which should be even faster too. Indexes are kept when dumping & importing the file. I am lost. :(
Any help is greatly appreciated. Apologies for my english (not my language) and please let me know if the is some important information missing. I will be glad to provide all the necessary information that help me/us troubleshoot this.
Thanks in advance!
I think you should optimize your MySQL config (my.cnf in Linux or my.ini in Windows). To view problems in MySQL you can try run the script MySQLTuner: https://github.com/major/MySQLTuner-perl.
Applications in our company were working on quite old Apache + MySQL bundle (with PHP 5.x) but they were fast in retrieving data from the database. Simple retrieval of around 100 rows with some basic calculations would take around a second or two.
We had that bundle upgraded to the newest XAMPP version and all features it comes with. Applications were not changed (they did not have to, they just retrieve data and calculate them) but the performance of the very same operations has lowered significantly, they take around 20 seconds, sometimes up to 50 seconds.
I would appreciate it if anybody could point me to what can be done to improve the performance on that new bundle.
Thank you in advance.
In case anyone ever experiences this error, your my.ini needs some improvements on the size of cache and buffer. There's a working version of my.ini available if you search it on Google. It worked like a miracle in terms of MySQL performance.
This is the most puzzling MySQL problem that I've encountered in my career as an administrator. Can anyone with MySQL mastery help me a bit on this?:
Right now, I run an application that queries my MySQL/InnoDB tables many times a second. These queries are simple and optimized -- either single row inserts or selects with an index.
Usually, the queries are super fast, running under 10 ms. However, once every hour or so, all the queries slow down. For example, at 5:04:39 today, a bunch of simple queries all took more than 1-3 seconds to run, as shown in my slow query log.
Why is this the case, and what do you think the solution is?
I have some ideas of my own: maybe the hard drive is busy during that time? I do run a cloud server (rackspace) But I have flush_log_at_trx_commit set to 0 and tons of buffer memory (10x the table size on disk). So the inserts and selects should be done from memory right?
Has anyone else experience something like this before? I've searched all over this forum and others, and it really seems like no other MySQL problem I've seen before.
There are many reasons for sudden stalls. For example - even if you are using flush_log_at_trx_commit=0, InnoDB will need to pause briefly as it extends the size of data files.
My experience with the smaller instance types on Rackspace is that IO is completely awful. I've seen random writes (which should take 10ms) take 500ms.
There is nothing in built-in MySQL that will help you identify the problem easier. What you might want to do is take a look at Percona Server's slow query log enhancements. There's a specific feature called "profiling_server" which can break down time:
http://www.percona.com/docs/wiki/percona-server:features:slow_extended#changes_to_the_log_format
I've had a tough time setting up my replication server. Is there any program (OS X, Windows, Linux, or PHP no problem) that lets me monitor and resolve replication issues? (btw, for those following, I've been on this issue here, here, here and here)
My production database is several megs in size and growing. Every time the database replication stops and the databases inevitably begin to slide out of sync i cringe. My last resync from dump took almost 4 hours roundtrip!
As always, even after sync, I run into this kind of show-stopping error:
Error 'Duplicate entry '252440' for key 1' on query.
I would love it if there was some way to closely monitor whats going on and perhaps let the software deal with it. I'm even all ears for service companies which may help me monitor my data better. Or an alternate way to mirror altogether.
Edit: going through my previous questions i found this which helps tremendously. I'm still all ears on the monitoring solution.
To monitor the servers we use the free tools from Maatkit ... simple, yet efficient.
The binary replication is available in 5.1, so I guess you've got some balls. We still use 5.0 and it works OK, but of course we had our share of issues with it.
We use a Master-Master replication with a MySql Proxy as a load-balancer in front, and to prevent it from having errors:
we removed all unique indexes
for the few cases where we really needed unique constraints we made sure we used REPLACE instead of INSERT (MySql Proxy can be used to guard for proper usage ... it can even rewrite your queries)
scheduled scripts doing intensive reports are always accessing the same server (not the load-balancer) ... so that dangerous operations are replicated safely
Yeah, I know it sounds simple and stupid, but it solved 95% of all the problems we had.
We use mysql replication to replicate data to close to 30 servers. We monitor them with nagios. You can probably check the replication status and use an event handler to restart it with 'SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; Start Slave;'. That will fix the error, but you'll lose the insert that caused the error.
About the error, do you use memory tables on your slaves? I ask this because the only time we ever got a lot of these error they where caused by a bug in the latests releases of mysql. 'Delete From Table Where Field = Value' will delete only one row in memory tables even though they where multiple rows.
mysql bug descritpion