MySQL/Magento Performance issues on Amazon EC2 - mysql

We are running 2 web servers that host a Magento eCommerce site and 1 MySQL database server on the Amazon EC2.
We are experiencing major performance issues, deadlocks, 'lock wait timeout exceeded' errors etc on the MySQL server and really struggling to get these resolved.
We have recently upgraded the db server to an m1.xlarge instance (from m1.large) and still we are continuing to experience these problems.
We've been attributing these issues to bad disk IO we often see on the EC2 servers, but recently I've seen issues with deadlocks etc even when the disk IO is fine.
The "sar" command is showing that we have pretty poor disk IO performance at peak times or when we perform database intensive operations like creating invoices via the Magento API. We often see the iowait go up to over 20%.
Below is a link to a screenshot show the results of an "mtop" during a recent problem we had where a query was causing a slow down of the entire database:
http://i.imgur.com/AARlc.png
This screenshot shows one or other query that is holding up the rest of the queries from executing. It also shows quite a low load average, often we see the load average going up to 3.0 when an intensive command is being executed.
Here are the my.cnf settings:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
symbolic-links=0
innodb_file_per_table=1
key_buffer=512M
max_allowed_packet=64M
table_cache=512
innodb_thread_concurrency=5
innodb_buffer_pool_size=4976M
innodb_additional_mem_pool_size=8M
innodb_log_file_size=128M
innodb_log_buffer_size=8M
thread_cache_size=150
sort_buffer_size=4M
read_buffer_size=4M
read_rnd_buffer_size=2M
myisam_sort_buffer_size=64M
tmp_table_size=256M
query_cache_type=1
query_cache_size=128M
max_connections=400
wait_timeout=28800
innodb_lock_wait_timeout=120
max_heap_table_size=256M
long_query_time=3
log-slow-queries=...mysql-slow.log
[mysqld_safe]
log-error=...mysqld.log
pid-file=...mysqld.pid
We have used the pt-query-digest function extensively to analyze our MySQL slow query log.
Basically we are seeing that the sales_flat_quote table is extremely slow with updates and inserts, but so are a number of other tables.
sales_flat_quote is not particularly large though, there are only around 100k rows in the table.

Several root causes are possible:
Some of your slow queries may be locking tables, thus queueing other queries
Your queries may not be optimized
Your queries may need more indexes on some tables
Check your slowest queries using this official tool:
mysqldumpslow

We have observed similar hogs on our mysql EC2 server, however, we quickly migrated our database to an RDS instance. Since then, there have been very few problems. One might argue that RDS are costly and EC2 are not, however, you would also save on the time spent on managing databases/daily backups etc.

I recommend to migrate your database to an RDS instance.

Related

MySQL max_connections

I have setup with Linux, Debian Jessie with Mysql 5.7.13 installed.
I have set following settings in
my.cnf: default_storage_engine= innodb, innodb_buffer_pool_size= 44G
When I start MySQL I manually set max_connections with SET GLOBAL max_connections = 1000;
Then I trigger my loadtest that sends a lot of traffic to the DB server which mostly consists of slow/bad queries.
The result I expected was that I would reach close to 1000 connections but somehow MySQL limits it to 462 connections and I can not find the setting that is responsible for this limit. We are not even close to maxing out the CPU or Memory.
If you have any idea or could point me in a direction where you think the error might be it would be really helpful.
What loadtest did you use? Are you sure that it can utilize about thousands of connections?
You may maxing out your server resources in the disk IO area, especially if you're talking about lot of slow/bad queries. Did you check for disk utilization on your server?
Even if your InnoDB pool size is large your DB still need to read your DB to the cache first, and if your entire DB is large it will not help you.
I can recommend you to perform such a test once more time and track your disk performance during loadtest using iostat or iotop utility.
Look here for more examples of the server performance troubleshooting.
I found the issue, it was du to limitation of Apache server, there is a "hidden" setting inside /etc/apache2/mods-enabled/mpm_prefork.conf which will overwrite setting inside /etc/apache2/apache2.conf
Thank you!

mysql 5.6 Linux vs windows performance

The below command takes 2-3 seconds in a Linux MySQL 5.6 server running Php 5.4
exec("mysql --host=$db_host --user=$db_user --password=$db_password $db_name < $sql_file");
On windows with similar configuration it takes 10-15 seconds. The windows machine has a lot more ram (16gb) and similar hard drive. I installed MySQL 5.6 and made no configuration changes. This is on windows server 2012.
What are configurations I can change to fix this?
The database file creates about 40 innodb tables with very minimal inserts.
EDIT: Here is the file I am running:
https://www.dropbox.com/s/uguzgbbnyghok0o/database_14.4.sql?dl=0
UPDATE: On windows 8 and 7 it was 3 seconds. But on windows server 2012 it is 15+ seconds. I disabled System center 2012 and that made no difference.
UPDATE 2:
I also tried killing almost every service except for mysql and IIS and it still performed slowly. Is there something in windows server 2012 that causes this to be slow?
Update 3
I tried disable write cache buffer flush and performance is now great.
I didn't have to do this on other machines I tested with. Does this indicate a bottleneck With how disk is setup?
https://social.technet.microsoft.com/Forums/windows/en-US/282ea0fc-fba7-4474-83d5-f9bbce0e52ea/major-disk-speed-improvement-disable-write-cache-buffer-flushing?forum=w7itproperf
That is why we call it LAMP stack and no doubt why it is so popular mysql on windows vs Linux. But that has more to do more with stability and safety. Performance wise the difference should be minimal. While a Microsoft Professional can best tune the Windows Server explicitly for MySQL by enabling and disabling the services, but we would rather be interested to see the configuration of your my.ini. So what could be the contributing factors w.r.t Windows on MySQL that we should consider
The services and policies in Windows is sometimes a big impediment to performance because of all sorts of restrictions and protections.
We should also take into account the Apache(httpd.conf) and PHP(php.ini) configuration as MySQL is so tightly coupled with them.
Antivirus : Better disable this when benchmarking about performance
Must consider these parameters in my.ini as here you have 40 Innodb tables
innodb_buffer_pool_size, innodb_flush_log_at_trx_commit, query_cache_size, innodb_flush_method, innodb_log_file_size, innodb_file_per_table
For example: If file size of ib_logfile0 = 524288000, Then
524288000/1048576 = 500, Hence innodb_log_file_size should be 500M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
https://dev.mysql.com/doc/refman/5.1/en/innodb-tuning.html
When importing data into InnoDB, make sure that MySQL does not have autocommit mode enabled because that requires a log flush to disk for every insert
SET autocommit=0;
Most importantly innodb_flush_log_at_trx_commit as in this case it is about importing database. Setting this to '2' form '1' (default)hm can be a big performance booster specially during data import as log buffer will be flushed to OS file cache on every transaction commit
For reference :
https://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html
https://dba.stackexchange.com/a/72766/60318
http://kvz.io/blog/2009/03/31/improve-mysql-insert-performance/
Lastly, based on this
mysql --host=$db_host --user=$db_user --password=$db_password $db_name < $sql_file
If the mysqldump (.sql) file is not residing in the same host where you are importing, performance will be slow. Consider to copy the (.sql) file exactly in the server where you need to import the database, then try importing without --host option.
Windows is slower at creating files, period. 40 InnoDB tables involves 40 or 80 file creations. Since they are small InnoDB tables, you may as well set innodb_file_per_table=OFF before doing the CREATEs, thereby needing only 40 file creations.
Good practice in MySQL is to create tables once, and not be creating/dropping tables frequently. If your application is designed to do lots of CREATEs, we should focus on that. (Note that, even on Linux, table create time is non-trivial.)
If these are temporary tables... 5.7 will have significant changes that will improve the performance (on either OS) in this area. 5.7 is on the cusp of being GA.
(RAM size is irrelevant in this situation.)

AWS RDS MySQL Read Replica Lag Issues

I run a service that needs to be able to support about 4000+ IOPS and keep replica lag <=1 second to function properly.
I am using AWS RDS MySQL instances and have 2 read replica's. My service was experiencing giant replica lag spikes on the read replica's so I was in contact with AWS support for a week trying to understand why I was experiencing the lag--I had 6000 IOPS provisioned and my instances were very powerful. They gave me all kinds of reasons.
After changing instance types, upgrading to MySQL 5.6 from 5.5 to take advantage of multi-threading, and them replacing underlying hardware I was still seeing significant replica lag randomly.
Eventually I decided to start tinkering with the parameter groups changing my configs for just the read replica's on anything I could find that was involved in the replication process and am now finally experiencing <= 1 second of replica lag.
Here are the settings I changed and their values that appear to be successful (I copied the default mysql 5.6 param group and changed these values applying the updated paramater group to just the read replicas):
innodb_flush_log_at_trx_commit=0
sync_binlog=0
sync_master_info=0
sync_relay_log=0
sync_relay_log_info=0
Please read about each of these to understand the impact of the modifications: http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html
Other things to make sure you take care of:
Convert any MyISAM tables to InnoDB
Upgrade from MySQL < 5.6 to MySQL >= 5.6
Ensure that your provisioned IOPS are > the combined read/write IOPS you require
Ensure that your read replica instances are >= master instance
If anyone else has any additional parameters that could be modified on the read replica's or master DB to get the best replication performance I'd love to hear more.
UPDATE 7-8-2014
To take advantage of Mysql 5.6 multi-thread replication I've set:
slave_parallel_workers=5 (Set it to the number of read replica DBs you have running)
I found this in this here:
https://blogs.oracle.com/MySQL/entry/benchmarking_mysql_replication_with_multi
Mysql replication executes all the transactions on a single database in order , and master - can execute those transactions in parallel.
You probably have most of the updates executed on a single DA, and that is what not allowing you to get advantage of multithreaded replication.
Check the iostat on your replica server. Most of the time those problem occurs because of high IO on the machine.
In order to decrease the IO on a machine - there are several additional changes that you can do:
Increase innodb_buffer_pool_size - this is the first thing you should change from default. If this instance runs only mysql - you can allocate about 80% of your available the memory here.
Verify also the following parameters:
log_slave_updates = false
binlog_format = STATEMENT
(if you have MIXED or ROW binlog_format configured - verify that you understand what does that means from here http://dev.mysql.com/doc/refman/5.6/en/binary-log-setting.html
If you have a lot of data that is being changed for several times - increasing
innodb_max_dirty_pages_pct to 90 or 95% can be worth checking.

What when mysqldump becomes utterly slow

Currently my database is almost 20 GB big and still growing.
I'm taking a daily backup with mysqldump and it's getting really slow.
So slow that in the meanwhile new connections stack up and eventually cause this error:
SQLSTATE[HY000] [1040] Too many connections
(I could improve the amount of connections that's accepted but that won't do anything because the connections are still just frozen, waiting for the backup to complete, which will lead to timeout)
I've been reading up on some options to improve the speed and this is what I've found:
option --quick (Will probably help)
option --single-transaction (Will prevent tables from being locked, but may cause database to become incorrect)
Master-Slave replication (Probably the best thing I could do, one problem, I have only one server available)
The master-slave replication really sounds like it's the best option since I can stop the slave from updating, take the backup, and let it resume syncing. The problem is I only have one fysical machine to work with.
I know that I can set up multiple mysql instances on this one server. The question is: Is it wise to do so?
The slave is really only used to generate that backup file (which will be copied to a different disk on the network) so that the master can stay live.
if you use just innodb - try xtrabackup.
if you use both myisam and innodb - flush + lvm snapshot + file-level copy might work for you.
indeed replication slave for backups is good idea as well. just remember to periodically check data consistency between the master and the slave.

Increasing the number of simultaneous request to mysql

Recently we changed app server of our rails website from mongrel to passenger [with REE and Rails 2.3.8]. The production setup has 6 machines pointing to a single mysql server and a memcache server. Before each machine had 5 mongrel instance. Now we have 45 passenger instance as the RAM in each machine is 16GB with 2, 4 core cpu. Once we deployed this passenger set up in production. the Website became so slow. and all the request starting to queue up. And eventually we had to roll back.
Now we suspect that the cause should be the increased load to the Mysql server. As before there where only 30 mysql connection and now we have 275 connection. The mysql server has the similar set up as our website machine. bUt all the configs were left to the defaul limit. The buffer_pool_size is only 8 mb though we have 16GB ram. and number of Concurrent threads is 8.
Will this increased simultaneous connection to mysql would have caused mysql to respond slowly than when we had only 30 connections? If so, how can we make mysql perform better with 275 simultaneous connection in place.
Any advice greatly appreciated.
UPDATE:
More information on the mysql server:
RAM : 16GB CPU: two processors each having 4 cores
Tables are innoDB. with only default innodb config values.
Thanks
An idle MySQL connection uses up a stack and a network buffer on the server. That is worth about 200 KB of memory and zero CPU.
In a database using InnoDB only, you should edit /etc/sysctl.conf to include vm.swappiness = 0 to delay swapping out processes as long as possible. You should then increase innodb_buffer_pool_size to about 80% of the systems memory assuming a dedicated database server machine. Make sure the box does not swap, that is, VSIZE should not exceed system RAM.
innodb_thread_concurrency can be set to 0 (unlimited) or 32 to 64, if you are a bit paranoid, assuming MySQL 5.5. The limit is lower in 5.1, and around 4-8 in MySQL 5.0. It is not recommended to use such outdated versions of MySQL in a machine with 8 or 16 cores, there are huge improvements wrt to concurrency in MySQL 5.5 with InnoDB 1.1.
The variable thread_concurrency has no meaning inside a current Linux. It is used to call pthread_setconcurrency() in Linux, which does nothing. It used to have a function in older Solaris/SunOS.
Without further information, the cause for your performance problems cannot be determined with any security, but the above general advice may help. More general advice geared at my limited experience with Ruby can be found in http://mysqldump.azundris.com/archives/72-Rubyisms.html That article is the summary of a consulting job I once did for an early version of a very popular Facebook application.
UPDATE:
According to http://pastebin.com/pT3r6A9q , you are running 5.0.45-community-log, which is awfully old and does not perform well under concurrent load. Use a current 5.5 build, it should perform way better than what you have there.
Also, fix the innodb_buffer_pool_size. You are going nowhere with only 8M of pool here.
While you are at it, innodb_file_per_table should be ON.
Do not switch on innodb_flush_log_at_trx_commit = 2 without understanding what that means, but it may help you temporarily, depending on your persistence requirements. It is not a permanent solution to your problems in any way, though.
If you have any substantial kind of writes going on, you need to review the innodb_log_file_size and innodb_log_buffer_size as well.
If that installation is earning money, you dearly need professional help. I am no longer doing this as a profession, but I can recommend people. Contact me outside of Stack Overflow if you want.
UPDATE:
According to your processlist, you have very many queries in state Sending data. MySQL is in this state when a query is being executed, that is, the main interior Join Loop/Query Execution loop is busy. SHOW ENGINE INNODB STATUS\G will show you something like
...
--------------
ROW OPERATIONS
--------------
3 queries inside InnoDB, 0 queries in queue
...
If that number is larger than say 4-8 (inside InnoDB), 5.0.x is going to have trouble. 5.5.x will perform a lot better here.
Regarding the my.cnf: See my previous comments on your InnoDB. See also my comments on thread_concurrency (without innodb_ prefix):
# On Linux, this does exactly nothing.
thread_concurrency = 8
You are missing all innodb configuration at all. Assuming that you ARE using innodb tables, you are not performing well, no matter what you do.
As far as I know, it's unlikely that merely maintaining/opening the connections would be the problem. Are you seeing this issue even when the site is idle?
I'd try http://www.quest.com/spotlight-on-mysql/ or similar to see if it's really your database that's the bottleneck here.
In the past, I've seen basic networking craziness lead to behaviour similar to what you describe - someone had set up the new machines with an incorrect submask.
Have you looked at any of the machine statistics on the database server? Memory/CPU/disk IO stats? Is the database server struggling?