"Storage-full " problem with aws RDS MYSQL read replica - mysql

I have one master RDS instance of 50 GB storage and created a read replica of the master DB with the same configuration.
I use this read replica only for SELECT operations nothing else. But suddenly I got the storage full problem with the read replica DB. Master Db is working properly, how is it possible if the size is same but replica DB's size is full?
Getting below error when executing the complex query(inner, join, group):-
ERROR 3 (HY000): Error writing file '/rdsdbdata/tmp/MY7U2XRf' (Errcode: 28 - No space left on device)
In AWS Console, The slave DB status is "Storage-full" and an event log message is : The free storage capacity for DB Instance: example-slave is low at 1% of the provisioned storage [Provisioned Storage: 49.07 GB, Free Storage: 527.80 MB]. You may want to increase the provisioned storage to address this issue.
Checked the free size with below command of both Master and replica Db instances, it's almost the same.
MySQL:
SELECT table_schema, ROUND(SUM(data_length+index_length)/1024/1024/1024,2) "size in GB"
FROM information_schema.tables
GROUP BY 1
ORDER BY 2 DESC;
Master DB's space is used 23.86 GB out of 50 GB and Slave DB's space is used 24 GB out of 50 GB.

What probably happened is that you enabled the auto-scaling in the master but not in the slave/read-replica

Related

MySQL server execute query took long time

Why MySQL server took time compare than MySQL inside wamp ?
Machine 1 installed MySQL 5.6.17(inside the wamp)
Machine 2 installed MySQL 5.7.24(separate server)
Both machines are same configuration and same OS.
I imported same DB dump file to Machine1 and Machine2.
Now I execute query (the query get data from 6 join tables) and return 400 rows.
Took time:
Machine 1 (5.6.17) inside wamp- Below 30 sec's
Machine 2 (5.7.24) - Morethan 230 sec's
Shall I use MySQL(wamp) instead of MySQL server?
I think MySQL server need to increase Innodb_bufferpool_size on my.ini which located from C;\Program Data (Default hidden folder)
Default Innodb_bufferpool_size is 8M
innodb_buffer_pool_size: This is a very important setting to look immediate after the installation using InnoDB. The InnoDB is the buffer pool where the data is indexed the cached, which has a very large possible size that will make sure and use the memory no the disk space for most of the read-write operations, generally the size of InnoDB values are 5-6GB for 8GB RAM.
Fix: Increase innodb_buffer_pool_size
innodb_buffer_pool_size=356M

abnormal bandwidth between ec2 instance for mysql replication

I am using MySQL replication between 2 AWS Instance: m4.xlarge.
Version 5.6.35
Sometimes I have trouble with replication lag time is high, increased too much up to seconds_behind_master xx,000 seconds. I found that io_thread on slave can not catch up bin_logs size on master db.
At this time, the bandwidth & byte transmit between master & slave is very low (count in bytes)
But when I changed the instance type of Slave from m4.xlarge to t2.xlarge and backward. The bandwidth between master slave increased immediately ( up to 400kb/s). Keep master db, no touch or change on it. ANd soon the replication lag disappear. This is weird.
(iftop to check bandwidth)
Could you please advise what is wrong with this? what happend when we change the instance type? and How can we detect the root cause?
Thanks so much.
I have checked & monitored that the problem come from our Slave DB running out of Burst Balance. (https://aws.amazon.com/blogs/aws/new-burst-balance-metric-for-ec2s-general-purpose-ssd-gp2-volumes/).
As our slave DB has small storage just 20GB so it just has default 100 IOPS.
IOPS = Volume size (in GB) * 3
Minimum for gp2 is 100 IOPS
Maximum for gp2 is 3000 IOPS
I increased the storage to 50GB to get 150 IOPS and now io_thread running better.
Stop/start instance also reset Burst Balance to 100% but it is a quick tip only. This metric should be checked beside bandwidth incase high latency between EC2 instances.
Seems like there are some one ran into the same problem [1] [2]
Did you try to turn off sync_binlog ? ( sync_binlog=0 )

AWS RDS Concurrent Connections Issue

So I have an RDS MariaDB server running with the following specs.
Instance Class: db.m4.2xlarge
Storage Type: Provisioned IOPS (SSD)
IOPS: 4000
Storage: 500 GB
My issue is when the SQL server experience heavy load (connections in excess of 200) it will start to refuse new connections.
However according to the monitoring stats, it should be able to handle connections well above that. At it's peak load these are the following stats:
CPU Utilization: 18%
DB Connections: 430
Write Operations: 175/sec
Read Operations: 1/sec (This Utilizes MemCache)
Memory Usage: 1.2GB
The DB has the following hardware specs
8 vCPUs
32 GB Mem
1000 Mbps EBS Optimized
Also from what I can tell RDS has the "Max_Connections" setting in MySQL set to
2,664.
So I can't understand why it is rejecting new connections at such a low rate by comparison. Is there another setting that controls this either in RDS or in the MariaDB?

Couchbase 4.0 Community Edition benchmark

We are benchmarking couchbase and observing a very strange behaviour.
Setup phase:
Couchbase cluster machines;
2 x EC2 r3.xlarge with General purpose 80GB SSD (Not EBS optimised ) , IOPS 240/3000.
Couchbase settings:
Cluster:
Data Ram Quota: 22407 MB
Index Ram Quota: 2024 MB
Index Settings (default)
Bucket:
Per Node Ram Quota: 22407 MB
Total Bucket Size: 44814 MB (22407 x 2)
Replicas enabled (1)
Disk I/O Optimisation (Low)
Each node runs all three services
Couchbase client;
1 x EC2 m4.xlarge General purpose 20 GB SSD (EBS Optimised), IOPS 60/3000.
The client is running the 'YCSB' benchmark tool.
ycsb load couchbase -s -P workloads/workloada -p recordcount=100000000 -p core_workload_insertion_retry_limit=3 -p couchbase.url=http://HOST:8091/pools -p couchbase.bucket=test -threads 20 | tee workloadaLoad.dat
PS: All the machines are residing within the same VPC and subnet.
Results:
While everything works as expected
The average ops/sec is ~21000
The 'disk write queue' graph is floating between 200K - 600K(periodically drained).
The 'temp OOM per sec' graph is at constant 0.
When things starting to get weird
After about ~27M documents inserted we start seeing 'disk write queue' is constantly rising (Not getting drained)
At about ~8M disk queue size the OOM failures are starting to show them selves and the client receives 'Temporary failure' from couchbase.
After 3 retries of each YCSB thread, the client stops after inserting only ~27% of the overall documents.
Even when the YCSB client stopped running, the 'disk write queue' is asymptotically moving towards 0, and is drained only after ~15 min.
P.S
When we benchmark locally on MacBook with 16GB of ram + SSD disk (local client + one node server) we do not observe such behaviour and the 'disk write queue' is constantly drained in a predictable manner.
Thanks.

AWS RDS MySQL Read Replica Lag Issues

I run a service that needs to be able to support about 4000+ IOPS and keep replica lag <=1 second to function properly.
I am using AWS RDS MySQL instances and have 2 read replica's. My service was experiencing giant replica lag spikes on the read replica's so I was in contact with AWS support for a week trying to understand why I was experiencing the lag--I had 6000 IOPS provisioned and my instances were very powerful. They gave me all kinds of reasons.
After changing instance types, upgrading to MySQL 5.6 from 5.5 to take advantage of multi-threading, and them replacing underlying hardware I was still seeing significant replica lag randomly.
Eventually I decided to start tinkering with the parameter groups changing my configs for just the read replica's on anything I could find that was involved in the replication process and am now finally experiencing <= 1 second of replica lag.
Here are the settings I changed and their values that appear to be successful (I copied the default mysql 5.6 param group and changed these values applying the updated paramater group to just the read replicas):
innodb_flush_log_at_trx_commit=0
sync_binlog=0
sync_master_info=0
sync_relay_log=0
sync_relay_log_info=0
Please read about each of these to understand the impact of the modifications: http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html
Other things to make sure you take care of:
Convert any MyISAM tables to InnoDB
Upgrade from MySQL < 5.6 to MySQL >= 5.6
Ensure that your provisioned IOPS are > the combined read/write IOPS you require
Ensure that your read replica instances are >= master instance
If anyone else has any additional parameters that could be modified on the read replica's or master DB to get the best replication performance I'd love to hear more.
UPDATE 7-8-2014
To take advantage of Mysql 5.6 multi-thread replication I've set:
slave_parallel_workers=5 (Set it to the number of read replica DBs you have running)
I found this in this here:
https://blogs.oracle.com/MySQL/entry/benchmarking_mysql_replication_with_multi
Mysql replication executes all the transactions on a single database in order , and master - can execute those transactions in parallel.
You probably have most of the updates executed on a single DA, and that is what not allowing you to get advantage of multithreaded replication.
Check the iostat on your replica server. Most of the time those problem occurs because of high IO on the machine.
In order to decrease the IO on a machine - there are several additional changes that you can do:
Increase innodb_buffer_pool_size - this is the first thing you should change from default. If this instance runs only mysql - you can allocate about 80% of your available the memory here.
Verify also the following parameters:
log_slave_updates = false
binlog_format = STATEMENT
(if you have MIXED or ROW binlog_format configured - verify that you understand what does that means from here http://dev.mysql.com/doc/refman/5.6/en/binary-log-setting.html
If you have a lot of data that is being changed for several times - increasing
innodb_max_dirty_pages_pct to 90 or 95% can be worth checking.