Running Gearmand 1.1.8 on a CentOS AWS VM using MySQL as storage for the queue, every few hours Gearmand suddenly spins out of control, 100% CPU, and sucks up most of the memory on the small instance.
We are currently in testing and not production, all the messages we send to it are relatively small and well formed, the biggest of which is aprox 15 megabytes.
We initially hit a wall with the bigger requests but increased mysql's max packet size and InnoDB's max log size because it was barking about both
[mysqld]
max_allowed_packet=20M
innodb_log_file_size=300M
We are using the recommend schema http://gearman.info/gearmand/queues/mysql.html in InnoDB (the schema doesn't specify storage)
We have 20 workers connected to the gearmand server, and it ran FINE stacking up hundreds of thousands of messages until the last few days.
Any help would be greatly appreciated.
While this is going on there is no network traffic in or out. We have a status dashboard (see: http://jdon.at/kmCU) which completely locks up waiting for a STATUS response from the Gearman server.
Any help would be greatly appreciated! Thanks
The problem was a but in the version we were using, 1.1.8 whereas the gearmand queue server would crash upon receiving a WORK_EXCEPTION Upgrading to 1.1.11 corrected the issue.
see: https://bugs.launchpad.net/gearmand/+bug/1164997
Related
we're using MySql on CloudSql for quite some time now.
Obviously, we started with Mysql 5 but after a long wait and the final release of Mysql8 we decided to upgrade our database server.
As the title promotes, we now see a strange behavior of our memory utilization.
As you can see here it constantly fills up until server max resources are reached and then restarts and start filling up again.
I mean there could be an issue with one of our services but before the upgrade our memory consumption looked like this:
So you can see, memory consumption was more or less constant.
Furthermore, we increased resources when we upgraded to mysql8 and switched from db-n1-standard-1 to db-n1-standard-2, to have more available resources when data grows up.
Does anyone knows this behavior? Is there a change in Mysql5 to 8? I didn't find any information about it. Just found some notes that it's normal that Mysql takes as much memory as it can get. But I'm still wondering why it didn't on Mysql5.
Some more details on the configuration:
We're using read replica for HA
Binarylogs activated
Slow Query log enabled with FILE output.
Everything else is default CloudSql Configuration.
Any help is much appreciated.
Best regards,
Chris
Indeed, it seems that MySQL 8 is consuming more memory than MySQL 5. As shown in some tests performed by the author of the article MySQL 8 and MySQL 5.7 Memory Consumption on Small Devices
, the memory used by the version 8 in same VM settings is considerably higher than on versions 5, including both resident and virtual memories - even though these are tests in small VMs, it's a good indication that this occurs in bigger configurations as well.
So, yes, it seems that, as you mentioned, it's normal that Mysql takes as much memory as it can get, but that indeed, MySQL 8 is consuming more memory than the 5 one.
I have a droplet on DigitalOcean created using Laravel Forge and since a few days ago the MySQL server just crashes and the only way to make it work again is by rebooting the server (MySQL makes the server unresponsive).
When I type htop to see the list of processes is showing a few of /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysql.pid (currently is showing 33 of them).
The error log is bigger than 1GB (yes, I know!) and shows this message hundreds of times:
[Warning] InnoDB: Difficult to find free blocks in the buffer pool (21
search iterations)! 21 failed attempts to flush a page! Consider
increasing the buffer pool size. It is also possible that in your Unix
version fsync is very slow, or completely frozen inside the OS kernel.
Then upgrading to a newer version of your operating system may help.
Look at the number of fsyncs in diagnostic info below. Pending flushes
(fsync) log: 0; buffer pool: 0. 167678974 OS file reads, 2271392 OS
file writes, 758043 OS fsyncs. Starting InnoDB Monitor to print
further diagnostics to the standard output.
This droplet has been running during 6 months but this problem only started last week. The only thing that changed recently is now we send weekly notifications to customers (only the ones that subscribed to it) to let them know about certain events happening in the current week. This is kind of a intensive process, because we have a few thousands of customers, but we take advantage of Laravel Queues in order to process everything.
Is this a MySQL-settings related issue?
Try increasing innodb_buffer_pool_size in my.cnf
The recommendation for a dedicated DB server is 80% - if you're already at that level then you should consider moving to a bigger instance type.
in the my.cnf set this value:
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 12
innodb_page_cleaners = 12
We have two MySQL RDS instances (Master and read replica). As usual we write to the master and read from the slave.
Master server works fine, but we observed that slave server becomes unresponsive time to time.
Observations:
Monitoring Graphs
CPU utilization drops down to 0
Increase in number of connections
Write IOPS, read IOPS, queue depth, write throughput, write latency and read latency drop to 0.
This can be resolved with a restart, but we are interested in finding the root cause. Basically when this happens, we can still log in to mysql prompt, but we can't execute any queries. AWS console shows instance as healthy, no errors are shown.
According to the graphs, there is no any abnormal activity or increase in resource utilization just before this happens. Everything looks normal.
(Small climbs in the attached graphs are normal. Those are in line with the business pattern. Historically instance survived against larger mountains)
Please let me know if you happen to come across such a situation.
Thanks.
Note:
Instance Information
db.m4.xlarge
IOPS 2000
Size 50G
Basically, instance is being under utilized when the issue happens
Note:
If we wait without restarting the instance, it gets restarted automatically with following error.
MySQL restart initiated to address MySQL induced log backup issues. Note that as part of this resulution, a DB Snapshot will be performed after MySQL completes restarting.
Every couple of days we have been getting a small number of MySql timeout errors that correspond with a large spike in CPU and DB connections on our MySQL RDS instance. These are queries that are typically very fast (<5ms) that suddenly timeout.
At this point, database operations are very slow for a minute or so (likely because new connections are being allocated). The number of new connections often doubles and seem to correspond to the entire Connection Pool being recycled.
The timeouts do not seem to correspond with heavy database load. The CPU is often under 7% when this happens spiking up to around 12%.
Once these connections are created, the old connections seem to stay around for several hours.
We have some theories:
An occasional network hiccup between EC2 and RDS
A connection pool recycle (is there such a thing?)
Resource contention on the server that backs up all queries (no deadlocks present)
Any help on debugging this would be very much appreciated.
System Details:
Windows 2012 EC2 instances
.NET 4.5
MySql Connector 6.8.3
Entity Framework 6.0.2
MySql.Data.Entities 6.8.3
MySql 5.6.12 (Hosted in Amazon's RDS)
I wanted to put this as a comment not an answer but "...must have 50 reputation to comment..."
Are you maxing out on connections? show variables like 'max_connections'; show process_list; (as root user)
How's your disk I/O: iostat -x 5 via command line and pay special attention to queue sizes & service/wait times. If its an issue you can purchase AWS reserved IOPS for better reliability & performance.
You can profile it - i like Jet Profiler, simple & low load.
Recently we changed app server of our rails website from mongrel to passenger [with REE and Rails 2.3.8]. The production setup has 6 machines pointing to a single mysql server and a memcache server. Before each machine had 5 mongrel instance. Now we have 45 passenger instance as the RAM in each machine is 16GB with 2, 4 core cpu. Once we deployed this passenger set up in production. the Website became so slow. and all the request starting to queue up. And eventually we had to roll back.
Now we suspect that the cause should be the increased load to the Mysql server. As before there where only 30 mysql connection and now we have 275 connection. The mysql server has the similar set up as our website machine. bUt all the configs were left to the defaul limit. The buffer_pool_size is only 8 mb though we have 16GB ram. and number of Concurrent threads is 8.
Will this increased simultaneous connection to mysql would have caused mysql to respond slowly than when we had only 30 connections? If so, how can we make mysql perform better with 275 simultaneous connection in place.
Any advice greatly appreciated.
UPDATE:
More information on the mysql server:
RAM : 16GB CPU: two processors each having 4 cores
Tables are innoDB. with only default innodb config values.
Thanks
An idle MySQL connection uses up a stack and a network buffer on the server. That is worth about 200 KB of memory and zero CPU.
In a database using InnoDB only, you should edit /etc/sysctl.conf to include vm.swappiness = 0 to delay swapping out processes as long as possible. You should then increase innodb_buffer_pool_size to about 80% of the systems memory assuming a dedicated database server machine. Make sure the box does not swap, that is, VSIZE should not exceed system RAM.
innodb_thread_concurrency can be set to 0 (unlimited) or 32 to 64, if you are a bit paranoid, assuming MySQL 5.5. The limit is lower in 5.1, and around 4-8 in MySQL 5.0. It is not recommended to use such outdated versions of MySQL in a machine with 8 or 16 cores, there are huge improvements wrt to concurrency in MySQL 5.5 with InnoDB 1.1.
The variable thread_concurrency has no meaning inside a current Linux. It is used to call pthread_setconcurrency() in Linux, which does nothing. It used to have a function in older Solaris/SunOS.
Without further information, the cause for your performance problems cannot be determined with any security, but the above general advice may help. More general advice geared at my limited experience with Ruby can be found in http://mysqldump.azundris.com/archives/72-Rubyisms.html That article is the summary of a consulting job I once did for an early version of a very popular Facebook application.
UPDATE:
According to http://pastebin.com/pT3r6A9q , you are running 5.0.45-community-log, which is awfully old and does not perform well under concurrent load. Use a current 5.5 build, it should perform way better than what you have there.
Also, fix the innodb_buffer_pool_size. You are going nowhere with only 8M of pool here.
While you are at it, innodb_file_per_table should be ON.
Do not switch on innodb_flush_log_at_trx_commit = 2 without understanding what that means, but it may help you temporarily, depending on your persistence requirements. It is not a permanent solution to your problems in any way, though.
If you have any substantial kind of writes going on, you need to review the innodb_log_file_size and innodb_log_buffer_size as well.
If that installation is earning money, you dearly need professional help. I am no longer doing this as a profession, but I can recommend people. Contact me outside of Stack Overflow if you want.
UPDATE:
According to your processlist, you have very many queries in state Sending data. MySQL is in this state when a query is being executed, that is, the main interior Join Loop/Query Execution loop is busy. SHOW ENGINE INNODB STATUS\G will show you something like
...
--------------
ROW OPERATIONS
--------------
3 queries inside InnoDB, 0 queries in queue
...
If that number is larger than say 4-8 (inside InnoDB), 5.0.x is going to have trouble. 5.5.x will perform a lot better here.
Regarding the my.cnf: See my previous comments on your InnoDB. See also my comments on thread_concurrency (without innodb_ prefix):
# On Linux, this does exactly nothing.
thread_concurrency = 8
You are missing all innodb configuration at all. Assuming that you ARE using innodb tables, you are not performing well, no matter what you do.
As far as I know, it's unlikely that merely maintaining/opening the connections would be the problem. Are you seeing this issue even when the site is idle?
I'd try http://www.quest.com/spotlight-on-mysql/ or similar to see if it's really your database that's the bottleneck here.
In the past, I've seen basic networking craziness lead to behaviour similar to what you describe - someone had set up the new machines with an incorrect submask.
Have you looked at any of the machine statistics on the database server? Memory/CPU/disk IO stats? Is the database server struggling?