MySQL crashes often - mysql

I have a droplet on DigitalOcean created using Laravel Forge and since a few days ago the MySQL server just crashes and the only way to make it work again is by rebooting the server (MySQL makes the server unresponsive).
When I type htop to see the list of processes is showing a few of /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysql.pid (currently is showing 33 of them).
The error log is bigger than 1GB (yes, I know!) and shows this message hundreds of times:
[Warning] InnoDB: Difficult to find free blocks in the buffer pool (21
search iterations)! 21 failed attempts to flush a page! Consider
increasing the buffer pool size. It is also possible that in your Unix
version fsync is very slow, or completely frozen inside the OS kernel.
Then upgrading to a newer version of your operating system may help.
Look at the number of fsyncs in diagnostic info below. Pending flushes
(fsync) log: 0; buffer pool: 0. 167678974 OS file reads, 2271392 OS
file writes, 758043 OS fsyncs. Starting InnoDB Monitor to print
further diagnostics to the standard output.
This droplet has been running during 6 months but this problem only started last week. The only thing that changed recently is now we send weekly notifications to customers (only the ones that subscribed to it) to let them know about certain events happening in the current week. This is kind of a intensive process, because we have a few thousands of customers, but we take advantage of Laravel Queues in order to process everything.
Is this a MySQL-settings related issue?

Try increasing innodb_buffer_pool_size in my.cnf
The recommendation for a dedicated DB server is 80% - if you're already at that level then you should consider moving to a bigger instance type.

in the my.cnf set this value:
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 12
innodb_page_cleaners = 12

Related

mysql server log file is increasing in size with warnings

Version: Mysql server version is 8.0 and it is installed on Windows server 2019.
Problem statement: Error log file from location C:\programdata\mysql80\data\ is increasing and it is about ~100GB.
Error log statements are :
[Warning] [MY-011959] [InnoDB] Difficult to find free blocks in the buffer pool (6091 search iterations)! 6091 failed attempts to flush a page! Consider increasing the buffer pool size. It is also possible that in your Unix version fsync is very slow, or completely frozen inside the OS kernel. Then upgrading to a newer version of your operating system may help. Look at the number of fsyncs in diagnostic info below. Pending flushes (fsync) log: 0; buffer pool: 0. 1738 OS file reads, 217 OS file writes, 45 OS fsyncs. Starting InnoDB Monitor to print further diagnostics to the standard output.
Scenario:
There was failure to update more than 4MB data size to the database, so we changed the "max_allowed_packet" from default 4M to 256M in "mysql.ini" file.
After this settings restarted mysqld service, everything worked after this db changed.
But after 2 days we are facing the above mentioned issue, as a result disk space is getting full and DB connections are pending in a queue.
Tried to stop the mysql80 service and deleted the error.log file and reverted the max_allowed_packet to its default size.
But with the change also error file is getting created again with same warning.
What could be the possible issues or fix for it?

Mysql Show InnoDB status not reporting buffer pool stats correctly on Linux

We have multiple slaves running MySQL 5.7 - some on Linux (CentOS 7) and some on Windows. We're trying to diagnose an issue where our linux boxes randomly start falling behind with no long running queries or locks or dramatic increases in writes and reads.
Our error logs on the Linux boxes are filled with "InnoDB: page_cleaner: 1000ms intended loop took x ms. The settings might not be optimal." messages.
Which suggests that we're flushing a lot of dirty pages from the buffer pool.
When viewing the innodb status:
SHOW ENGINE INNODB STATUS
Our Windows boxes show non-zero values:
However, the Linux machines' buffer pool status show 0.00 for youngs/s, non-youngs/s, reads/s etc. which is not true.
Any ideas how to get the Linux boxes to report correctly?
Lowering lru_scan_depth (which is probably 1024 now) to, say, 20, may help performance and may make those messages go away. (Please report back on what results you get.)

What is the best mysql configuration for mysql instance with a lot of databases and lot of tables inside?

I have a mysql database instance with more than 3000 database inside. Each database contains more than 200 tables. I have more than 100 gb of data in all these database at present. I am using windows server 2012R2 operating system with a 4GB of RAM. The RAM memory utilization of the server system was always showing very high. So I tried to restart the system and restart is not working. It is showing restarting for long time and not restarting. When i checked the logs I understood that there is a memory issue. I want to restart my mysql instance and continue. What is the best configuration for the mysql with above architecture? what i need to do to make this work with out failure in future?
[Warning] InnoDB: Difficult to find free blocks in the buffer pool (1486 search iterations)! 1486 failed attempts to flush a page! Consider increasing the buffer pool size. It is also possible that in your Unix version fsync is very slow, or completely frozen inside the OS kernel. Then upgrading to a newer version of your operating system may help. Look at the number of fsyncs in diagnostic info below. Pending flushes (fsync) log: 0; buffer pool: 0. 26099 OS file reads, 1 OS file writes, 1 OS fsyncs. Starting InnoDB Monitor to print further diagnostics to the standard output.

How to solve mysql warning: "InnoDB: page_cleaner: 1000ms intended loop took XXX ms. The settings might not be optimal "?

I ran a mysql import mysql dummyctrad < dumpfile.sql on server and its taking too long to complete. The dump file is about 5G. The server is a Centos 6, memory=16G and 8core processors, mysql v 5.7 x64-
Are these normal messages/status "waiting for table flush" and the message InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal
mysql log contents
2016-12-13T10:51:39.909382Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal. (flushed=1438 and evicted=0, during the time.)
2016-12-13T10:53:01.170388Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4055ms. The settings might not be optimal. (flushed=1412 and evicted=0, during the time.)
2016-12-13T11:07:11.728812Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4008ms. The settings might not be optimal. (flushed=1414 and evicted=0, during the time.)
2016-12-13T11:39:54.257618Z 3274915 [Note] Aborted connection 3274915 to db: 'dummyctrad' user: 'root' host: 'localhost' (Got an error writing communication packets)
Processlist:
mysql> show processlist \G;
*************************** 1. row ***************************
Id: 3273081
User: root
Host: localhost
db: dummyctrad
Command: Field List
Time: 7580
State: Waiting for table flush
Info:
*************************** 2. row ***************************
Id: 3274915
User: root
Host: localhost
db: dummyctrad
Command: Query
Time: 2
State: update
Info: INSERT INTO `radacct` VALUES (351318325,'kxid ge:7186','abcxyz5976c','user100
*************************** 3. row ***************************
Id: 3291591
User: root
Host: localhost
db: NULL
Command: Query
Time: 0
State: starting
Info: show processlist
*************************** 4. row ***************************
Id: 3291657
User: remoteuser
Host: portal.example.com:32800
db: ctradius
Command: Sleep
Time: 2
State:
Info: NULL
4 rows in set (0.00 sec)
Update-1
mysqlforum ,innodb_lru_scan_depth
changing innodb_lru_scan_depth value to 256 have improved the insert queries execution time + no warning message in log, the default was innodb_lru_scan_depth=1024;
SET GLOBAL innodb_lru_scan_depth=256;
InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal. (flushed=1438 and evicted=0, during the time.)
The problem is typical of a MySQL instance where you have a high rate of changes to the database. By running your 5GB import, you're creating dirty pages rapidly. As dirty pages are created, the page cleaner thread is responsible for copying dirty pages from memory to disk.
In your case, I assume you don't do 5GB imports all the time. So this is an exceptionally high rate of data load, and it's temporary. You can probably disregard the warnings, because InnoDB will gradually catch up.
Here's a detailed explanation of the internals leading to this warning.
Once per second, the page cleaner scans the buffer pool for dirty pages to flush from the buffer pool to disk. The warning you saw shows that it has lots of dirty pages to flush, and it takes over 4 seconds to flush a batch of them to disk, when it should complete that work in under 1 second. In other words, it's biting off more than it can chew.
You adjusted this by reducing innodb_lru_scan_depth from 1024 to 256. This reduces how far into the buffer pool the page cleaner thread searches for dirty pages during its once-per-second cycle. You're asking it to take smaller bites.
Note that if you have many buffer pool instances, it'll cause flushing to do more work. It bites off innodb_lru_scan_depth amount of work for each buffer pool instance. So you might have inadvertently caused this bottleneck by increasing the number of buffer pools without decreasing the scan depth.
The documentation for innodb_lru_scan_depth says "A setting smaller than the default is generally suitable for most workloads." It sounds like they gave this option a value that's too high by default.
You can place a limit on the IOPS used by background flushing, with the innodb_io_capacity and innodb_io_capacity_max options. The first option is a soft limit on the I/O throughput InnoDB will request. But this limit is flexible; if flushing is falling behind the rate of new dirty page creation, InnoDB will dynamically increase flushing rate beyond this limit. The second option defines a stricter limit on how far InnoDB might increase the flushing rate.
If the rate of flushing can keep up with the average rate of creating new dirty pages, then you'll be okay. But if you consistently create dirty pages faster than they can be flushed, eventually your buffer pool will fill up with dirty pages, until the dirty pages exceeds innodb_max_dirty_page_pct of the buffer pool. At this point, the flushing rate will automatically increase, and may again cause the page_cleaner to send warnings.
Another solution would be to put MySQL on a server with faster disks. You need an I/O system that can handle the throughput demanded by your page flushing.
If you see this warning all the time under average traffic, you might be trying to do too many write queries on this MySQL server. It might be time to scale out, and split the writes over multiple MySQL instances, each with their own disk system.
Read more about the page cleaner:
Introducing page_cleaner thread in InnoDB (archived copy)
MySQL-5.7 improves DML oriented workloads
The bottleneck is saving data to HDD. Whatever HDD you have: SSD, normal one, NVMe etc.
Note, that this solution applies mostly to InnoDB
I had the same problem, I've applied few solutions.
1st: checking what's wrong
atop -d will show you disk usage. If disk is 'busy', then try to stop all queries to database (but don't stop mysql server service!)
To monitor how many queries you do have, use mytop, innotop or equivalent.
If you have 0 queries, but disk usage is STILL next to 100% from a few seconds / few minutes, then it means, that mysql server is trying to flush dirty pages / do some cleaning as mentioned before (great post of Bill Karwin).
THEN you can try to apply such solutions:
2nd: harware optimisation
If your array is not in RAID 1+0 consider to double speed of saving data using such kind of solution. Try to extend your HDD cotroller possibilities with writing data. Try to use SSD or faster HDD. Applying this soultion depends on your harware and budget possibilities and may vary.
3nd: software tuning
If harware cotroller is working fine, but you want to extend speed of saving data you can set up in mysql config file:
3.1.
innodb_flush_log_at_trx_commit = 2 -> if you/re using innodb tables. It works form my experisnce the best with one table per file:
innodb_file_per_table = 1
3.2.
continuing with InnoDB:
innodb_flush_method = O_DIRECT
innodb_doublewrite = 0
innodb_support_xa = 0
innodb_checksums = 0
Lines above are in general reducing amount of data needed to be saved in HDD, so performance is greater.
3.3
general_log = 0
slow_query_log = 0
Lines above disable saving logs, of course it is yet another amount of data to be saved on HDD
3.4
check again what's happening usit e.g.
tail -f /var/log/mysql/error.log
4th: general notes
General notes:
This was tested under MySQL 5.6 AND 5.7.22
OS: Debian 9
RAID: 1 + 0 SSD drives
Database: InnoDB tables
innodb_buffer_pool_size = 120G
innodb_buffer_pool_instances = 8
innodb_read_io_threads = 64
innodb_write_io_threads = 64
Total amount of RAM in server: 200GB
After doing that you may observe higher CPU usage; that's normal, because writing data is more faster, so then CPU will work harder.
If you're doing that using my.cnf of course don't forget to restart MySQL server.
5th: supplement
Beeing intrigued I did this quirk with:
SET GLOBAL innodb_lru_scan_depth=256;
mentioned above.
Working with big tables I've seen no change in performance.
After corrections above I didn't get rid of warnings, however whole system is working significantly faster.
Everything above is just an experimentation, but I have measured results, it helped me a little, so hopefully it may be useful for others.
This can simply be indicative of poor filesystem performance in general - a symptom of an unrelated problem. In my case I spent an hour researching this, analyzing my system logs, and had nearly reached the point of tweaking the MySQL config, when I decided to check with my cloud based hosting. It turns out there were "abusive I/O spikes from a neighbor." which my host quickly resolved after I brought it to their attention.
My recommendation is to know your baseline / expected filesystem performance, stop MySQL, and measure your filesystem performance to determine if there are more fundamental problems unrelated to MySQL.

Gearmand Server Completely Locks Up

Running Gearmand 1.1.8 on a CentOS AWS VM using MySQL as storage for the queue, every few hours Gearmand suddenly spins out of control, 100% CPU, and sucks up most of the memory on the small instance.
We are currently in testing and not production, all the messages we send to it are relatively small and well formed, the biggest of which is aprox 15 megabytes.
We initially hit a wall with the bigger requests but increased mysql's max packet size and InnoDB's max log size because it was barking about both
[mysqld]
max_allowed_packet=20M
innodb_log_file_size=300M
We are using the recommend schema http://gearman.info/gearmand/queues/mysql.html in InnoDB (the schema doesn't specify storage)
We have 20 workers connected to the gearmand server, and it ran FINE stacking up hundreds of thousands of messages until the last few days.
Any help would be greatly appreciated.
While this is going on there is no network traffic in or out. We have a status dashboard (see: http://jdon.at/kmCU) which completely locks up waiting for a STATUS response from the Gearman server.
Any help would be greatly appreciated! Thanks
The problem was a but in the version we were using, 1.1.8 whereas the gearmand queue server would crash upon receiving a WORK_EXCEPTION Upgrading to 1.1.11 corrected the issue.
see: https://bugs.launchpad.net/gearmand/+bug/1164997