Why is MySQL consuming so much memory? - mysql

I have mysql 5.6.36 database where the size is ~35G running on CentOS 7.3 with 48G of RAM.
[UPDATE 17-08-06] I will update relevant information here.
I am seeing that my server runs out of memory and crashes even with ~48G of RAM. I could not keep it running on 24G, for example. A DB this size should be able to run on much less. Clearly, I a missing something fundamental.
[UPDATE: 17-08-05] By crashes, I mean mysqld stops and restarts with no useful information in the log, other than restarting from a crash. Also, with all this memory, I got this error during recovery:
[ERROR] InnoDB: space header page consists of zero bytes in tablespace ./ca_uim/t_qos_snapshot.ibd (table ca_uim/t_qos_snapshot)
The relevant portion of my config file looks like this [EDITED 17-08-05 to add missing lines]:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
lower_case_table_names = 1
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
max_allowed_packet = 32M
max_connections = 300
table_definition_cache=2000
innodb_buffer_pool_size = 18G
innodb_buffer_pool_instances = 9
innodb_log_file_size = 1G
innodb_file_per_table=1
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
It was an oversight to use file per table, and I need to change that (I have 6000 tables, and most of those are partitioned).
After running for a short while (one hour), mytop shows this:
MySQL on 10.238.40.209 (5.6.36) load 0.95 1.08 1.01 1/1003 8525 up 0+01:31:01 [17:44:39]
Queries: 1.5M qps: 283 Slow: 22.0 Se/In/Up/De(%): 50/07/09/01
Sorts: 27 qps now: 706 Slow qps: 0.0 Threads: 118 ( 3/ 2) 43/28/01/00
Key Efficiency: 100.0% Bps in/out: 76.7k/176.8k Now in/out: 144.3k/292.1k
And free shows this:
# free -h
total used free shared buff/cache available
Mem: 47G 40G 1.5G 8.1M 5.1G 6.1G
Swap: 3.9G 508K 3.9G
Top shows this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2010 mysql 20 0 45.624g 0.039t 9008 S 95.0 84.4 62:31.93 mysqld
How can this be? Is this related file per table? The entire DB could fit in memory. What am I doing wrong?

In your my.cnf (MySQL configuration) file:
Add a setting in [mysqld] block
[mysqld]
performance_schema = 0
For MySQL 5.7.8 onwards, you will have to add extra settings as below:
[mysqld]
performance_schema = 0
show_compatibility_56 = 1
NOTE: This will cut your Memory usage to more than 50%-60%. "show_compatibility_56" is optional, for some cases it works, better to check it once added to the config file.

Well, I resolved the issue. I appreciate all the insight from those who responded. The solution is very strange, and I cannot explain why this solves the problem, but it does. What I did was add the following line to my.cnf:
log_bin
You may, in addition, need to add the following:
expire_logs_days = <some number>
We have seen at least one instance where the logs accumulated and filled up a disk. The default is 0 (no auto removal). https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_expire_logs_days

Results are stored and fed from memory and given that you're running 283 per second, there's probably a lot of data at any given moment being dished out.
I would think that you are doing a good job squeezing a lot out of that server. Consider the tables are one thing, then the schema involved for 6000 tables, plus the fact that you're pulling 283 queries per second against a 35 GB database and that those results are held in memory while they are being served. The rest of us might as well learn from you.
Regarding the stopping and restarting of MySQL
[ERROR] InnoDB: space header page consists of zero bytes in tablespace ./ca_uim/t_qos_snapshot.ibd (table ca_uim/t_qos_snapshot)
Your might consider trying
innodb_flush_method=normal which is recommended here and here, but I can't promise it will work.

I would check table_open_cache. You have a lot of tables and it is clearly reflected in avg opened files per second: about 48 when a normal value is between 1 and 5.
That is confirmed by the values of Table_open_cache_misses and Table_open_cache_overflows,
ideally those values should be cero. Those means failed attempts to use cache and in consequence wasted memory.
You should try increasing it at least to 3000 and see results.
Since you are on CentOS:
I would double check that ulimit it is unlimited or about 20000 for your 6000 tables.
Consider set swappiness to 1. I think it is better to have some swapps (while observing) than crashes.

Hoping you are a believer in ONLY one change at a time so you can track progress for a configuration reason. 2017-08-07 about 17:00 SHOW GLOBAL VARIABLES indicates innodb_buffer_pool_size is 128M. Change in my.cnf to 24G, shutdown/restart when permitted, please.
A) max_allowed_packet_size at 1G is likely what you meant in your configuration, considering on 8/7/2017 your remote agents are sending 1G packets for processing on this equipment. How are remote agents managed in terms of scheduling their sending of data to prevent exhausting all 48G on this host for this single use of memory? Status indicates bytes_received on 8/6/2017 was 885,485,832 from max_used_connections of 86 in first 1520 seconds of uptime.
B) innodb_io_capacity at 200 is likely a significant throttle to your possible IOPS, we run here at 700. sqlio.exe utility was used to guide us in this direction.
C) innodb_io_capacity_max should be likely be adjusted as well.
D) thread_cache_size of 11, consider going to 128.
E) thread_concurrency of 10, consider going to 30.
F) I understand the length of process-list.txt in the number of Sleep ID's is likely caused by the use of persistent connections. The connection is just waiting for some additional activity from the client for an extended period of time. 8/8/2017
G) STATUS Com_begin count is usually very close to Com_commit count, not in your case. 8/8/2017 Com_begin was 2 and Com_commit was 709,910 for 11 hours of uptime.
H) It would probably be helpful to see just 3 minutes of a General Log, if possible.
Keep me posted on your progress.

Please enable the MySQL error log in your usual configuration.
When MySQL crashes, protect the error log before restarting, and add last error-log available to your Question, please. It should have a clue WHY MySQL is failing.
Running the 'small' configuration will run like a dog, when supporting the volume of activity reported by SHOW GLOBAL STATUS.
Please get back to your usual production configuration.
I am looking at your provided details and will have some tuning suggestions in next 24 hours. It appears most of the process-list activities are related to replication. Would that be true?

Use of www.mysqlcalculator.com would be a quick way to get a brain check on about a dozen memory consumption factors in less than 2 minutes.
118 active threads may be reasonable but would seem to be causing extreme context switching trying to answer 118 questions simultaneously.
Would love to see your SHOW GLOBAL STATUS and SHOW GLOBAL VARIABLES, if you could get them posted.

Related

MySQL crashes often

I have a droplet on DigitalOcean created using Laravel Forge and since a few days ago the MySQL server just crashes and the only way to make it work again is by rebooting the server (MySQL makes the server unresponsive).
When I type htop to see the list of processes is showing a few of /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysql.pid (currently is showing 33 of them).
The error log is bigger than 1GB (yes, I know!) and shows this message hundreds of times:
[Warning] InnoDB: Difficult to find free blocks in the buffer pool (21
search iterations)! 21 failed attempts to flush a page! Consider
increasing the buffer pool size. It is also possible that in your Unix
version fsync is very slow, or completely frozen inside the OS kernel.
Then upgrading to a newer version of your operating system may help.
Look at the number of fsyncs in diagnostic info below. Pending flushes
(fsync) log: 0; buffer pool: 0. 167678974 OS file reads, 2271392 OS
file writes, 758043 OS fsyncs. Starting InnoDB Monitor to print
further diagnostics to the standard output.
This droplet has been running during 6 months but this problem only started last week. The only thing that changed recently is now we send weekly notifications to customers (only the ones that subscribed to it) to let them know about certain events happening in the current week. This is kind of a intensive process, because we have a few thousands of customers, but we take advantage of Laravel Queues in order to process everything.
Is this a MySQL-settings related issue?
Try increasing innodb_buffer_pool_size in my.cnf
The recommendation for a dedicated DB server is 80% - if you're already at that level then you should consider moving to a bigger instance type.
in the my.cnf set this value:
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 12
innodb_page_cleaners = 12

How to solve mysql warning: "InnoDB: page_cleaner: 1000ms intended loop took XXX ms. The settings might not be optimal "?

I ran a mysql import mysql dummyctrad < dumpfile.sql on server and its taking too long to complete. The dump file is about 5G. The server is a Centos 6, memory=16G and 8core processors, mysql v 5.7 x64-
Are these normal messages/status "waiting for table flush" and the message InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal
mysql log contents
2016-12-13T10:51:39.909382Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal. (flushed=1438 and evicted=0, during the time.)
2016-12-13T10:53:01.170388Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4055ms. The settings might not be optimal. (flushed=1412 and evicted=0, during the time.)
2016-12-13T11:07:11.728812Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4008ms. The settings might not be optimal. (flushed=1414 and evicted=0, during the time.)
2016-12-13T11:39:54.257618Z 3274915 [Note] Aborted connection 3274915 to db: 'dummyctrad' user: 'root' host: 'localhost' (Got an error writing communication packets)
Processlist:
mysql> show processlist \G;
*************************** 1. row ***************************
Id: 3273081
User: root
Host: localhost
db: dummyctrad
Command: Field List
Time: 7580
State: Waiting for table flush
Info:
*************************** 2. row ***************************
Id: 3274915
User: root
Host: localhost
db: dummyctrad
Command: Query
Time: 2
State: update
Info: INSERT INTO `radacct` VALUES (351318325,'kxid ge:7186','abcxyz5976c','user100
*************************** 3. row ***************************
Id: 3291591
User: root
Host: localhost
db: NULL
Command: Query
Time: 0
State: starting
Info: show processlist
*************************** 4. row ***************************
Id: 3291657
User: remoteuser
Host: portal.example.com:32800
db: ctradius
Command: Sleep
Time: 2
State:
Info: NULL
4 rows in set (0.00 sec)
Update-1
mysqlforum ,innodb_lru_scan_depth
changing innodb_lru_scan_depth value to 256 have improved the insert queries execution time + no warning message in log, the default was innodb_lru_scan_depth=1024;
SET GLOBAL innodb_lru_scan_depth=256;
InnoDB: page_cleaner: 1000ms intended loop took 4013ms. The settings might not be optimal. (flushed=1438 and evicted=0, during the time.)
The problem is typical of a MySQL instance where you have a high rate of changes to the database. By running your 5GB import, you're creating dirty pages rapidly. As dirty pages are created, the page cleaner thread is responsible for copying dirty pages from memory to disk.
In your case, I assume you don't do 5GB imports all the time. So this is an exceptionally high rate of data load, and it's temporary. You can probably disregard the warnings, because InnoDB will gradually catch up.
Here's a detailed explanation of the internals leading to this warning.
Once per second, the page cleaner scans the buffer pool for dirty pages to flush from the buffer pool to disk. The warning you saw shows that it has lots of dirty pages to flush, and it takes over 4 seconds to flush a batch of them to disk, when it should complete that work in under 1 second. In other words, it's biting off more than it can chew.
You adjusted this by reducing innodb_lru_scan_depth from 1024 to 256. This reduces how far into the buffer pool the page cleaner thread searches for dirty pages during its once-per-second cycle. You're asking it to take smaller bites.
Note that if you have many buffer pool instances, it'll cause flushing to do more work. It bites off innodb_lru_scan_depth amount of work for each buffer pool instance. So you might have inadvertently caused this bottleneck by increasing the number of buffer pools without decreasing the scan depth.
The documentation for innodb_lru_scan_depth says "A setting smaller than the default is generally suitable for most workloads." It sounds like they gave this option a value that's too high by default.
You can place a limit on the IOPS used by background flushing, with the innodb_io_capacity and innodb_io_capacity_max options. The first option is a soft limit on the I/O throughput InnoDB will request. But this limit is flexible; if flushing is falling behind the rate of new dirty page creation, InnoDB will dynamically increase flushing rate beyond this limit. The second option defines a stricter limit on how far InnoDB might increase the flushing rate.
If the rate of flushing can keep up with the average rate of creating new dirty pages, then you'll be okay. But if you consistently create dirty pages faster than they can be flushed, eventually your buffer pool will fill up with dirty pages, until the dirty pages exceeds innodb_max_dirty_page_pct of the buffer pool. At this point, the flushing rate will automatically increase, and may again cause the page_cleaner to send warnings.
Another solution would be to put MySQL on a server with faster disks. You need an I/O system that can handle the throughput demanded by your page flushing.
If you see this warning all the time under average traffic, you might be trying to do too many write queries on this MySQL server. It might be time to scale out, and split the writes over multiple MySQL instances, each with their own disk system.
Read more about the page cleaner:
Introducing page_cleaner thread in InnoDB (archived copy)
MySQL-5.7 improves DML oriented workloads
The bottleneck is saving data to HDD. Whatever HDD you have: SSD, normal one, NVMe etc.
Note, that this solution applies mostly to InnoDB
I had the same problem, I've applied few solutions.
1st: checking what's wrong
atop -d will show you disk usage. If disk is 'busy', then try to stop all queries to database (but don't stop mysql server service!)
To monitor how many queries you do have, use mytop, innotop or equivalent.
If you have 0 queries, but disk usage is STILL next to 100% from a few seconds / few minutes, then it means, that mysql server is trying to flush dirty pages / do some cleaning as mentioned before (great post of Bill Karwin).
THEN you can try to apply such solutions:
2nd: harware optimisation
If your array is not in RAID 1+0 consider to double speed of saving data using such kind of solution. Try to extend your HDD cotroller possibilities with writing data. Try to use SSD or faster HDD. Applying this soultion depends on your harware and budget possibilities and may vary.
3nd: software tuning
If harware cotroller is working fine, but you want to extend speed of saving data you can set up in mysql config file:
3.1.
innodb_flush_log_at_trx_commit = 2 -> if you/re using innodb tables. It works form my experisnce the best with one table per file:
innodb_file_per_table = 1
3.2.
continuing with InnoDB:
innodb_flush_method = O_DIRECT
innodb_doublewrite = 0
innodb_support_xa = 0
innodb_checksums = 0
Lines above are in general reducing amount of data needed to be saved in HDD, so performance is greater.
3.3
general_log = 0
slow_query_log = 0
Lines above disable saving logs, of course it is yet another amount of data to be saved on HDD
3.4
check again what's happening usit e.g.
tail -f /var/log/mysql/error.log
4th: general notes
General notes:
This was tested under MySQL 5.6 AND 5.7.22
OS: Debian 9
RAID: 1 + 0 SSD drives
Database: InnoDB tables
innodb_buffer_pool_size = 120G
innodb_buffer_pool_instances = 8
innodb_read_io_threads = 64
innodb_write_io_threads = 64
Total amount of RAM in server: 200GB
After doing that you may observe higher CPU usage; that's normal, because writing data is more faster, so then CPU will work harder.
If you're doing that using my.cnf of course don't forget to restart MySQL server.
5th: supplement
Beeing intrigued I did this quirk with:
SET GLOBAL innodb_lru_scan_depth=256;
mentioned above.
Working with big tables I've seen no change in performance.
After corrections above I didn't get rid of warnings, however whole system is working significantly faster.
Everything above is just an experimentation, but I have measured results, it helped me a little, so hopefully it may be useful for others.
This can simply be indicative of poor filesystem performance in general - a symptom of an unrelated problem. In my case I spent an hour researching this, analyzing my system logs, and had nearly reached the point of tweaking the MySQL config, when I decided to check with my cloud based hosting. It turns out there were "abusive I/O spikes from a neighbor." which my host quickly resolved after I brought it to their attention.
My recommendation is to know your baseline / expected filesystem performance, stop MySQL, and measure your filesystem performance to determine if there are more fundamental problems unrelated to MySQL.

Weird spikes in MySQL query times

I'm running a NodeJS with MySQL (InnoDB) for a game server (player info, savedata, stuff). Server is HTTP(S) based so nothing realtime.
I'm having these weird spikes as you can see from the graphs below (first graph is requests/sec and last graph is queries/sec)
On the response time graph you can see max response times with purple and avg response times with blue. Even with those 10-20k peaks avg stays at 50-100ms as do 95% of the requests.
I've been digging around and found that the slow queries are nothing special. Usually update query with savedata (blob of ~2kb) or player profile update which modifies like username or so. No joins or anything like that. We're talking about tables with less than 100k rows.
Server is running in Azure on Ubuntu 14.04 with MySQL 5.7 using 4 cores and 7GB of RAM.
MySQL settings:
innodb_buffer_pool_size=4G
innodb_log_file_size=1G
innodb_buffer_pool_instances=4
innodb_log_buffer_size=4M
query_cache_type=0
tmp_table_size=64M
max_heap_table_size=64M
sort_buffer_size=32M
wait_timeout=300
interactive_timeout=300
innodb_file_per_table=ON
Edit: It turned out that the problem was never MySQL performance but Node.js performance before the SQL queries. More info here: Node.js multer and body-parser sometimes extremely slow
check your swappiness (suppose to be 0 mysql machines maximizing ram usage):
> sysctl -A|grep swap
vm.swappiness = 0
with only 7G of RAM and 4G of just buffer pool, your machine will swap if swappiness is not zero.
could you post your swap graph and used memory. 4G buffer is "over the edge" for 7G ram. For 8G ram, I would give 3G as you have +1G on everything else mysql wise + 2G on OS.
Also you have 1G for transaction log file and I assume you have two log files. Do you have so many writes to have such large files? You can use this guide: https://www.percona.com/blog/2008/11/21/how-to-calculate-a-good-innodb-log-file-size/

out of memory error from mysql

we have a web application (racktables) that's giving us grief on our production box. whenever users try to run a search, it gives the following error:
Pdo exception: PDOException
SQLSTATE[HY000]: General error: 5 Out of memory (Needed 2057328 bytes) (HY000)
I cannot recreate the issue on our backup server. The servers match except for the fact that in production we have 16GB RAM and our backup we have 8GB. It's a moot point though because both are running 32 bit os's and so are only using 4GB of RAM. we also have set up a swap partition...
Here's what i get back from the "free -m" command in production:
prod:/etc# free -m
total used free shared buffers
Mem: 3294 1958 1335 0 118
-/+ buffers: 1839 1454
Swap: 3817 109 3707
prod:/etc#
I've checked to make sure that my.cnf on both boxes match. The database from production was replicated onto the backup server... so the data matches as well.
I guess our options are to:
A) convert the o/s to 64 bit so we can use more RAM.
B) start tweaking some of the innodb settings in my.cnf.
But before I try either A or B, I wanted to know if there's anything else I should compare between the two servers... seeing how the backup is working just fine. There must be a difference somewhere that we are not accounting for.
Any suggestions would be appreciated.
I created a script to simulate load on the backup server and was able to then to recreate the out of memory error message.
In the end, i added the "join_buffer_size" setting to my.cnf and set it 3 MBs. That has resolved the issue.
ps. I downloaded and ran tuning-primer.sh as well as mysqltuner.pl to narrow down the issues.

MySQL my.cnf -- open-files-limit causing CPU Overload

I got this server Intel Xeon Quadcore E3-1230v2 with 8GBs of DDR3 RAM Round the clock I see that this server is running out of CPU. It looks badly overloaded. After observing "Daily Process Log" I realized that below process is eating 25% of the CPU resources & there were three such processes (technically errors). Below is the process (error):
/usr/sbin/mysqld --basedir/ --datadir/var/lib/mysql --usermysql --log-error/var/lib/mysql/server.yacart.com.err --open-files-limit16384 --pid-file/var/lib/mysql/server.yacart.com.pid
As visible in the above error, It appears something is wrong with open-files-limit16384, I tried increasing open-files-limit in my.cnf to 16384 but in vain. Below is how my my.cnf now looks like:
[mysqld]
innodb_file_per_table=1
local-infile=0
open_files_limit=9978
Can anyone advise me a good configuration for my my.cnf ? Which would help me get rid of CPU overload?
There is a GoogleBot like robot script I am running in slave servers to mine data from internet. Its crawling the entire internet. When I shutdown this script, everything gets in order. I wonder if there is a fix I could apply to this script?
This robot program has got about 40 databases, each with a size of 50 - 800 MBs, total DB size of about 14 GBs so far & I expect this to shoot upto 500 GBs in future. At one point (whole day long) only ONE DB is used. Next day, I use next DB & so on. I was thinking of increasing RAM once the biggest DB reaches 2 GBs. Currently RAM does not seem to be an issue at all.
Thanks in advance for any help you guys can offer.
regards,
Sam
If you have WHM, look for this under Server Configuration >> Tweak Settings >> SQL
** Let cPanel determine the best value for your MySQL open_files_limit configuration ? [?]
cPanel will adjust the open_files_limit value during each MySQL restart depending on your total number of tables.