Lack of swap memory on Mariadb 10.4 Galera cluster - mysql

I have 3 nodes Galera cluster with MariaDB 10.4.13. Each node have 32GB RAM, and 2GB Swap. After my mysql tuning about 1 month ago each node memory almost full, but I think it is ok. But the last few days Swap size reached maximum and does not go down. My my.cnf looks like this:
####Slow logging
slow_query_log_file=/var/lib/mysql/mysql-slow.log
long_query_time=2
slow_query_log=ON
log_queries_not_using_indexes=ON
############ INNODB OPTIONS
innodb_buffer_pool_size=24000M
innodb_flush_log_at_trx_commit=2
innodb_file_per_table=1
innodb_data_file_path=ibdata1:100M:autoextend
innodb_read_io_threads=4
innodb_write_io_threads=4
innodb_doublewrite=1
innodb_log_file_size=6144M
innodb_log_buffer_size=96M
innodb_buffer_pool_instances=24
innodb_log_files_in_group=2
innodb_thread_concurrency=0
#### innodb_file_format = barracuda
innodb_flush_method = O_DIRECT
#### innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode=2
######## avoid statistics update when doing e.g show tables
innodb_stats_on_metadata=0
default_storage_engine=innodb
innodb_strict_mode = 0
#### OTHER THINGS, BUFFERS ETC
#### key_buffer_size = 24M
tmp_table_size = 1024M
max_heap_table_size = 1024M
max_allowed_packet = 512M
#### sort_buffer_size = 256K
#### read_buffer_size = 256K
#### read_rnd_buffer_size = 512K
#### myisam_sort_buffer_size = 8M
skip_name_resolve
memlock=0
sysdate_is_now=1
max_connections=500
thread_cache_size=512
query_cache_type = 1
query_cache_size = 512M
query_cache_limit=512K
join_buffer_size = 1M
table_open_cache = 116925
open_files_limit = 233850
table_definition_cache = 58863
table_open_cache_instances = 8
lower_case_table_names=0
With this configuration, I wanted MariaDB to use maximum, as long as it is not critical.
I wanted to review this configuration, and maybe disable query_cache part, and also adjust InnoDB values. Please give me some recommendations, and also let me know if the swap size is good enough, or maybe need to disable mysql to use swap at all.

Sorry, I don't see much that is exciting here:
Analysis of GLOBAL STATUS and VARIABLES:
Observations:
Version: 10.4.13-MariaDB-log
32 GB of RAM
Uptime = 1d 15:19:41
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) InnoDB.
The More Important Issues:
Lower to the suggested value:
table_open_cache = 10000
tmp_table_size = 200M
max_heap_table_size = 200M
query_cache_size = 0 -- the high value you have can cause mysterious slowdowns
max_connections = 200
thread_cache_size = 20
The I/O setting are pretty for HDD drive; do you have SSD?
There are a lot of SHOW commands -- more than one per second. Perhaps some monitoring tool is excessively agressive?
Why so many GRANTs?
Is this in a Galera cluster?
Details and other observations:
( Key_blocks_used * 1024 / key_buffer_size ) = 48 * 1024 / 128M = 0.04% -- Percent of key_buffer used. High-water-mark.
-- Lower key_buffer_size (now 134217728) to avoid unnecessary memory usage.
( table_open_cache ) = 116,660 -- Number of table descriptors to cache
-- Several hundred is usually good.
( Open_tables / table_open_cache ) = 4,439 / 116660 = 3.8% -- Cache usage (open tables + tmp tables)
-- Optionally lower table_open_cache (now 116660)
( innodb_buffer_pool_instances ) = 24 -- For large RAM, consider using 1-16 buffer pool instances, not allowing less than 1GB each. Also, not more than, say, twice the number of CPU cores.
-- Recommend no more than 16. (Beginning to go away in 10.5)
( innodb_lru_scan_depth * innodb_buffer_pool_instances ) = 1,024 * 24 = 24,576 -- A metric of CPU usage.
-- Lower either number.
( innodb_lru_scan_depth * innodb_page_cleaners ) = 1,024 * 4 = 4,096 -- Amount of work for page cleaners every second.
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixable by lowering lru_scan_depth: Consider 1000 / innodb_page_cleaners (now 4). Also check for swapping.
( innodb_page_cleaners / innodb_buffer_pool_instances ) = 4 / 24 = 0.167 -- innodb_page_cleaners
-- Recommend setting innodb_page_cleaners (now 4) to innodb_buffer_pool_instances (now 24)
(Beginning to go away in 10.5)
( innodb_lru_scan_depth ) = 1,024
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth
( innodb_io_capacity ) = 200 -- When flushing, use this many IOPs.
-- Reads could be slugghish or spiky.
( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 1,065,507 / 1538880 = 69.2% -- Pct of buffer_pool currently not in use
-- innodb_buffer_pool_size (now 25769803776) is bigger than necessary?
( innodb_io_capacity_max / innodb_io_capacity ) = 2,000 / 200 = 10 -- Capacity: max/plain
-- Recommend 2. Max should be about equal to the IOPs your I/O subsystem can handle. (If the drive type is unknown 2000/200 may be a reasonable pair.)
( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 7,641,841,664 / 24576M = 29.7% -- Percent of buffer pool taken up by data
-- A small percent may indicate that the buffer_pool is unnecessarily big.
( innodb_log_buffer_size ) = 96M -- Suggest 2MB-64MB, and at least as big as biggest blob set in transactions.
-- Adjust innodb_log_buffer_size (now 100663296).
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 141,581 / 60 * 6144M / 2470192128 = 6,154 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf.
-- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 6442450944). (Cannot change in AWS.)
( default_tmp_storage_engine ) = default_tmp_storage_engine =
( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk.
-- Use 0 for SSD drives; 1 for HDD.
( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.
( sync_binlog ) = 0 -- Use 1 for added security, at some cost of I/O =1 may lead to lots of "query end"; =0 may lead to "binlog at impossible position" and lose transactions in a crash, but is faster.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( min( tmp_table_size, max_heap_table_size ) ) = (min( 1024M, 1024M )) / 32768M = 3.1% -- Percent of RAM to allocate when needing MEMORY table (per table), or temp table inside a SELECT (per temp table per some SELECTs). Too high may lead to swapping.
-- Decrease tmp_table_size (now 1073741824) and max_heap_table_size (now 1073741824) to, say, 1% of ram.
( character_set_server ) = character_set_server = latin1
-- Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( query_cache_size ) = 512M -- Size of QC
-- Too small = not of much use. Too large = too much overhead. Recommend either 0 or no more than 50M.
( Qcache_hits / (Qcache_hits + Com_select) ) = 8,821 / (8821 + 5602645) = 0.16% -- Hit ratio -- SELECTs that used QC
-- Consider turning off the query cache.
( (query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size ) = (512M - 48787272) / 224183 / 16384 = 0.133 -- query_alloc_block_size vs formula
-- Adjust query_alloc_block_size (now 16384)
( tmp_table_size ) = 1024M -- Limit on size of MEMORY temp tables used to support a SELECT
-- Decrease tmp_table_size (now 1073741824) to avoid running out of RAM. Perhaps no more than 64M.
( Com_admin_commands / Queries ) = 888,691 / 6680823 = 13.3% -- Percent of queries that are "admin" commands.
-- What's going on?
( Slow_queries / Questions ) = 438,188 / 6557866 = 6.7% -- Frequency (% of all queries)
-- Find slow queries; check indexes.
( log_queries_not_using_indexes ) = log_queries_not_using_indexes = ON -- Whether to include such in slowlog.
-- This clutters the slowlog; turn it off so you can see the real slow queries. And decrease long_query_time (now 2) to catch most interesting queries.
( Uptime_since_flush_status ) = 451 = 7m 31s -- How long (in seconds) since FLUSH STATUS (or server startup).
-- GLOBAL STATUS has not been gathered long enough to get reliable suggestions for many of the issues. Fix what you can, then come back in a several hours.
( Max_used_connections / max_connections ) = 25 / 500 = 5.0% -- Peak % of connections
-- Since several memory factors can expand based on max_connections (now 500), it is good not to have that setting too high.
( thread_cache_size / Max_used_connections ) = 500 / 25 = 2000.0%
-- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.
Abnormally small:
Innodb_dblwr_pages_written / Innodb_dblwr_writes = 2.28
aria_checkpoint_log_activity = 1.05e+6
aria_pagecache_buffer_size = 128MB
innodb_buffer_pool_chunk_size = 128MB
innodb_max_undo_log_size = 10MB
innodb_online_alter_log_max_size = 128MB
innodb_sort_buffer_size = 1.05e+6
innodb_spin_wait_delay = 4
lock_wait_timeout = 86,400
performance_schema_max_mutex_classes = 0
query_cache_limit = 524,288
Abnormally large:
Acl_column_grants = 216
Acl_database_grants = 385
Acl_table_grants = 1,877
Innodb_buffer_pool_pages_free = 1.07e+6
Innodb_num_open_files = 9,073
Memory_used_initial = 8.16e+8
Open_table_definitions = 4,278
Open_tables = 4,439
Performance_schema_file_instances_lost = 1,732
Performance_schema_mutex_classes_lost = 190
Performance_schema_table_handles_lost = 570
Qcache_free_blocks = 9,122
Qcache_total_blocks = 457,808
Tc_log_page_size = 4,096
Uptime - Uptime_since_flush_status = 141,130
aria_sort_buffer_size = 256.0MB
auto_increment_offset = 3
gtid_domain_id = 12,000
innodb_open_files = 116,660
max_heap_table_size = 1024MB
max_relay_log_size = 1024MB
min(max_heap_table_size, tmp_table_size) = 1024MB
performance_schema_events_stages_history_size = 20
performance_schema_events_statements_history_size = 20
performance_schema_events_waits_history_size = 20
performance_schema_max_cond_classes = 90
table_definition_cache = 58,863
table_open_cache / max_connections = 233
tmp_memory_table_size = 1024MB
wsrep_cluster_size = 3
wsrep_gtid_domain_id = 12,000
wsrep_local_bf_aborts = 107
wsrep_slave_threads = 32
wsrep_thread_count = 33
Abnormal strings:
aria_recover_options = BACKUP,QUICK
disconnect_on_expired_password = OFF
gtid_ignore_duplicates = ON
gtid_strict_mode = ON
histogram_type = DOUBLE_PREC_HB
innodb_fast_shutdown = 1
myisam_stats_method = NULLS_UNEQUAL
old_alter_table = DEFAULT
opt_s__optimize_join_buffer_size = on
optimizer_trace = enabled=off
use_stat_tables = PREFERABLY_FOR_QUERIES
wsrep_cluster_status = Primary
wsrep_connected = ON
wsrep_debug = NONE
wsrep_gtid_mode = ON
wsrep_load_data_splitting = OFF
wsrep_provider = /usr/lib64/galera-4/libgalera_smm.so
wsrep_provider_name = Galera
wsrep_provider_options = base_dir = /var/lib/mysql/; base_host = FIRST_NODE_IP; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT7.5S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 1; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 1024M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://0.0.0.0:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = FIRST_NODE_IP; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 10; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto;
wsrep_provider_vendor = Codership Oy
wsrep_provider_version = 26.4.4(r4599)
wsrep_replicate_myisam = ON
wsrep_sst_auth = ********
wsrep_sst_method = mariabackup
wsrep_start_position = 353e0616-cb37-11ea-b614-be241cab877e:39442474

None of these is necessarily too big, but there may be things going on that conspire to make them too big, especially when combined:
innodb_buffer_pool_size=24000M -- quick fix: lower this
(otherwise it should be a good size)
tmp_table_size = 1024M -- lower to 1% of RAM
max_heap_table_size = 1024M -- ditto
max_allowed_packet = 512M -- possibly too big
max_connections=500 -- lower to Max_used_connections or 100
query_cache_type = 1 -- 0 -- QC is not allowed on Galera
query_cache_size = 512M -- 0 -- ditto
table_open_cache = 116925 -- see how 2000 works
table_definition_cache = 58863 -- ditto
For further analysis, provide GLOBAL STATUS and VARIABLES a discussed here: http://mysql.rjweb.org/doc.php/mysql_analysis#tuning

Related

Wordpress Database Optimisation for large Sites

I have a large Wordpress site with 170.000 users and a lot of daily page views.
I just tuned all MySQL indexes based on several comments but actually in my slow logs the SELECT distinct wp_usermeta.meta_key FROM wp_usermeta; takes around 3 seconds.
Server Hardware is: Dedicated Server with AMD Epyc 64 Cores, 128Gb DDR4, 2x480 NVMe SSD.
DB Server is MariaDB newest Version and config is (only innoDB tables):
innodb_buffer_pool_size = 64G
innodb_log_file_size = 16G
innodb_buffer_pool_instances = 16
innodb_io_capacity = 5000
max_binlog_size = 200M
max_connections = 250
wait_timeout = 28700
interactive_timeout = 28700
join_buffer_size = 128M
expire_logs_days = 3
skip-host-cache
skip-name-resolve
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
sql-mode = "STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION"
tmp_table_size = 256M
max_heap_table_size = 256M
table_definition_cache = 500
sort_buffer_size = 24M
key_buffer_size = 32M
performance_schema = on
Maybe someone has some suggestions
Of the 49 'values' that are associated with each user, how many are used in a WHERE or ORDER BY? I suspect only a few.
Here's a way to work around WP's abuse of the "Entity-Attribute-Value" design pattern.
Let's say, a,b,c are useful for filtering and/or ordering. And the other 46 values are simply saved for displaying later. Have 4 rows, not 49 rows in usermeta for each user. 3 rows would be for a,b,c; the rest for a JSON string of the rest of the stuff.
Then have the application aware of the JSON and code accordingly.
This change would necessitate rebuilding wp_usermeta. 46 rows per user would be gathered together and rearranged into a single meta row with a moderately large JSON string (in meta_value). That might not shrink the table much, but it would make it faster to use.
Suggestions to consider for your my.ini [mysqld] section to enable DEMAND query cache utilization.
query_cache_min_res_unit=512 # from 4096 to enable higher density of results
query_cache_size=50M # from 1M to increase capacity
query_cache_limit=6M # from 1M target result for identified query is above 2M
query_cache_type=2 # from OFF to support DEMAND (SELECT SQL_CACHE ...)
net_buffer_length=96K # from 16K to reduce packet in/out count
We should probably SKYPE TALK later today before making any changes.
In a few hours (3-4), I will check in.
Analysis of GLOBAL STATUS and VARIABLES:
Observations:
Version: 10.6.5-MariaDB-1:10.6.5+maria~bullseye-log
128 GB of RAM
Uptime = 1d 02:48:55
384 QPS
The More Important Issues:
I do not see any items that seem critical to help with the problem you are having.
Details and other observations:
( innodb_lru_scan_depth ) = 1,536
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth
( innodb_io_capacity_max ) = 10,000 -- When urgently flushing, use this many IOPs.
-- Reads could be slugghish or spiky.
( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 2,787,201 / 4145152 = 67.2% -- Pct of buffer_pool currently not in use
-- innodb_buffer_pool_size (now 68719476736) is bigger than necessary?
( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 22,248,669,184 / 65536M = 32.4% -- Percent of buffer pool taken up by data
-- A small percent may indicate that the buffer_pool is unnecessarily big.
( Innodb_log_writes ) = 5,298,275 / 96535 = 55 /sec
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 96,535 / 60 * 16384M / 6560327680 = 4,213 -- Minutes between InnoDB log rotations Beginning with 5.6.8, innodb_log_file_size can be changed dynamically; I don't know about MariaDB. Be sure to also change my.cnf.
-- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 17179869184). (Cannot change in AWS.)
( Innodb_row_lock_waits ) = 83,931 / 96535 = 0.87 /sec -- How often there is a delay in getting a row lock.
-- May be caused by complex queries that could be optimized.
( Innodb_row_lock_waits/Innodb_rows_inserted ) = 83,931/1560067 = 5.4% -- Frequency of having to wait for a row.
( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk.
-- Use 0 for SSD drives; 1 for HDD.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( join_buffer_size * Max_used_connections ) = (128M * 127) / 131072M = 12.4% -- (A metric for pondering the size of join_buffer_size.)
-- join_buffer_size (now 134217728) should probably be shrunk to avoid running out of RAM.
( (Com_show_create_table + Com_show_fields) / Questions ) = (66 + 1370563) / 37103211 = 3.7% -- Naughty framework -- spending a lot of effort rediscovering the schema.
-- Complain to the 3rd party vendor.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( Created_tmp_tables ) = 2,088,713 / 96535 = 22 /sec -- Frequency of creating "temp" tables as part of complex SELECTs.
( Created_tmp_disk_tables ) = 1,751,146 / 96535 = 18 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs
-- increase tmp_table_size (now 268435456) and max_heap_table_size (now 268435456).
Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM.
Better indexes and reformulation of queries are more likely to help.
( Created_tmp_disk_tables / Questions ) = 1,751,146 / 37103211 = 4.7% -- Pct of queries that needed on-disk tmp table.
-- Better indexes / No blobs / etc.
( Created_tmp_disk_tables / Created_tmp_tables ) = 1,751,146 / 2088713 = 83.8% -- Percent of temp tables that spilled to disk
-- Maybe increase tmp_table_size (now 268435456) and max_heap_table_size (now 268435456); improve indexes; avoid blobs, etc.
( Handler_read_rnd_next ) = 104,164,660,719 / 96535 = 1079035 /sec -- High if lots of table scans
-- possibly inadequate keys
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (1561842 + 4652536 + 13886 + 42) / 352 = 17,694 -- Statements per Commit (assuming all InnoDB)
-- High: long transactions strain various things.
( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (1561842 + 13886 + 0 + 42 + 4652536 + 794) / 96535 = 65 /sec -- writes/sec
-- 50 writes/sec + log flushes will probably max out I/O write capacity of HDD drives
( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 2208251 - 1415 ) / ( 2208251 + 1415 ) = 99.9% -- Are you closing your prepared statements?
-- Add Closes.
( Com_stmt_prepare - Com_stmt_close ) = 2,208,251 - 1415 = 2.21e+6 -- How many prepared statements have not been closed.
-- CLOSE prepared statements
( Com_stmt_close / Com_stmt_prepare ) = 1,415 / 2208251 = 0.06% -- Prepared statements should be Closed.
-- Check whether all Prepared statements are "Closed".
( binlog_format ) = binlog_format = MIXED -- STATEMENT/ROW/MIXED.
-- ROW is preferred by 5.7 (10.3)
( Syncs ) = 5,727,396 / 96535 = 59 /sec -- Sync to disk for binlog.
( Com_change_db ) = 1,168,504 / 96535 = 12 /sec -- Probably comes from USE statements.
-- Consider connecting with DB, using db.tbl syntax, eliminating spurious USE statements, etc.
( Connections ) = 3,377,949 / 96535 = 35 /sec -- Connections
-- Increase wait_timeout (now 28700); use pooling?
( thread_cache_size / Max_used_connections ) = 250 / 127 = 196.9%
-- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.
( thread_pool_size ) = 64 -- Number of 'thread groups'. Limits how many treads can be executing at once. Probably should not be much bigger than the number of CPUs.
-- Don't set much higher than the number of CPU cores.
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
VSClasses.inc.256 Error with eval('((1048576 - 1031304) / 0) / 4096') expr=[[((query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache) / query_cache_min_res_unit]]
VSClasses.inc.256 Error with eval('(1048576 - 1031304) / 0 / 16384') expr=[[(query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size]]
VSClasses.inc.256 Error with eval('0/0') expr=[[Innodb_dblwr_pages_written/Innodb_pages_written]]
VSClasses.inc.256 Error with eval('0 / (0 + 0 + 0)') expr=[[Qcache_hits / (Qcache_hits + Qcache_inserts + Qcache_not_cached)]]
VSClasses.inc.256 Error with eval('0/0') expr=[[Qcache_lowmem_prunes/Qcache_inserts]]
Abnormally small:
Innodb_adaptive_hash_non_hash_searches = 0
Innodb_buffer_pool_pages_flushed / max(Questions, Queries) = 0
Innodb_buffer_pool_pages_misc = 0
Innodb_buffer_pool_pages_misc * 16384 / innodb_buffer_pool_size = 0
Innodb_data_writes - Innodb_log_writes - Innodb_dblwr_writes = 6 /HR
Innodb_data_written = 0
Innodb_dblwr_pages_written = 0
Innodb_master_thread_active_loops = 13
Innodb_mem_adaptive_hash = 0
Innodb_pages_written = 0
Memory_used = 0.04%
Memory_used_initial = 15.7MB
Abnormally large:
Aria_pagecache_reads = 18 /sec
Aria_pagecache_write_requests = 1180 /sec
Com_show_fields = 14 /sec
Com_stmt_prepare = 23 /sec
Handler_discover = 3 /HR
Handler_read_next = 1805396 /sec
Handler_read_next / Handler_read_key = 121
Innodb_buffer_pool_pages_dirty = 77,929
Innodb_buffer_pool_pages_free = 2.79e+6
Innodb_buffer_pool_pages_total = 4.15e+6
Innodb_checkpoint_age = 2.3e+9
Innodb_log_writes / Innodb_log_write_requests = 6636.2%
Innodb_os_log_fsyncs = 55 /sec
Innodb_rows_read = 2894484 /sec
Open_streams = 4
Opened_views = 0.058 /sec
Performance_schema_file_instances_lost = 6
Rows_read = 2887256 /sec
Select_full_range_join = 0.4 /sec
Select_full_range_join / Com_select = 0.18%
Slaves_connected = 0.037 /HR
Threads_cached = 113
performance_schema_max_statement_classes = 222
Abnormal strings:
Slave_heartbeat_period = 0
Slave_received_heartbeats = 0
aria_recover_options = BACKUP,QUICK
binlog_row_metadata = NO_LOG
character_set_system = utf8mb3
disconnect_on_expired_password = OFF
innodb_fast_shutdown = 1
log_slow_admin_statements = ON
myisam_stats_method = NULLS_UNEQUAL
old_alter_table = DEFAULT
old_mode = UTF8_IS_UTF8MB3
optimizer_trace = enabled=off
slave_parallel_mode = optimistic
sql_slave_skip_counter = 0
Your server provisioning is fine. If anything, it's overprovisioned for your site.
If I understand your comments correctly, you have many occurrences of this offending SELECT distinct wp_usermeta.meta_key FROM wp_usermeta query. And it seems like that query generates a 172K row result set. Yet you say each user has the entirely reasonable number of 49 rows in wp_usermeta.
So, as #RickJames pointed out, it looks like each user somehow gets their own personal unique wp_usermeta.meta_key value. WordPress core doesn't do that and would never do that. The point of those wp_whatevermeta tables is to have a limited number of keys. Also, rerunning that particular query very often is grossly inefficient. What conceivable purpose, other than some global list of users, does that query serve? So a plugin is surely implicated in this. If you can get the Query Monitor (https://wordpress.org/plugins/query-monitor/) plugin to work, it will tell you what software generated the offending query.
There's nothing magic about queries without WHERE clauses. SELECTs like COUNT(*) and DISTINCT generally need to scan through an entire table or index to generate their results, so when a table is large the query takes a long time. If you have an index on wp_usermeta.meta_key, and it indexes the whole column not the 191 prefix, the offending query should do a relatively inexpensive loose index scan. But still, it has to do the index scan.
And, try the plugin Rick James and I put together. https://wordpress.org/plugins/index-wp-mysql-for-speed/ It makes useful indexes, and also has a couple of diagnostic tools built in.

How to make replication faster for mysql table with 8 million records

I take my Live server backup using mysqldump command via CRON job in my Ubuntu Server via Bash Shell Script and the same script uploads the backup to my backup server also. Earlier this was working fine but now I am facing slowness issue (it takes 1 hour to backup and upload on backup server) as one of the database Table size has grown to 5GB and consists of 10 Million Records. I saw on a thread that we can fasten the SQL insertion via bulk/group execution of SQL - How can mysql insert millions records faster?
But in my case I am unsure how can I create a Shell Script to perform the same.
The requirement is I want to export all my SQL Database tables in groups of maximum 10k so that execution can be faster while it is getting imported on the server.
I have written this code on my server bash script:
#!/bin/bash
cd /tmp
file=$(date +%F-%T).sql
mysqldump \
--host ${MYSQL_HOST} \
--port ${MYSQL_PORT} \
-u ${MYSQL_USER} \
--password="${MYSQL_PASS}" \
${MYSQL_DB} > ${file}
if [ "${?}" -eq 0 ]; then
mysql -umyuser -pmypassword -h 198.168.1.3 -e "show databases"
mysql -umyuser -pmypassword -h 198.168.1.3 -D backup_db -e "drop database backup_db"
mysql -umyuser -pmypassword -h 198.168.1.3 -e "create database backup_db"
mysql -umyuser -pmypassword -h 198.168.1.3 backup_db < ${file}
gzip ${file}
aws s3 cp ${file}.gz s3://${S3_BUCKET}/live_db/
rm ${file}.gz
else
echo "Error backing up mysql"
exit 255
fi
The backup server and live server share the same AWS hardware configuration: 16GB RAM, 4 CPU, 100GB SSD.
These are the screenshots and data:
Screenshot and Queries for Debugging on the Live Server:
Information Schema Tables:
https://i.imgur.com/RnjQwbP.png
SHOW GLOBAL STATUS:
https://pastebin.com/raw/MuJYwnsm
SHOW GLOBAL VARIABLES:
https://pastebin.com/raw/wdvn97XP
Screenshot and Queries for Debugging on the Backup Server:
https://i.imgur.com/rB7qcYU.png
https://pastebin.com/raw/K7vHXqWi
https://pastebin.com/raw/PR2gWpqe
Server workload is almost negligible. There is no load all the times, I have also monitored via AWS Monitoring Panel, and that's the only reason to take more than required resource server so that it never gets exhausted. I have taken 16GB RAM and 4 CPU which are more than more than sufficient. AWS Monitoring Panel showed Max usage 6% rarely and maximum times it is around 1%.
Analysis of GLOBAL STATUS and VARIABLES:
Observations:
Version: 10.3.32-MariaDB-0ubuntu0.20.04.1-log
16 GB of RAM
Uptime = 3d 05:41:50
42.1 QPS
The More Important Issues:
Some setting suggestions for better memory utilization:
key_buffer_size = 20M
innodb_buffer_pool_size = 8G
table_open_cache = 300
innodb_open_files = 1000
query_cache_type = OFF
query_cache_size = 0
Some setting suggestions for other reasons:
eq_range_index_dive_limit = 20
log_queries_not_using_indexes = OFF
Recommend using the slowlog (with long_query_time = 1) to locate the naughty queries. Then we can discuss how to improve them.
http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog
Details and other observations:
( (key_buffer_size - 1.2 * Key_blocks_used * 1024) ) = ((512M - 1.2 * 8 * 1024)) / 16384M = 3.1% -- Percent of RAM wasted in key_buffer.
-- Decrease key_buffer_size (now 536870912).
( Key_blocks_used * 1024 / key_buffer_size ) = 8 * 1024 / 512M = 0.00% -- Percent of key_buffer used. High-water-mark.
-- Lower key_buffer_size (now 536870912) to avoid unnecessary memory usage.
( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((512M / 0.20 + 128M / 0.70)) / 16384M = 16.7% -- Most of available ram should be made available for caching.
-- http://mysql.rjweb.org/doc.php/memory
( table_open_cache ) = 16,293 -- Number of table descriptors to cache
-- Several hundred is usually good.
( innodb_buffer_pool_size ) = 128M -- InnoDB Data + Index cache
-- 128M (an old default) is woefully small.
( innodb_buffer_pool_size ) = 128 / 16384M = 0.78% -- % of RAM used for InnoDB buffer_pool
-- Set to about 70% of available RAM. (To low is less efficient; too high risks swapping.)
( innodb_lru_scan_depth ) = 1,024
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth
( innodb_io_capacity ) = 200 -- When flushing, use this many IOPs.
-- Reads could be slugghish or spiky.
( innodb_io_capacity_max / innodb_io_capacity ) = 2,000 / 200 = 10 -- Capacity: max/plain
-- Recommend 2. Max should be about equal to the IOPs your I/O subsystem can handle. (If the drive type is unknown 2000/200 may be a reasonable pair.)
( Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests ) = 3,565,561,783 / 162903322931 = 2.2% -- Read requests that had to hit disk
-- Increase innodb_buffer_pool_size (now 134217728) if you have enough RAM.
( Innodb_pages_read / Innodb_buffer_pool_read_requests ) = 3,602,479,499 / 162903322931 = 2.2% -- Read requests that had to hit disk
-- Increase innodb_buffer_pool_size (now 134217728) if you have enough RAM.
( Innodb_buffer_pool_reads ) = 3,565,561,783 / 279710 = 12747 /sec -- Cache misses in the buffer_pool.
-- Increase innodb_buffer_pool_size (now 134217728)? (~100 is limit for HDD, ~1000 is limit for SSDs.)
( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((3565561783 + 1105583) ) / 279710 = 12751 /sec -- InnoDB I/O
-- Increase innodb_buffer_pool_size (now 134217728)?
( Innodb_buffer_pool_read_ahead_evicted ) = 5,386,209 / 279710 = 19 /sec
( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 1,564,913,152 / (279710 / 3600) / 2 / 48M = 0.2 -- Ratio
-- (see minutes)
( innodb_flush_method ) = innodb_flush_method = fsync -- How InnoDB should ask the OS to write blocks. Suggest O_DIRECT or O_ALL_DIRECT (Percona) to avoid double buffering. (At least for Unix.) See chrischandler for caveat about O_ALL_DIRECT
( default_tmp_storage_engine ) = default_tmp_storage_engine =
( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk.
-- Use 0 for SSD drives; 1 for HDD.
( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 3602479499 + 1115984 ) / 279710 / 200 = 6441.7% -- If > 100%, need more io_capacity.
-- Increase innodb_io_capacity (now 200) if the drives can handle it.
( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.
( innodb_adaptive_hash_index ) = innodb_adaptive_hash_index = ON -- Whether to use the adapative hash (AHI).
-- ON for mostly readonly; OFF for DDL-heavy
( innodb_adaptive_hash_index ) = innodb_adaptive_hash_index = ON -- Usually should be ON.
-- There are cases where OFF is better. See also innodb_adaptive_hash_index_parts (now 8) (after 5.7.9) and innodb_adaptive_hash_index_partitions (MariaDB and Percona). ON has been implicated in rare crashes (bug 73890). 10.5.0 decided to default OFF.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( innodb_ft_result_cache_limit ) = 2,000,000,000 / 16384M = 11.6% -- Byte limit on FULLTEXT resultset. (Possibly not preallocated, but grows?)
-- Lower the setting.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( Qcache_lowmem_prunes ) = 6,329,393 / 279710 = 23 /sec -- Running out of room in QC
-- increase query_cache_size (now 16777216)
( Qcache_lowmem_prunes/Qcache_inserts ) = 6,329,393/7792821 = 81.2% -- Removal Ratio (frequency of needing to prune due to not enough memory)
( Qcache_hits / Qcache_inserts ) = 1,619,341 / 7792821 = 0.208 -- Hit to insert ratio -- high is good
-- Consider turning off the query cache.
( Qcache_hits / (Qcache_hits + Com_select) ) = 1,619,341 / (1619341 + 9691638) = 14.3% -- Hit ratio -- SELECTs that used QC
-- Consider turning off the query cache.
( Qcache_hits / (Qcache_hits + Qcache_inserts + Qcache_not_cached) ) = 1,619,341 / (1619341 + 7792821 + 278272) = 16.7% -- Query cache hit rate
-- Probably best to turn off the QC.
( (query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size ) = (16M - 1272984) / 3058 / 16384 = 0.309 -- query_alloc_block_size vs formula
-- Adjust query_alloc_block_size (now 16384)
( Created_tmp_disk_tables ) = 1,667,989 / 279710 = 6 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs
-- increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216).
Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM.
Better indexes and reformulation of queries are more likely to help.
( Created_tmp_disk_tables / Questions ) = 1,667,989 / 11788712 = 14.1% -- Pct of queries that needed on-disk tmp table.
-- Better indexes / No blobs / etc.
( Created_tmp_disk_tables / Created_tmp_tables ) = 1,667,989 / 4165525 = 40.0% -- Percent of temp tables that spilled to disk
-- Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.
( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 473 - 0 ) / ( 473 + 0 ) = 100.0% -- Are you closing your prepared statements?
-- Add Closes.
( Com_stmt_close / Com_stmt_prepare ) = 0 / 473 = 0 -- Prepared statements should be Closed.
-- Check whether all Prepared statements are "Closed".
( binlog_format ) = binlog_format = MIXED -- STATEMENT/ROW/MIXED.
-- ROW is preferred by 5.7 (10.3)
( long_query_time ) = 5 -- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
( Subquery_cache_hit / ( Subquery_cache_hit + Subquery_cache_miss ) ) = 0 / ( 0 + 1800 ) = 0 -- Subquery cache hit rate
( log_queries_not_using_indexes ) = log_queries_not_using_indexes = ON -- Whether to include such in slowlog.
-- This clutters the slowlog; turn it off so you can see the real slow queries. And decrease long_query_time (now 5) to catch most interesting queries.
( back_log ) = 80 -- (Autosized as of 5.6.6; based on max_connections)
-- Raising to min(150, max_connections (now 151)) may help when doing lots of connections.
Abnormally small:
Delete_scan = 0.039 /HR
Handler_read_rnd_next / Handler_read_rnd = 2.06
Handler_write = 0.059 /sec
Innodb_buffer_pool_read_requests / (Innodb_buffer_pool_read_requests + Innodb_buffer_pool_reads ) = 97.9%
Table_locks_immediate = 2.6 /HR
eq_range_index_dive_limit = 0
Abnormally large:
( Innodb_pages_read + Innodb_pages_written ) / Uptime = 12,883
Com_release_savepoint = 5.5 /HR
Com_savepoint = 5.5 /HR
Handler_icp_attempts = 110666 /sec
Handler_icp_match = 110663 /sec
Handler_read_key = 62677 /sec
Handler_savepoint = 5.5 /HR
Handler_tmp_update = 1026 /sec
Handler_tmp_write = 40335 /sec
Innodb_buffer_pool_read_ahead = 131 /sec
Innodb_buffer_pool_reads * innodb_page_size / innodb_buffer_pool_size = 43524924.1%
Innodb_data_read = 211015030 /sec
Innodb_data_reads = 12879 /sec
Innodb_pages_read = 12879 /sec
Innodb_pages_read + Innodb_pages_written = 12883 /sec
Select_full_range_join = 1.1 /sec
Select_full_range_join / Com_select = 3.0%
Tc_log_page_size = 4,096
innodb_open_files = 16,293
log_slow_rate_limit = 1,000
query_cache_limit = 3.36e+7
table_open_cache / max_connections = 107
Abnormal strings:
Innodb_have_snappy = ON
Slave_heartbeat_period = 0
Slave_received_heartbeats = 0
aria_recover_options = BACKUP,QUICK
innodb_fast_shutdown = 1
log_output = FILE,TABLE
log_slow_admin_statements = ON
myisam_stats_method = NULLS_UNEQUAL
old_alter_table = DEFAULT
sql_slave_skip_counter = 0
time_zone = +05:30
Rate Per Second = RPS
Suggestions to consider for your 'Backup' AWS instance Parameters Group
innodb_buffer_pool_size=10G # from 128M to reduce innodb_data_reads RPS of 16
innodb_change_buffer_max_size=50 # from 25 percent to speed up INSERT completion
innodb_buffer_pool_instances=3 # from 1 to reduce mutex contention
innodb_write_io_threads=16 # from 4 for your intense data INSERT operations
innodb_buffer_pool_dump_pct=90 # from 25 percent to reduce WARM UP delays
innodb_fast_shutdown=0 # from 1 to help avoid RECOVERY on instance START
You should find these changes reduce your refresh DATA time required on your BACKUP instance. Your LIVE instance has different operating characteristics but should have all these suggestions applied, as well as others. Please view profile for contact info and get in touch for additional assistance.
Your mention of REPLICATION should likely be ignored since you can not have SERVER_ID of 1 on both the MASTER and SLAVE with replication to survive. Your LIVE server can not be MASTER because LOG_BIN is OFF for the first reason.
LIVE observations,
com_begin count was 30, com_commit was 0 after 3 + days of uptime.
Usually we find commit to be the same as com_begin. Did someone forget to COMMIT the data?
com_savepoint reported 430 operations. com_rollback_to_savepoint reported 430 operations. We do not normally see a rollback for every savepoint in 3 dqys.
com_stmt_prepare reported 473 operations.
com_stmt_execute reported 1211 operations.
com_stmt_close reported 0 operations. Forgetting to CLOSE prepared statements
when done leaves resources in use that could have been released.
handler_rollback counter 961 in 3 days. Seems unusual for 3 days of uptime.
slow_queries counted 87338 in 3 days exceeding 5 seconds to completion.
log_slow_verbosity=query_plan,explain would help your team identify the cause of slow. Your slow query logs is already ON.
Kerry, The VERY BEST to you and your team.

MySQL InnoDB Disk Writes increase suddenly after 2.5 hours

MySQL version = 5.7.31
We started noticing high CPU utilization in our DB server after 2.5 hours of heavy work load (roughly 800 selects per second). DB was performing quite well, and all of a sudden InnoDB Disk Writes increase significantly, followed by InnoDB Disk Reads. Select count drops to zero at this point making the application useless.
After about 15 mins the DB starts working normally.
configuration as follows
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_numa_interleave
innodb_buffer_pool_size=75G
key_buffer_size = 12G
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
tmp_table_size = 1024M
max_heap_table_size = 1024M
max_connections = 600
max_connect_errors = 10000
query_cache_limit = 1M
query_cache_size = 50M
htop: https://ibb.co/gwGSkc1 - (Before the issue)
iostat: https://ibb.co/YyJWkb9 - (Before the issue)
df -h : https://ibb.co/x25vg52
RAM 94G
CORE COUNT 32
SSD : /var/lib/mysql is mounted on a SSD Volume (Solution is hosted on open stack)
GLOBAL STATUS : https://pastebin.com/yC4FUYiE
GLOBAL Variables : https://pastebin.com/PfsYTRbm
PROCESS LIST : https://pastebin.com/TyA5KBDb
Rate Per Second = RPS
Suggestions to consider for your my.cnf [mysqld] section
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
innodb_io_capacity=1500 # from 200 to use more of your available SSD IOPS
read_rnd_buffer_size=128K # from 256K to reduce handler_read_rnd_next RPS of 474,918
key_buffer_size=16M # from 12G less than 1% used and your max is 94G available.
For additional suggestions view profile, Network profile for contact info and free downloadable Utility Scripts to assist with performance tuning.
There are many more opportunities to improve your configuration.
Not much exciting in the settings:
Analysis of GLOBAL STATUS and VARIABLES:
Observations:
Version: 5.7.31
94 GB of RAM
Uptime = 17:36:15; some GLOBAL STATUS values may not be meaningful yet.
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) InnoDB.
The More Important Issues:
MyISAM is not used, so key_buffer_size = 12G is a waste of RAM. Change to 50M.
If you have SSD drives, increase innodb_io_capacity from 200 to 1000.
Several metrics point to inefficient queries. They may need better indexes or rewriting. See http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog
Details and other observations:
( key_buffer_size ) = 12,288M / 96256M = 12.8% -- % of RAM used for key_buffer (for MyISAM indexes)
-- 20% is ok if you are not using InnoDB.
( (key_buffer_size - 1.2 * Key_blocks_used * 1024) ) = ((12288M - 1.2 * 9 * 1024)) / 96256M = 12.8% -- Percent of RAM wasted in key_buffer.
-- Decrease key_buffer_size (now 12884901888).
( Key_blocks_used * 1024 / key_buffer_size ) = 9 * 1024 / 12288M = 0.00% -- Percent of key_buffer used. High-water-mark.
-- Lower key_buffer_size (now 12884901888) to avoid unnecessary memory usage.
( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((12288M / 0.20 + 76800M / 0.70)) / 96256M = 177.8% -- Most of available ram should be made available for caching.
-- http://mysql.rjweb.org/doc.php/memory
( innodb_lru_scan_depth * innodb_page_cleaners ) = 1,024 * 4 = 4,096 -- Amount of work for page cleaners every second.
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixable by lowering lru_scan_depth: Consider 1000 / innodb_page_cleaners (now 4). Also check for swapping.
( innodb_page_cleaners / innodb_buffer_pool_instances ) = 4 / 8 = 0.5 -- innodb_page_cleaners
-- Recommend setting innodb_page_cleaners (now 4) to innodb_buffer_pool_instances (now 8)
(Beginning to go away in 10.5)
( innodb_lru_scan_depth ) = 1,024
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth
( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 1,579,794 / 4914600 = 32.1% -- Pct of buffer_pool currently not in use
-- innodb_buffer_pool_size (now 80530636800) is bigger than necessary?
( innodb_io_capacity_max / innodb_io_capacity ) = 2,000 / 200 = 10 -- Capacity: max/plain
-- Recommend 2. Max should be about equal to the IOPs your I/O subsystem can handle. (If the drive type is unknown 2000/200 may be a reasonable pair.)
( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 138,870,272 / (63375 / 3600) / 2 / 1024M = 0.00367 -- Ratio
-- (see minutes)
( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 63,375 / 60 * 1024M / 138870272 = 8,166 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf.
-- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 1073741824). (Cannot change in AWS.)
( innodb_flush_method ) = innodb_flush_method = O_DSYNC -- How InnoDB should ask the OS to write blocks. Suggest O_DIRECT or O_ALL_DIRECT (Percona) to avoid double buffering. (At least for Unix.) See chrischandler for caveat about O_ALL_DIRECT
( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk.
-- Use 0 for SSD drives; 1 for HDD.
( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( min( tmp_table_size, max_heap_table_size ) ) = (min( 1024M, 1024M )) / 96256M = 1.1% -- Percent of RAM to allocate when needing MEMORY table (per table), or temp table inside a SELECT (per temp table per some SELECTs). Too high may lead to swapping.
-- Decrease tmp_table_size (now 1073741824) and max_heap_table_size (now 1073741824) to, say, 1% of ram.
( character_set_server ) = character_set_server = latin1
-- Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( Created_tmp_disk_tables / Created_tmp_tables ) = 59,659 / 68013 = 87.7% -- Percent of temp tables that spilled to disk
-- Check slowlog
( tmp_table_size ) = 1024M -- Limit on size of MEMORY temp tables used to support a SELECT
-- Decrease tmp_table_size (now 1073741824) to avoid running out of RAM. Perhaps no more than 64M.
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (53844 + 35751 + 1 + 0) / 35789 = 2.5 -- Statements per Commit (assuming all InnoDB)
-- Low: Might help to group queries together in transactions; High: long transactions strain various things.
( Select_range_check ) = 70,106 / 63375 = 1.1 /sec -- no good index
-- Find slow queries; check indexes.
( Select_scan ) = 2,393,389 / 63375 = 38 /sec -- full table scans
-- Add indexes / optimize queries (unless they are tiny tables)
( Select_scan / Com_select ) = 2,393,389 / 10449190 = 22.9% -- % of selects doing full table scan. (May be fooled by Stored Routines.)
-- Add indexes / optimize queries
( Sort_merge_passes ) = 18,868 / 63375 = 0.3 /sec -- Heafty sorts
-- Increase sort_buffer_size (now 262144) and/or optimize complex queries.
( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)
( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
( log_slow_slave_statements ) = log_slow_slave_statements = OFF -- (5.6.11, 5.7.1) By default, replicated statements won't show up in the slowlog; this causes them to show.
-- It can be helpful in the slowlog to see writes that could be interfering with Replica reads.
( Aborted_connects / Connections ) = 1,057 / 2070 = 51.1% -- Perhaps a hacker is trying to break in? (Attempts to connect)
( max_connect_errors ) = 10,000 -- A small protection against hackers.
-- Perhaps no more than 200.
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
Abnormally small:
Innodb_os_log_fsyncs = 0
innodb_buffer_pool_chunk_size = 128MB
innodb_online_alter_log_max_size = 128MB
innodb_sort_buffer_size = 1.05e+6
Abnormally large:
(Com_select + Qcache_hits) / (Com_insert + Com_update + Com_delete + Com_replace) = 116
Com_create_procedure = 0.11 /HR
Com_drop_procedure = 0.11 /HR
Com_show_charsets = 0.68 /HR
Com_show_plugins = 0.11 /HR
Created_tmp_files = 0.6 /sec
Innodb_buffer_pool_bytes_data = 838452 /sec
Innodb_buffer_pool_pages_data = 3.24e+6
Innodb_buffer_pool_pages_free = 1.58e+6
Innodb_buffer_pool_pages_total = 4.91e+6
Key_blocks_unused = 1.02e+7
Ssl_default_timeout = 7,200
Ssl_session_cache_misses = 10
Ssl_verify_depth = 1.84e+19
Ssl_verify_mode = 5
max_heap_table_size = 1024MB
min(max_heap_table_size, tmp_table_size) = 1024MB
Abnormal strings:
ft_boolean_syntax = + -><()~*:&
innodb_fast_shutdown = 1
innodb_numa_interleave = ON
optimizer_trace = enabled=off,one_line=off
optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on
slave_rows_search_algorithms = TABLE_SCAN,INDEX_SCAN

Mysql process list too long with killed queries

For several years I've been making automated daily database backups using a procedure iterating over existing databases.
mysqldump --user=${mysql_username} --password=${mysql_password} $db --single-transaction --events -R >> $normal_output_filename
Recently I moved from a dedicated server (Centos 6, Apache 2.2, php5.6, Mysql 5.0 -as far I recall) to a VPS with Centos 7, Apache 2.4, php 7.2/5.6, MariaDB 5.5)
Recently, time to time SOME database accesses are slow and eventually "time execution exceeded"
I have a cron job to make a daily backup after 03:00 of all databases.
From ps aux | grep mysql I get
root 15840 0.0 0.0 126772 3456 ? SN 03:09 0:00 mysqldump --user=uuu --password=x xxxxxx information_schema --single-transaction --events -R
which is on hold for several hours.
Once, I realized that problem after six days that mysqldump was on hold and no new db backups were performed.
show status like '%conn%';
does not output anything, it stays on hold.
mysqladmin -uuser1 -p*** processlist
(user1 is superuser) lists almost 8000 lines of Killed processes like
| 671958 | user1 | localhost | database1 | Killed | 3 | | | 0.000 |
| 671959 | user1 | localhost | database1 | Killed | 3 | | | 0.000 |
| 671961 | user1 | localhost | database1 | Killed | 2 | | | 0.000 |
| 671962 | user1 | localhost | database1 | Killed | 2 | | | 0.000 |
| 671963 | user1 | localhost | database2 | Killed | 2 | | | 0.000 |
| 671964 | user2 | localhost | database3 | Killed | 1 | | | 0.000 |
| 671965 | user1 | localhost | | Killed | 1 | | | 0.000 |
| 671966 | user1 | localhost | | Query | 0 | | show processlist | 0.000 |
+--------+-----+--------------+-----------+---------+---+---+------------------+----------+
I didn't restart mysql server yet. I can see some websites loading fast their pages which have several db accesses while Horde and Roundcube webmails reach the timeout and error 500.
I don't realize why suddenly (it may be days before it happens) list of processes start growing with killed processes I don't know where they come from.
UPDATE 1:
VPS at Contabo, 200GB SSD disk. 61.93 GiB used / 134.78 GiB free / 196.71 GiB total
Intel(R) Xeon(R) CPU E5-2620 v3 # 2.40GHz, 4 cores
CentOS Linux 7.7.1908
Linux 3.10.0-1062.9.1.el7.x86_64 on x86_64
At this time: CPU load averages 0.88 (1 min) 1.03 (5 mins) 0.95 (15 mins)
8GB RAM - At this time: 1.81 GiB used / 2.21 GiB cached / 7.63 GiB total
At this time: Uptime 2 days, 17 hours
MORE DATA
UPDATE 2
Added thread_handling=pool-of-threads to my.cnf
The following does not directly answer the Question you are asking, but it points out some very low settings and the usage of MyISAM. I don't know whether switching to InnoDB and/or increasing some of the settings would help.
Do be aware that dumping MyISAM tables essentially blocks users from doing database work. (On the other hand, perhaps your data set is rather small and the activity is rather low.)
Observations:
Version: 5.5.64-MariaDB
8 GB of RAM
Uptime = 21:05:35; some GLOBAL STATUS values may not be meaningful yet.
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) MyISAM.
The More Important Issues:
You should move from MyISAM to InnoDB; see Conversion from MyISAM to InnoDB
See if you can raise the following (more discussion below):
open_files_limit = 2000
table_open_cache = 300
key_buffer_size = 200M
innodb_buffer_pool_size = 600M -- after moving tables to InnoDB
OPTIMIZE TABLE is an infrequent task; you are doing it much too often.
Details and other observations:
( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((16M / 0.20 + 128M / 0.70)) / 8192M = 3.2% -- Most of available ram should be made available for caching.
-- http://mysql.rjweb.org/doc.php/memory
( open_files_limit ) = 760 -- ulimit -n
-- To allow more files, change ulimit or /etc/security/limits.conf or in sysctl.conf (kern.maxfiles & kern.maxfilesperproc) or something else (OS dependent)
( table_open_cache ) = 64 -- Number of table descriptors to cache
-- Several hundred is usually good.
( innodb_buffer_pool_size ) = 128M -- InnoDB Data + Index cache
-- 128M (an old default) is woefully small.
( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 5,578 / 8191 = 68.1% -- Pct of buffer_pool currently not in use
-- innodb_buffer_pool_size (now 134217728) is bigger than necessary?
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( join_buffer_size ) = 131,072 / 8192M = 0.00% -- 0-N per thread. May speed up JOINs (better to fix queries/indexes) (all engines) Used for index scan, range index scan, full table scan, each full JOIN, etc.
-- If large, decrease join_buffer_size (now 131072) to avoid memory pressure. Suggest less than 1% of RAM. If small, increase to 0.01% of RAM to improve some queries.
( innodb_buffer_pool_populate ) = OFF = 0 -- NUMA control
( query_prealloc_size ) = 8,192 / 8192M = 0.00% -- For parsing. Pct of RAM
( query_alloc_block_size ) = 8,192 / 8192M = 0.00% -- For parsing. Pct of RAM
( character_set_server ) = character_set_server = latin1
-- Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( Key_writes / Key_write_requests ) = 5,804 / 9232 = 62.9% -- key_buffer effectiveness for writes
-- If you have enough RAM, it would be worthwhile to increase key_buffer_size (now 16777216).
( Created_tmp_disk_tables / Created_tmp_tables ) = 13,250 / 18108 = 73.2% -- Percent of temp tables that spilled to disk
-- Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (68440 + 1927 + 425 + 0) / 0 = INF -- Statements per Commit (assuming all InnoDB)
-- Low: Might help to group queries together in transactions; High: long transactions strain various things.
( Select_scan ) = 165,862 / 75935 = 2.2 /sec -- full table scans
-- Add indexes / optimize queries (unless they are tiny tables)
( Com_optimize ) = 464 / 75935 = 22 /HR -- How often OPTIMIZE TABLE is performed.
-- OPTIMIZE TABLE is rarely useful, certainly not at high frequency.
( binlog_format ) = binlog_format = STATEMENT -- STATEMENT/ROW/MIXED.
-- ROW is preferred by 5.7 (10.3)
( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days)
-- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash.
(Not relevant if log_bin (now OFF) = OFF)
( innodb_autoinc_lock_mode ) = 1 -- Galera: desires 2 -- 2 = "interleaved"; 1 = "consecutive" is typical; 0 = "traditional".
-- Galera desires 2; 2 requires BINLOG_FORMAT=ROW or MIXED
( log_slow_queries ) = log_slow_queries = OFF -- Whether to log slow queries. (Before 5.1.29, 5.6.1)
( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)
( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
( back_log ) = 50 -- (Autosized as of 5.6.6; based on max_connections)
-- Raising to min(150, max_connections (now 151)) may help when doing lots of connections.
( Com_change_db / Connections ) = 1,278,567 / 363881 = 3.51 -- Database switches per connection
-- (minor) Consider using "db.table" syntax
( Com_change_db ) = 1,278,567 / 75935 = 17 /sec -- Probably comes from USE statements.
-- Consider connecting with DB, using db.tbl syntax, eliminating spurious USE statements, etc.
( Threads_running / thread_cache_size ) = 1 / 0 = INF -- Threads: current / cached (Not relevant when using thread pooling)
-- Optimize queries
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
Abnormally small:
( Innodb_pages_read + Innodb_pages_written ) / Uptime = 0.0672
(innodb_buffer_pool_size + innodb_log_buffer_size + key_buffer_size + query_cache_size + Max_used_connections * (thread_stack + net_buffer_length)) / _ram = 1.9%
Innodb_adaptive_hash_non_hash_searches = 1.1 /sec
Innodb_buffer_pool_pages_flushed / max(Questions, Queries) = 0.00056
Innodb_buffer_pool_pages_made_young = 0
Innodb_buffer_pool_pages_old = 943
Innodb_buffer_pool_read_ahead = 0
Innodb_checkpoint_max_age = 7.78e+6
Innodb_ibuf_merged_inserts = 0
Innodb_ibuf_merges = 0
Innodb_lsn_current = 2.52e+8
Innodb_lsn_flushed = 240.6MB
Innodb_lsn_last_checkpoint = 2.52e+8
Innodb_master_thread_10_second_loops = 945
Innodb_master_thread_1_second_loops = 10,439
Innodb_master_thread_sleeps = 0.14 /sec
Innodb_mem_adaptive_hash = 2.25e+6
Innodb_mem_dictionary = 2.1e+6
Innodb_mem_total = 131.4MB
Innodb_pages_read + Innodb_pages_written = 0.067 /sec
Innodb_x_lock_spin_waits = 0.047 /HR
Open_tables = 64
net_buffer_length = 8,192
Abnormally large:
Com_check = 22 /HR
Com_show_charsets = 28 /HR
Com_show_events = 1.2 /HR
Feature_gis = 0.66 /HR
Abnormal strings:
binlog_checksum = NONE
innodb_fast_shutdown = 1
opt_s__engine_condition_pushdown = off
opt_s__extended_keys = off
Some things have changed...
Observations:
Version: 5.5.64-MariaDB
8 GB of RAM
Uptime = 5d 15:01:27
You are not running on Windows.
Running 64-bit version
It appears that you are running both MyISAM and InnoDB.
The More Important Issues:
For 8GB and a mixture of MyISAM and InnoDB:
key_buffer_size = 800M
innodb_buffer_pool_size = 3000M
ulimit -n is 1024. Yet open_files_limit is only 760. I don't know how to get those raised and keep them raised.
innodb_log_file_size = 5M -- This is too low. However, it will be messy to change.
24 OPTIMIZEs/hour is very high, even for MyISAM. 1/month might be more realistic.
There are indications of slow queries; see my blog for how to chase that.
thread_cache_size -- set to 30; this may significantly speed up connections.
Details and other observations:
Conversion from MyISAM to InnoDB
( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((200M / 0.20 + 128M / 0.70)) / 8192M = 14.4% -- Most of available ram should be made available for caching.
-- http://mysql.rjweb.org/doc.php/memory
( open_files_limit ) = 760 -- ulimit -n
-- To allow more files, change ulimit or /etc/security/limits.conf or in sysctl.conf (kern.maxfiles & kern.maxfilesperproc) or something else (OS dependent)
( innodb_buffer_pool_size ) = 128M -- InnoDB Data + Index cache
-- 128M (an old default) is woefully small.
( innodb_log_buffer_size / innodb_log_file_size ) = 8M / 5M = 160.0% -- Buffer is in RAM; file is on disk.
-- The buffer_size should be smaller and/or the file_size should be larger.
( innodb_flush_method ) = innodb_flush_method = -- How InnoDB should ask the OS to write blocks. Suggest O_DIRECT or O_ALL_DIRECT (Percona) to avoid double buffering. (At least for Unix.) See chrischandler for caveat about O_ALL_DIRECT
( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.
( innodb_stats_on_metadata ) = innodb_stats_on_metadata = ON -- Re-analyze table when touching stats.
-- ON is likely to slow down certain SHOWs and information_schema accesses.
( innodb_recovery_update_relay_log ) = innodb_recovery_update_relay_log = OFF -- Helps avoid replication errors after a crash.
( innodb_import_table_from_xtrabackup ) = 0 -- Useful for transportable tablespaces
( sync_binlog ) = 0 -- Use 1 for added security, at some cost of I/O =1 may lead to lots of "query end"; =0 may lead to "binlog at impossible position" and lose transactions in a crash, but is faster.
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( character_set_server ) = character_set_server = latin1
-- Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default.
( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue
( Created_tmp_disk_tables / Created_tmp_tables ) = 98,045 / 181066 = 54.1% -- Percent of temp tables that spilled to disk
-- Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.
( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (388211 + 19570 + 3890 + 330) / 0 = INF -- Statements per Commit (assuming all InnoDB)
-- Low: Might help to group queries together in transactions; High: long transactions strain various things.
( Select_scan ) = 945,274 / 486087 = 1.9 /sec -- full table scans
-- Add indexes / optimize queries (unless they are tiny tables)
( Com_optimize ) = 3,202 / 486087 = 24 /HR -- How often OPTIMIZE TABLE is performed.
-- OPTIMIZE TABLE is rarely useful, certainly not at high frequency.
( binlog_format ) = binlog_format = STATEMENT -- STATEMENT/ROW/MIXED.
-- ROW is preferred by 5.7 (10.3)
( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days)
-- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash.
(Not relevant if log_bin (now OFF) = OFF)
( log_slow_queries ) = log_slow_queries = OFF -- Whether to log slow queries. (Before 5.1.29, 5.6.1)
( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)
( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
( back_log ) = 50 -- (Autosized as of 5.6.6; based on max_connections)
-- Raising to min(150, max_connections (now 151)) may help when doing lots of connections.
( Com_change_db / Connections ) = 8,920,272 / 2392646 = 3.73 -- Database switches per connection
-- (minor) Consider using "db.table" syntax
( Com_change_db ) = 8,920,272 / 486087 = 18 /sec -- Probably comes from USE statements.
-- Consider connecting with DB, using db.tbl syntax, eliminating spurious USE statements, etc.
You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.
Abnormally small:
Innodb_adaptive_hash_non_hash_searches = 46 /sec
Innodb_buffer_pool_bytes_data = 272 /sec
Innodb_checkpoint_max_age = 7.78e+6
Innodb_master_thread_10_second_loops = 17,345
Innodb_master_thread_1_second_loops = 184,979
Innodb_master_thread_sleeps = 0.38 /sec
Innodb_mem_adaptive_hash = 4.17e+6
Innodb_mem_dictionary = 4.15e+6
Innodb_mem_total = 131.4MB
net_buffer_length = 8,192
Abnormally large:
Com_check = 24 /HR
Com_create_db = 0.15 /HR
Com_drop_db = 0.044 /HR
Com_rename_table = 0.49 /HR
Com_show_charsets = 16 /HR
Com_show_events = 17 /HR
Com_show_storage_engines = 1.9 /HR
Feature_gis = 1.1 /HR
Feature_locale = 20 /HR
Threadpool_idle_threads = 7
Threadpool_threads = 8
Abnormal strings:
binlog_checksum = NONE
innodb_fast_shutdown = 1
opt_s__engine_condition_pushdown = off
opt_s__extended_keys = off
thread_handling = pool-of-threads

MariaDB / Columnstore engine Memory getting chocked

we have installed mariadb along with columnstore engine and from the last few weeks we are facing memory chocking issue where memory getting chocked and all our DML/DDL operations are getting stuck, after restarting the services it gets fixed.
below are the stats :
total used free shared buff/cache available
Mem: 15 2 7 0 5 12
Swap: 4 0 4
[mysqld]
port = 3306
socket = /opt/evolv/mariadb/columnstore/mysql/lib/mysql/mysql.sock
datadir = /opt/evolv/mariadb/columnstore/mysql/db
skip-external-locking
key_buffer_size = 512M
max_allowed_packet = 1M
table_cache = 512
sort_buffer_size = 64M
read_buffer_size = 64M
read_rnd_buffer_size = 512M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 0
# Try number of CPU's*2 for thread_concurrency
#thread_concurrency = 8
thread_stack = 512K
lower_case_table_names=1
group_concat_max_len=512
infinidb_use_import_for_batchinsert=1
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 8192M
#innodb_additional_mem_pool_size = 20M
# Set .._log_file_size to 25 % of buffer pool size
#innodb_log_file_size = 100M
#innodb_log_buffer_size = 8M
#innodb_flush_log_at_trx_commit = 1
#innodb_lock_wait_timeout = 50
Here's an analysis of the VARIABLES and (suspicious) GLOBAL STATUS; nothing exciting:
Observations:
Version: 10.1.26-MariaDB
15 GB of RAM
Uptime = 03:04:25; Please rerun SHOW GLOBAL STATUS after several hours.
Are you sure this was a SHOW GLOBAL STATUS ?
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) InnoDB.
The More Important Issues:
Uptime = 03:04:25; Please rerun SHOW GLOBAL STATUS after several hours.
Are you sure this was a SHOW GLOBAL STATUS ?
key_buffer_size is excessively large (3G). If you don't need MyISAM for anything, set it to 50M.
Check infinidb_um_mem_limit to see if it makes sense for your application.
Suggest lowering innodb_buffer_pool_size to 2G until the "choking" is figured out.
Details and other observations:
( (key_buffer_size - 1.2 * Key_blocks_used * 1024) / _ram ) = (3072M - 1.2 * 0 * 1024) / 15360M = 20.0% -- Percent of RAM wasted in key_buffer.
-- Decrease key_buffer_size.
( Key_blocks_used * 1024 / key_buffer_size ) = 0 * 1024 / 3072M = 0 -- Percent of key_buffer used. High-water-mark.
-- Lower key_buffer_size to avoid unnecessary memory usage.
( innodb_buffer_pool_size / _ram ) = 6144M / 15360M = 40.0% -- % of RAM used for InnoDB buffer_pool
( Innodb_buffer_pool_pages_free * 16384 / innodb_buffer_pool_size ) = 392,768 * 16384 / 6144M = 99.9% -- buffer pool free
( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks.
-- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.
( local_infile ) = local_infile = ON
-- local_infile = ON is a potential security issue
( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days)
-- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash.
(Not relevant if log_bin = OFF)
( long_query_time ) = 5 -- Cutoff (Seconds) for defining a "slow" query.
-- Suggest 2
Abnormally large:
read_buffer_size = 32MB
Acl_database_grants = 780
Acl_proxy_users = 4
Acl_users = 281
Columstore.xml
95% of all memory??
<MemoryCheckPercent>95</MemoryCheckPercent> <!-- Max real memory to limit growth of buffers to -->
<DataFileLog>OFF</DataFileLog>
I guess this is not relevant, since it is commented out??
<!-- enable if you want to limit how much memory may be used for hdfs read/write memory buffers.
<hdfsRdwrBufferMaxSize>8G</hdfsRdwrBufferMaxSize>
-->
Keep in mind that MySQL, other than Columnstore, is consuming a lot of memory:
<TotalUmMemory>25%</TotalUmMemory>
<TotalPmUmMemory>10%</TotalPmUmMemory>