MySQL I/O wait time suddenly increasing - mysql

The I/O wait time is increasing for one of my servers. I haven't seen any long running queries in the process list. Can someone help me identify the cause?
The server is running Cent OS Linux 7.7, 4 core CPU , MySQL version 5.6.42
iotop output
Actual DISK READ: 57.85 M/s | Actual DISK WRITE: 74.90 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
31529 be/4 mysql 58.08 M/s 0.00 B/s 0.00 % 84.45 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib~p.geolearning.com.pid --socket=/var/lib/mysql/mysql.sock --port=3306
12331 be/4 mysql 7.68 K/s 61.46 K/s 0.00 % 0.66 % mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib~p.geolearning.com.pid --socket=/var/lib/mysql/mysql.sock --port=3306
6819 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.02 % [kworker/2:16]
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % systemd --switched-root --system --deserialize 22
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
4 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0H]
6 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]
7 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
8 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_bh]
9 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_sched]
10 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [lru-add-drain]
Processlist:
+---------+----------------+------------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+---------+----------------+------------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| 2128874 | lpaauser | 10.7.40.7:64197 | prodsup | Query | 0 | Sending data | REPLACE INTO course_auto_assignments_compute8 (domain_id,lms_user_id,filter_set_id)
S | 0 | 0 |
| 2129350 | mveeranki | localhost | | Query | 0 | init | show processlist | 0 | 0 |
+---------+----------------+------------------+---------+---------+------+--------------+------------------------------------------------------------------------------------------------------+-----

Related

Why am I getting this error in MariaDB Galera Clutser?

I am using Galera Cluster, and recently encountered the following error while restarting MariaDB.
It consists of a total of 3 nodes.
When only the first node is running, DB access is possible, but when the second node is executed, the following error message is displayed and the cluster is not formed.
Why?
MariaDB Version: 10.4.20
mariadb | =====================================
mariadb | 2022-08-11 12:00:11 0x7f37339cf700 INNODB MONITOR OUTPUT
mariadb | =====================================
mariadb | Per second averages calculated from the last 60 seconds
mariadb | -----------------
mariadb | BACKGROUND THREAD
mariadb | -----------------
mariadb | srv_master_thread loops: 3 srv_active, 0 srv_shutdown, 74 srv_idle
mariadb | srv_master_thread log flush and writes: 76
mariadb | ----------
mariadb | SEMAPHORES
mariadb | ----------
mariadb | OS WAIT ARRAY INFO: reservation count 20
mariadb | --Thread 139884474328832 has waited at dict0dict.cc line 880 for 232.00 seconds the semaphore:
mariadb | Mutex at 0x556c68691100, Mutex DICT_SYS created dict0dict.cc:824, lock var 2
mariadb |
mariadb | --Thread 139874768164608 has waited at srv0srv.cc line 2011 for 242.00 seconds the semaphore:
mariadb | X-lock on RW-latch at 0x556c68691130 created in file dict0dict.cc line 833
mariadb | a writer (thread id 139875057460992) has reserved it in mode exclusive
mariadb | number of readers 0, waiters flag 1, lock_word: 0
mariadb | Last time write locked in file dict0stats.cc line 2486
mariadb | OS WAIT ARRAY INFO: signal count 13
mariadb | RW-shared spins 19, rounds 487, OS waits 13
mariadb | RW-excl spins 17, rounds 103, OS waits 2
mariadb | RW-sx spins 0, rounds 0, OS waits 0
mariadb | Spin rounds per wait: 25.63 RW-shared, 6.06 RW-excl, 0.00 RW-sx
mariadb | ------------
mariadb | TRANSACTIONS
mariadb | ------------
mariadb | Trx id counter 2908999983
mariadb | Purge done for trx's n:o < 2908999981 undo n:o < 0 state: running but idle
mariadb | History list length 4
mariadb | LIST OF TRANSACTIONS FOR EACH SESSION:
mariadb | ---TRANSACTION 421359460385048, COMMITTED IN MEMORY flushing log
mariadb | 0 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 2
mariadb | ---TRANSACTION 421359460380824, not started
mariadb | 0 lock struct(s), heap size 1128, 0 row lock(s)
mariadb | --------
mariadb | FILE I/O
mariadb | --------
mariadb | I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
mariadb | I/O thread 1 state: waiting for completed aio requests (log thread)
mariadb | I/O thread 2 state: waiting for completed aio requests (read thread)
mariadb | I/O thread 3 state: waiting for completed aio requests (read thread)
mariadb | I/O thread 4 state: waiting for completed aio requests (read thread)
mariadb | I/O thread 5 state: waiting for completed aio requests (read thread)
mariadb | I/O thread 6 state: waiting for completed aio requests (write thread)
mariadb | I/O thread 7 state: waiting for completed aio requests (write thread)
mariadb | I/O thread 8 state: waiting for completed aio requests (write thread)
mariadb | I/O thread 9 state: waiting for completed aio requests (write thread)
mariadb | Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
mariadb | ibuf aio reads:, log i/o's:, sync i/o's:
mariadb | Pending flushes (fsync) log: 0; buffer pool: 0
mariadb | 129125 OS file reads, 323 OS file writes, 57 OS fsyncs
mariadb | 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
mariadb | -------------------------------------
mariadb | INSERT BUFFER AND ADAPTIVE HASH INDEX
mariadb | -------------------------------------
mariadb | Ibuf: size 1, free list len 4026, seg size 4028, 0 merges
mariadb | merged operations:
mariadb | insert 0, delete mark 0, delete 0
mariadb | discarded operations:
mariadb | insert 0, delete mark 0, delete 0
mariadb | Hash table size 2212699, node heap has 0 buffer(s)
mariadb | Hash table size 2212699, node heap has 0 buffer(s)
mariadb | Hash table size 2212699, node heap has 0 buffer(s)
mariadb | Hash table size 2212699, node heap has 0 buffer(s)
mariadb | Hash table size 2212699, node heap has 2 buffer(s)
mariadb | Hash table size 2212699, node heap has 2 buffer(s)
mariadb | Hash table size 2212699, node heap has 2 buffer(s)
mariadb | Hash table size 2212699, node heap has 3 buffer(s)
mariadb | 0.00 hash searches/s, 0.00 non-hash searches/s
mariadb | ---
mariadb | LOG
mariadb | ---
mariadb | Log sequence number 1202450201026
mariadb | Log flushed up to 1202450200636
mariadb | Pages flushed up to 1202450196728
mariadb | Last checkpoint at 1202450196719
mariadb | 1 pending log flushes, 0 pending chkp writes
mariadb | 35 log i/o's done, 0.00 log i/o's/second
mariadb | ----------------------
mariadb | BUFFER POOL AND MEMORY
mariadb | ----------------------
mariadb | Total large memory allocated 8606711808
mariadb | Dictionary memory allocated 407480
mariadb | Buffer pool size 513728
mariadb | Free buffers 384737
mariadb | Database pages 128982
mariadb | Old database pages 47768
mariadb | Modified db pages 136
mariadb | Percent of dirty pages(LRU & free pages): 0.026
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 20, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 128851, created 131, written 282
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 128982, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ----------------------
mariadb | INDIVIDUAL BUFFER POOL INFO
mariadb | ----------------------
mariadb | ---BUFFER POOL 0
mariadb | Buffer pool size 64216
mariadb | Free buffers 48110
mariadb | Database pages 16104
mariadb | Old database pages 5964
mariadb | Modified db pages 20
mariadb | Percent of dirty pages(LRU & free pages): 0.031
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 20, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16100, created 4, written 31
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16104, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 1
mariadb | Buffer pool size 64216
mariadb | Free buffers 48038
mariadb | Database pages 16177
mariadb | Old database pages 5991
mariadb | Modified db pages 5
mariadb | Percent of dirty pages(LRU & free pages): 0.008
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16177, created 0, written 8
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16177, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 2
mariadb | Buffer pool size 64216
mariadb | Free buffers 48125
mariadb | Database pages 16089
mariadb | Old database pages 5959
mariadb | Modified db pages 0
mariadb | Percent of dirty pages(LRU & free pages): 0.000
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16089, created 0, written 0
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16089, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 3
mariadb | Buffer pool size 64216
mariadb | Free buffers 48078
mariadb | Database pages 16137
mariadb | Old database pages 5976
mariadb | Modified db pages 64
mariadb | Percent of dirty pages(LRU & free pages): 0.100
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16137, created 0, written 65
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16137, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 4
mariadb | Buffer pool size 64216
mariadb | Free buffers 48092
mariadb | Database pages 16124
mariadb | Old database pages 5972
mariadb | Modified db pages 46
mariadb | Percent of dirty pages(LRU & free pages): 0.072
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16124, created 0, written 46
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16124, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 5
mariadb | Buffer pool size 64216
mariadb | Free buffers 48143
mariadb | Database pages 16071
mariadb | Old database pages 5952
mariadb | Modified db pages 1
mariadb | Percent of dirty pages(LRU & free pages): 0.002
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16071, created 0, written 1
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16071, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 6
mariadb | Buffer pool size 64216
mariadb | Free buffers 48087
mariadb | Database pages 16129
mariadb | Old database pages 5973
mariadb | Modified db pages 0
mariadb | Percent of dirty pages(LRU & free pages): 0.000
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16066, created 63, written 63
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16129, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | ---BUFFER POOL 7
mariadb | Buffer pool size 64216
mariadb | Free buffers 48064
mariadb | Database pages 16151
mariadb | Old database pages 5981
mariadb | Modified db pages 0
mariadb | Percent of dirty pages(LRU & free pages): 0.000
mariadb | Max dirty pages percent: 75.000
mariadb | Pending reads 0
mariadb | Pending writes: LRU 0, flush list 0, single page 0
mariadb | Pages made young 0, not young 0
mariadb | 0.00 youngs/s, 0.00 non-youngs/s
mariadb | Pages read 16087, created 64, written 68
mariadb | 0.00 reads/s, 0.00 creates/s, 0.00 writes/s
mariadb | No buffer pool page gets since the last printout
mariadb | Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
mariadb | LRU len: 16151, unzip_LRU len: 0
mariadb | I/O sum[0]:cur[0], unzip sum[0]:cur[0]
mariadb | --------------
mariadb | ROW OPERATIONS
mariadb | --------------
mariadb | 0 queries inside InnoDB, 0 queries in queue
mariadb | 0 read views open inside InnoDB
mariadb | Process ID=1, Main thread ID=139874768164608, state: enforcing dict cache limit
mariadb | Number of rows inserted 0, updated 0, deleted 0, read 200
mariadb | 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
mariadb | Number of system rows inserted 5, updated 0, deleted 4, read 9
mariadb | 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
mariadb | ----------------------------
mariadb | END OF INNODB MONITOR OUTPUT
mariadb | ============================
Well, unfortunately, you did not include an error. Or that's what it looks like. The output that you have included is nothing out of ordinary - this is a result of a SHOW ENGINE INNODB STATUS; command, which exposes some of the internals. Based on this, it shows that you have a fairly unused server with very limited traffic. There were two transactions running, one of them got locked on some of the rows. You still have free memory in the InnoDB buffer pool. Nothing that would point towards any issues.
In general, the problem with the Galera cluster could be that the second node did not join the cluster properly. Unless all nodes are in "Primary" state, as shown in the output of SHOW STATUS LIKE 'wsrep_cluster_status'; If the node is not in the primary state, you won't be able to run queries against that node (unless you have set wsrep_dirty_reads, then you can run SELECTs, but that's another story). This is intended to ensure that you won't be getting stale reads (dataset on the non-Primary node is outdated as it is not part of the cluster).

mysql crashing on queries to some tables

We are running mariadb 10.3.25:
$ mysql --version
mysql Ver 15.1 Distrib 10.3.25-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
It seems that there is some sort of corruption in some of our databases’ tables.
Exhibit A:
MariaDB [etherpad]> select * from store;
ERROR 2013 (HY000): Lost connection to MySQL server during query
In the meantime, this happens in the log:
Jan 16 19:51:52 hostname mysqld[31236]: 2021-01-16 19:51:52 0x7f0c884b8700 InnoDB: Assertion failure in file /build/mariadb-10.3-RRxkin/mariadb-10.3-10.3.25/storage/innobase/row/row0sel.cc line 2972
Jan 16 19:51:52 hostname mysqld[31236]: InnoDB: Failing assertion: prebuilt->trx->isolation_level == TRX_ISO_READ_UNCOMMITTED
Jan 16 19:51:52 hostname mysqld[31236]: InnoDB: We intentionally generate a memory trap.
Jan 16 19:51:52 hostname mysqld[31236]: InnoDB: [...]
Jan 16 19:51:52 hostname mysqld[31236]: 210116 19:51:52 [ERROR] mysqld got signal 6 ;
Jan 16 19:51:52 hostname mysqld[31236]: This could be because you hit a bug. It is also possible that this binary
Jan 16 19:51:52 hostname mysqld[31236]: or one of the libraries it was linked against is corrupt, improperly built,
Jan 16 19:51:52 hostname mysqld[31236]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jan 16 19:51:52 hostname mysqld[31236]: [...]
Jan 16 19:51:52 hostname mysqld[31236]: We will try our best to scrape up some info that will hopefully help
Jan 16 19:51:52 hostname mysqld[31236]: diagnose the problem, but since we have already crashed,
Jan 16 19:51:52 hostname mysqld[31236]: something is definitely wrong and this may fail.
Jan 16 19:51:52 hostname mysqld[31236]: Server version: 10.3.25-MariaDB-0+deb10u1-log
Jan 16 19:51:52 hostname mysqld[31236]: key_buffer_size=16777216
Jan 16 19:51:52 hostname mysqld[31236]: read_buffer_size=131072
Jan 16 19:51:52 hostname mysqld[31236]: key_buffer_size=16777216 [55/647]
Jan 16 19:51:52 hostname mysqld[31236]: read_buffer_size=131072
Jan 16 19:51:52 hostname mysqld[31236]: max_used_connections=16
Jan 16 19:51:52 hostname mysqld[31236]: max_threads=153
Jan 16 19:51:52 hostname mysqld[31236]: thread_count=22
Jan 16 19:51:52 hostname mysqld[31236]: It is possible that mysqld could use up to
Jan 16 19:51:52 hostname mysqld[31236]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352736 K bytes of memory
Jan 16 19:51:52 hostname mysqld[31236]: Hope that's ok; if not, decrease some variables in the equation.
Jan 16 19:51:52 hostname mysqld[31236]: Thread pointer: 0x7f0c500093b8
Jan 16 19:51:52 hostname mysqld[31236]: Attempting backtrace. You can use the following information to find out
Jan 16 19:51:52 hostname mysqld[31236]: where mysqld died. If you see no messages after this, something went
Jan 16 19:51:52 hostname mysqld[31236]: terribly wrong...
Jan 16 19:51:52 hostname mysqld[31236]: stack_bottom = 0x7f0c884b7dd8 thread_stack 0x30000
Jan 16 19:51:52 hostname mysqld[31236]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x563337b2b05e]
Jan 16 19:51:52 hostname mysqld[31236]: /usr/sbin/mysqld(handle_fatal_signal+0x54d)[0x56333765e09d]
Jan 16 19:51:53 hostname mysqld[31236]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f0c91ef1730]
Jan 16 19:51:53 hostname mysqld[31236]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7f0c914ae7bb]
Jan 16 19:51:53 hostname mysqld[31236]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7f0c91499535]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x4e3433)[0x5633373a2433]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x4d5d6c)[0x563337394d6c]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x9d8814)[0x563337897814]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x9dcdcf)[0x56333789bdcf]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x918681)[0x5633377d7681]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x127)[0x563337662db7]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x1c)[0x56333776a43c]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x1e3)[0x5633374bdf03]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0xaaa)[0x5633374e01ba]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x5633374e03d3]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xef)[0x5633374deaaf]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x14d)[0x5633374df38d]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(+0x5c1d8c)[0x563337480d8c]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5857)[0x56333748d087]
Jan 16 19:51:53 hostname mysqld[31236]: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x1c9)[0x56333748f879]
Jan 16 19:51:54 hostname mysqld[31236]: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x111d)[0x56333749172d]
Jan 16 19:51:54 hostname mysqld[31236]: /usr/sbin/mysqld(_Z10do_commandP3THD+0x122)[0x563337492e82]
Jan 16 19:51:54 hostname mysqld[31236]: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x23a)[0x5633375641ba]
Jan 16 19:51:54 hostname mysqld[31236]: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x56333756433d]
Jan 16 19:51:55 hostname mysqld[31236]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7f0c91ee6fa3]
Jan 16 19:51:55 hostname mysqld[31236]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f0c915704cf]
Jan 16 19:51:55 hostname mysqld[31236]: Trying to get some variables.
Jan 16 19:51:55 hostname mysqld[31236]: Some pointers may be invalid and cause the dump to abort.
Jan 16 19:51:55 hostname mysqld[31236]: Query (0x7f0c50012e20): select * from store
Jan 16 19:51:55 hostname mysqld[31236]: Connection ID (thread ID): 733
Jan 16 19:51:55 hostname mysqld[31236]: Status: NOT_KILLED
Jan 16 19:51:55 hostname mysqld[31236]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,de
rived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_joi
n_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_d
erived=on,split_materialized=on
Jan 16 19:51:55 hostname mysqld[31236]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Jan 16 19:51:55 hostname mysqld[31236]: information that should help you find out what is causing the crash.
Jan 16 19:51:55 hostname mysqld[31236]: Writing a core file...
Jan 16 19:51:55 hostname mysqld[31236]: Working directory at /var/lib/mysql
Jan 16 19:51:55 hostname mysqld[31236]: Resource Limits:
Jan 16 19:51:55 hostname mysqld[31236]: Limit Soft Limit Hard Limit Units
Jan 16 19:51:55 hostname mysqld[31236]: Max cpu time unlimited unlimited seconds
Jan 16 19:51:55 hostname mysqld[31236]: Max file size unlimited unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max data size unlimited unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max file size unlimited unlimited bytes [0/647]
Jan 16 19:51:55 hostname mysqld[31236]: Max data size unlimited unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max stack size 8388608 unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max core file size 0 unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max resident set unlimited unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max processes 15390 15390 processes
Jan 16 19:51:55 hostname mysqld[31236]: Max open files 65536 65536 files
Jan 16 19:51:55 hostname mysqld[31236]: Max locked memory 65536 65536 bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max address space unlimited unlimited bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max file locks unlimited unlimited locks
Jan 16 19:51:55 hostname mysqld[31236]: Max pending signals 15390 15390 signals
Jan 16 19:51:55 hostname mysqld[31236]: Max msgqueue size 819200 819200 bytes
Jan 16 19:51:55 hostname mysqld[31236]: Max nice priority 0 0
Jan 16 19:51:55 hostname mysqld[31236]: Max realtime priority 0 0
Jan 16 19:51:55 hostname mysqld[31236]: Max realtime timeout unlimited unlimited us
Jan 16 19:51:55 hostname mysqld[31236]: Core pattern: core
Jan 16 19:52:02 hostname mysqld[6672]: [... innodb crash recovery ...]
A very similar thing happens with some other tables as well.
What I tried:
I wanted to dump all data, purge the entire mariadb installation and restore. Unsurprisingly, mysqldump runs into the same corruption (?) and the database crashes during the dump.
I tried following a guide that advises to create a MyISAM table and fill that with data from the innodb table, but this fails for the same reason.
What can be done about this? Naturally, we need the data in these tables. It appears that once the query hits a certain record/block (I am oblivious to inner workings of mysql) it crashes the server. So how do we salvage the data?
UPDATE 2021-01-18 as requested, here are the variables and status queries:
MariaDB [(none)]> show global variables like '%thread%';
+-----------------------------------------+---------------------------+
| Variable_name | Value |
+-----------------------------------------+---------------------------+
| aria_repair_threads | 1 |
| binlog_optimize_thread_scheduling | ON |
| debug_no_thread_alarm | OFF |
| innodb_encryption_threads | 0 |
| innodb_purge_threads | 4 |
| innodb_read_io_threads | 4 |
| innodb_thread_concurrency | 0 |
| innodb_thread_sleep_delay | 10000 |
| innodb_write_io_threads | 4 |
| max_delayed_threads | 20 |
| max_insert_delayed_threads | 20 |
| myisam_repair_threads | 1 |
| performance_schema_max_thread_classes | 50 |
| performance_schema_max_thread_instances | -1 |
| slave_domain_parallel_threads | 0 |
| slave_parallel_threads | 0 |
| thread_cache_size | 8 |
| thread_concurrency | 10 |
| thread_handling | one-thread-per-connection |
| thread_pool_idle_timeout | 60 |
| thread_pool_max_threads | 65536 |
| thread_pool_oversubscribe | 3 |
| thread_pool_prio_kickup_timer | 1000 |
| thread_pool_priority | auto |
| thread_pool_size | 1 |
| thread_pool_stall_limit | 500 |
| thread_stack | 196608 |
| wsrep_slave_threads | 1 |
+-----------------------------------------+---------------------------+
28 rows in set (0.001 sec)
MariaDB [(none)]> show global status like '%thread%';
+------------------------------------------+-------+
| Variable_name | Value |
+------------------------------------------+-------+
| Delayed_insert_threads | 0 |
| Performance_schema_thread_classes_lost | 0 |
| Performance_schema_thread_instances_lost | 0 |
| Slow_launch_threads | 0 |
| Threadpool_idle_threads | 0 |
| Threadpool_threads | 0 |
| Threads_cached | 7 |
| Threads_connected | 12 |
| Threads_created | 98 |
| Threads_running | 6 |
| wsrep_applier_thread_count | 0 |
| wsrep_rollbacker_thread_count | 0 |
| wsrep_thread_count | 0 |
+------------------------------------------+-------+
13 rows in set (0.001 sec)
MariaDB [(none)]> show global variables like '%timeout%';
+---------------------------------------+----------+
| Variable_name | Value |
+---------------------------------------+----------+
| connect_timeout | 10 |
| deadlock_timeout_long | 50000000 |
| deadlock_timeout_short | 10000 |
| delayed_insert_timeout | 300 |
| idle_readonly_transaction_timeout | 0 |
| idle_transaction_timeout | 0 |
| idle_write_transaction_timeout | 0 |
| innodb_flush_log_at_timeout | 1 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | OFF |
| interactive_timeout | 28800 |
| lock_wait_timeout | 86400 |
| net_read_timeout | 600 |
| net_write_timeout | 600 |
| rpl_semi_sync_master_timeout | 10000 |
| rpl_semi_sync_slave_kill_conn_timeout | 5 |
| slave_net_timeout | 60 |
| thread_pool_idle_timeout | 60 |
| wait_timeout | 28800 |
+---------------------------------------+----------+
19 rows in set (0.001 sec)
MariaDB [(none)]> show global status like '%timeout%';
+-------------------------------------+-------+
| Variable_name | Value |
+-------------------------------------+-------+
| Binlog_group_commit_trigger_timeout | 0 |
| Master_gtid_wait_timeouts | 0 |
| Ssl_default_timeout | 0 |
| Ssl_session_cache_timeouts | 0 |
+-------------------------------------+-------+
4 rows in set (0.001 sec)
MariaDB [(none)]> show global status like '%aborted%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Aborted_clients | 3 |
| Aborted_connects | 0 |
+------------------+-------+
2 rows in set (0.001 sec)
The server has 5 GB of RAM.
About the store table:
MariaDB [etherpad]> show create table store;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| store | CREATE TABLE `store` (
`key` varchar(100) COLLATE utf8_bin NOT NULL DEFAULT '',
`value` longtext COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)
MariaDB [etherpad]> select count(*) from store;
+----------+
| count(*) |
+----------+
| 779443 |
+----------+
1 row in set (1 min 19.378 sec)
Here is the iostat info:
$ iostat -xm 5 3
Linux 4.14.0-0.bpo.3-amd64 (hostname) 01/18/2021 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
8.63 2.39 16.53 22.68 0.23 49.54
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
xvdap2 7.67 37.91 0.07 0.67 0.13 37.91 1.67 50.00 16.35 2.54 0.05 9.40 18.01 4.35 19.82
xvdap1 0.51 1.25 0.00 0.01 0.02 0.07 3.58 5.64 7.52 27.01 0.03 4.15 4.24 1.63 0.29
avg-cpu: %user %nice %system %iowait %steal %idle
18.51 2.21 15.49 55.33 0.40 8.05
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
xvdap2 4.00 157.80 0.02 1.53 0.00 71.00 0.00 31.03 5.80 55.33 7.93 4.00 9.92 4.37 70.72
xvdap1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
8.96 2.44 15.68 15.27 0.41 57.23
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
xvdap2 0.00 22.20 0.00 0.40 0.00 35.40 0.00 61.46 0.00 22.81 0.30 0.00 18.27 4.11 9.12
xvdap1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
UPDATE 2021-01-24: I tried to pinpoint the problem by quasy-bisecting the table with limit clauses and found that out of the ~800,000 records, every query that selects records after 663,187 crashes the DB. The few records preceding the 663,187. one contain seemingly mangled data, see below.
MariaDB [etherpad]> select * from store limit 663184, 1\G;
*************************** 1. row ***************************
key:
value:
f[Y
f[팩
Does this not hint at data corruption? What could I do about the problem? Get rid of these records?
From the information available at this time, consider in your my.cnf [mysqld] section
innodb_buffer_pool_size=2G # to use 40% of available RAM
REMOVE thread_cache_size to allow default sizing (or set it to 256)
REMOVE thread_stack to allow default calc of slightly larger thread_stack per ref man
When we know number of cores/cpus we may be able to provide additional suggestions.
When more than 1 cpu's are available, additional suggestions could be provided.
Your query: select count(*) from store;
may complete in less than a minute if your tried
SELECT COUNT(key) FROM store;
to only cause reading of the index, rather than every row.
Have a GREAT 2021.

Very slow writes on MySQL 8 - waiting for handler commit

I have MySQL 8 docker installation installed on an edge device which has the following two tables to write to
video_paths | CREATE TABLE `video_paths` (
`entry` int(11) NOT NULL AUTO_INCREMENT,
`timestamp` bigint(20) NOT NULL,
`duration` int(11) NOT NULL,
`path` varchar(255) NOT NULL,
`motion` int(11) NOT NULL DEFAULT '0',
`cam_id` varchar(255) NOT NULL DEFAULT '',
`hd` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`entry`),
KEY `cam_id` (`cam_id`),
KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=7342309 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
AND
CREATE TABLE `tracker` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`table_name` varchar(255) NOT NULL,
`primary_key_name` varchar(255) NOT NULL,
`pointer` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `table_name` (`table_name`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
The following queries are run every few secs for up to 32 cameras and are taking a lot of time as indicated by the slow query log.
UPDATE tracker SET pointer = 7342046 WHERE table_name = 'video_paths'
INSERT INTO video_paths (timestamp,duration,path,cam_id,hd) VALUES (1597548365000,5000,'/s/ss/x-0/v/2020-08-16/3/1.ts','x-1',1)
Most of the time is spent in the waiting for handler commit state
The total size of my data (tables + index) is ~1GB and I have the following settings enabled to optimise for write
skip-log-bin - Disabled the bin log because I don't have a replica and therefore no use for it
innodb_flush_log_at_trx_commit =2 - I am Optimising for performance rather than consistency here.
range_optimizer_max_mem_size =0 As mention in this question, I have allowed max memory to range optimiser.
inndo_buffer_pool_size= 512Mb - This should be enough for my data?.
innodb_log_file_size= 96Mb *2 files
I am seeing queries that are taking up to 90-100 secs sometimes.
SET timestamp=1597549337;
INSERT INTO video_paths (timestamp,duration,path,cam_id,hd) VALUES (1597548365000,5000,'/s/ss/x-0/v/2020-08-16/3/1.ts','x-1',1);
# Time: 2020-08-16T03:42:24.533408Z
# Query_time: 96.712976 Lock_time: 0.000033 Rows_sent: 0 Rows_examined: 0
---UPDATE---
Here's the complete my.cnf file
my.cnf
[mysqld]
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
datadir = /var/lib/mysql
secure-file-priv= NULL
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
skip-log-bin
innodb_buffer_pool_size=536870912
innodb_log_file_size=100663296
# Custom config should go here
!includedir /etc/mysql/conf.d/
conf.d/docker.cnf
[mysqld]
skip-host-cache
skip-name-resolve
The docker container is using the host mode so complete 15GB memory is available to the container.
--- UPDATE 2 ---
After increasing the innodb_buffer_poo_size to 2GB as suggested by #fyrye, the statements have now started getting stuck on STATE = UPDATE instead of waiting for handler commit.
---- UPDATE 3 ---
Looks like the CPU is causing the bottleneck
** ---- UPDATE 4 ---- **
Additional info
Ram Size
total used free shared buff/cache available
Mem: 15909 1711 9385 2491 4813 11600
Swap: 0 0 0
No SSD/NVMe devices attached
SHOW GLOBAL STATUS - https://pastebin.com/vtWi0PUq
SHOW GLOBAL VARIABLES - https://pastebin.com/MUZeG959
SHOW FULL PROCESSLIST - https://pastebin.com/eebEcYk7
htop - htop here is for the edge system which has 4 other containers running which include the main app, ffmpeg, mqtt, etc.
ulimit -a:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 62576
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 62576
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
opstat -xm 5 4
Linux 4.15.0-106-generic (xxxx) 08/18/2020 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
26.97 0.00 22.36 22.53 0.00 28.14
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 3.20 0.00 2.40 2.40 0.00 0.00 0.00
sda 13.78 9.89 32.24 11.44 0.37 4.10 209.51 47.52 1079.07 44.07 3994.87 22.39 97.81
avg-cpu: %user %nice %system %iowait %steal %idle
19.71 0.00 27.85 40.87 0.00 11.57
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 1.40 4.60 0.03 2.71 934.93 142.66 24221.33 666.29 31390.26 166.67 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
20.16 0.00 26.77 28.30 0.00 24.77
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 8.80 5.60 0.03 3.45 496.11 141.28 12507.78 194.00 31858.00 69.44 100.00
mpstat -P ALL 5 3
Linux 4.15.0-106-generic (sn-1f0ce8) 08/18/2020 _x86_64_ (4 CPU)
02:15:47 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:15:52 PM all 21.48 0.00 20.40 29.01 0.00 7.94 0.00 0.00 0.00 21.17
02:15:52 PM 0 24.95 0.00 20.86 5.32 0.00 0.61 0.00 0.00 0.00 48.26
02:15:52 PM 1 17.59 0.00 18.81 57.67 0.00 5.93 0.00 0.00 0.00 0.00
02:15:52 PM 2 21.28 0.00 17.36 0.21 0.00 24.79 0.00 0.00 0.00 36.36
02:15:52 PM 3 22.34 0.00 24.59 52.46 0.00 0.61 0.00 0.00 0.00 0.00
02:15:52 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:15:57 PM all 20.56 0.00 20.00 28.26 0.00 7.08 0.00 0.00 0.00 24.10
02:15:57 PM 0 24.44 0.00 18.89 12.32 0.00 0.21 0.00 0.00 0.00 44.15
02:15:57 PM 1 17.73 0.00 15.46 33.20 0.00 4.95 0.00 0.00 0.00 28.66
02:15:57 PM 2 18.93 0.00 22.22 12.35 0.00 22.84 0.00 0.00 0.00 23.66
02:15:57 PM 3 21.06 0.00 23.31 55.21 0.00 0.41 0.00 0.00 0.00 0.00
02:15:57 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:16:02 PM all 21.81 0.00 18.32 26.42 0.00 7.03 0.00 0.00 0.00 26.42
02:16:02 PM 0 26.43 0.00 19.67 0.20 0.00 0.41 0.00 0.00 0.00 53.28
02:16:02 PM 1 20.57 0.00 17.11 45.21 0.00 5.30 0.00 0.00 0.00 11.81
02:16:02 PM 2 19.67 0.00 16.74 0.21 0.00 21.97 0.00 0.00 0.00 41.42
02:16:02 PM 3 20.45 0.00 19.84 58.91 0.00 0.81 0.00 0.00 0.00 0.00
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: all 21.28 0.00 19.57 27.90 0.00 7.35 0.00 0.00 0.00 23.90
Average: 0 25.27 0.00 19.81 5.94 0.00 0.41 0.00 0.00 0.00 48.57
Average: 1 18.63 0.00 17.13 45.39 0.00 5.39 0.00 0.00 0.00 13.45
Average: 2 19.96 0.00 18.78 4.28 0.00 23.20 0.00 0.00 0.00 33.77
Average: 3 21.28 0.00 22.57 55.54 0.00 0.61 0.00 0.00 0.00 0.00
Suggestions to consider for your my.cnf [mysqld] section
log_error=/var/lib/mysql/sn-1f0ce8-error.log # from stderr to have a visible error log
innodb_lru_scan_depth=100 # from 1024 to conserve 90% CPU cycles used for function
innodb_io_capacity=1900 # from 200 to allow more IOPSecond to your storage device
innodb_flush_neighbors=2 # from 0 to expedite writing to current extent
innodb_max_dirty_pages_pct_lwm=1 # from 10 percent to expedite writes
innodb_max_dirty_pages_pct=1 # from 90 percent to reduce innodb_buffer_pool_pages_dirty count
innodb_change_buffer_max_size=50 # from 25 percent to expedite your high volume activity
You will find these suggestions will reduce CPU busy and expedite query completion.
For additional suggestions view profile, Network profile for contact information and free downloadable Utility Scripts to assist with performance improvements.

mysql .net connector connection pool wasted

I have asp.net mvc 3 application using mysql database and mysql .net connector. Application is using membership, role, profile providers and subsonic3 as data layer. Application is able to waste very big number of connections and eventually it crashes with timeout exception. I cloned server/application setup and I managed to reproduce issue with 10 connections limit. Info I have for now is below. For the beginning what does "Cleaning up" in innoDb transaction status means? I found this info: http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html but I dont see how transaction can stay in that status. Off course I desperately need any info that can help.
I actually debugged Subsonic code and I don't see that its doing anything wrong. When I become more desperate I guess I will do it again. Now I am trying to see whats happening in connector. Connections listed below are wasted i.e. not reusable.
Connections:
mysql> SHOW FULL PROCESSLIST;
+----+------+-----------------+------------+---------+------+-------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------------+------------+---------+------+-------+-----------------------+
| 2 | root | localhost:49167 | NULL | Query | 0 | init | SHOW FULL PROCESSLIST |
| 15 | root | localhost:49360 | somedbname | Sleep | 260 | | NULL |
| 16 | root | localhost:49361 | NULL | Sleep | 260 | | NULL |
| 19 | root | localhost:49437 | somedbname | Sleep | 3969 | | NULL |
| 20 | root | localhost:49439 | somedbname | Sleep | 3702 | | NULL |
| 21 | root | localhost:49440 | somedbname | Sleep | 3396 | | NULL |
| 22 | root | localhost:49457 | somedbname | Sleep | 3102 | | NULL |
| 23 | root | localhost:49460 | somedbname | Sleep | 2802 | | NULL |
| 24 | root | localhost:49478 | somedbname | Sleep | 1929 | | NULL |
| 26 | root | localhost:49497 | somedbname | Sleep | 1629 | | NULL |
| 27 | root | localhost:49498 | somedbname | Sleep | 1329 | | NULL |
+----+------+-----------------+------------+---------+------+-------+-----------------------+
11 rows in set (0.00 sec)
InnoDb status:
=====================================
2013-02-13 07:54:01 790 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 46 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 311 srv_active, 0 srv_shutdown, 14316 srv_idle
srv_master_thread log flush and writes: 14623
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 300
OS WAIT ARRAY INFO: signal count 296
Mutex spin waits 207, rounds 6140, OS waits 111
RW-shared spins 200, rounds 6000, OS waits 172
RW-excl spins 1, rounds 480, OS waits 15
Spin rounds per wait: 29.66 mutex, 30.00 RW-shared, 480.00 RW-excl
------------
TRANSACTIONS
------------
Trx id counter 7160
Purge done for trx's n:o < 7157 undo n:o < 0 state: running but idle
History list length 650
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 7159, not started
MySQL thread id 27, OS thread handle 0xb6c, query id 11259 localhost ::1 root cleaning up
---TRANSACTION 7124, not started
MySQL thread id 26, OS thread handle 0xc88, query id 11080 localhost ::1 root cleaning up
---TRANSACTION 0, not started
MySQL thread id 2, OS thread handle 0x790, query id 11270 localhost ::1 root init
SHOW ENGINE INNODB STATUS
---TRANSACTION 7005, not started
MySQL thread id 24, OS thread handle 0xde0, query id 10510 localhost ::1 root cleaning up
---TRANSACTION 6865, not started
MySQL thread id 23, OS thread handle 0x1d0, query id 9615 localhost ::1 root cleaning up
---TRANSACTION 6697, not started
MySQL thread id 22, OS thread handle 0x874, query id 8824 localhost ::1 root cleaning up
---TRANSACTION 6647, not started
MySQL thread id 21, OS thread handle 0xfa8, query id 8546 localhost ::1 root cleaning up
---TRANSACTION 6531, not started
MySQL thread id 20, OS thread handle 0x910, query id 8019 localhost ::1 root cleaning up
---TRANSACTION 6243, not started
MySQL thread id 19, OS thread handle 0x740, query id 6886 localhost ::1 root cleaning up
---TRANSACTION 0, not started
MySQL thread id 15, OS thread handle 0x75c, query id 11268 localhost 127.0.0.1 root cleaning up
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (read thread)
I/O thread 4 state: wait Windows aio (read thread)
I/O thread 5 state: wait Windows aio (read thread)
I/O thread 6 state: wait Windows aio (write thread)
I/O thread 7 state: wait Windows aio (write thread)
I/O thread 8 state: wait Windows aio (write thread)
I/O thread 9 state: wait Windows aio (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
1017 OS file reads, 3059 OS file writes, 2067 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 17393, node heap has 1 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 2556460
Log flushed up to 2556460
Pages flushed up to 2556460
Last checkpoint at 2556460
0 pending log writes, 0 pending chkp writes
852 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 8585216; in additional pool allocated 0
Dictionary memory allocated 142202
Buffer pool size 512
Free buffers 255
Database pages 256
Old database pages 0
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0 single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 988, created 63, written 1772
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 256, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Main thread id 1240, state: sleeping
Number of rows inserted 49, updated 955, deleted 0, read 5238
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
Web app gets stuck with following exception and isnt usable until app pool is recycled.
System.Configuration.Provider.ProviderException: An exception occurred.
Please check the Event Log. ---> MySql.Data.MySqlClient.MySqlException:
error connecting: Timeout expired.
The timeout period elapsed prior to obtaining a connection from the pool.
This may have occurred because all pooled connections were in use and max
pool size was reached.
OK I will provide some debugging steps that can be reused in similar situations to wash myself a little.
I did upgrade to Version=6.6.5.0. I downloaded source and attached debugger to connector and pool was working all right but I still had same issue. Connections from pool were not reused.
I added watch to private MySqlPool.inUsePool; And saw that all stuck connections are in use there. By adding another watch (in fact 10 of them): inUsePool[0-10].reader.Command.CommandText
helped me identify part in code that wasn't closing reader/connection. All connections that were stuck were indeed occupied by my reader. All had same SQL Command text that is called only once in application.

How to interpret iostat?

I track a lot of parameters on my Server and the only thing I can't realy put in perspective is the IOstat. It is a MySQL Server, is this a good result, or should I worry?
root:/var/lib/mysql# iostat -xc
Linux 2.6.28-11-server () 07/25/2009 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
3.66 0.19 0.45 1.04 0.00 94.69
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 2.55 871.36 1.46 27.67 392.40 7200.45 260.64 1.02 34.85 2.48 7.22
sda1 0.18 0.61 0.03 0.01 3.60 4.98 215.91 0.01 185.95 19.25 0.08
sda2 0.01 0.00 0.00 0.00 1.03 0.02 919.32 0.00 21.36 6.94 0.00
sda3 2.36 870.75 1.43 27.66 387.76 7195.46 260.68 1.01 34.65 2.48 7.21
sdb 2.37 871.36 1.63 27.67 392.69 7200.45 259.12 0.65 22.07 2.51 7.35
sdb1 0.17 0.61 0.04 0.01 3.59 4.98 187.33 0.01 110.67 12.54 0.06
sdb2 0.00 0.00 0.00 0.00 1.03 0.02 256.48 0.00 2.36 1.50 0.00
sdb3 2.19 870.75 1.60 27.66 388.06 7195.46 259.23 0.64 21.93 2.51 7.34
md0 0.00 0.00 0.38 0.62 3.06 4.96 8.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.02 8.36 0.00 0.00 0.00 0.00
md2 0.00 0.00 2.01 898.28 62.49 7186.28 8.05 0.00 0.00 0.00 0.00
Also what war options for decreasing read / write activity?
delay_______key_______writes
memory based Tables
less indicies
The write load is quite high on the tables.
If anyone worrying about the disk IO bottlenecks, please have a check with the following command.
iostat
If this tool is not installed then,
apt-get install sysstat
on Debian based servers.
yum install sysstat
on Redhat/CentOS based servers.
Then,
iostat -x -d sda
-here "sda" denotes your HDD
Output:
root#forum.innovationframes.com:~# iostat -x -d sda
Linux 2.6.32-24-server (forum.innovationframes.com) 10/01/2011 _x86_64_ (1 CPU)
Dev: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm **%util**
sda 0.01 0.04 0.06 0.03 1.34 0.51 21.77 0.00 5.23 0.30 **0.00**
Note:
If Util shows more than 75-80% then you should keep an eye on your HDD.