Mysql - query is slow when IO on linux server wait is high - mysql

Slow query when IO wait is high.
Show from iotop command
-- TID -- PRIO -- USER -- DISK READ -- DISK WRITE -- SWAPIN -- IO> -- COMMAND
-- 2311 -- be/4 -- mysql -- 0.00 B/s -- 0.00 B/s -- 0.00% -- 96.25% -- mysql~l.sock
-- 2311 -- be/4 -- mysql -- 0.00 B/s -- 0.00 B/s -- 0.00% -- 96.25% -- mysql~l.sock
-- 2311 -- be/4 -- mysql -- 0.00 B/s -- 0.00 B/s -- 0.00% -- 96.24% -- mysql~l.sock
High IO wait start when 6:13:28 PM - 6:13:29 PM (sar command)
--------------------- CPU -- %usr -- %nice -- %sys -- %iowait -- %steal
-- 6:13:28 PM --- all ----- 2.53 --- 0.00 ---- 2.02 ----- 39.39 ------ 0.00
-- 6:13:29 PM --- all ----- 1.99 --- 0.00 ---- 1.00 ----- 49.25 ------ 0.00
Got slow query between that time
Time: 130329 18:13:29
User#Host: wdwdwd[wdwdwd] # localhost []
Query_time: 2.007902 Lock_time: 0.000025 Rows_sent: 0 Rows_examined: 1 SET timestamp=1364555609;
UPDATE log_product SET credit=credit+1 WHERE id_product='349721228' and id_user='2021841' LIMIT 1;
## Question are How to fix this process. What the real cause ? ##

Do you have an index on the "Log_Product" table by (id_user, id_product) as a single index, not TWO indexes, one by each column. Also, if the fields are numeric, you don't need quotes...

Related

Last observation carried forward / ignore nulls in lag

How can I imitate the LOCF behavior induced by lag(x) ignore nulls on, e.g., Redshift, in Presto?
Take this sample data:
select * from (
values (7369, null), (7499, 300), (7521, 500),
(7566, null), (7654, 1400), (7698, null),
(7782, null), (7788, null), (7839, null),
(7844, 0), (7876, null), (7900, null),
(7902, null), (7934, null)
) ex(empno, comm)
-- empno comm
-- 7369
-- 7499 300
-- 7521 500
-- 7566
-- 7654 1400
-- 7698
-- 7782
-- 7788
-- 7839
-- 7844 0
-- 7876
-- 7900
-- 7902
-- 7934
Desired output is:
-- empno comm prev_comm
-- 7369
-- 7499 300
-- 7521 500 300
-- 7566 500
-- 7654 1400 500
-- 7698 1400
-- 7782 1400
-- 7788 1400
-- 7839 1400
-- 7844 0 1400
-- 7876 0
-- 7900 0
-- 7902 0
-- 7934 0
This can be nearly achieved by the following (adapted to Presto from here):
select empno, comm, max(comm) over (partition by grp) prev_comm
from (
select empno, comm, sum(cast(comm is not null as double)) over (order by empno) grp
from example_table
)
order by empno
-- empno comm prev_comm
-- 7369
-- 7499 300 300
-- 7521 500 500
-- 7566 500
-- 7654 1400 1400
-- 7698 1400
-- 7782 1400
-- 7788 1400
-- 7839 1400
-- 7844 0 0
-- 7876 0
-- 7900 0
-- 7902 0
-- 7934 0
(the difference being that the current rows for non-NULL comm are incorrect)
Actually, in my case, the difference doesn't matter, since I want to coalesce(comm, prev_comm). However, this answer still does not suffice, because in the full data set, it created a memory failure:
Query exceeded local memory limit of 20GB
The following outstanding pull request to presto would implement ignore nulls directly; is there no way to accomplish the equivalent result in the interim?
https://github.com/prestodb/presto/pull/6157

How to solve MySQL innodb "Waiting for table metadata lock" on TRUNCATE TABLE?

Running a test suite with hundreds of application unit tests in a GitLab CI server. After ran 10 tests in, somehow it always gets stuck on Waiting for table metadata lock on TRUNCATE TABLE, which is a tearDown step.
I am aware of the SHOW ENGINE INNODB STATUS command . Here are some diagnostic logs:
mysql> \s
--------------
mysql Ver 14.14 Distrib 5.6.30, for Linux (x86_64) using EditLine wrapper
Connection id: 190
Current database:
Current user: root#localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 5.6.30 MySQL Community Server (GPL)
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: utf8mb4
Db characterset: utf8mb4
Client characterset: utf8mb4
Conn. characterset: utf8mb4
UNIX socket: /var/run/mysqld/mysqld.sock
Uptime: 51 min 28 sec
Threads: 4 Questions: 3859 Slow queries: 0 Opens: 715 Flush tables: 1 Open tables: 131 Queries per second avg: 1.249
--------------
mysql> show processlist;
+-----+------+----------------+------------+---------+------+---------------------------------+-----------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+------+----------------+------------+---------+------+---------------------------------+-----------------------------+
| 1 | root | 10.0.2.1:52773 | test_3926 | Query | 2961 | Waiting for table metadata lock | TRUNCATE TABLE `capability` |
| 188 | root | 10.0.2.1:53658 | test_3926 | Sleep | 2962 | | NULL |
| 189 | root | 10.0.2.1:53660 | test_3926 | Sleep | 2962 | | NULL |
| 190 | root | localhost | NULL | Query | 0 | init | show processlist |
+-----+------+----------------+------------+---------+------+---------------------------------+-----------------------------+
4 rows in set (0.00 sec)
2016-05-18 16:10:37 7f03be9ba700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 7 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 126 srv_active, 0 srv_shutdown, 3047 srv_idle
srv_master_thread log flush and writes: 3173
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 2408
OS WAIT ARRAY INFO: signal count 2525
Mutex spin waits 988, rounds 24557, OS waits 747
RW-shared spins 1339, rounds 45580, OS waits 1518
RW-excl spins 3, rounds 5283, OS waits 113
Spin rounds per wait: 24.86 mutex, 34.04 RW-shared, 1761.00 RW-excl
------------
TRANSACTIONS
------------
Trx id counter 7574
Purge done for trx's n:o < 7493 undo n:o < 0 state: running but idle
History list length 778
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 190, OS thread handle 0x7f03be9ba700, query id 3941 localhost root init
SHOW ENGINE INNODB STATUS
---TRANSACTION 7489, not started
MySQL thread id 188, OS thread handle 0x7f03bea3c700, query id 3824 10.0.2.1 root cleaning up
---TRANSACTION 7548, not started
MySQL thread id 1, OS thread handle 0x7f03bea7d700, query id 3855 10.0.2.1 root Waiting for table metadata lock
TRUNCATE TABLE `capability`
---TRANSACTION 7490, ACTIVE 3047 sec
MySQL thread id 189, OS thread handle 0x7f03be9fb700, query id 3840 10.0.2.1 root cleaning up
Trx read view will not see trx with id >= 7491, sees < 7491
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
173 OS file reads, 6858 OS file writes, 6022 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 276671, node heap has 2 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 10549488
Log flushed up to 10549488
Pages flushed up to 10549488
Last checkpoint at 10549488
0 pending log writes, 0 pending chkp writes
2555 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 137363456; in additional pool allocated 0
Dictionary memory allocated 545426
Buffer pool size 8191
Free buffers 7354
Database pages 835
Old database pages 288
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 4257, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 160, created 4341, written 863
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 835, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 1, id 139654053570304, state: sleeping
Number of rows inserted 1187, updated 37, deleted 0, read 650
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
Now my question is, why TRUNCATE table would get stuck on table metalock? And how can this be resolved?
The problem here seems straightforward enough.
---TRANSACTION 7490, ACTIVE 3047 sec
MySQL thread id 189, OS thread handle 0x7f03be9fb700, query id 3840 10.0.2.1 root cleaning up
Trx read view will not see trx with id >= 7491, sees < 7491
---
Thread 189 (a client connection) is idle, and as been for a while, but it has left a transaction running. This is probably a bug in the code that's using the database, since it doesn't make sense to leave a running transaction going for almost an hour.
mysql> KILL 189;
That should release the metadata lock... but you need to find out why this is happening. Bad Things™ will happen if an application doesn't behave better than this.
Also... your application should not be connecting as root. Not related to the problem, but not good, if that's what this is.
The question is already well answered by #Michael - sqlbot but I want to add some additional details. You can look through it why thread id 189 is waiting for this long.
The Waiting for table metadata lock may occur when we DELETE or CREATE an index and modify the table’s structure.
It can also occur when we perform maintenance operations on tables, DELETE tables, or try to access the WRITE lock on table-level (using this query LOCK TABLE table_name WRITE;).
Different types of active transactions can lead to metadata lock waits. Some cases are listed below where the metadata lock wait can occur.
Query to a table that has been present for a long period.
Failed to COMMIT or ROLLBACK the transaction opened implicitly or explicitly.
When we have an unsuccessful query transaction on a table.
Truncate lock bug can also occur if you have a large buffer pool, you may have noticed a complete server lock-up or stall of multiple seconds. You can find more information how this bug occur in this MySQL troubleshooting article.

Mysql threads stuck in 'query end', how to prevent furious flushing

MySQL became unresponsive as many simple UPDATE and INSERT threads were stuck in 'query end' state.
---TRANSACTION F528F961, ACTIVE (PREPARED) 858 sec
mysql tables in use 1, locked 1
2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 82683520, OS thread handle 0x7f73a6925700, query id 14714499253 192.168.1.22 wms query end
UPDATE `users`
SET `id` = '6016', `es_id` = '4817', `department_id` = '4',
`schedule_id` = '1', `username` = 'john.doe',
`user_role` = 'Guest,Admin,Picker',
`status` = '1', `team` = '2', `email` = NULL,
`wms_user` = '1', `logged_in_time` = '2016-02-01 07:06:45',
`last_activity` = '2016-02-01 13:07:49',
`session_id` = 'qbei0rrfiu05l9olcckh6sg976'
WHERE (id = 6016)
CPU load went up, Disk IO went up, hit ratio went down.
CPU Load / Disk IO
Even "use db" and "show master status" threads showed up in the slow log.
From what I figure this is 'furious flushing'.
A user ran a large SELECT statement through the applcation. The select inner joins 12 InnoDB tables that have sum(data_length + index_length) = 11.2Gand sorts the results. The thing is that this is not an unusual query. It runs very often but with a much smaller working set:
# Query_time: 1.737293 Lock_time: 0.000027 Rows_sent: 7051 Rows_examined: 1109050
This time the user wanted data from the past 2 months, which lead to:
# Query_time: 370.063806 Lock_time: 0.000039 Rows_sent: 919 Rows_examined: 27994638
From Engine InnoDB Status:
Main thread process no. 24701, id 140134828910336, state: flushing buffer pool pages
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 894332.73 reads/s
Server runs Debian 6.0.4, MySQL 5.5.31 Community Edition, 32 core CPU at 2.60GHz / 64GB RAM / SSD
My.cnf:
innodb_buffer_pool_size = 40G
innodb_log_file_size = 512M
innodb_log_buffer_size = 16M
innodb_flush_log_at_trx_commit = 1
innodb_thead_concurrency = 0 ## modified to 32 after crash
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_old_blocks_pct = 37
innodb_flush_method=NULL ## would change to O_DIRECT but needs restart
innodb_old_blocks_time = 0 ## modified to 1000 after crash
This was an probably an isolated case, but I want to know how I can prevent this in the future. Please offer your input. Thanks.
Build Summary tables for the users. This will
Keep the SELECTs from slowing down the UPDATEs.
Make the SELECTs run 10x faster.
Decrease overall system usage.
More discussion.

Mysql replication with UPDATE JOIN on an ignored table

I'm doing mysql replication and as a dumbed down example, I have two tables, tableA and tableB.
on the slave for replication, tableA is allowed and tableB is ignored.
replicate-do-table='dbname.tableA'
On the master, this query is being made (i can't make any changes to the master):
UPDATE tableA as a LEFT JOIN tableB as b ON b.type = a.type
SET b.col1 = CONCAT(IFNULL(a.col1,''),'|',IFNULL(a.col2,''))
Obviously I could just create tableB on the slave and let it update a bogus table, however this table in particular is an in-memory table that is used for searching and is updated almost constantly resulting in a lot of wasted resources.
Is there a way for me to filter out these updates from the replication while still keeping tableA? I have no access to the master, however i can ask them to make changes if it's a change that wouldn't affect how their system operates.
Options AFAIK are mainly based around getting the replication to be be ROW based rather then STATEMENT based.
Set the default to ROW (which is a brute force method, and has its drawbacks).
You can set the SESSION binlog_format to ROW, but it requires the SUPER privilege which the user probably hasn't and will not be granted either for good reasons.
If the logging happens in MIXED format you can look around here to force a ROW based entry in the binlog, trying to force a useless FOUND_ROWS() or UUID() call in the update could very well trigger it.
An example for the MIXED solution:
The queries:
INSERT INTO sometable VALUES ('a','aa');
UPDATE sometable SET aa='bb';
UPDATE sometable SET aa='cc' WHERE UUID(); -- slight overhead, but always true
The log (use mysqlbinlog to inspect it), clearly STATEMENT based for the first 2, but ROW based for the 3rd:
# at 175
#130918 21:18:25 server id 1 end_log_pos 277 Query thread_id=142 exec_time=0 error_code=0
use `test`/*!*/;
SET TIMESTAMP=1379531905/*!*/;
INSERT INTO sometable VALUES ('a','aa')
/*!*/;
# at 277
#130918 21:18:25 server id 1 end_log_pos 304 Xid = 488
COMMIT/*!*/;
# at 304
#130918 21:18:52 server id 1 end_log_pos 372 Query thread_id=142 exec_time=0 eror_code=0
SET TIMESTAMP=1379531932/*!*/;
BEGIN
/*!*/;
# at 372
#130918 21:18:52 server id 1 end_log_pos 463 Query thread_id=142 exec_time=0 error_code=0
SET TIMESTAMP=1379531932/*!*/;
UPDATE sometable SET aa='bb'
/*!*/;
# at 463
#130918 21:18:52 server id 1 end_log_pos 490 Xid = 497
COMMIT/*!*/;
# at 490
#130918 21:21:06 server id 1 end_log_pos 558 Query thread_id=144 exec_time=0 error_code=0
SET TIMESTAMP=1379532066/*!*/;
BEGIN
/*!*/;
# at 558
# at 610
#130918 21:21:06 server id 1 end_log_pos 610 Table_map: `test`.`sometable` mapped to number 180
#130918 21:21:06 server id 1 end_log_pos 664 Update_rows: table id 180 flags: STMT_END_F
BINLOG '
Iv05UhMBAAAANAAAAGICAAAAALQAAAAAAAEABHRlc3QACXNvbWV0YWJsZQAC/A8DAwYAAQ==
Iv05UhgBAAAANgAAAJgCAAAAALQAAAAAAAEAAv///QJiYv0CY2P8AQAAYQJiYvwBAABhAmNj
'/*!*/;
# at 664
#130918 21:21:06 server id 1 end_log_pos 691 Xid = 578
COMMIT/*!*/;
DELIMITER ;
# End of log file
In my situation it made more sense to instead ignore the table doesn't exist errors. It is because my database system has little to no chance of ever changing and the updates in question never target the tables I am replicating.
It is a legacy system that we're slowly moving away from.
slave-skip-errors=1146
The only other reliable way to solve this would be to switch to row-level bin logging on the master, however I couldn't get them to make that change for me.

Rails 2.2.2 Performance Problem/Bug

I recently upgraded one of my applications to Rails 2.2.2. Having done that, I've encountered a strange performance bug that has caused renders that used to complete in a fraction of a second to take up to 10 seconds.
I've profiled the issue, and here are the results I've come up with. It looks like the issue is in the real_connect method of the Mysql class. My understanding is that the Ruby real_connect method is a wrapper around the C mysql_real_connect() function. This would lead me to believe that the issue must be with the database, since I've encountered the same problem when running the code on Windows and Linux (the database server is a separate system). I don't, however, believe this is the case, because when I roll back to a previous version (pre Rails 2.2.2) from my subversion repository, the performance issue goes away. This would seem the indicate that there is some kind of bug in ActiveRecord.
How do I go about identifying and fixing this bug? Does anyone have any insight? Is there something I'm missing?
Update: I just created a small profiler script to test the Mysql.real_connect method, and it appears that the problem isn't in Rails, but in the MySQL gem or the database server itself.
Upon running the following code:
result = RubyProf.profile do
5.times do
begin
# connect to the MySQL server
dbh = Mysql.real_connect(ip, user, pass, db)
# get server version string and display it
puts "Server version: " + dbh.get_server_info
rescue Mysql::Error => e
puts "Error code: #{e.errno}"
puts "Error message: #{e.error}"
puts "Error SQLSTATE: #{e.sqlstate}" if e.respond_to?("sqlstate")
ensure
# disconnect from server
dbh.close if dbh
end
end
end
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT, 0)
I came up with this performance result:
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Thread ID: 18998180
Total: 50.402000
%self total self wait child calls name
99.99 50.40 50.40 0.00 0.00 5 <Class::Mysql>#real_connect (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 10 IO#write (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Mysql#get_server_info (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Kernel#puts (ruby_runtime: 0}
0.00 0.00 0.00 0.00 0.00 5 String#+ (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Mysql#initialize (ruby_runtime:0}
0.00 50.40 0.00 0.00 50.40 1 Integer#times (ruby_runtime:0}
0.00 50.40 0.00 0.00 50.40 1 Global#[No method] (tmp/mysql_test/test.rb:12}
0.00 0.00 0.00 0.00 0.00 5 Mysql#close (ruby_runtime: 0}
It seems as though the problem isn't in ActiveRecord, it's either in the MySQL gem or in the database. Where do I go from here?
I was able to track down the problem. I started by connecting to the host using the MySQL command from my development machine using the command mysql --host=ip --user=user --password=password db. This was very slow, so I ssh'ed into the server, and connected from there using the same command. This was also slow.
I changed the command to mysql --host=localhost --user=user --password=password db and I was able to connect instantaneously. I added an entry for my development system in the /etc/hosts file, and was able to connect instantaneously from that as well. Apparently the MySQL server was attempting to perform a reverse dns lookup to resolve the host name associated with the IP address, as is listed in the MySQL Manual, and was timing out.
I added the --skip-name-resolve option to the start section of the /etc/init.d/mysql script, so that this check is skipped, and restarted the server. When I run the profile script I created earlier, I get the following result:
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Server version: 5.0.32-Debian_7etch3-log
Thread ID: 52978590
Total: 0.016000
%self total self wait child calls name
87.50 0.01 0.01 0.00 0.00 5 <Class::Mysql>#real_connect (ruby_runtime:0}
6.25 0.00 0.00 0.00 0.00 10 IO#write (ruby_runtime:0}
6.25 0.00 0.00 0.00 0.00 5 Mysql#close (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Kernel#puts (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Mysql#initialize (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 String#+ (ruby_runtime:0}
0.00 0.02 0.00 0.00 0.02 1 Global#[No method] (tmp/mysql_test/test.rb:12}
0.00 0.02 0.00 0.00 0.02 1 Integer#times (ruby_runtime:0}
0.00 0.00 0.00 0.00 0.00 5 Mysql#get_server_info (ruby_runtime:0}