Environment
Spring-boot 2.6.6
Spring-data-jpa 2.6.6
Kotlin 1.6.10
mysql:mysql-connector-java 8.0.23
mysql 5.7 (innoDB)
Problem
A transaction commit happens during execution of my transactional method and I don't know why.
Code
#Transactional(isolation = Isolation.SERIALIZABLE)
fun login(username: String) {
userRepository.findByUsername(username)
?.let{
it.lastLoggedIn = LocalDateTime.now()
userRepository.save(it)
} :? run {
throw Exception("Not found User")
}
}
Result (from APM mornitoring tool)
-------------------------------------------------------------------------------------
p# # TIME T-GAP CONTENTS
-------------------------------------------------------------------------------------
[******] 18:50:13.493 0 Start transaction
- [000003] 18:50:13.639 40 spring-tx-doBegin(PROPAGATION_REQUIRED,ISOLATION_SERIALIZABLE)
- [000004] 18:50:13.639 0 getConnection jdbc:mysql://....(com.zaxxer.hikari.HikariDataSource#getConnection) 0 ms
- [000005] 18:50:13.641 2 [PREPARED] select ...{ellipsis} from user user0_ where user0_.username=? 0 ms
- [000006] 18:50:13.641 0 Fetch Count 1 1 ms
- [000007] 18:50:13.642 1 spring-tx-doCommit
- [000008] 18:50:13.643 1 [PREPARED] update user set last_logged_in=? where id=? 0 ms
[******] 18:50:13.646 3 End transaction
My though
I set transaction isolation level to serializable for some reason. As I know, it affects select query to make it with 'lock in share mode' to get a shared lock and an exclusive lock is needed for execution of the update query. An exclusive lock cannot be acquired on a row locked in shared mode so I thought it commits the transaction first to release the shared lock to get exclusive lock... but these query executions are in the same transaction, in the same DB connection, so in the same DB session. I think it doesn't make sense that the shared lock must be released to get exclusive lock in the same session.
What am I missing?
Related
Since MySQL 8.0.27, multithreading is now enabled by default for replica servers. Source
Until then, if the replication failed, we could get the exact error from Last_Error in the result of show replica status\G;. Now, the query is replaced by "Anonymous":
Coordinator stopped because there were error(s) in the worker(s). The
most recent failure being: Worker 1 failed executing transaction
'ANONYMOUS' at master log mysql-bin.031116, end_log_pos 81744270. See
error log and/or
performance_schema.replication_applier_status_by_worker table for more
details about this failure or others, if any.
The table performance_schema.replication_applier_status_by_worker does not contain the exact error either:
mysql> select * from performance_schema.replication_applier_status_by_worker\G
*************************** 1. row ***************************
CHANNEL_NAME:
WORKER_ID: 1
THREAD_ID: 128
SERVICE_STATE: ON
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
LAST_APPLIED_TRANSACTION: ANONYMOUS
LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2021-11-16 11:35:04.414021
LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2021-11-16 11:35:04.414021
LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2021-11-16 11:35:04.416898
LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2021-11-16 11:35:04.420018
APPLYING_TRANSACTION:
APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
LAST_APPLIED_TRANSACTION_RETRIES_COUNT: 0
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
APPLYING_TRANSACTION_RETRIES_COUNT: 0
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
-- <I've removed 3 other similar blocks>
I can indeed find the error in the MySQL error log (e.g. "Could not execute Write_rows event on table db.table; Duplicate entry '16737' for key 'table.PRIMARY'"), but not from a query anymore.
Is there another query that would give me this last error message? Or a specific setting to log it & display it under show replica status\G;?
Actually, you are correct. Based on the documentation, once you are using MTR (multi-threaded replication), the SQL thread is the coordinator for worker threads. Here’s what the documentation has to say about it:
If the replica is multithreaded, the SQL thread is the coordinator for worker threads. In this case, the Last_SQL_Error field shows exactly what the Last_Error_Message column in the Performance Schema replication_applier_status_by_coordinator table shows. The field value is modified to suggest that there may be more failures in the other worker threads which can be seen in the replication_applier_status_by_worker table that shows each worker thread's status. If that table is not available, the replica error log can be used. The log or the replication_applier_status_by_worker table should also be used to learn more about the failure shown by SHOW SLAVE STATUS or the coordinator table.
In that case, you can check with the performance_schema.replication_applier_status_by_coordinator then know more via performance_schema.replication_applier_status_by_worker . Otherwise, error log can be used.
I have a maxscale mariadb cluster with one master and two slaves. I am using flask-sqlachemy ORM for querying and writing.
I have written read queries in style
db.session(User).join()....
Now all my read queries are going to max scale master node
Below are maxcalse logs
2021-09-14 17:38:26 info : (1239) (Read-Write-Service) > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 287, type: QUERY_TYPE_READ, stmt: SELECT some_col FROM user
2021-09-14 17:38:26 info : (1239) [readwritesplit] (Read-Write-Service) Route query to master: Primary <
I have tried other ways too
conn = mysql_connector.connect(...)
conn.autocommit(True)
cursor = conn.cursor()
cursor.execute(query)
This works fine and routes query to one of slave.
But my most of code is written in ORM style. Is there any way to achieve this while using flask-sqlalchemy
If autocommit is disabled, you always have an open transaction: use START TRANSACTION READ ONLY to start an explicit read-only transaction. This allows MaxScale to route the transaction to a slave.
My App (https://github.com/atulsm/config-service) is a simple config service which stores/retrieves config key/vals from db and consists of the following.
Dropwizard
Hibernate
Hibernate EHCache
Mysql
I have Mysql general log enabled and when i look into it while running a load test of simple get(), this is the only thing i see in the log which fills in the entire logs.
2019-08-28T11:36:12.003158Z 295 Query SET autocommit=0
2019-08-28T11:36:12.003318Z 295 Query commit
2019-08-28T11:36:12.003425Z 295 Query SET autocommit=1
2019-08-28T11:36:12.003834Z 295 Query SET autocommit=0
2019-08-28T11:36:12.004005Z 295 Query commit
2019-08-28T11:36:12.004105Z 295 Query SET autocommit=1
2019-08-28T11:36:12.004481Z 295 Query SET autocommit=0
2019-08-28T11:36:12.004646Z 295 Query commit
2019-08-28T11:36:12.004762Z 295 Query SET autocommit=1
Question:
Since no query got executed and the data was provided from the second level cache, my expectation was that there will be no interaction with Mysql
Any idea why this is happening? Is there a way to disable it.
My QPS is currently ~800. I am hoping that if i can disable these commit calls, it would improve dramatically.
EDIT
Problem seems to be with #UnitOfWork. I removed the entire DAO call and replaced it with a dummy object, still i see DB Operations !
Bucket dummy = new Bucket();
#GET
#Produces(MediaType.APPLICATION_JSON)
#Timed
#UnitOfWork
public Bucket getBucket(#PathParam("bucketName") String bucketName){
//return BucketService.INSTANCE.getBucket(bucketName);
return dummy;
}
This was addressed by disabling transaction in get operations using #UnitOfWork(transactional = false).
#GET
#Produces(MediaType.APPLICATION_JSON)
#Timed
#UnitOfWork(transactional = false)
public Bucket getBucket(#PathParam("bucketName") String bucketName){
return BucketService.INSTANCE.getBucket(bucketName);
}
I am running two MySQL databases -- one is on an Amazon AWS cloud server, and another is running on a server in my network.
These two databases are replicating normally in a multi-master arrangement seemingly without issue, but then every once in a while -- a few times a day -- I get an error in my application saying "Plugin instructed the server to rollback the current transaction."
The error persists for a few minutes (around least 15 minutes), and then goes back to replicating normally again. In the MySQL Error logs, I don't see any error, but in the normal log file I do see the rollback happening:
2018-09-10T22:50:25.185065Z 4342 Query UPDATE `visit_team` SET `created` = '2018-09-10 12:34:56.306918', `last_updated` = '2018-09-10 22:50:25.183904', `last_changed` = '2018-09-10 22:50:25.183904', `visit_id` = 'J8R2QY', `station_type_id` = 'puffin', `current_state_id` = 680 WHERE `visit_team`.`uuid` = 'S80OSQ'
2018-09-10T22:50:25.185408Z 4342 Query commit
2018-09-10T22:50:25.222304Z 4340 Quit
2018-09-10T22:50:25.226917Z 4341 Query set autocommit=1
2018-09-10T22:50:25.240787Z 4341 Query SELECT `program_nodeconfig`.`id`, `program_nodeconfig`.`program_id`, `program_nodeconfig`.`node_id`, `program_nodeconfig`.`application_id`, `program_nodeconfig`.`bundle_version_id`, `program_nodeconfig`.`arguments`, `program_nodeconfig`.`station_type_id` FROM `program_nodeconfig` INNER JOIN `supervisor_node` ON (`program_nodeconfig`.`node_id` = `supervisor_node`.`id`) WHERE (`program_nodeconfig`.`program_id` = 'rwrs' AND `supervisor_node`.`cluster_id` = 2 AND `program_nodeconfig`.`station_type_id` = 'osprey')
... Six more select statements happen here, but removed for brevity...
2018-09-10T22:50:25.253520Z 4342 Query rollback
2018-09-10T22:50:25.253624Z 4342 Query set autocommit=1
In the log file above, the Query UPDATE that is attempted in the first line gets rolled back even after the commit statement, and at 2018-09-10T22:50:25.254394I received an application error saying that the query was rolled back.
I've seen the error when connecting to both databases -- both the cloud and internal.
Does anyone know what would cause the replication to fail randomly, but on a regular basis, and then go back to working again?
I just wonder mysql "SELECT FOR UPDATE" lock block all my threads in a process, and how to by pass it if I need to grant this lock in multi-threaded Applications.
To keep it simple, I just give a simple test code in ruby:
t1 = Thread.new do
db = Sequel.connect("mysql://abcd:abcd#localhost:3306/test_db")
db.transaction do
db["select * from tables where id = 1 for update"].first
10.times { |t| p 'babababa' }
end
end
t2 = Thread.new do
db = Sequel.connect("mysql://abcd:abcd#localhost:3306/test_db")
db.transaction do
db["select * from tables where id = 1 for update"].first
10.times { |t| p 'lalalala' }
end
end
t1.join
t2.join
p 'done'
Actually, the result will be:
thread 1 and thread 2 hang for at least 50 sec after one of the thread get the "FOR UPDATE" LOCK ( Lock wait time is 50 sec in Mysql Setting ), and then one thread raise "Lock wait timeout exceeded" error and quit, another thread successfully print its 'baba' or 'lala' out.
This is related to the fact that the mysql driver locks the entire ruby VM for every query. When the second thread hits the FOR UPDATE lock and freezes, nothing in ruby happens until it times out, causing the behavior you are seeing.
Either install the mysqlplus gem (which I doesn't blocks the entire interpreter, and Sequel's mysql adapter will use automatically if installed), or install the mysql2 gem and use the mysql2 adapter instead of the mysql adapter.