I just wonder mysql "SELECT FOR UPDATE" lock block all my threads in a process, and how to by pass it if I need to grant this lock in multi-threaded Applications.
To keep it simple, I just give a simple test code in ruby:
t1 = Thread.new do
db = Sequel.connect("mysql://abcd:abcd#localhost:3306/test_db")
db.transaction do
db["select * from tables where id = 1 for update"].first
10.times { |t| p 'babababa' }
end
end
t2 = Thread.new do
db = Sequel.connect("mysql://abcd:abcd#localhost:3306/test_db")
db.transaction do
db["select * from tables where id = 1 for update"].first
10.times { |t| p 'lalalala' }
end
end
t1.join
t2.join
p 'done'
Actually, the result will be:
thread 1 and thread 2 hang for at least 50 sec after one of the thread get the "FOR UPDATE" LOCK ( Lock wait time is 50 sec in Mysql Setting ), and then one thread raise "Lock wait timeout exceeded" error and quit, another thread successfully print its 'baba' or 'lala' out.
This is related to the fact that the mysql driver locks the entire ruby VM for every query. When the second thread hits the FOR UPDATE lock and freezes, nothing in ruby happens until it times out, causing the behavior you are seeing.
Either install the mysqlplus gem (which I doesn't blocks the entire interpreter, and Sequel's mysql adapter will use automatically if installed), or install the mysql2 gem and use the mysql2 adapter instead of the mysql adapter.
Related
Environment
Spring-boot 2.6.6
Spring-data-jpa 2.6.6
Kotlin 1.6.10
mysql:mysql-connector-java 8.0.23
mysql 5.7 (innoDB)
Problem
A transaction commit happens during execution of my transactional method and I don't know why.
Code
#Transactional(isolation = Isolation.SERIALIZABLE)
fun login(username: String) {
userRepository.findByUsername(username)
?.let{
it.lastLoggedIn = LocalDateTime.now()
userRepository.save(it)
} :? run {
throw Exception("Not found User")
}
}
Result (from APM mornitoring tool)
-------------------------------------------------------------------------------------
p# # TIME T-GAP CONTENTS
-------------------------------------------------------------------------------------
[******] 18:50:13.493 0 Start transaction
- [000003] 18:50:13.639 40 spring-tx-doBegin(PROPAGATION_REQUIRED,ISOLATION_SERIALIZABLE)
- [000004] 18:50:13.639 0 getConnection jdbc:mysql://....(com.zaxxer.hikari.HikariDataSource#getConnection) 0 ms
- [000005] 18:50:13.641 2 [PREPARED] select ...{ellipsis} from user user0_ where user0_.username=? 0 ms
- [000006] 18:50:13.641 0 Fetch Count 1 1 ms
- [000007] 18:50:13.642 1 spring-tx-doCommit
- [000008] 18:50:13.643 1 [PREPARED] update user set last_logged_in=? where id=? 0 ms
[******] 18:50:13.646 3 End transaction
My though
I set transaction isolation level to serializable for some reason. As I know, it affects select query to make it with 'lock in share mode' to get a shared lock and an exclusive lock is needed for execution of the update query. An exclusive lock cannot be acquired on a row locked in shared mode so I thought it commits the transaction first to release the shared lock to get exclusive lock... but these query executions are in the same transaction, in the same DB connection, so in the same DB session. I think it doesn't make sense that the shared lock must be released to get exclusive lock in the same session.
What am I missing?
I have a maxscale mariadb cluster with one master and two slaves. I am using flask-sqlachemy ORM for querying and writing.
I have written read queries in style
db.session(User).join()....
Now all my read queries are going to max scale master node
Below are maxcalse logs
2021-09-14 17:38:26 info : (1239) (Read-Write-Service) > Autocommit: [disabled], trx is [open], cmd: (0x03) COM_QUERY, plen: 287, type: QUERY_TYPE_READ, stmt: SELECT some_col FROM user
2021-09-14 17:38:26 info : (1239) [readwritesplit] (Read-Write-Service) Route query to master: Primary <
I have tried other ways too
conn = mysql_connector.connect(...)
conn.autocommit(True)
cursor = conn.cursor()
cursor.execute(query)
This works fine and routes query to one of slave.
But my most of code is written in ORM style. Is there any way to achieve this while using flask-sqlalchemy
If autocommit is disabled, you always have an open transaction: use START TRANSACTION READ ONLY to start an explicit read-only transaction. This allows MaxScale to route the transaction to a slave.
I have a reasonably large dataset of about 6,000,000 rows X 60 columns that I am trying to insert into a database. I am chunking them, and inserting them 10,000 at a time into a mysql database using a class I've written and pymysql. The problem is, I occasionally time out the server when writing, so I've modified my executemany call to re-connect on errors. This works fine for when I lose connection once, but if I lose the error a second time, I get a pymysql.InternalException stating that lock wait timeout exceeded. I was wondering how I could modify the following code to catch that and destroy the transaction completely before attempting again.
I've tried calling rollback() on the connection, but this causes another InternalException if the connection is destroyed because there is no cursor anymore.
Any help would be greatly appreciated (I also don't understand why I am getting the timeouts to begin with, but the data is relatively large.)
class Database:
def __init__(self, **creds):
self.conn = None
self.user = creds['user']
self.password = creds['password']
self.host = creds['host']
self.port = creds['port']
self.database = creds['database']
def connect(self, type=None):
self.conn = pymysql.connect(
host = self.host,
user = self.user,
password = self.password,
port = self.port,
database = self.database
)
def executemany(self, sql, data):
while True:
try:
with self.conn.cursor() as cursor:
cursor.executemany(sql, data)
self.conn.commit()
break
except pymysql.err.OperationalError:
print('Connection error. Reconnecting to database.')
time.sleep(2)
self.connect()
continue
return cursor
and I am calling it like this:
for index, chunk in enumerate(dataframe_chunker(df), start=1):
print(f"Writing chunk\t{index}\t{timer():.2f}")
db.executemany(insert_query, chunk.values.tolist())
Take a look at what MySQL is doing. The lockwait timeouts are because the inserts cannot be done until something else finishes, which could be your own code.
SELECT * FROM `information_schema`.`innodb_locks`;
Will show the current locks.
select * from information_schema.innodb_trx where trx_id = [lock_trx_id];
Will show the involved transactions
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST where id = [trx_mysql_thread_id];
Will show the involved connection and may show the query whose lock results in the lock wait timeout. Maybe there is an uncommitted transaction.
It is likely your own code, because of the interaction with your executemany function which catches exceptions and reconnects to the database. What of the prior connection? Does the lockwait timeout kill the prior connection? That while true is going to be trouble.
For the code calling executemany on the db connection, be more defensive on the try/except with something like:
def executemany(self, sql, data):
while True:
try:
with self.conn.cursor() as cursor:
cursor.executemany(sql, data)
self.conn.commit()
break
except pymysql.err.OperationalError:
print('Connection error. Reconnecting to database.')
if self.conn.is_connected():
connection.close()
finally:
time.sleep(2)
self.connect()
But the solution here will be to not induce lockwait timeouts if there are no other database clients.
I am running two MySQL databases -- one is on an Amazon AWS cloud server, and another is running on a server in my network.
These two databases are replicating normally in a multi-master arrangement seemingly without issue, but then every once in a while -- a few times a day -- I get an error in my application saying "Plugin instructed the server to rollback the current transaction."
The error persists for a few minutes (around least 15 minutes), and then goes back to replicating normally again. In the MySQL Error logs, I don't see any error, but in the normal log file I do see the rollback happening:
2018-09-10T22:50:25.185065Z 4342 Query UPDATE `visit_team` SET `created` = '2018-09-10 12:34:56.306918', `last_updated` = '2018-09-10 22:50:25.183904', `last_changed` = '2018-09-10 22:50:25.183904', `visit_id` = 'J8R2QY', `station_type_id` = 'puffin', `current_state_id` = 680 WHERE `visit_team`.`uuid` = 'S80OSQ'
2018-09-10T22:50:25.185408Z 4342 Query commit
2018-09-10T22:50:25.222304Z 4340 Quit
2018-09-10T22:50:25.226917Z 4341 Query set autocommit=1
2018-09-10T22:50:25.240787Z 4341 Query SELECT `program_nodeconfig`.`id`, `program_nodeconfig`.`program_id`, `program_nodeconfig`.`node_id`, `program_nodeconfig`.`application_id`, `program_nodeconfig`.`bundle_version_id`, `program_nodeconfig`.`arguments`, `program_nodeconfig`.`station_type_id` FROM `program_nodeconfig` INNER JOIN `supervisor_node` ON (`program_nodeconfig`.`node_id` = `supervisor_node`.`id`) WHERE (`program_nodeconfig`.`program_id` = 'rwrs' AND `supervisor_node`.`cluster_id` = 2 AND `program_nodeconfig`.`station_type_id` = 'osprey')
... Six more select statements happen here, but removed for brevity...
2018-09-10T22:50:25.253520Z 4342 Query rollback
2018-09-10T22:50:25.253624Z 4342 Query set autocommit=1
In the log file above, the Query UPDATE that is attempted in the first line gets rolled back even after the commit statement, and at 2018-09-10T22:50:25.254394I received an application error saying that the query was rolled back.
I've seen the error when connecting to both databases -- both the cloud and internal.
Does anyone know what would cause the replication to fail randomly, but on a regular basis, and then go back to working again?
I have a production web site with the following environment:
Rails 2.3.5
MySQL Server 5.1.33
Enterprise Ruby 1.8.6 (2008-08-11 patchlevel 287) [x86_64-linux]
mysql gem 2.7
Old version of BackgrounDRb plugin running on 4 different servers for background tasks, with 5 different workers each (Ruby threads, not separate processes!).
One of the BackgrounDRb workers processes the job queue using a variation of "optimistic locking":
update_sql = "update jobs
set updated_at = CURRENT_TIMESTAMP,
in_process = 1
where id = #{job.id} and in_process = 0"
affected_rows = Job.connection.update(update_sql)
captured_job = affected_rows > 0 ? Job.find(job.id) : nil
The code above tries to update the record with the given ID and with extra condition for in_process field. So if this same record was already updated by a different server/process then UPDATE statement would just return 0 (zero) and the job would not be processed simultaneously by 2 different servers.
The problem is: sometimes "Job.connection.update(update_sql)" returns 0 (zero) even when the record was actually updated! I was only able to find that out after a heavy logging was added to the code. It only happens in Production at night when we have a heavy load...
My guess is that mysql gem uses some global variable (class-variable) for affected_rows that is shared across all 5 threads of BackgrounDRb process, but I'm not sure. I was looking through the code of mysql gem and ActiveRecord, but I couldn't understand how it really works.
How could this happen?
Update 2010-07-07: We decided to not use threads for job processing - that'll solve all our problems: every job processor would be a separate process :)