Percona XtraDB Cluster multi-node writing and unexpected deadlocks outside of transaction? - mysql

I am having trouble finding an answer to this using google or Stack Overflow, so perhaps people familiar with Percona XtraDB can help answer this. I fully understand how unexpected deadlocks can occur as outlined in this article, and the solution is to make sure you wrap your transactions with retry logic so you can restart them if they fail. We already do that.
https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/
My questions is about normal updates that occur outside of a transaction in auto commit mode. Normally if you are writing only to a single SQL DB and perform an update, you get a last in wins scenario so whoever executes the statement last, is golden. Any other data is lost so if two updates occur at the same time, one of them will take hold and the others data is essentially lost.
Now what happens in a multi master environment with the same thing? The difference in cluster mode with multi master is that the deadlock can occur at the point where the commit happens as opposed to when the lock is first taken on the table. So in auto commit mode, the data will get written to the DB but then it could fail when it tries to commit that to the other nodes in the cluster if something else modified the exact same record at the same time. Clearly the simply solution is to re-execute the update again and it would seem to me that the database itself should be able to handle this, since it is a single statement in auto commit mode?
So is that what happens in this scenario, or do I need to start wrapping all my update code in retry handling as well and retry it myself when this fails?

Autocommit is still a transaction; a single statement transaction. Your single statement is just wrapped up in BEGIN/COMMIT for you. I believe your logic is inverted. In PXC, the rule is "commit first wins". If you start a manual transaction on node1 (ie: autocommit=0; BEGIN;) and UPDATE id=1 and don't commit then on node2 you autocommit an update to the same row, that will succeed on node2 and succeed on node1. When you commit the manual UPDATE, you will get a deadlock error. This is correct behavior.
It doesn't matter if autocommit or not; whichever commits first wins and the other transaction must re-try. This is the reason why we don't recommend writing to multiple nodes in PXC.
Yes, if you want to write to multiple nodes, you need to adjust your code to "try-catch-retry" handle this error case.

Related

Is there a significant cost of Beginning & Committing a MySQL Transaction even if no changes made to tables?

This application is running on AWS EC2, using RDS/MySQL (5.7).
We have an audit script that is visiting every user in a MySQL user's table. \
For each user, we
Unconditionally start a transaction,
Possibly make some changes the user's record and to other tables.
Unconditionally commit the transaction.
Now, it's likely (and common) in step 2 that no changes were made to any table. The question arose during the code inspection as to the performance impact of Starting/Committing a transaction for many records when no changes were made.
I've read elsewhere that MySQL is optimized for commits, not rollback. But I've yet to find a discussion on the cost of starting/committing transactions when no work was done.
There's no significant cost in short running transactions.
The minimal cost of transactions starts to happen when changes are actually made.
Rollbacks are only expensive if there is data to be changed during the rollback, otherwise its a fairly empty operation.
Committing (and probably rollback) where there are no changes shouldn't incur a penalty.
Keep in mind that as soon as you read an InnoDB table, whether you explicitly "started a transaction" or not, InnoDB does similar work.
The alternative to starting a transaction is relying on autocommit. But autocommit doesn't mean no transaction happens. Autocommit means the transaction starts implicitly when your query touches an InnoDB table, and the transaction commits automatically as soon as the query is done. You can't run more than one statement, and you can't rollback, but otherwise it's the same as an explicit transaction.
You don't really save anything by trying to avoid starting a transaction.
There is most definitely an overhead, and it can be quite significant. I recently worked with a client who had exactly this issue (lots of empty commits due to a bug in their ORM) and their database was burning through about 30% of CPU on empty commits.
Capture a day of slow log with long_query_time=0 (important!), put it through mysqldumpslow or pt-query-digest, and see if you have a query that is just COMMIT; using up a large amount of CPU.

MySQL InnoDB: Difference Between `FOR UPDATE` and `LOCK IN SHARE MODE`

What is the exact difference between the two locking read clauses:
SELECT ... FOR UPDATE
and
SELECT ... LOCK IN SHARE MODE
And why would you need to use one over the other?
I have been trying to understand the difference between the two. I'll document what I have found in hopes it'll be useful to the next person.
Both LOCK IN SHARE MODE and FOR UPDATE ensure no other transaction can update the rows that are selected. The difference between the two is in how they treat locks while reading data.
LOCK IN SHARE MODE does not prevent another transaction from reading the same row that was locked.
FOR UPDATE prevents other locking reads of the same row (non-locking reads can still read that row; LOCK IN SHARE MODE and FOR UPDATE are locking reads).
This matters in cases like updating counters, where you read value in 1 statement and update the value in another. Here using LOCK IN SHARE MODE will allow 2 transactions to read the same initial value. So if the counter was incremented by 1 by both transactions, the ending count might increase only by 1 - since both transactions initially read the same value.
Using FOR UPDATE would have locked the 2nd transaction from reading the value till the first one is done. This will ensure the counter is incremented by 2.
For Update --- You're informing Mysql that the selected rows can be updated in the next steps(before the end of this transaction) ,,so that mysql does'nt grant any read locks on the same set of rows to any other transaction at that moment. The other transaction(whether for read/write )should wait until the first transaction is finished.
For Share- Indicates to Mysql that you're selecting the rows from the table only for reading purpose and not to modify before the end of transaction. Any number of transactions can access read lock on the rows.
Note: There are chances of getting a deadlock if this statement( For update, For share) is not properly used.
Either way the integrity of your data will be guaranteed, it's just a question of how the database guarantees it. Does it do so by raising runtime errors when transactions conflict with each other (i.e. FOR SHARE), or does it do so by serializing any transactions that would conflict with each other (i.e. FOR UPDATE)?
FOR SHARE (a.k.a. LOCK IN SHARE MODE): Transactions face a higher probability of failure due to deadlock, because they delay blocking until the moment an update statement is received (at which point they either block until all readlocks are released, or fail due to deadlock if another write is in progress). However, only one client blocks and eventually succeeds: the other clients will fail with deadlock if they try to update, so only one of them will succeed and the rest will have to retry their transactions.
FOR UPDATE: Transactions won't fail due to deadlock, because they won't be allowed to run concurrently. This may be desirable for example because it makes it easier to reason about multi-threading if all updates are serialized across all clients. However, it limits the concurrency you can achieve because all other transactions block until the first transaction is finished.
Pro-Tip: As an exercise I recommend taking some time to play with a local test database and a couple mysql clients on the command line to prove this behavior for yourself. That is how I eventually understood the difference myself, because it can be very abstract until you see it in action.

Restore Truncated Data In Mysql [duplicate]

I made a wrong update query in my table.
I forgot to make an id field in the WHERE clause.
So that updated all my rows.
How to recover that?
I didn't have a backup....
There are two lessons to be learned here:
Backup data
Perform UPDATE/DELETE statements within a transaction, so you can use ROLLBACK if things don't go as planned
Being aware of the transaction (autocommit, explicit and implicit) handling for your database can save you from having to restore data from a backup.
Transactions control data manipulation statement(s) to ensure they are atomic. Being "atomic" means the transaction either occurs, or it does not. The only way to signal the completion of the transaction to database is by using either a COMMIT or ROLLBACK statement (per ANSI-92, which sadly did not include syntax for creating/beginning a transaction so it is vendor specific). COMMIT applies the changes (if any) made within the transaction. ROLLBACK disregards whatever actions took place within the transaction - highly desirable when an UPDATE/DELETE statement does something unintended.
Typically individual DML (Insert, Update, Delete) statements are performed in an autocommit transaction - they are committed as soon as the statement successfully completes. Which means there's no opportunity to roll back the database to the state prior to the statement having been run in cases like yours. When something goes wrong, the only restoration option available is to reconstruct the data from a backup (providing one exists). In MySQL, autocommit is on by default for InnoDB - MyISAM doesn't support transactions. It can be disabled by using:
SET autocommit = 0
An explicit transaction is when statement(s) are wrapped within an explicitly defined transaction code block - for MySQL, that's START TRANSACTION. It also requires an explicitly made COMMIT or ROLLBACK statement at the end of the transaction. Nested transactions is beyond the scope of this topic.
Implicit transactions are slightly different from explicit ones. Implicit transactions do not require explicity defining a transaction. However, like explicit transactions they require a COMMIT or ROLLBACK statement to be supplied.
Conclusion
Explicit transactions are the most ideal solution - they require a statement, COMMIT or ROLLBACK, to finalize the transaction, and what is happening is clearly stated for others to read should there be a need. Implicit transactions are OK if working with the database interactively, but COMMIT statements should only be specified once results have been tested & thoroughly determined to be valid.
That means you should use:
SET autocommit = 0;
START TRANSACTION;
UPDATE ...;
...and only use COMMIT; when the results are correct.
That said, UPDATE and DELETE statements typically only return the number of rows affected, not specific details. Convert such statements into SELECT statements & review the results to ensure correctness prior to attempting the UPDATE/DELETE statement.
Addendum
DDL (Data Definition Language) statements are automatically committed - they do not require a COMMIT statement. IE: Table, index, stored procedure, database, and view creation or alteration statements.
Sorry man, but the chances of restoring an overwritten MySQL database are usually close to zero. Different from deleting a file, overwriting a record actually and physically overwrites the existing data in most cases.
To be prepared if anything comes up here, you should stop your MySQL server, and make a copy of the physical directory containing the database so nothing can get overwritten further: A simple copy+paste of the data folder to a different location should do.
But don't get your hopes up - I think there's nothing that can be done really.
You may want to set up a frequent database backup for the future. There are many solutions around; one of the simplest, most reliable and easiest to automate (using at or cron in Linux, or the task scheduler in Windows) is MySQL's own mysqldump.
Sorry to say that, but there is no way to restore the old field values without a backup.
Don't shoot the messenger...
Do you have binlogs enabled? You can recover by accessing the binlogs.

Should I commit after a single select

I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.

Recovery after wrong MySQL update query?

I made a wrong update query in my table.
I forgot to make an id field in the WHERE clause.
So that updated all my rows.
How to recover that?
I didn't have a backup....
There are two lessons to be learned here:
Backup data
Perform UPDATE/DELETE statements within a transaction, so you can use ROLLBACK if things don't go as planned
Being aware of the transaction (autocommit, explicit and implicit) handling for your database can save you from having to restore data from a backup.
Transactions control data manipulation statement(s) to ensure they are atomic. Being "atomic" means the transaction either occurs, or it does not. The only way to signal the completion of the transaction to database is by using either a COMMIT or ROLLBACK statement (per ANSI-92, which sadly did not include syntax for creating/beginning a transaction so it is vendor specific). COMMIT applies the changes (if any) made within the transaction. ROLLBACK disregards whatever actions took place within the transaction - highly desirable when an UPDATE/DELETE statement does something unintended.
Typically individual DML (Insert, Update, Delete) statements are performed in an autocommit transaction - they are committed as soon as the statement successfully completes. Which means there's no opportunity to roll back the database to the state prior to the statement having been run in cases like yours. When something goes wrong, the only restoration option available is to reconstruct the data from a backup (providing one exists). In MySQL, autocommit is on by default for InnoDB - MyISAM doesn't support transactions. It can be disabled by using:
SET autocommit = 0
An explicit transaction is when statement(s) are wrapped within an explicitly defined transaction code block - for MySQL, that's START TRANSACTION. It also requires an explicitly made COMMIT or ROLLBACK statement at the end of the transaction. Nested transactions is beyond the scope of this topic.
Implicit transactions are slightly different from explicit ones. Implicit transactions do not require explicity defining a transaction. However, like explicit transactions they require a COMMIT or ROLLBACK statement to be supplied.
Conclusion
Explicit transactions are the most ideal solution - they require a statement, COMMIT or ROLLBACK, to finalize the transaction, and what is happening is clearly stated for others to read should there be a need. Implicit transactions are OK if working with the database interactively, but COMMIT statements should only be specified once results have been tested & thoroughly determined to be valid.
That means you should use:
SET autocommit = 0;
START TRANSACTION;
UPDATE ...;
...and only use COMMIT; when the results are correct.
That said, UPDATE and DELETE statements typically only return the number of rows affected, not specific details. Convert such statements into SELECT statements & review the results to ensure correctness prior to attempting the UPDATE/DELETE statement.
Addendum
DDL (Data Definition Language) statements are automatically committed - they do not require a COMMIT statement. IE: Table, index, stored procedure, database, and view creation or alteration statements.
Sorry man, but the chances of restoring an overwritten MySQL database are usually close to zero. Different from deleting a file, overwriting a record actually and physically overwrites the existing data in most cases.
To be prepared if anything comes up here, you should stop your MySQL server, and make a copy of the physical directory containing the database so nothing can get overwritten further: A simple copy+paste of the data folder to a different location should do.
But don't get your hopes up - I think there's nothing that can be done really.
You may want to set up a frequent database backup for the future. There are many solutions around; one of the simplest, most reliable and easiest to automate (using at or cron in Linux, or the task scheduler in Windows) is MySQL's own mysqldump.
Sorry to say that, but there is no way to restore the old field values without a backup.
Don't shoot the messenger...
Do you have binlogs enabled? You can recover by accessing the binlogs.