That's pretty much all there is to my question: Can schema changes be done within transactions in MySQL?
My understanding is no but I'm not having an easy time finding documentation that provides a definitive answer.
Implicit commit and cannot rollback sections of the mysql documentation quite clearly indicate, that schema changes should not be part of a transaction involving other commands, since they will either cause a commit, or the schema changes cannot be rolled back.
Related
So, I'm working in MySQL at the moment, but any SQL answers will probably do, cuz I'm trying to understand the general concepts.
So thread safety is obviously important in concurrent environments. I program primarily in Java and I'm always extremely careful to write code that guards its mutable state to avoid thread conflicts.
In SQL, though, I'm very confused about how to achieve that same level of safety. So I'm gonna start with what I don't know, go on to what I'm confused about, and take it from there.
First, what I do know is transactions. Disable auto commit, use savepoints, rollbacks, etc. Transactions, as I understand them, are atomic at the point of committing them.
But I've also seen references to explicit locking statements and concurrency models (optimistic,pessimistic). And I don't really get where all that fits in. I also don't want to just use transactions for everything and assume it'll be safe. I don't write code unless I understand it in its entirety, I don't want to leave anything to chance.
Moreover, what about triggers, procedures, etc. How do I use them with transactions? How do I ensure atomicity there?
I feel like I'm overcomplicating this a bit, but I'm looking for a comprehensive, clear cut explanation as to how to ensure that multiple threads and users can modify the database safely. Not quite and ELI5, since I understand SQL better than that, but something that really thoroughly explains the process.
Thanks. I haven't found a good match for this question on this site in my search, but if it is a duplicate I apologize and simply ask that a link to the appropriate answer be provided before this question is locked.
When working with database transactions, what are the possible conditions (if any) that would cause the final COMMIT statement in a transaction to fail, presuming that all statements within the transaction already executed without issue?
For example... let's say you have some two-phase or three-phase commit protocol where you do a bunch of statements, then wait for some master process to tell you when it is ok to finally commit the transaction:
-- <initial handshaking stuff>
START TRANSACTION;
-- <Execute a bunch of SQL statements>
-- <Inform master of readiness to commit>
-- <Time passes... background transactions happening while we wait>
-- <Receive approval to commit from master (finally!)>
COMMIT;
If your code gets to that final COMMIT statement and sends it to your DBMS, can you ever get an error (uniqueness issue, database full, etc) at that statement? What errors? Why? How do they appear? Does it vary depending on what DBMS you run?
COMMIT may fail. You might have had sufficent resources to log all the changes you wished to make, but lack resources to actually implement the changes.
And that's not considering other reasons it might fail:
The change itself might not fit the constraints of the database.
Power loss stops things from completing.
The level of requested selection concurrency might disallow an update (cursors updating a modified table, for example).
The commit might time out or be on a connection which times out due to starvation issues.
The network connection between the client and the database may be lost.
And all the other "simple" reasons that aren't on the top of my head.
It is possible for some database engines to defer UNIQUE index constraint checking until COMMIT. Obviously if the constraint does not hold true at the time of commit then it will fail.
Sure.
In a multi-user environment, the COMMIT may fail because of changes by other users (e.g. your COMMIT would violate a referential constraint when applied to the now current database...).
Thomas
If you're using two-phase commit, then no. Everything that could go wrong is done in the prepare phase.
There could still be network outage, power less, cosmic rays, etc, during the commit, but even so, the transactions will have been written to permanent storage, and if a commit has been triggered, recovery processes should carry them through.
Hopefully.
Certainly, there could be a number of issues. The act of committing, in and of itself, must make some final, permanent entry to indicate that the transaction committed. If making that entry fails, then the transaction can't commit.
As Ignacio states, there can be deferred constraint checking (this could be any form of constraint, not just unique constraint, depending on the DBMS engine).
SQL Server Specific: flushing FILESTREAM data can be deferred until commit time. That could fail.
One very simple and often overlooked item: hardware failure. The commit can fail if the underlying server dies. This might be disk, cpu, memory, or even network related.
The transaction could fail if it never receives approval from the master (for any number of reasons).
No matter how wonderfully a system may be designed, there is going to be some possibility that a commit will get into a situation where it's impossible to know whether it succeeded or not. In some cases, it may not matter (e.g. if a hard drive holding the database turns into a pile of slag, it may be impossible to tell whether the commit succeeded or not before that occurred but it wouldn't really matter); in others cases, however, this could be a problem. Especially with distributed database systems, if a connection failure occurs at just the right time during a commit, it will be impossible for both sides to be certain of whether the other side is expecting a commit or a rollback.
With MySQL or MariaDB, when used with Galera clustering, COMMIT is when the other nodes in the cluster are checked. So, yes important errors can be discovered by COMMIT, and you must check for these errors.
Would it add overhead to put a DB transactions around every single service method in our application?
We currently only use DB transactions where it's an explicit/obvious necessity. I have recently suggested transactions around all service methods, but some other developers asked the prudent question: will this add overhead?
My feeling is not - auto commit is the same as a transaction from the DB perspective. But is this accurate?
DB: MySQL
You are right, with autocommit every statement is wrapped in transaction. If your service methods are executing multiple sql statements, it would be good to wrap them into a transaction. Take a look at this answer for more details, and here is a nice blog post on the subject.
And to answer your question, yes, transactions do add performance overhead, but in your specific case, you will not notice the difference since you already have autocommit enabled, unless you have long running statements in service methods, which will cause longer locks on tables participating in transactions. If you just wrap your multiple statements inside a transaction, you will get one transaction (instead of transaction for every individual statement), as pointed here ("A session that has autocommit enabled can perform a multiple-statement transaction by starting it with an explicit START TRANSACTION or BEGIN statement and ending it with a COMMIT or ROLLBACK statement") and you will achieve atomicity on a service method level...
At the end, I would go with your solution, if that makes sense from the perspective of achieving atomicity on a service method level (which I think that you want to achieve), but there are + and - effects on performance, depending on your queries, requests/s etc...
Yes, they can add overhead. The extra "bookkeeping" required to isolate transactions from each other can become significant, especially if the transactions are held open for a long time.
The short answer is that it depends on your table type. If you're using MyISAM, the default, there are no transactions really, so there should be no effect on performance.
But you should use them anyway. Without transactions, there is no demarcation of work. If you upgrade to InnoDB or a real database like PostgreSQL, you'll want to add these transactions to your service methods anyway, so you may as well make it a habit now while it isn't costing you anything.
Besides, you should already be using a transactional store. How do you clean up if a service method fails currently? If you write some information to the database and then your service method throws an exception, how do you clean out that incomplete or erroneous information? If you were using transactions, you wouldn't have to—the database would throw away rolled back data for you. Or what do you do if I'm halfway through a method and another request comes in and finds my half-written data? Is it going to blow up when it goes looking for the other half that isn't there yet? A transactional data store would handle this for you: your transactions would be isolated from each other, so nobody else could see a partially written transaction.
Like everything with databases, the only definitive answer will come from testing with realistic data and realistic loads. I recommend that you do this always, no matter what you suspect, because when it comes to databases very different code paths get activated when the data are large versus when they are not. But I strongly suspect the cost of using transactions even with InnoDB is not great. After all, these systems are heavily used constantly, every day, by organizations large and small that depend on transactions performing well. MVCC adds very little overhead. The benefits are vast, the costs are low—use them!
Can anyone give (or point me to) a high-level overview of how MySQL implements transactions, rollbacks, and retries? I'm staring at some code but before diving in for the weekend I figured it'd be useful if someone could give me a birds-eye view so that I'd know where to start.
EDIT: Maybe I was a little less than clear. I'm not looking for how to use MySQL's client interfaces, I'm looking for how it actually does transactions. I'm looking for something like "check int my_isam_start_transaction(..." in my_isam.c.
MySQL only supports transactions in the table type is InnoDB. Otherwise, you have to do all the rollbacks and retries in code. Doing it in code can be really difficult since you may lose the connection to the server, then you can't roll back in a timely manner.
In a nutshell, you "wrap" your set of queries in START TRANSACTION and COMMIT queries.
http://dev.mysql.com/doc/refman/5.1/en/commit.html
InnoDB will automatically rollback in case of failure/disconnect in your code.
I just went to the MySQL manual (somewhere on mysql.com), and did a search for "transactions":
http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-transactions.html
It's for version 5.0, but it's a pretty repeatable process. For a general-overview, Wikipedia is a good starting point on that whole strange "ACID" concept. However, transactions (and the correct implementation or not, not to mention the various quirks and best practices) depend heavily on the specific DB itself.
Following the magic formula above also yields:
http://dev.mysql.com/doc/refman/5.5/en/innodb.html (more detailed information on the InnoDB back-end, which is likely what you'll be using, although their are alternatives that support transactions such as IBMDB2I)
Happy reading.
I'm refactoring some code, converting a number of related updates into a single transaction.
This is using JDBC, MySQL, InnoDB.
I believe there is an unwanted COMMIT still happening somewhere in the (rather large and undocumented) library or application code.
What's the easiest way to find out where this is happening?
There must be some sort of implicit commit, because I'm not finding any COMMIT statements.
Check out this page in the MySQL docs for statements that cause an implicit commit.
Also, since you're using JDBC, make sure autocommit is false, as in
connection.setAutoCommit(false);
I am not an expert with mysql, but there should be a possibility to log all executed statements to a file and/or console. This will probably help. If you can debug through code set breakpoints right before the commits you know, and then have a look to the logged statements. Thus you'll probably see if or if not there is a unwanted commit.