MySQL transactions implicit commit - mysql

I'm starting out with MySQL trnsactions and I have a doubt:
In the documentation it says:
Beginning a transaction causes any pending transaction to be
committed. See Section 13.3.3, “Statements That Cause an Implicit
Commit”, for more information.
I have more or less 5 users on the same web application ( It is a local application for testing ) and all of them share the same MySQL user to interact with the database.
My question is: If I use transactions in the code and two of them start a transaction ( because of inserting, updating or something ) Could it be that the transactions interfere with each other?
I see in the statements that cause an implicit commit Includes starting a transaction. Being a local application It's fast and hard to tell if there is something wrong going on there, every query turns out as expected but I still have the doubt.

The implicit commit occurs within a session.
So for instance you start a transaction, do some updates and then forget to close the transaction and start a new one. Then the first transaction will implicitely committed.
However, other connections to the database will not be affected by that; they have their own transactions.
You say that 5 users use the same db user. That is okay. But in order to have them perform separate operations they should not use the same connection/session.

With MySQl by default each connection has autocommit turned on. That is, each connection will commit each query immediately. For an InnoDb table each transaction is therefore atomic - it completes entirely and without interference.
For updates that require several operations you can use a transaction by using a START TRANSACTION query. Any outstanding transactions will be committed, but this won't be a problem because mostly they will have been committed anyway.
All the updates performed until a COMMIT query is received are guaranteed to be completed entirely and without interference or, in the case of a ROLLBACK, none are applied.
Other transations from other connections see a consistent view of the database while this is going on.
This property is ACID compliance (Atomicity, Consistency, Isolation, Durability) You should be fine with an InnoDB table.
Other table types may implement different levels of ACID compliance. If you have a need to use one you should check it carefully.
This is a much simplified veiw of transaction handling. There is more detail on the MySQL web site here and you can read about ACID compliance here

Related

Is there a significant cost of Beginning & Committing a MySQL Transaction even if no changes made to tables?

This application is running on AWS EC2, using RDS/MySQL (5.7).
We have an audit script that is visiting every user in a MySQL user's table. \
For each user, we
Unconditionally start a transaction,
Possibly make some changes the user's record and to other tables.
Unconditionally commit the transaction.
Now, it's likely (and common) in step 2 that no changes were made to any table. The question arose during the code inspection as to the performance impact of Starting/Committing a transaction for many records when no changes were made.
I've read elsewhere that MySQL is optimized for commits, not rollback. But I've yet to find a discussion on the cost of starting/committing transactions when no work was done.
There's no significant cost in short running transactions.
The minimal cost of transactions starts to happen when changes are actually made.
Rollbacks are only expensive if there is data to be changed during the rollback, otherwise its a fairly empty operation.
Committing (and probably rollback) where there are no changes shouldn't incur a penalty.
Keep in mind that as soon as you read an InnoDB table, whether you explicitly "started a transaction" or not, InnoDB does similar work.
The alternative to starting a transaction is relying on autocommit. But autocommit doesn't mean no transaction happens. Autocommit means the transaction starts implicitly when your query touches an InnoDB table, and the transaction commits automatically as soon as the query is done. You can't run more than one statement, and you can't rollback, but otherwise it's the same as an explicit transaction.
You don't really save anything by trying to avoid starting a transaction.
There is most definitely an overhead, and it can be quite significant. I recently worked with a client who had exactly this issue (lots of empty commits due to a bug in their ORM) and their database was burning through about 30% of CPU on empty commits.
Capture a day of slow log with long_query_time=0 (important!), put it through mysqldumpslow or pt-query-digest, and see if you have a query that is just COMMIT; using up a large amount of CPU.

How do I determine if I have uncommitted writes in a MySQL transaction?

I'm working with a very complex codebase and wish to do some introspection of how it works with MySQL. (Note that I'm using InnoDB).
Specifically, in a method I'm writing, I wish to determine if there are any outstanding (non-committed) writes in its open db connection.
Is there any MySQL command - or other means - to detect if there are uncommitted writes? (or phrased another way, determine if COMMIT will make no modifications).
This is a pretty interesting question. I don't think there is a definite way to determine if issuing commit will or will not make a difference in the session you are running.
You can see transactions with show innodb status or show engine innodb status but I don't think you can issue commit on those transactions.
INNODB_TRX table in information_schema will show currently executing transations: https://dev.mysql.com/doc/refman/5.5/en/innodb-trx-table.html and again there's not much you can do to force-commit them. You can roll them back by killing the associated process.
If you are running a transaction using START TRANSACTION in a stored procedure, you can handle commit and rollback manually. You can even set autocommit to 0 to control when to rollback and when to commit.

What are the consequences of accessing the same database from different programs at the same time?

I have a mysql database that is accessed using JDBC. If I access the database from two different programs at the same time then what effect will be there on the database?
Please tell in view of when both programs are reading the database, one is reading and the other is writing data and when both are writing data.
I think that when both programs write data then that would definitely lead to loss of data. But what happens in the other scenarios?
MySQL works on an ACID basis: http://en.wikipedia.org/wiki/ACID
Which means, both clients will be reading the database as if they were the only clients.
For this to happen each client must start a transaction, which is a single logical unit of work. Within this transaction either all the operations done to the database must be committed or rolled back.
Different RDBMSs have different defaults for their transaction support. For MySQL, the isolation level is REPEATABLE READ, which means that SELECT statements within the same transaction are consistent with respect to each other.
How you can verify this:
Have program1 going start a transaction and through every row and increasing a value, while the other program starts a transaction and goes through the database calculating the sum of the same value for all rows. When they are done, they close their transactions and print out the results. You will notice that both of them read the database as if they were isolated from each other.
There are whole books written about JDBC. Here are some links that can get you started:
JDBC Tutorial: http://docs.oracle.com/javase/tutorial/jdbc/
MySQL: http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html
Hopefully, MySQL like PostgreSQL, MariaDB or other major databases accept to be used by many programs, each being allowed to have many connections. And the database will not break even if multiple programs try to update the same row at the same time. But ... the how to do that is the problem of the client programs via transactions.
Welcome to the world of ACID transactions ! Within a transaction, the database guarantees that the program keeps a level of consistency. There is no problems for Atomicity, Consistency and Durability, but Isolation is a little more tedious. JDBC defines 4 level of isolation, plus no transaction at all (following extracted from The Java Tutorials : Using Transactions) :
The interface Connection includes five values that represent the transaction isolation levels you can use in JDBC:
Isolation Level Transactions Dirty Reads Non-Repeatable Reads/Phantom Reads
TRANSACTION_NONE Not supported Not applicable Not applicable Not applicable
TRANSACTION_READ_COMMITTED Supported Prevented Allowed Allowed
TRANSACTION_READ_UNCOMMITTED Supported Allowed Allowed Allowed
TRANSACTION_REPEATABLE_READ Supported Prevented Prevented Allowed
TRANSACTION_SERIALIZABLE Supported Prevented Prevented Prevented
Accessing an updated value that has not been committed is considered a dirty read because it is possible for that value to be rolled back to its previous value.
A non-repeatable read occurs when transaction A retrieves a row, transaction B subsequently updates the row, and transaction A later retrieves the same row again. Transaction A retrieves the same row twice but sees different data.
A phantom read occurs when transaction A retrieves a set of rows satisfying a given condition, transaction B subsequently inserts or updates a row such that the row now meets the condition in transaction A, and transaction A later repeats the conditional retrieval. Transaction A now sees an additional row. This row is referred to as a phantom.

Should I commit after a single select

I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.

Locking mySQL tables/rows

can someone explain the need to lock tables and/or rows in mysql?
I am assuming that it to prevent multiple writes to the same field, is this the best practise?
First lets look a good document This is not a mysql related documentation, it's about postgreSQl, but it's one of the simplier and clear doc I've read on transaction. You'll understand MySQl transaction better after reading this link http://www.postgresql.org/docs/8.4/static/mvcc.html
When you're running a transaction 4 rules are applied (ACID):
Atomicity : all or nothing (rollback)
Coherence : coherent before, coherent after
Isolation: not impacted by others?
Durability : commit, if it's done, it's really done
In theses rules there's only one which is problematic, it's Isolation. using a transaction does not ensure a perfect isolation level. The previous link will explain you better what are the phantom-reads and suchs isolation problems between concurrent transactions. But to make it simple you should really use Row levels locks to prevent other transaction, running in the same time as you (and maybe comitting before you), to alter the same records. But with locks comes deadlocks...
Then when you'll try using nice transactions with locks you'll need to handle deadlocks and you'll need to handle the fact that transaction can fail and should be re-launched (simple for or while loops).
Edit:------------
Recent versions of InnoDb provides greater levels of isolation than previous ones. I've done some tests and I must admit that even the phantoms reads that should happen are now difficult to reproduce.
MySQL is on level 3 by default of the 4 levels of isolation explained in the PosgtreSQL document (where postgreSQL is in level 2 by default). This is REPEATABLE READS. That means you won't have Dirty reads and you won't have Non-repeatable reads. So someone modifying a row on which you made your select in your transaction will get an implicit LOCK (like if you had perform a select for update).
Warning: If you work with an older version of MySQL like 5.0 you're maybe in level 2, you'll need to perform the row lock using the 'FOR UPDATE' words!
We can always find some nice race conditions, working with aggregate queries it could be safer to be in the 4th level of isolation (by using LOCK IN SHARE MODE at the end of your query) if you do not want people adding rows while you're performing some tasks. I've been able to reproduce one serializable level problem but I won't explain here the complex example, really tricky race conditions.
There is a very nice example of race conditions that even serializable level cannot fix here : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
When working with transactions the more important things are:
data used in your transaction must always be read INSIDE the transaction (re-read it if you had data from before the BEGIN)
understand why the high isolation level set implicit locks and may block some other queries ( and make them timeout)
try to avoid dead locks (try to lock tables in the same order) but handle them (retry a transaction aborted by MySQL)
try to freeze important source tables with serialization isolation level (LOCK IN SHARE MODE) when your application code assume that no insert or update should modify the dataset he's using (if not you will not have problems but your result will have ignored the concurrent changes)
It is not a best practice. Modern versions of MySQL support transactions with well defined semantics. Use transactions, and forget about locking stuff by hand.
The only new thing you'll have to deal with is that transaction commits may fail because of race conditions, but you'd be doing error checking with locks anyway, and it is easier to retry the logic that led to a transaction failure than to recover from errors in a non-transactional setup.
If you do get race conditions and failed commits, then you may want to fine-tune the isolation configuration for your transactions.
For example if you need to generate invoice numbers which are sequential and have no numbers missing - this is a requirement at least in the country I live in.
If you have a few web servers, then a few users might be buying stuff literally at the same time.
If you do select max(invoice_id)+1 from invoice to get the new invoice number, two web servers might do that at the same time (before the new invoice has been added), and get the same invoice number for the invoices they're trying to create.
If you use a mechanism such as "auto_increment", this is just meant to generate unique values, and makes no guarantees about not missing out numbers (if one transaction tries to insert a row, then does a rollback, the number is "lost"),
So the solution is to (a) lock the table (b) select max(invoice_id)+1 from invoice (c) do the insert (d) commit + unlock the table.
On another note, in MySQL you're best using InnoDB and using row-level locking. Doing a lock table command can implicitly commit the transaciton you're working on.
Take a look here for general introduction to what transactions are and how to use them.
Databases are designed to work in concurrent environments, so locking the tables and/or records helps to keep the transactions consistent.
So a record affected by one transaction should not be altered until this transaction commits or rolls back.