Several locks are introduced in MySQL, among which SELECT ... FROM is a consistent read, a snapshot of the read data and no lock is set (unless the transaction level is SERIALIZABLE) (https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html)
The snapshot (MVCC) is implemented in MySQL by adding a header (including transaction version and pointer) and logical visibility rules to the tuple
But we always emphasize the design principle of the visibility rule, but ignore the fact that reading and writing to the tuple Header are two mutually exclusive actions, which can only be avoided by locking.
So how does the sentence of consistent read without lock understand? Is it just that there is no lock in a broad sense? How is the atomic reading and writing of the tuple Header designed? Is there any performance overhead? Is there any information available in this regard?
----- supplementary instruction -----
When a row(tuple) is updated, the new version of the row is kept, along with the old copy/copies. Each copy has a sequence number (transaction id) with it.
The transaction ID and the pointer to the copies are stored in the row header, that is, when creating a copies, you need to modify the row header (update the transaction ID and the pointer to the copies). When accessing the row, you need to access the row header first to judge the version (location) we want to access.
Modifying the row header and visiting row header should be two mutually exclusive actions (otherwise dirty data will be read in the case of concurrent reading and writing), and what I want to know is that MySQL performs this part (row header) How is the logic of reading and writing designed? Is it a read-write lock / spin lock or is there any other clever method?
I think the answer goes something like this.
When a row is updated, the new version of the row is kept, along with the old copy/copies. Each copy has a sequence number (transaction id) with it. After both transactions COMMIT or ROLLBACK, the set of rows is cleaned up -- the surviving one is kept; the other(s) are tossed.
That sequence number has the effect of labeling rows as being part of the snapshot that of the dataset that was taken at BEGIN time.
Rows with sequence numbers that are equal to or older than the transaction in question are considered as fair game for reading by the transaction. Note that no lock is needed for such a "consistent read".
Each copy of the row has its own tuple header, with a different sequence number. The copies are chained together in a "history list". Updates/deletes to the row will add new items to the list, but leave the old copies unchanged. (Again, this points out that no lock is needed for an old read.)
Transaction isolation "read dirty" allows the transaction to go through the history list to 'see' the latest copy.
Performance overhead? Sure. Everything has a performance overhead. But... The "performance" that matters is the grand total of all actions. There is a lot of complexity, but the end result is 'better' performance.
The history list is a lot of overhead, but it helps by decreasing the locking, etc.
InnoDB uses "optimistic" locking -- That is it starts a transaction with the assumption (hope) that it will COMMIT. The cost is that ROLLBACK is less efficient. This seems like a reasonable tradeoff.
InnoDB has a lot of overhead, yet it can beat the much-simpler MyISAM Engine in many benchmarks. Faster and ACID -- May as well get rid of MyISAM. And that is the direction Oracle is taking MySQL.
Related
I was practicing some "system design" coding questions and I was interested in how to solve a concurrency problem in MySQL. The problem was "design an inventory checkout system".
Let's say you are trying to check out a specific item from an inventory, a library book for instance.
If two people are on the website, looking to book it, is it possible that they both check it out? Let's assume the query is updating the status of the row to mark a boolean checked_out to True.
Would transactions solve this issue? It would cause the second query that runs to fail (assuming they are the same query).
Alternatively, we insert rows into a checkouts table. Since both queries read that the item is not checked out currently, they could both insert into the table. I don't think a transaction would solve this, unless the transaction includes reading the table to see if a checkout currently exists for this item that hasn't yet ended.
One of the suggested methods
How would I simulate two writes at the exact same time to test this?
No, transactions alone do not address concurrency issues. Let's quickly revisit mysql's definition of transactions:
Transactions are atomic units of work that can be committed or rolled back. When a transaction makes multiple changes to the database, either all the changes succeed when the transaction is committed, or all the changes are undone when the transaction is rolled back.
To sum it up: transactions are a way to ensure data integrity.
RDBMSs use various types of locking, isolation levels, and storage engine level solutions to address concurrency. People often mistake transactions as a mean to control concurrency because transactions affect how long certain locks are held.
Focusing on InnoDB: when you issue an update statement, mysql places an exclusive lock on the record being updated. Only the transaction holding the exclusive lock can modify the given record, the others have to wait until the transaction is committed.
How does this help you preventing multiple users checking out the same book? Let's say you have an id field uniquely identifying the books and a checked_out field indicating the status of the book.
You can use the following atomic update to check out a book:
update books set checked_out=1 where id=xxx and checked_out=0
The checked_out=0 criteria makes sure that the update only succeeds if the book is not checked out yet. So, if the above statement affects a row, then the current user checks out the book. If it does not affect any rows, then someone else has already checked out the book. The exclusive lock makes sure that only one transaction can update the record at any given time, thus serializing the access to that record.
If you want to use a separate checkouts table for reserving books, then you can use a unique index on book ids to prevent the same book being checked out more than once.
Transactions don't cause updates to fail. They cause sequences of queries to be serialized. Only one accessor can run the sequence of queries; others wait.
Everything in SQL is a transaction, single-statement update operations included. The kind of transaction denoted by BEGIN TRANSACTION; ... COMMIT; bundles a series of queries together.
I don't think a transaction would solve this, unless the transaction
includes reading the table to see if a checkout currently exists for
this item.
That's generally correct. Checkout schemes must always read availability from the database. The purpose of the transaction is to avoid race conditions when multiple users attempt to check out the same item.
SQL doesn't have thread-safe atomic test-and-set instructions like multithreaded processor cores have. So you need to use transactions for this kind of thing.
The simplest form of checkout uses a transaction, something like this.
BEGIN TRANSACTION;
SELECT is_item_available, id FROM item WHERE catalog_number = whatever FOR UPDATE;
/* if the item is not available, tell the user and commit the transaction without update*/
UPDATE item SET is_item_available = 0 WHERE id = itemIdPreviouslySelected;
/* tell the user the checkout succeeded. */
COMMIT;
It's clearly possible for two or more users to attempt to check out the same item more-or-less simultaneously. But only one of them actually gets the item.
A more complex checkout scheme, not detailed here, uses a two-step system. First step: a transaction to reserve the item for a user, rejecting the reservation if someone else has it checked out or reserved. Second step: reservation holder has a fixed amount of time to accept the reservation and check out the item, or the reservation expires and some other user may reserve the item.
I have read the MySQL mannual about intention lock:
http://dev.mysql.com/doc/refman/5.5/en/innodb-locking.html#innodb-intention-locks
It says that "To make locking at multiple granularity levels practical", but how? It does not tell us about it.
Can anyone give a detailed explanation and a sample?
Think of the InnoDB data space as a collection of databases, each database being a collection of tables, and each table being a collection of rows. This forms a hierarchy, where lower and lower levels offer more and more granularity.
Now, when you want to update some part(s) of this tree in a transaction, how do you do it? Well, InnoDB employs multiple granularity locking (MGL). The mechanism in MGL is that you specify "intentions" to lock at a particular granularity level as shared or exclusive, then MGL combines all these intentions together and marches up the hierarchy until it finds the minimum spanning set that has to be locked given those intentions.
Without intention locks, you have high level shared and exclusive locks which really don't give you much flexibility: they're all or nothing, which is what we see in MyISAM. But MGL brings the concept of intended shared and intended exclusive, which it uses as I said above to provide "just enough" locking.
If you'd like to know about the specific C level implementation, refer to Introduction to Transaction Locks in InnoDB.
From this link
the table-level intention locks are still not released so other transactions cannot lock the whole table in S or X mode.
I think the existance of intention lock is to allow table locking more effective(Mysql don't have to travse the entrie tree to see if there is a conflicted lock).
Variable size fields seem like they could cause performance issue.
For the sake of being concrete, let's assume we're using a relational database. Suppose a relation has a variable length text field. What happens if an update to a tuple in the relation increases the variable length field's size? An in-line record edit (i.e. editing the file containing the record in-line) would require shuffling around the other tuples residing on the same physical page -- potentially kicking some out.
I understand that different DBMSs handle this differently, but I'm curious what some of the common practices are for this. It seems to me that the best way to do this would be to simply mark the existing tuple as deleted and create a whole new tuple.
"It depends". Each implementation is different and practically warrants its own small book. (I should really be close-voting this question not answering it, but I figure I'll try to help and I can't make this short enough for a comment).
For PostgreSQL, read the developer documentation about DB storage and VARLENA, storage classes and TOAST, as well as the manual section on MVCC and concurrency control. For more info, start reading the code, many of the key headers and source files have good detailed comments that explain the low level operation.
The condensed version, which you may have to read the above-mentioned resources to understand:
PostgreSQL never overwrites a tuple during an update. It always writes it to a new location. If the location is on the same physical page and there are no indexes changed it avoids index updates, but it'll always do a heap write of a new tuple. It sets the xmax value of the old tuple and the xmin of the new one so that a transaction can only ever see one or the other. See the concurrency and mvcc docs for the gory details.
Variable length values may be stored inline or out-of-line (TOAST). If it's stored inline in the tuple on the heap, which is the default for small values, then when you update the record (whether you update that field or some other) the data gets copied to a new tuple, just like fixed length data does. If it's stored out-of-line in a TOAST side-table then if it's unmodified a pointer to it is copied but the value its self isn't. If it's stored out-of-line and modified then a new record is written to the TOAST table for that value and a new pointer to it is stored in the newly saved heap tuple for the new value.
Later on, VACUUM comes along and marks obsolete tuples, freeing space and allowing them to be overwritten.
Because PostgreSQL must retain the old data to be visible to old transactions it can never do an in-place modification.
In theory it'd be possible to put the old data somewhere else and then overwrite it - that's what Oracle does, with its undo and redo logs - but that's not what PostgreSQL does. Doing that introduces different complexities and trade-offs, solving problems and creating others.
(The only exception to the no-overwrite rule is pg_largeobject, which uses a sort of slice based copy-on-write to allow transactional updates to big file-like chunks of data without copying the whole file. Oh, and you could argue that SEQUENCEs get overwritten too. Also some full-table-lock operations.)
Other RDBMses work in different ways. Some even support multiple modes. MySQL for example uses MyISAM tables (in-place writes, AFAIK) and InnoDB (MVCC copy-on-write). Oracle has the undo and redo logs - it copies the old data to out-of-line storage then does an in-place update. Other DBMSes are no doubt different again.
For PostgreSQL, there is some information about that in http://www.postgresql.org/docs/9.3/static/datatype-character.html:
Tip: There is no performance difference among these three types [varchar(n)/character varying(n), char(n)/charachter(n), text], apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead.
I would assume that "an in-line record edit" will never occur, due to data integrity requirements and transaction processing (MVCC).
There is some (fairly old) information about transaction processing:
http://www.postgresql.org/files/developer/transactions.pdf
We must store multiple versions of every row. A tuple can be removed only after it’s been committed as deleted for long enough that no active transaction can see it anymore.
I start this worker 10 times to give it a sense of concurrency:
class AnalyzerWorker
#queue = :analyzer
def self.perform
loop do
# My attempt to lock pictures from other worker instances that may
# try to analyze the same picture (race condition)
pic = Pic.where(locked: false).first
pic.update_attributes locked: true
pic.analyze
end
end
end
This code is actually still vulnerable to race condition, one of the reasons i think is because there's a gap of time between fetching the unlocked picture and actually locking it.
Maybe there's more reasons, any robust approach to prevent this?
Active Record provides optimistic locking and pessimistic locking.
In order to use optimistic locking, the table needs to have a column
called lock_version of type integer. Each time the record is updated,
Active Record increments the lock_version column. If an update request
is made with a lower value in the lock_version field than is currently
in the lock_version column in the database, the update request will
fail with an ActiveRecord::StaleObjectError.
Pessimistic locking uses a locking mechanism provided by the
underlying database. Using lock when building a relation obtains an
exclusive lock on the selected rows. Relations using lock are usually
wrapped inside a transaction for preventing deadlock conditions.
Code samples are provided in the referenced links...
Either should work but each need different implementations. From what you are doing, I'd consider pessimistic locking since the possibility of a conflict is relatively high.
Your current implementation is kind of a mixture of both however, as you indicated, it really doesn't solve the problem. You might be able to make yours work, but using the Active Record implementation makes sense.
I use a table with one row to keep the last used ID (I have my reasons to not use auto_increment), my app should work in a server farm so I wonder how I can update the last inserted ID (ie. increment it) and select the new ID in one step to avoid problems with thread safety (race condition between servers in the server farm).
You're going to use a server farm for the database? That doesn't sound "right".
You may want to consider using GUID's for Id's. They may be big but they don't have duplicates.
With a single "next id" value you will run into locking contention for that record. What I've done in the past is use a table of ranges of id's (RangeId, RangeFrom, RangeTo). The range table has a primary key of "RangeId" that is a simple number (eg. 1 to 100). The "get next id" routine picks a random number from 1 to 100, gets the first range record with an id lower than the random number. This spreads the locks out across N records. You can use 10's, 100's or 1000's of range records. When a range is fully consumed just delete the range record.
If you're really using multiple databases then you can manually ensure each database's set of range records do not overlap.
You need to make sure that your ID column is only ever accessed in a lock - then only one person can read the highest and set the new highest ID.
You can do this in C# using a lock statement around your code that accesses the table, or in your database you can put together a transaction on your read/write. I don't know the exact syntax for this on mysql.
Use a transactional database and control transactions manually. That way you can submit multiple queries without risking having something mixed up. Also, you may store the relevant query sets in stored procedures, so you can simply invoke these transactional queries.
If you have problems with performance, increment the ID by 100 and use a thread per "client" server. The thread should do the increment and hand each interested party a new ID. This way, the thread needs only access the DB once for 100 IDs.
If the thread crashes, you'll loose a couple of IDs but if that doesn't happen all the time, you shouldn't need to worry about it.
AFAIK the only way to get this out of a DB with nicely incrementing numbers is going to be transactional locks at the DB which is hideous performance wise. You can get a lockless behaviour using GUIDs but frankly you're going to run into transaction requirements in every CRUD operation you can think of anyway.
Assuming that your database is configured to run with a transaction isolation of READ_COMMITTED or better, then use one SQL statement that updates the row, setting it to the old value selected from the row plus an increment. With lower levels of transaction isolation you might need to use INSERT combined with SELECT FOR UPDATE.
As pointed out [by Aaron Digulla] it is better to allocate blocks of IDs, to reduce the number of queries and table locks.
The application must perform the ID acquisition in a separate transaction from any business logic, otherwise any transaction that needs an ID will end up waiting for every transaction that asks for an ID first to commit/rollback.
This article: http://www.ddj.com/architect/184415770 explains the HIGH-LOW strategy that allows your application to obtain IDs from multiple allocators. Multiple allocators improve concurrency, reliability and scalability.
There is also a long discussion here: http://www.theserverside.com/patterns/thread.tss?thread_id=4228 "HIGH/LOW Singleton+Session Bean Universal Object ID Generator"