I've a MyISAM table (I can't change it to use InnoDb, do please don't suggest that) which is pretty big (~20GB)
I've a worker which regularly dump this table (I launch is with the --skip-lock-tables option)
During the dump (which takes ~5min), concurrent select can be correctly run, as I would expect. When I go a "REPLACE" during the dump, this REPLACE is "waiting for metadatalock" which seems normal too.
But, every SELECT started after the start the REPLACE will also be "waiting for metadata lock". I can't understand why. Could you help me on this, and tell me how I can have all the selects correctly run (even after this replace)
Thanks !
What is happening is:
Your worker is making a big SELECT. The SELECT is locking the table with a read lock. By the way, the skip-lock-tables only means that you are not locking all the tables at once, but the SELECT query is still locking each table individually. More info on this answer.
Your REPLACE is trying to INSERT but has to wait for the first SELECT (the dump) to finish in order to acquire a write lock. It is put in the write lock queue.
Every SELECT after the REPLACE is put in the read lock queue.
This is a behavior described in the doc on table-level locking:
Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not “starved” even when there is heavy SELECT activity for the table.
If you want the SELECT to not wait for the REPLACE you could (never actually tested that) try the LOW_PRIORITY modifier on your replace.
If you use the LOW_PRIORITY modifier, execution of the INSERT is delayed until no other clients are reading from the table. This includes other clients that began reading while existing clients are reading, and while the INSERT LOW_PRIORITY statement is waiting. It is possible, therefore, for a client that issues an INSERT LOW_PRIORITY statement to wait for a very long time (or even forever) in a read-heavy environment. (This is in contrast to INSERT DELAYED, which lets the client continue at once.)
However be careful as it might never run if there are always a lot of select.
Related
On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?
In my code I need to do the following:
Check a MySQL table (InnoDB) if a particular row (matching some criteria) exists. If it does, return it. If it doesn't, create it and then return it.
The problem I seem to have is race conditions. Every now and then two processes run so closely together, that they both check the table at the same time, don't see the row, and both insert it - thus duplicate data.
I'm reading MySQL documentation trying to come up with some way to prevent this. What I've come up so far:
Unique indexes seem to be one option, but they're not universal (it only works when the criteria is something unique for all rows).
Transactions even at SERIALIZABLE level don't protect against INSERT, period.
Neither do SELECT ... LOCK IN SHARE MODE or SELECT ... FOR UPDATE.
A LOCK TABLE ... WRITE would do it, but it's a very drastic measure - other processes won't be able to read from the table, and I need to lock ALL tables that I intend to use until I unlock them.
Basically, I'd like to do either of the following:
Prevent all INSERT to the table from processes other than mine, while allowing SELECT/UPDATE (this is probably impossible because it make so little sense most of the time).
Organize some sort of manual locking. The two processes would coordinate among themselves which one gets to do the select/insert dance, while the other waits. This needs some sort of operation that waits until the lock is released. I could probably implement a spin-lock (one process repeatedly checks if the other has released the lock), but I'm afraid that it would be too resource intensive.
I think I found an answer myself. Transactions + SELECT ... FOR UPDATE in an InnoDB table can provide a synchronization lock (aka mutex). Have all processes lock on a specific row in a specific table before they start their work. Then only one will be able to run at a time and the rest will wait until the first one finishes its transaction.
I would like to be able to lock an entire table to prevent any INSERTs or UPDATEs in it between the "beginTransaction" and the ending "commit" or "rollback".
I know that beginning a transaction results in an implicit UNLOCK TABLES and that a LOCK table results in a implicit COMMIT... so is there any way to do what I want?
Why? Perhaps you have missed the point of transactions.
If you use repeatable-read transaction isolation, inserts, updates etc, can happen during your transaction, BUT YOU WILL NOT SEE THEM. So as far as your process is concerned, the table is locked for inserts/updates. Except they are still happening, they're still durable to disc, and other processes can continue.
After you do your first "select", a snapshot is created, and you are effectively reading that snapshot, not the latest version. If this is what you want, repeatable-read works well for you.
select count(*) from table
within a transaction, locks the talbe on msSQL 2000
If you're using PHP, so when there is a transaction going on, you can set a SESSION variable to tell the script not to do anything with the database, i.e. $_SESSION['on_going_transaction'] = true.
When the transaction is completed, just destroy the SESSION variable so that another transaction can occur. This is much easier.
I was going through some code and noticed that UPDATE LOW_PRIORITY and INSERT DELAYED INTO are used for updating the database. What is is the use of these statements? Should I use these in every insert and update statement for various tables in the same database?
With the LOW_PRIORITY keyword, execution of the UPDATE is delayed until no other clients are reading from the table. Normally, reading clients are put on hold until the update query is done. If you want to give the reading clients priority over the update query, you should use LOW_PRIORITY.
The DELAYED option for the INSERT statement is a MySQL extension to standard SQL that is very useful if you have clients that cannot or need not wait for the INSERT to complete. This is a common situation when you use MySQL for logging and you also periodically run SELECT and UPDATE statements that take a long time to complete.
LOW_PRIORITY, HIGH_PRIORITY and DELAYED are only useful in a few circustamces. If you don't have a BIG load they can't help you. If you have, don't do anything you don't fully understand.
All of these otpions only work with MyISAM, not InnoDB, not views.
DELAYED doesn't work with partitioned tables, and it's clearly designed for dataware house. The client sends the insert and then forgets it, without waiting for the result. So you won't know if the insert succeded, if there were duplicate values, etc. It should never be used while other threads could SELECT from that table, because an insert delayed is never concurrent.
LOW_PRIORITY waits until NO client is accessing the table. But if you have a high traffic, you may wait until the connection times out... that's not what you want, I suppose :)
Also, note that DELAYED will be removed in Oracle MySQL 5.7 (but not in MariaDB).
If you need to use these, then you have a big load on your server, and you know that some UPDATE or INSERT statements are not high priority and they can act on load.
Example: SQL that generates some statistics or items top. They are slow, and do not need to be executed immediately.
If your UPDATEs on MySQL a read intensive environment are taking as much as 1800 seconds then it is advisable to use the UPDATE LOW_PRIORITY.
In MySQL:
Every one minute I empty the table and fill it with a new data. Now I want that users should not read data during the fill process, before or after is ok.
How do I achieve this?
Is transaction the way?
Assuming you use a transactional engine (Usually Innodb), clear and refill the table in the same transaction.
Be sure that your readers use READ_COMMITTED or higher transaction isolation level (the default is REPEATABLE READ which is higher).
That way readers will continue to be able to read the old contents of the table during the update.
There are a few things to be careful of:
If the table is so big that it exhausts the rollback area - this is possible if you update the whole of (say) a 1M row table. Of course this is tunable but there are limits
If the transaction fails part way through and gets rolled back - rolling back big transactions is VERY inefficient in InnoDB (it is optimised for commits, not rollbacks)
Be careful of deadlocks and lock wait timeouts, which are more likely if you use big transactions.
You can LOCK your table for the duration of your operation:
http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
A table lock protects only against
inappropriate reads or writes by other
sessions. The session holding the
lock, even a read lock, can perform
table-level operations such as DROP
TABLE. Truncate operations are not
transaction-safe, so an error occurs
if the session attempts one during an
active transaction or while holding a
table lock.
I don't know enough about the internal row-versioning mechanisms of MySql (or indeed, if there is one), but other databases (Oracle, Postgresql, and more recently, Sql Server) have invested a lot of effort into allowing writers to not block readers, in so far as readers have access to the version of the rows that existed immediately before the update/write process started. Once the update is committed, that version of the row becomes the one made availabe to all readers, thereby avoiding a bottleneck that the above behaviour in MySql will introduce.
This policy ensures that table locking
is deadlock free. There are, however,
other things you need to be aware of
about this policy: If you are using a
LOW_PRIORITY WRITE lock for a table,
it means only that MySQL waits for
this particular lock until there are
no other sessions that want a READ
lock. When the session has gotten the
WRITE lock and is waiting to get the
lock for the next table in the lock
table list, all other sessions wait
for the WRITE lock to be released. If
this becomes a serious problem with
your application, you should consider
converting some of your tables to
transaction-safe tables.
You can load your data into a shadow table as slowly as you like, then instantly swap the shadow and actual with RENAME TABLE:
truncate table shadow; # make sure it is clean to start with
insert into shadow .....; # lots of inserts etc against shadow table
rename table active to temp, shadow to active, temp to shadow;
truncate table shadow; # throw away the old active data
The rename statement is atomic. An intermediate name "temp" is used to help swap the names of temp and active.
This should work with all storage engines.
Rename table - MySQL Manual