I have a website, which has a method (in the backing bean) which does several READ-requests (to check e.g. the users' rights) and then to insert some data.
the prodecure looks like this:
1.) SELECT * FROM userLocks (table which logs the users which are exclusively working on a topic, to avoid redundancy). it checks if there is currently no lock (user working) for the topic.
2.) SELECT * FROM ... (some other selects for further checking)
3.) INSERT INTO userLocks (curUser, timeout) values (...) (if everything was ok, create a new lock for the current user for the topic)
The problem I am facing now is: Between the first select and the insert many other users can request the same website which manipulates the data of the table userLocks. It seems to be not thread-safe.
Is there a solution for this problem? all I need is that all the used Tables are locked during this procedure for other users. (the other request just should wait until the lock is released, which dont take more than half a second...)
(I use InnoDB)
I would suggest you need to research SQL Transactions a little and row and table locks for InnoDB. This was an article that helped me https://blogs.oracle.com/mysqlinnodb/entry/introduction_to_transaction_locks_in
Related
In my code I need to do the following:
Check a MySQL table (InnoDB) if a particular row (matching some criteria) exists. If it does, return it. If it doesn't, create it and then return it.
The problem I seem to have is race conditions. Every now and then two processes run so closely together, that they both check the table at the same time, don't see the row, and both insert it - thus duplicate data.
I'm reading MySQL documentation trying to come up with some way to prevent this. What I've come up so far:
Unique indexes seem to be one option, but they're not universal (it only works when the criteria is something unique for all rows).
Transactions even at SERIALIZABLE level don't protect against INSERT, period.
Neither do SELECT ... LOCK IN SHARE MODE or SELECT ... FOR UPDATE.
A LOCK TABLE ... WRITE would do it, but it's a very drastic measure - other processes won't be able to read from the table, and I need to lock ALL tables that I intend to use until I unlock them.
Basically, I'd like to do either of the following:
Prevent all INSERT to the table from processes other than mine, while allowing SELECT/UPDATE (this is probably impossible because it make so little sense most of the time).
Organize some sort of manual locking. The two processes would coordinate among themselves which one gets to do the select/insert dance, while the other waits. This needs some sort of operation that waits until the lock is released. I could probably implement a spin-lock (one process repeatedly checks if the other has released the lock), but I'm afraid that it would be too resource intensive.
I think I found an answer myself. Transactions + SELECT ... FOR UPDATE in an InnoDB table can provide a synchronization lock (aka mutex). Have all processes lock on a specific row in a specific table before they start their work. Then only one will be able to run at a time and the rest will wait until the first one finishes its transaction.
I need to perform some scripted actions, which may take a while (like maybe a minute). At the beginning of these actions, I take some measures from the MySQL DB and it's important that they do not change until the actions are done. The DB has dozens of tables since it belongs to a quite old fashioned but huge CMS, and the CMS users have a dozen options to modify it.
I do not even want to change anything in the time my scripts runs in the DB myself, it just shall be frozen. It's not a Dump or Update. But tables should be kept open for reading for everyone, to prevent visitors of the connected homepage from getting errors.
If the database altering actions, which may be performed by other CMS users in the meantime would be triggered after the DB is unlocked again, it would be perfect, but if they fail, I would not mind.
So I thought at the beginning of the script I lock the tables down with
lock first_table write;
lock second_table write;
...
And after I do
unlock tables
I think that should do exactly what I want. But can I archive this for all tables of the db without naming them explicitly, to make this more futureproof?
This does not work for sure:
lock tables (select TABLE_NAME from information_schema.tables
where table_schema='whatever') write;
Another question would be, if someone can answer this on the fly, if I would have to perfom the lock/unlock with another MYSQL user than the one used by the CMS. If I understood this right, then yes.
Below is the statement to lock all tables (actually it creates a single global lock):
FLUSH TABLES WITH READ LOCK;
Then release it with:
UNLOCK TABLES;
Mysqldump does this, for example, unless you are backing up only transactional tables and use the --single-transaction option.
Read http://dev.mysql.com/doc/refman/5.7/en/flush.html for more details about FLUSH TABLES.
Re your comment:
Yes, this takes a global READ LOCK on all tables. Even your own session cannot write. My apologies for overlooking this requirement of yours.
There is no equivalent global statement to give you a write lock. You'll have to lock tables by name explicitly.
There's no syntax for wildcard table names, nor is there syntax for putting a subquery in the LOCK TABLES statement.
You'll have to get a list of table names and build a dynamic SQL query.
This type of question has been posted a few times, but the solutions offered are not ideal in the following situation. In the first query, I'm selecting table names that I know exist when this first query is executed. Then while looping through them, I want to query the number of records in the selected tables, but only if they still exist. The problem is, during the loop, some of the tables are dropped by another script. For example :
SELECT tablename FROM table
-- returns say 100 tables
while (%tables){
SELECT COUNT(*) FROM $table
-- by the time it gets to the umpteenth table, it's been dropped
-- so the SELECT COUNT(*) fails
}
And, I guess because it's run by cron, it fails fataly, and I get sent an email from cron stating it failed.
DBD::mysql::st execute failed: Table 'xxx' doesn't exist at
/usr/local/lib/perl/5.10.1/Mysql.pm line 175.
Script is using the deprecated Mysql.pm perl module.
Obviously you need to secure table to make sure it won't get deleted before you execute your query. Keep in mind, that if you begin with some kind of table lock, to avoid possible drop - the DROP TABLE query issued from some other place will fail with some lock error, or at least will wait until your SELECT finishes. Dropping a table isn't really often used operation, so with most cases the schema design persists during server operation - what you observe is really rare behaviour. In general, preventing table from being dropped during other query just isn't supported, however, in comments for below document you may find some trick with usage of semaphore tables to achieve it.
http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
"A table lock protects only against inappropriate reads or writes by other sessions. The session holding the lock, even a read lock, can perform table-level operations such as DROP TABLE. Truncate operations are not transaction-safe, so an error occurs if the session attempts one during an active transaction or while holding a table lock."
"If you need to do things with tables not normally supported by read or write locks (like dropping or truncating a table), and you're able to cooperate, you can try this: Use a semaphore table, and create two sessions per process. In the first session, get a read or write lock on the semaphore table, as appropriate. In the second session, do all the stuff you need to do with all the other tables."
You should be able to protect your perl code from failing by putting it into eval block. Something like that:
eval {
# try doing something with DBD::mysql
};
if ($#) {
# oops, mysql code failed.
# probably need to try it again
}
Or even put this in "while" loop
If you used better server like Postgres, right solution would be to enclose everything into transaction. But, in MySQL dropping table is not protected by transactions.
TL;DR - MySQL doesn't let you lock a table and use a transaction at the same time. Is there any way around this?
I have a MySQL table I am using to cache some data from a (slow) external system. The data is used to display web pages (written in PHP.) Every once in a while, when the cached data is deemed too old, one of the web connections should trigger an update of the cached data.
There are three issues I have to deal with:
Other clients will try to read the cache data while I am updating it
Multiple clients may decide the cache data is too old and try to update it at the same time
The PHP instance doing the work may be terminated unexpectedly at any time, and the data should not be corrupted
I can solve the first and last issues by using a transaction, so clients will be able to read the old data until the transaction is committed, when they will immediately see the new data. Any problems will simply cause the transaction to be rolled back.
I can solve the second problem by locking the tables, so that only one process gets a chance to perform the update. By the time any other processes get the lock they will realise they have been beaten to the punch and don't need to update anything.
This means I need to both lock the table and start a transaction. According to the MySQL manual, this is not possible. Starting a transaction releases the locks, and locking a table commits any active transaction.
Is there a way around this, or is there another way entirely to achieve my goal?
This means I need to both lock the table and start a transaction
This is how you can do it:
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
For more info, see mysql doc
If it were me, I'd use the advisory locking function within MySQL to implement a mutex for updating the cache, and a transaction for read isolation. e.g.
begin_transaction(); // although reading a single row doesnt really require this
$cached=runquery("SELECT * FROM cache WHERE key=$id");
end_transaction();
if (is_expired($cached)) {
$cached=refresh_data($cached, $id);
}
...
function refresh_data($cached, $id)
{
$lockname=some_deterministic_transform($id);
if (1==runquery("SELECT GET_LOCK('$lockname',0)") {
$cached=fetch_source_data($id);
begin_transaction();
write_data($cached, $id);
end_transaction();
runquery("SELECT RELEASE_LOCK('$lockname')");
}
return $cached;
}
(BTW: bad things may happen if you try this with persistent connections)
I'd suggest to solve the issue by removing the contention altogether.
Add a timestamp column to your cached data.
When you need to update the cached data:
Just add new cached data to your table using the current timestamp
Remove cached data older than, let's say, 24 hours.
When you need to serve the cached data
Sort by timestamp (DESC) and return the newest cached data
At any given time your clients will retrieve records which are never deleted by any other process. Moreover, you don't care if a client gets cached data belonging to different writes (i.e. with different timestamps)
The second problem may be solved without involving the database at all. Have a lock file for the cache update procedure so that other clients know that someone is already on it. This may not catch each and every corner case, but is it that big of a deal if two clients are updating the cache at the same time? After all, they are doing the update in transactions to the cache will still be consistent.
You may even implement the lock yourself by having the last cache update time stored in a table. When a client wants update the cache, make it lock that table, check the last update time and then update the field.
I.e., implement your own locking mechanism to prevent multiple clients from updating the cache. Transactions will take care of the rest.
How do you stop race conditions in MySQL? the problem at hand is caused by a simple algorithm:
select a row from table
if it doesn't exist, insert it
and then either you get a duplicate row, or if you prevent it via unique/primary keys, an error.
Now normally I'd think transactions help here, but because the row doesn't exist, the transaction don't actually help (or am I missing something?).
LOCK TABLE sounds like an overkill, especially if the table is updated multiple times per second.
The only other solution I can think of is GET_LOCK() for every different id, but isn't there a better way? Are there no scalability issues here as well? And also, doing it for every table sounds a bit unnatural, as it sounds like a very common problem in high-concurrency databases to me.
what you want is LOCK TABLES
or if that seems excessive how about INSERT IGNORE with a check that the row was actually inserted.
If you use the IGNORE keyword, errors
that occur while executing the INSERT
statement are treated as warnings
instead.
It seems to me you should have a unique index on your id column, so a repeated insert would trigger an error instead of being blindingly accepted again.
That can be done by defining the id as a primary key or using a unique index by itself.
I think the first question you need to ask is why do you have many threads doing the exact SAME work? Why would they have to insert the exact same row?
After that being answered, I think that just ignoring the errors will be the most performant solution, but measure both approaches (GET_LOCK v/s ignore errors) and see for yourself.
There is no other way that I know of. Why do you want to avoid errors? You still have to code for the case when another type of error occurs.
As staticsan says transactions do help but, as they usually are implied, if two inserts are ran by different threads, they will both be inside an implied transactions and see consistent views of the database.
Locking the entire table is indeed overkill. To get the effect that you want, you need something that the litterature calls "predicate locks". No one has ever seen those except printed on the paper that academic studies are published on. The next best thing are locks on the "access paths" to the data (in some DBMS's : "page locks").
Some non-SQL systems allow you to do both (1) and (2) in one single statement, more or less meaning the potential race conditions arising from your OS suspending your execution thread right between (1) and (2), are entirely eliminated.
Nevertheless, in the absence of predicate locks such systems will still need to resort to some kind of locking scheme, and the finer the "granularity" (/"scope") of the locks it takes, the better for concurrency.
(And to conclude : some DBMS's - especially the ones you don't have to pay for - do indeed offer no finer lock granularity than "the entire table".)
On a technical level, a transaction will help here because other threads won't see the new row until you commit the transaction.
But in practice that doesn't solve the problem - it only moves it. Your application now needs to check whether the commit fails and decide what to do. I would normally have it rollback what you did, and restart the transaction because now the row will be visible. This is how transaction-based programmer is supposed to work.
I ran into the same problem and searched the Net for a moment :)
Finally I came up with solution similar to method to creating filesystem objects in shared (temporary) directories to securely open temporary files:
$exists = $success = false;
do{
$exists = check();// select a row in the table
if (!$exists)
$success = create_record();
if ($success){
$exists = true;
}else if ($success != ERROR_DUP_ROW){
log_error("failed to create row not 'coz DUP_ROW!");
break;
}else{
//probably other process has already created the record,
//so try check again if exists
}
}while(!$exists)
Don't be afraid of busy-loop - normally it will execute once or twice.
You prevent duplicate rows very simply by putting unique indexes on your tables. That has nothing to do with LOCKS or TRANSACTIONS.
Do you care if an insert fails because it's a duplicate? Do you need to be notified if it fails? Or is all that matters that the row was inserted, and it doesn't matter by whom or how many duplicates inserts failed?
If you don't care, then all you need is INSERT IGNORE. There is no need to think about transactions or table locks at all.
InnoDB has row level locking automatically, but that applies only to updates and deletes. You are right that it does not apply to inserts. You can't lock what doesn't yet exist!
You can explicitly LOCK the entire table. But if your purpose is to prevent duplicates, then you are doing it wrong. Again, use a unique index.
If there is a set of changes to be made and you want an all-or-nothing result (or even a set of all-or-nothing results within a larger all-or-nothing result), then use transactions and savepoints. Then use ROLLBACK or ROLLBACK TO SAVEPOINT *savepoint_name* to undo changes, including deletes, updates and inserts.
LOCK tables is not a replacement for transactions, but it is your only option with MyISAM tables, which do not support transactions. You can also use it with InnoDB tables if row-level level locking isn't enough. See this page for more information on using transactions with lock table statements.
I have a similar issue. I have a table that under most circumstances should have a unique ticket_id value, but there are some cases where I will have duplicates; not the best design, but it is what it is.
User A checks to see if the ticket is reserved, it isn't
User B checks to see if the ticket is reserved, it isn't
User B inserts a 'reserved' record into the table for that ticket
User A inserts a 'reserved' record into the table for that ticket
User B check for duplicate? Yes, is my record newer? Yes, leave it
User A check for duplicate? Yes, is my record newer? No, delete it
User B has reserved the ticket, User A reports back that the ticket has been taken by someone else.
The key in my instance is that you need a tie-breaker, in my case it's the auto-increment id on the row.
In case insert ignore doesnt fit for you as suggested in the accepted answer , so according to the requirements in your question :
1] select a row from table
2] if it doesn't exist, insert it
Another possible approach is to add a condition to the insert sql statement,
e.g :
INSERT INTO table_listnames (name, address, tele)
SELECT * FROM (SELECT 'Rupert', 'Somewhere', '022') AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_listnames WHERE name = 'Rupert'
) LIMIT 1;
Reference:
https://stackoverflow.com/a/3164741/179744