Difference between table and row locks - mysql

I'm studying about MySQL and how it works, and something confuses me and I don't find any clear explanation on the web about this.
What exactly is the difference between row and table locks? One locks the row and the other locks the table. Correct?
So, in which sort of situations would you use a table lock and row lock? Is it something the programmer or database manager can program in or it is the enigne that does it for you?
If there is any other information you think is good to know, feel free to add that to your answer.
I'm sorry for this possible noobish question, but I'm still learning.

While this is SQL server, it applies well to mySQL as well: What are row, page and table locks? And when they are acquired?.
MySQL docs shows this:
Generally, table locks are superior to row-level locks in the following cases:
Most statements for the table are reads.
Statements for the table are a mix of reads and writes, where writes are updates or deletes for a single row that can be fetched with one key read:
SELECT combined with concurrent INSERT statements, and very few UPDATE or DELETE statements.
Many scans or GROUP BY operations on the entire table without any writers.
Now when to use: The infamous "It depends" applies here:
Ask yourself what is the use case for this transaction?
Typically row level locking will be used when high granular control is needed. In my opinion this should be used as the default. Say a orders or orders detail table where the order could be updated or deleted. Locking the whole table on a high transaction volume table makes no sense. I want users of individual orders to be able to update each order and not lock someone else out when I know the scope of their change is a limited to a specific order.
Now if I needed to restore the orders and details table from backup for some reason; or make many updates to many records based on an external source; I may lock the whole table to ensure all the updates complete successfully and I can verify the load before I let anyone back in. I don't want any changes while I'm making the needed updates. But we have to consider if locking the whole table will negatively impact user experience; or if we have no other options available. Locking at the table level will prevent other users from changing any value. IS this really what we want?

Related

Can I INSERT into table while UPDATING multiple different rows with MariaDB or MySQL?

I am creating a custom analytics system and currently in the database designing process. I'm planning to use MariaDB with the InnoDB engine to be able to handle big loads.
The data I'm expecting could be around 500k clicks/day. I will need to insert these rows into the database, which means that I'll have around 5.8 inserts/sec on average. However, at the same time, I want to record if someone visited a page associated with that click. (basically to record funnels)
So what I'm planning to do is to create additional columns and search for the ID of the specific row then update that column with the exact time of the visit.
My first question: is this generally a recommended approach to design the database like that? If not, how else is it worth to design the database?
My only concern is that while updating rows the Table will be locked, and can't do inserts, therefore slowing down the user experience.
My second question: is this something I should worry about, that the table gets locked while updating, and thus slowing down inserts? Does it hurt performance?
InnoDB doesn't lock the table for insert if you're performing the update. Your users won't experience any weird hanging.
It's an MVCC compliant engine, designed to handle concurrent access to underlying tables.
You can control the engine's behavior by choosing an appropriate isolation level, however the default (REPEATABLE READ) is excellent and does the job more than well.
If a table is being modified by multiple users (not users that connect to your site but connections established towards MySQL via a scripting language or some other service) and there's many inserts/updates/deletes - MySQL can throw an error saying a deadlock occurred.
A deadlock is a warning, not an error, that more than 1 thread tried to access an occupied resource (such as two threads tried to update the same row at the same time, but only 1 will be allowed to do so). It's an indication you should repeat the query.
I'm suggesting that you take care of all possible scenarios in the language of your choice when it comes to handling MySQL that's under heavier I/O.
~6 inserts a second isn't a lot, make sure you're allowing MySQL to access sufficient system resources. For InnoDB, check the value of innodb_buffer_pool_size or google a bit to see what it is and how to use it to make your database run fast.
Good luck!
At a mere 5.6/second, there won't be much problem.
I do, however, suggest vertical partitioning for "Likes", "Upvotes", "Clicks", and similar things. These tend to have a lot of UPDATEs of random single rows, and may interfere with other activity.
That is, have a separate table with (perhaps) just 2 columns:
The id of the item being Liked/Clicked/etc.
A counter.
It is simple enough (and fast enough) to JOIN via that id when you want to display info including the counter.
As already pointed out, the row is locked, not the table.

How to perform check-and-insert in mysql?

In my code I need to do the following:
Check a MySQL table (InnoDB) if a particular row (matching some criteria) exists. If it does, return it. If it doesn't, create it and then return it.
The problem I seem to have is race conditions. Every now and then two processes run so closely together, that they both check the table at the same time, don't see the row, and both insert it - thus duplicate data.
I'm reading MySQL documentation trying to come up with some way to prevent this. What I've come up so far:
Unique indexes seem to be one option, but they're not universal (it only works when the criteria is something unique for all rows).
Transactions even at SERIALIZABLE level don't protect against INSERT, period.
Neither do SELECT ... LOCK IN SHARE MODE or SELECT ... FOR UPDATE.
A LOCK TABLE ... WRITE would do it, but it's a very drastic measure - other processes won't be able to read from the table, and I need to lock ALL tables that I intend to use until I unlock them.
Basically, I'd like to do either of the following:
Prevent all INSERT to the table from processes other than mine, while allowing SELECT/UPDATE (this is probably impossible because it make so little sense most of the time).
Organize some sort of manual locking. The two processes would coordinate among themselves which one gets to do the select/insert dance, while the other waits. This needs some sort of operation that waits until the lock is released. I could probably implement a spin-lock (one process repeatedly checks if the other has released the lock), but I'm afraid that it would be too resource intensive.
I think I found an answer myself. Transactions + SELECT ... FOR UPDATE in an InnoDB table can provide a synchronization lock (aka mutex). Have all processes lock on a specific row in a specific table before they start their work. Then only one will be able to run at a time and the rest will wait until the first one finishes its transaction.

Do transactions prevent other updates for a while, or just hide them?

When doing a transaction in a mysql db, they are talking about the ongoing transaction not being able to see any updates made by external sources until it commits. So does this mean that changes CAN be made but the transaction just will not be able to see them, or is it actually impossible to update the db while the transaction is going on.
Because I need it to be impossible for other queries to change anything about certain tables while the transaction is going. Right now I write lock all those tables, start a transaction for the atomicity, commit, and than unlock. Is this the way to do this?
From my testing it seems that setting the isolation level to SERIALIZABLE accomplishes the same as manual table locking and unlocking? Is this correct?
It's going to depend on the transaction isolation level you have set on your database. You can read more about the levels here. For example, for READ UNCOMMITTED, you can actually read rows that are uncommitted by another transaction. This is usually not what you want to happen.
Locking an entire table is a really extreme choice though, and should probably not be done unless there's no other choice. My recommendation would be to consider the rows you need to lock, and then you can lock those specific rows using a select for update statement.
For example, suppose you have a resources table and a schedules table that contains bookings for those resources. When booking a resource, you have to check the schedules table for a given resource to make sure it's available for the desired time. However, you have to do this is a concurrent way, that is, you want to ensure that between the time you check the schedules table for availability for the resource, and the time you actually insert the row into the schedules table, you want to ensure that some other transaction doesn't book the resource for the same time (or an overlapping time).
You can accomplish this by using a select for update command:
select * from resources where resource_name=’a’ for update;
Assuming you're doing this in a stored procedure, if some other code fires the stored procedure for the same resource, it will block on that statement. This will ensure that resources don't get double booked.
We could also accomplish this by locking the entire resources table. However, there's no need to do that since we're only interested in booking a single resource. So it's good enough to just lock the resource row we care about.
Note that for MySQL, you need to index the columns you use in the for update or it will lock the entire table.
The point to all this is to always consider maximum concurrency. In other words, don't lock more than you need to. Otherwise, you make the application much less scalable and you inhibit concurrency.

how can i lock tables in MySQL or phpmyadmin?

I need to use a table for a queuing system. The table will be constantly be updated.
For example, multiple users on my website, will add their files for process, and I heard that when updates occur simultaneously from multiple users, the table becomes non responsive or something like that.
so do I need locking tables in this situation ? how do i apply a lock to a mysql table ?
By constantly be updated do you mean it will be appended to 10,000 times per second? Even for middle-range servers, that still presents 10,000 opportunities per second for the table to be shared by other users.
It's only necessary to lock the table when several dependent operations need to occur as a unit. In most cases, it is sufficient to include the series of operations in a database transaction. If the "constant updates" are merely insert sometable values ( ...), then it will be easy to guarantee transaction consistency.
so do I need locking tables in this situation ?
All tables can be locked, there isn't a special type of table. That said, deal with the issue when you run into deadlocks. Isolation levels are a closely related topic as well.
how do i apply a lock to a mysql table ?
There's the LOCK TABLES syntax - this article covers how & why you'd want to lock tables.
When you do several updates at once, you can get into a deadlock situation. Engines such as InnoDB will detect this and fail one of your transactions (You can retry, but only the whole transaction).
You can avoid this by table locking, but it reduces concurrency.
Engines such as MyISAM (which does not support MVCC or transactions anyway) use table locking implicitly anyway. Such table locks exist only for the duration of a query; they're automatically released soonish after the table isn't needed any more (not necessarily as soon as possible, but quite soon)
I recommend you do not use LOCK TABLE unless you feel you really need to; if you get a deadlock, retry the transaction.

mysql insert race condition

How do you stop race conditions in MySQL? the problem at hand is caused by a simple algorithm:
select a row from table
if it doesn't exist, insert it
and then either you get a duplicate row, or if you prevent it via unique/primary keys, an error.
Now normally I'd think transactions help here, but because the row doesn't exist, the transaction don't actually help (or am I missing something?).
LOCK TABLE sounds like an overkill, especially if the table is updated multiple times per second.
The only other solution I can think of is GET_LOCK() for every different id, but isn't there a better way? Are there no scalability issues here as well? And also, doing it for every table sounds a bit unnatural, as it sounds like a very common problem in high-concurrency databases to me.
what you want is LOCK TABLES
or if that seems excessive how about INSERT IGNORE with a check that the row was actually inserted.
If you use the IGNORE keyword, errors
that occur while executing the INSERT
statement are treated as warnings
instead.
It seems to me you should have a unique index on your id column, so a repeated insert would trigger an error instead of being blindingly accepted again.
That can be done by defining the id as a primary key or using a unique index by itself.
I think the first question you need to ask is why do you have many threads doing the exact SAME work? Why would they have to insert the exact same row?
After that being answered, I think that just ignoring the errors will be the most performant solution, but measure both approaches (GET_LOCK v/s ignore errors) and see for yourself.
There is no other way that I know of. Why do you want to avoid errors? You still have to code for the case when another type of error occurs.
As staticsan says transactions do help but, as they usually are implied, if two inserts are ran by different threads, they will both be inside an implied transactions and see consistent views of the database.
Locking the entire table is indeed overkill. To get the effect that you want, you need something that the litterature calls "predicate locks". No one has ever seen those except printed on the paper that academic studies are published on. The next best thing are locks on the "access paths" to the data (in some DBMS's : "page locks").
Some non-SQL systems allow you to do both (1) and (2) in one single statement, more or less meaning the potential race conditions arising from your OS suspending your execution thread right between (1) and (2), are entirely eliminated.
Nevertheless, in the absence of predicate locks such systems will still need to resort to some kind of locking scheme, and the finer the "granularity" (/"scope") of the locks it takes, the better for concurrency.
(And to conclude : some DBMS's - especially the ones you don't have to pay for - do indeed offer no finer lock granularity than "the entire table".)
On a technical level, a transaction will help here because other threads won't see the new row until you commit the transaction.
But in practice that doesn't solve the problem - it only moves it. Your application now needs to check whether the commit fails and decide what to do. I would normally have it rollback what you did, and restart the transaction because now the row will be visible. This is how transaction-based programmer is supposed to work.
I ran into the same problem and searched the Net for a moment :)
Finally I came up with solution similar to method to creating filesystem objects in shared (temporary) directories to securely open temporary files:
$exists = $success = false;
do{
$exists = check();// select a row in the table
if (!$exists)
$success = create_record();
if ($success){
$exists = true;
}else if ($success != ERROR_DUP_ROW){
log_error("failed to create row not 'coz DUP_ROW!");
break;
}else{
//probably other process has already created the record,
//so try check again if exists
}
}while(!$exists)
Don't be afraid of busy-loop - normally it will execute once or twice.
You prevent duplicate rows very simply by putting unique indexes on your tables. That has nothing to do with LOCKS or TRANSACTIONS.
Do you care if an insert fails because it's a duplicate? Do you need to be notified if it fails? Or is all that matters that the row was inserted, and it doesn't matter by whom or how many duplicates inserts failed?
If you don't care, then all you need is INSERT IGNORE. There is no need to think about transactions or table locks at all.
InnoDB has row level locking automatically, but that applies only to updates and deletes. You are right that it does not apply to inserts. You can't lock what doesn't yet exist!
You can explicitly LOCK the entire table. But if your purpose is to prevent duplicates, then you are doing it wrong. Again, use a unique index.
If there is a set of changes to be made and you want an all-or-nothing result (or even a set of all-or-nothing results within a larger all-or-nothing result), then use transactions and savepoints. Then use ROLLBACK or ROLLBACK TO SAVEPOINT *savepoint_name* to undo changes, including deletes, updates and inserts.
LOCK tables is not a replacement for transactions, but it is your only option with MyISAM tables, which do not support transactions. You can also use it with InnoDB tables if row-level level locking isn't enough. See this page for more information on using transactions with lock table statements.
I have a similar issue. I have a table that under most circumstances should have a unique ticket_id value, but there are some cases where I will have duplicates; not the best design, but it is what it is.
User A checks to see if the ticket is reserved, it isn't
User B checks to see if the ticket is reserved, it isn't
User B inserts a 'reserved' record into the table for that ticket
User A inserts a 'reserved' record into the table for that ticket
User B check for duplicate? Yes, is my record newer? Yes, leave it
User A check for duplicate? Yes, is my record newer? No, delete it
User B has reserved the ticket, User A reports back that the ticket has been taken by someone else.
The key in my instance is that you need a tie-breaker, in my case it's the auto-increment id on the row.
In case insert ignore doesnt fit for you as suggested in the accepted answer , so according to the requirements in your question :
1] select a row from table
2] if it doesn't exist, insert it
Another possible approach is to add a condition to the insert sql statement,
e.g :
INSERT INTO table_listnames (name, address, tele)
SELECT * FROM (SELECT 'Rupert', 'Somewhere', '022') AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_listnames WHERE name = 'Rupert'
) LIMIT 1;
Reference:
https://stackoverflow.com/a/3164741/179744