if an INSERT and a SELECT are done simultaneously on a mysql table which one will go first?
Example: Suppose "users" table row count is 0.
Then this two queries are run at the same time (assume it's at the same mili/micro second):
INSERT into users (id) values (1)
and
SELECT COUNT(*) from users
Will the last query return 0 or 1?
Depends whether your users table is MyISAM or InnoDB.
If it's MyISAM, one statement or the other takes a lock on the table, and there's little you can do to control that, short of locking tables yourself.
If it's InnoDB, it's transaction-based. The multi-versioning architecture allows concurrent access to the table, and the SELECT will see the count of rows as of the instant its transaction started. If there's an INSERT going on simultaneously, the SELECT will see 0 rows. In fact you could even see 0 rows by a SELECT executed some seconds later, if the transaction for the INSERT has not committed yet.
There's no way for the two transactions to start truly simultaneously. Transactions are guaranteed to have some order.
It depends on which statement will be executed first. If first then the second will return 1, if the second one executes first, then it will return 0. Even you are executing them on the computer with multiple physical cores and due to the lock mechanism, they will never ever execute at the exactly same time stamp.
Related
I don't have a real code sorry. But only a problem explanation.
I would like to understand how is the best way to solve this problem.
I have 3 queries:
The first one is a long Transaction which performs an SQL INSERT statement in a table.
The second query COUNTs the number of rows of the previous table after the INSERT took place
The third query UPDATEs one field of the previously inserted record with the count number retrieved by the second query.
So far so good. My 3 queries are executed correctly.
Now suppose that these 3 queries are executed inside an API call. What happens now is that if multiple API calls are executed too fast and simultaneously, the second COUNT query retrieves a wrong value and consequently the 3th UPDATE has also a wrong value.
Nevertheless I have dead locks on the INSERT query because while making the INSERT, the SELECT COUNT tried to read at the same time on a second api call.
My question is what would be the best approach to solve this kind of problem.
I don't need code. I just would like to understand the best way to go.
Would I need to lock all the tables, for example?
It is unclear what you are doing, but this might be faster:
CREATE TEMPORARY TABLE t ...; -- all columns except count
INSERT IN t ...; -- the incoming data
SELECT COUNT(*) INTO #ct FROM t;
INSERT INTO real_table
(...) -- including the count-column last
SELECT ..., #ct FROM t; -- Note how count is tacked on last
I', writing MySQL queries in a multi threaded environment so this query can be executed on any given number of threads. My db is MySQL 8 using InnoDB engine.
Let says I have a DB table with 10 numbers (1,2,3,4,5,6,7,8,9,10)
I have a SELECT ... FOR UPDATE query with a limit of 2 rows from a table in the database. FOR UPDATE will lock the rows to ensure isolation. If I have 5 threads that start at the same time will thread 1 grab entries 1 and 2, thread 2 see that thread 1 got entries 1 and 2 so it will go grab 3 and 4.. and so on.
Would it behavior this way?
No, the locks should have no influence on the query plan. The queries will try to select whichever rows fit the WHERE and ORDER BY criteria. If they're locked by another thread, it will block.
Also, the locking will depend on whether the WHERE or ORDER BY clauses use indexed columns or not. If you examine non-indexed columns, it will have to scan the entire table to find or order the rows, which will effectively lock the entire table. If you restrict these clauses to indexed columns, it should just be able to set locks on those indexes.
See use of LIMIT, FOR UPDATE in SELECT statement in the MySQL Forum for some more information.
If I have a query like:
UPDATE table_x SET a = 1 WHERE id = ? AND (
SELECT SUM(a) < 100 FROM table_x
)
And
hundreds of this query could be made at exactly the same time
I need to be certain that a never gets to more than 100
Do I need to lock the table or will table_x be locked automatically as it's a subquery?
Assuming this is innodb table, You will have row level locking . So, even if they are 100 of these happening at a time, only ONE transaction will be able to acquire the lock on those rows and finish processing before the next transaction is to occur. There is no difference between how a transaction is processed for the update and the subquery. To the innodb engine this is all ONE transaction, not two separate transactions.
If you want to see what is going on behind the scenes when you run your query, type 'show engine innodb status' in the command line while the query is running.
Here is a great walkthrough on what all that output means.
If you want to read more about Innodb and row level locking, follow link here.
It is unclear to me (by reading MySQL docs) if the following query ran on INNODB tables on MySQL 5.1, would create WRITE LOCK for each of the rows the db updates internally (5000 in total) or LOCK all the rows in the batch. As the database has really heavy load, this is very important.
UPDATE `records`
INNER JOIN (
SELECT id, name FROM related LIMIT 0, 5000
) AS `j` ON `j`.`id` = `records`.`id`
SET `name` = `j`.`name`
I'd expect it to be per row but as I do not know a way to make sure it is so, I decided to ask someone with deeper knowledge. If this is not the case and the db would LOCK all the rows in the set, I'd be thankful if you give me explanation why.
The UPDATE is running in transaction - it's an atomic operation, which means that if one of the rows fails (because of unique constrain for example) it won't update any of the 5000 rows. This is one of the ACID properties of a transactional database.
Because of this the UPDATE hold a lock on all of the rows for the entire transaction. Otherwise another transaction can further update the value of a row, based on it's current value (let's say update records set value = value * '2'). This statement should produce different result depending if the first transaction commits or rollbacks. Because of this it should wait for the first transaction to complete all 5000 updates.
If you want to release the locks, just do the update in (smaller) batches.
P.S. autocommit controls if each statement is issued in own transaction, but does not effect the execution of a single query
I need a little help with SELECT FOR UPDATE (resp. LOCK IN SHARE MODE).
I have a table with around 400 000 records and I need to run two different processing functions on each row.
The table structure is appropriately this:
data (
`id`,
`mtime`, -- When was data1 set last
`data1`,
`data2` DEFAULT NULL,
`priority1`,
`priority2`,
PRIMARY KEY `id`,
INDEX (`mtime`),
FOREIGN KEY ON `data2`
)
Functions are a little different:
first function - has to run in loop on all records (is pretty fast), should select records based on priority1; sets data1 and mtime
second function - has to run only once on each records (is pretty slow), should select records based on priority2; sets data1 and mtime
They shouldn't modify the same row at the same time, but the select may return one row in both of them (priority1 and priority2 have different values) and it's okay for transaction to wait if that's the case (and I'd expect that this would be the only case when it'll block).
I'm selecting data based on following queries:
-- For the first function - not processed first, then the oldest,
-- the same age goes based on priority
SELECT id FROM data ORDER BY mtime IS NULL DESC, mtime, priority1 LIMIT 250 FOR UPDATE;
-- For the second function - only processed not processed order by priority
SELECT if FROM data ORDER BY priority2 WHERE data2 IS NULL LIMIT 50 FOR UPDATE;
But what I am experiencing is that every time only one query returns at the time.
So my questions are:
Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT) would be appreciated )?
Can ORDER BY ... LIMIT ... cause any issues?
Can indexes and keys cause any issues?
Key things to check for before getting much further:
Ensure the table engine is InnoDB, otherwise "for update" isn't going to lock the row, as there will be no transactions.
Make sure you're using the "for update" feature correctly. If you select something for update, it's locked to that transaction. While other transactions may be able to read the row, it can't be selected for update, updated or deleted by any other transaction until the lock is released by the original locking transaction.
To keep things clean, try explicitly starting a transaction using "START TRANSACTION", run your select "for update", do whatever you're going to do to the records that are returned, and finish up by explicitly executing a "COMMIT" to close out the transaction.
Order and limit will have no impact on the issue you're experiencing as far as I can tell, whatever was going to be returned by the Select will be the rows that get locked.
To answer your questions:
Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
Yes, but not on the same rows. Locks can only exist at the row level in one transaction at a time.
Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT) would be appreciated )?
There could be a short period where the row lock is being calculated, which will delay the second query, however unless you're running many hundreds of these select for updates at once, it shouldn't cause you any significant or noticable delays.
Can ORDER BY ... LIMIT ... cause any issues?
Not in my experience. They should work just as they always would on a normal select statement.
Can indexes and keys cause any issues?
Indexes should exist as always to ensure sufficient performance, but they shouldn't cause any issues with obtaining a lock.
All points in accepted answer seem fine except below 2 points:
"whatever was going to be returned by the Select will be the rows that get locked." &
"Can indexes and keys cause any issues?
but they shouldn't cause any issues with obtaining a lock."
Instead all the rows which are internally read by DB during deciding which rows to select and return will be locked. For example below query will lock all rows of the table but might select and return only few rows:
select * from table where non_primary_non_indexed_column = ? for update
Since there is no index, DB will have to read the entire table to search for your desired row and hence lock entire table.
If you want to lock only one row either you need to specify its primary key or an indexed column in the where clause. Thus indexing becomes very important in case of locking only the appropriate rows.
This is a good reference - https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html