MySQL Locking scenarios - mysql

We have a large table of about 100 million records and 100+ fields and there are frequent select and update queries running related to this table.
Now we have a requirement to set almost 50+ fields to null and we are planning to do this updation based on the primary key.
We are aware that there will be a locking mechanism when two updates are trying to update the same record.
Our question is, what happens when an update and select query is trying to access the same record.
For example in our case
case1: If we are selecting some 10000 records in one thread and during this select query execution if we are trying to update one of this 10000 records to null in another thread, Will both of this query executes without waiting for the other query? how will be the locking mechanism behave in this scenario?
case2: If we are updating 10000 records to null and during this update query execution if we are trying to select one of these 10000 records, Will both of these queries execute without waiting for the other query? how will be the locking mechanism behave in this scenario?
We are using MySQL 5.7, InnoDB engine and consider all parameters in MySQL is default
Apologizing for this basic question

Given your premise that you use InnoDB as the storage engine and default configuration options:
Readers do not block writers, and vice-versa, full stop.
A SELECT query (unless it's a locking read) can read rows even if they are locked by a concurrent transaction. The result of the query will be based on the latest version of the rows that were committed at the time the SELECT query's transaction snapshot started.
Likewise, an UPDATE can lock a row even if it is being read by a concurrent transaction.

Related

Concurrent inserts and race condition in MySQL

I have a case, where I should limit rows per user in the table. Now I am doing this via COUNT * FROM table check before insert, and if the count equals/more than allowed, I throw an error. COUNT and INSERT query running in the single transaction.
But, on 5000 online users and 50K requests per minute, I have extra records (more than limit) in the table. Looks like a race condition on parallel inserts. How can I avoid this? Can anyone suggest some best practices?
Use a separate table which will maintain the user and the count of rows inserted. Use the userid as foreign key to the main table. Now if you have a session based application you can load the data into the session or memory and keep fetching it / updating the count after every insert in the session / memory and the database and then actually inserting into the main table.
The issue is called Phantom read. Typically it can be resolved by using Serializable isolation level of transaction:
https://en.wikipedia.org/wiki/Isolation_(database_systems)
But it can decrease performance. So if you have a lot of inserts than try other options from comments too.

Will a MySQL SELECT statement interrupt INSERT statement?

I have a mysql table that keep gaining new records every 5 seconds.
The questions are
can I run query on this set of data that may takes more than 5 seconds?
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
I'll go over your questions and some of the comments you added later.
can I run query on this set of data that may takes more than 5 seconds?
Can you? Yes. Should you? It depends. In a MySQL configuration I set up, any query taking longer than 3 seconds was considered slow and logged accordingly. In addition, you need to keep in mind the frequency of the queries you intend to run.
For example, if you try to run a 10 second query every 3 seconds, you can probably see how things won't end well. If you run a 10 second query every few hours or so, then it becomes more tolerable for the system.
That being said, slow queries can often benefit from optimizations, such as not scanning the entire table (i.e. search using primary keys), and using the explain keyword to get the database's query planner to tell you how it intends to work on that internally (e.g. is it using PKs, FKs, indices, or is it scanning all table rows?, etc).
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
"Affect" in what way? If you mean "prevent insert from actually inserting until the select has completed", that depends on the storage engine. For example, MyISAM and InnoDB are different, and that includes locking policies. For example, MyISAM tends to lock entire tables while InnoDB tends to lock specific rows. InnoDB is also ACID-compliant, which means it can provide certain integrity guarantees. You should read the docs on this for more details.
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
Part of "what happens" is determined by how the specific storage engine behaves. Regardless of what happens, the database is designed to answer application queries in a way that's consistent.
As an example, if the select statement were to lock an entire table, then the insert statement would have to wait until the select has completed and the lock has been released, meaning that the app would see the results prior to the insert's update.
I understand that locking database can prevent messing up the SELECT statement.
It can also put a potentially unacceptable performance bottleneck, especially if, as you say, the system is inserting lots of rows every 5 seconds, and depending on the frequency with which you're running your queries, and how efficiently they've been built, etc.
what is the good practice to do when I need the data for calculations while those data will be updated within short period?
My recommendation is to simply accept the fact that the calculations are based on a snapshot of the data at the specific point in time the calculation was requested and to let the database do its job of ensuring the consistency and integrity of said data. When the app requests data, it should trust that the database has done its best to provide the most up-to-date piece of consistent information (i.e. not providing a row where some columns have been updated, but others yet haven't).
With new rows coming in at the frequency you mentioned, reasonable users will understand that the results they're seeing are based on data available at the time of request.
All of your questions are related to locking of table.
Your all questions depend on the way database is configured.
Read : http://www.mysqltutorial.org/mysql-table-locking/
Perform Select Statement While insert statement working
If you want to perform a select statement during insert SQL is performing, you should check by open new connection and close connection every time. i.e If I want to insert lots of records, and want to know that last record has inserted by selecting query. I must have to open connection and close connection in for loop or while loop.
# send a request to store data
insert statement working // take a long time
# select statement in while loop.
while true:
cnx.open()
select statement
cnx.close
//break while loop if you get the result

Concurrent mysql queries causing large query queue's

I have a large mysql database that receives large volumes of queries, each query takes around 5-10 seconds to perform.
Queries involve checking records, updating records and adding records.
I'm experiencing some significant bottle necks in the query executions, which I believe is due to incoming queries having to 'queue' whilst current queries are using records that these incoming queries need to access.
Is there a way, besides completely reformatting my database structure and SQL queries, to enable simultaneous use of database records by queries?
An INSERT, UPDATE, or DELETE operation locks the relevant tables - myISAM - or rows -InnoDB - until the operation completes. Be sure your query of this type are fastly commited .. and also chechck for you transacation isolating the part with relevant looking ..
For MySQL internal locking see: https://dev.mysql.com/doc/refman/5.5/en/internal-locking.html
Also remeber that in mysql there are differente storage engine with different features eg:
The MyISAM storage engine supports concurrent inserts to reduce
contention between readers and writers for a given table: If a MyISAM
table has no holes in the data file (deleted rows in the middle), an
INSERT statement can be executed to add rows to the end of the table
at the same time that SELECT statements are reading rows from the
table.
https://dev.mysql.com/doc/refman/5.7/en/concurrent-inserts.html
eventually take a look at https://dev.mysql.com/doc/refman/5.7/en/optimization.html

How to improve InnoDB's SELECT performance while INSERTing

We recently switched our tables to use InnoDB (from MyISAM) specifically so we could take advantage of the ability to make updates to our database while still allowing SELECT queries to occur (i.e. by not locking the entire table for each INSERT)
We have a cycle that runs weekly and INSERTS approximately 100 million rows using "INSERT INTO ... ON DUPLICATE KEY UPDATE ..."
We are fairly pleased with the current update performance of around 2000 insert/updates per second.
However, while this process is running, we have observed that regular queries take very long.
For example, this took about 5 minutes to execute:
SELECT itemid FROM items WHERE itemid = 950768
(When the INSERTs are not happening, the above query takes several milliseconds.)
Is there any way to force SELECT queries to take a higher priority? Otherwise, are there any parameters that I could change in the MySQL configuration that would improve the performance?
We would ideally perform these updates when traffic is low, but anything more than a couple seconds per SELECT query would seem to defeat the purpose of being able to simultaneously update and read from the database. I am looking for any suggestions.
We are using Amazon's RDS as our MySQL server.
Thanks!
I imagine you have already solved this nearly a year later :) but I thought I would chime in. According to MySQL's documentation on internal locking (as opposed to explicit, user-initiated locking):
Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not “starved” even if there is heavy SELECT activity for the table. However, if you have many updates for a table, SELECT statements wait until there are no more updates.
So it sounds like your SELECT is getting queued up until your inserts/updates finish (or at least there's a pause.) Information on altering that priority can be found on MySQL's Table Locking Issues page.

Select statement blocks the read/write operation on the InnoDB table

I have a Select query which executes on a transactional table having more than 4 million records. Whenever I execute this query , I observe that all write and update operations on that particular transactional table become suspended and we start getting exceptions from java side that lock wait timeout exceeds , try restarting transaction. Currently lock wait timeout is set to 200 seconds. I am unable to understand that why a select statement can create such locks on the table and block all insert/update statements. The table storage engine is InnoDb and primary key is auto-increment key. The MySQL Version is 5.1.40.
Also I m not starting any transaction before executing select statement.
Any Idea?
So, yes, your SELECT in one transaction read-locks the records of that table and write operations which touch the same records will have to wait until read transaction completes (if it follows two phase locking).
This document may help understanding innodb locks model