I've just read a mysql docs where I found such sentence: "A consistent read means that InnoDB uses multi-versioning to present to a query a snapshot of the database at a point in time."
I read a lot of mysql doc pages, but still cann't clarify to myself what exactly "to a query" here means. Definitly it ralates to a SELECT statement, but what about if my transaction starts with UPDATE, INSERT, DELETE statement?
Thanks!
I found another way on my answer. And I think it should be apripriate by the others. So, days of searching whiting oracle docs and finaly founed:
InnoDB creates a consistent read view or a consistent snapshot either when the statement
mysql> START TRANSACTION WITH CONSISTENT SNAPSHOT;
is executed or when the first select query is executed in the transaction.
https://blogs.oracle.com/mysqlinnodb/entry/repeatable_read_isolation_level_in
When the query can change the data, the database also uses locks to synchronise queries.
So between queries that change data, locks are used to make sure that only one query at a time can change specific items. Between a query that reads data and a query that changes data, multi-versioning is used to present the data before the change to the query that reads it.
Related
I have a quick question that I can't seem to find online, not sure I'm using the right wording or not.
Do MySql database automatically synchronize queries or coming in at around the same time? For example, if I send a query to insert something to a database at the same time another connection sends a query to select something from a database, does MySQL automatically lock the database while the insert is happening, and then unlock when it's done allowing the select query to access it?
Thanks
Do MySql databases automatically synchronize queries coming in at around the same time?
Yes.
Think of it this way: there's no such thing as simultaneous queries. MySQL always carries out one of them first, then the second one. (This isn't exactly true; the server is far more complex than that. But it robustly provides the illusion of sequential queries to us users.)
If, from one connection you issue a single INSERT query or a single UPDATE query, and from another connection you issue a SELECT, your SELECT will get consistent results. Those results will reflect the state of data either before or after the change, depending on which query went first.
You can even do stuff like this (read-modify-write operations) and maintain consistency.
UPDATE table
SET update_count = update_count + 1,
update_time = NOW()
WHERE id = something
If you must do several INSERT or UPDATE operations as if they were one, you'll need to use the InnoDB engine, and you'll need to use transactions. The transaction will block SELECT operations while it is in progress. Teaching you to use transactions is beyond the scope of a Stack Overflow answer.
The key to understanding how a modern database engine like InnoDB works is Multi-Version Concurrency Control or MVCC. This is how simultaneous operations can run in parallel and then get reconciled into a consistent "view" of the database when fully committed.
If you've ever used Git you know how you can have several updates to the same base happening in parallel but so long as they can all cleanly merge together there's no conflict. The database works like that as well, where you can begin a transaction, apply a bunch of operations, and commit it. Should those apply without conflict the commit is successful. If there's trouble the transaction is rolled back as if it never happened.
This ability to juggle multiple operations simultaneously is what makes a transaction-capable database engine really powerful. It's an important component necessary to meet the ACID standard.
MyISAM, the original engine from MySQL 3.0, doesn't have any of these features and locks the whole database on any INSERT operation to avoid conflict. It works like you thought it did.
When creating a database in MySQL you have your choice of engine, but using InnoDB should be your default. There's really no reason at all to use MyISAM as any of the interesting features of that engine (e.g. full-text indexes) have been ported over to InnoDB.
I have a mysql table that keep gaining new records every 5 seconds.
The questions are
can I run query on this set of data that may takes more than 5 seconds?
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
I'll go over your questions and some of the comments you added later.
can I run query on this set of data that may takes more than 5 seconds?
Can you? Yes. Should you? It depends. In a MySQL configuration I set up, any query taking longer than 3 seconds was considered slow and logged accordingly. In addition, you need to keep in mind the frequency of the queries you intend to run.
For example, if you try to run a 10 second query every 3 seconds, you can probably see how things won't end well. If you run a 10 second query every few hours or so, then it becomes more tolerable for the system.
That being said, slow queries can often benefit from optimizations, such as not scanning the entire table (i.e. search using primary keys), and using the explain keyword to get the database's query planner to tell you how it intends to work on that internally (e.g. is it using PKs, FKs, indices, or is it scanning all table rows?, etc).
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
"Affect" in what way? If you mean "prevent insert from actually inserting until the select has completed", that depends on the storage engine. For example, MyISAM and InnoDB are different, and that includes locking policies. For example, MyISAM tends to lock entire tables while InnoDB tends to lock specific rows. InnoDB is also ACID-compliant, which means it can provide certain integrity guarantees. You should read the docs on this for more details.
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
Part of "what happens" is determined by how the specific storage engine behaves. Regardless of what happens, the database is designed to answer application queries in a way that's consistent.
As an example, if the select statement were to lock an entire table, then the insert statement would have to wait until the select has completed and the lock has been released, meaning that the app would see the results prior to the insert's update.
I understand that locking database can prevent messing up the SELECT statement.
It can also put a potentially unacceptable performance bottleneck, especially if, as you say, the system is inserting lots of rows every 5 seconds, and depending on the frequency with which you're running your queries, and how efficiently they've been built, etc.
what is the good practice to do when I need the data for calculations while those data will be updated within short period?
My recommendation is to simply accept the fact that the calculations are based on a snapshot of the data at the specific point in time the calculation was requested and to let the database do its job of ensuring the consistency and integrity of said data. When the app requests data, it should trust that the database has done its best to provide the most up-to-date piece of consistent information (i.e. not providing a row where some columns have been updated, but others yet haven't).
With new rows coming in at the frequency you mentioned, reasonable users will understand that the results they're seeing are based on data available at the time of request.
All of your questions are related to locking of table.
Your all questions depend on the way database is configured.
Read : http://www.mysqltutorial.org/mysql-table-locking/
Perform Select Statement While insert statement working
If you want to perform a select statement during insert SQL is performing, you should check by open new connection and close connection every time. i.e If I want to insert lots of records, and want to know that last record has inserted by selecting query. I must have to open connection and close connection in for loop or while loop.
# send a request to store data
insert statement working // take a long time
# select statement in while loop.
while true:
cnx.open()
select statement
cnx.close
//break while loop if you get the result
SO, we are trying to run a Report going to screen, which will not change any stored data.
However, it is complex, so needs to go through a couple of (TEMPORARY*) tables.
It pulls data from live tables, which are replicated.
The nasty bit when it comes to take the "eligible" records from
temp_PreCalc
and populate them from the live data to create the next (TEMPORARY*) table output
resulting in effectively:
INSERT INTO temp_PostCalc (...)
SELECT ...
FROM temp_PreCalc
JOIN live_Tab1 ON ...
JOIN live_Tab2 ON ...
JOIN live_Tab3 ON ...
The report is not a "definitive" answer, expectation is that is merely a "snapshot" report and will be out-of-date as soon as it appears on screen.
There is no order or reproducibility issue.
So Ideally, I would turn my TRANSACTION ISOLATION LEVEL down to READ COMMITTED...
However, I can't because live_Tab1,2,3 are replicated with BIN_LOG STATEMENT type...
The statement is lovely and quick - it takes hardly any time to run, so the resource load is now less than it used to be (which did separate selects and inserts) but it waits (as I understand it) because of the SELECT that waits for a repeatable/syncable lock on the live_Tab's so that any result could be replicated safely.
In fact it now takes more time because of that wait.
I'd like to SEE that performance benefit in response time!
Except the data is written to (TEMPORARY*) tables and then thrown away.
There are no live_ table destinations - only sources...
these tables are actually not TEMPORARY TABLES but dynamically created and thrown away InnoDB Tables, as the report Calculation requires Self-join and delete... but they are temporary
I now seem to be going around in circles finding an answer.
I don't have SUPER privilege and don't want it...
So can't SET BIN_LOG=0 for this connection session (Why is this a requirement?)
So...
If I have a scratch Database or table wildcard, which excludes all my temp_ "Temporary" tables from replication...
(I am awaiting for this change to go through at my host centre)
Will MySQL allow me to
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
INSERT INTO temp_PostCalc (...)
SELECT ...
FROM temp_PreCalc
JOIN live_Tab1 ON ...
JOIN live_Tab2 ON ...
JOIN live_Tab3 ON ...
;
Or will I still get my
"Cannot Execute statement: impossible to write to binary log since
BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine
limited to row-based logging..."
Even though its not technically true?
I am expecting it to, as I presume that the replication will kick in simply because it sees the "INSERT" statement, and will do a simple check on any of the tables involved being replication eligible, even though none of the destinations are actually replication eligible....
or will it pleasantly surprise me?
I really can't face using an unpleasant solution like
SELECT TO OUTFILE
LOAD DATA INFILE
In fact I dont think I could even use that - how would I get unique filenames? How would I clean them up?
The reports are run on-demand directly by end users, and I only have MySQL interface access to the server.
or streaming it through the PHP client, just to separate the INSERT from the SELECT so that MySQL doesnt get upset about which tables are replication eligible....
So, it looks like the only way appears to be:
We create a second Schema "ScratchTemp"...
Set the dreaded replication --replicate-ignore-db=ScratchTemp
My "local" query code opens a new mysql connection, and performs a USE ScratchTemp;
Because I have selected the default database of the "ignore"d one - none of my queries will be replicated.
So I need to take huge care not to perform ANY real queries here
Reference my scratch_ tables and actual data tables by prefixing them all on my queries with the schema qualified name...
e.g.
INSERT INTO LiveSchema.temp_PostCalc (...) SELECT ... FROM LiveSchema.temp_PreCalc JOIN LiveSchema.live_Tab1 etc etc as above.
And then close this connection just as soon as I can, as it is frankly dangerous to have a non-replicated connection open....
Sigh...?
Is there a way that if there's a change in records, that a query that changed the data (update, delete, insert) can be added to a "history" table transparently?
For example, if mySQL detects a change in a record or set of records, is there a way for mySQL to add that query statement into a separate table so that way, we can track the changes? That would make "rollback" possible since every query (other than SELECT) would be able to reconstruct database from its first row. Right?
I use PHP to interact with mySQL.
You need to enable the MySQL BinLog. This automatically logs all the alteration statements to a binary log which can be replied as needed.
The alternative is to use an auditing function through Triggers
Read about transaction logging in MySQL. This is built in to MySQL.
MySQL has logging functionality that can be used to log all queries. I usually leave this turned off since these logs can grow very rapidly, but it is useful to turn on when debugging.
If you are looking to track changes to records so that you can "roll back" a sequence of queries if some error condition presents itself, then you may want to look into MySQL's native support of transactions.
I have a very slow query that I need to run on a MySQL database from time to time.
I've discovered that attempts to update the table that is being queried are blocked until the query has finished.
I guess this makes sense, as otherwise the results of the query might be inconsistent, but it's not ideal for me, as the query is of much lower importance than the update.
So my question really has two parts:
Out of curiosity, what exactly does MySQL do in this situation? Does it lock the table for the duration of the query? Or try to lock it before the update?
Is there a way to make the slow query not blocking? I guess the options might be:
Kill the query when an update is needed.
Run the query on a copy of the table as it was just before the update took place
Just let the query go wrong.
Anyone have any thoughts on this?
It sounds like you are using a MyISAM table, which uses table level locking. In this case, the SELECT will set a shared lock on the table. The UPDATE then will try to request an exclusive lock and block and wait until the SELECT is done. Once it is done, the UPDATE will run like normal.
MyISAM Locking
If you switched to InnoDB, then your SELECT will set no locks by default. There is no need to change transaction isolation levels as others have recommended (repeatable read is default for InnoDB and no locks will be set for your SELECT). The UPDATE will be able to run at the same time. The multi-versioning that InnoDB uses is very similar to how Oracle handles the situation. The only time that SELECTs will set locks is if you are running in the serializable transaction isolation level, you have a FOR UPDATE/LOCK IN SHARE MODE option to the query, or it is part of some sort of write statement (such as INSERT...SELECT) and you are using statement based binary logging.
InnoDB Locking
For the purposes of the select statement, you should probably issue a:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
command on the connection, which causes the subsequent select statements to operate without locking.
Don't use the 'SELECT ... FOR UPDATE', as that definitely locks the table rows that are affected by the select statement.
The full list of msql transaction isloation levels are in the docs.
First off all you need to know what engine you´re using (MySam or InnoDb).
This is clearly a transaction problem.
Take a look a the section 13.4.6. SET TRANSACTION Syntax in the mysql manual.
UPDATE LOW_PRIORITY .... may be helpful - the mysql docs aren't clear whether this would let the user requesting the update continue and the update happen when it can (which is what I think happens) or whether the user has to wait (which would be worse than at present ...), and I can't remember.
What table types are you using? If you are on MyISAM, switching to InnoDB (if you can - it has no full text indexing) opens up more options for this sort of thing, as it supports the transactional features and row level locking.
I don't know MySQL, But it sounds like transaction problem.
You should be able to set transaction typ to Dirty Read in your select query.
That won't nessarily give you correct results. But it should'nt be blocked.
Better would be to make the first query go faster. Do some analyzing and check if you can speed it up with correct indeing and so on.