Locking in MySQL - mysql

I have recently started a fairly large web project which is going to use MySQL as a database. I am not completely familiar with MySQL, but I know enough to make simple queries and generally do all that I need to.
I was told that I needed to lock my tables before writing to them? Is this necessary every time? Surely MySQL would have some sort of built in feature to handle concurrent reading and writing of the database?
In short, when should I use locking, and how should I go about doing so?

Here is an excellent explanation of when and how to implement locking: http://www.brainbell.com/tutors/php/php_mysql/When_and_how_to_lock_tables.html
As per El yobo's suggestion:
If you are doing one off select querys, there is not going to be a problem.
From the article:
Locking is required only when
developing scripts that first read a
value from a database and later write
that value to the database.

In short, dont use myisam use innodb instead. When you want to insert, update or delete (CRUD) rows do:
start transaction;
insert into users (username) values ('f00');
...
commit; -- or rollback
when you want to fetch rows just select them:
select user_id, username from users;
hope this helps :)

Related

MySQL query synchronization/locking question

I have a quick question that I can't seem to find online, not sure I'm using the right wording or not.
Do MySql database automatically synchronize queries or coming in at around the same time? For example, if I send a query to insert something to a database at the same time another connection sends a query to select something from a database, does MySQL automatically lock the database while the insert is happening, and then unlock when it's done allowing the select query to access it?
Thanks
Do MySql databases automatically synchronize queries coming in at around the same time?
Yes.
Think of it this way: there's no such thing as simultaneous queries. MySQL always carries out one of them first, then the second one. (This isn't exactly true; the server is far more complex than that. But it robustly provides the illusion of sequential queries to us users.)
If, from one connection you issue a single INSERT query or a single UPDATE query, and from another connection you issue a SELECT, your SELECT will get consistent results. Those results will reflect the state of data either before or after the change, depending on which query went first.
You can even do stuff like this (read-modify-write operations) and maintain consistency.
UPDATE table
SET update_count = update_count + 1,
update_time = NOW()
WHERE id = something
If you must do several INSERT or UPDATE operations as if they were one, you'll need to use the InnoDB engine, and you'll need to use transactions. The transaction will block SELECT operations while it is in progress. Teaching you to use transactions is beyond the scope of a Stack Overflow answer.
The key to understanding how a modern database engine like InnoDB works is Multi-Version Concurrency Control or MVCC. This is how simultaneous operations can run in parallel and then get reconciled into a consistent "view" of the database when fully committed.
If you've ever used Git you know how you can have several updates to the same base happening in parallel but so long as they can all cleanly merge together there's no conflict. The database works like that as well, where you can begin a transaction, apply a bunch of operations, and commit it. Should those apply without conflict the commit is successful. If there's trouble the transaction is rolled back as if it never happened.
This ability to juggle multiple operations simultaneously is what makes a transaction-capable database engine really powerful. It's an important component necessary to meet the ACID standard.
MyISAM, the original engine from MySQL 3.0, doesn't have any of these features and locks the whole database on any INSERT operation to avoid conflict. It works like you thought it did.
When creating a database in MySQL you have your choice of engine, but using InnoDB should be your default. There's really no reason at all to use MyISAM as any of the interesting features of that engine (e.g. full-text indexes) have been ported over to InnoDB.

MySQL Transaction

It could be a dumb question, and tried to search for it and found nothing.
I been using mysql for years(not that to long) but i never had tried mysql transactions.
Now my question is, what would happen if i issue an insert or delete statement from multiple clients using transactions? does it would lock the table and prevent other client to perform there query?
what would happen if other client issue a transaction query while the other client still have unfinished transaction?
I appreciate for any help will come.
P.S. most likely i will use insert using a file or csv it could be a big chunk of data or just a small one.
MySQL automatically performs locking for single SQL statements to keep clients from interfering with each other, but this is not always sufficient to guarantee that a database operation achieves its intended result, because some operations are performed over the course of several statements. In this case, different clients might interfere with each other.
Source: http://www.informit.com/articles/article.aspx?p=2036581&seqNum=12

PDO transactions with InnoDB tables

I have InnoDB tables that we access via a PDO API from PHP. Now, I've read that for INSERT and UPDATE statements, it would probably be a good idea to use InnoDB transactions. Since auto commit is set to 1, it would commit the query as soon as it is made. So if I group a bunch of INSERTs together and do:
$GLOBALS['dbh']->query('BEGIN');
[multiple INSERT queries here]
$GLOBALS['dbh']->query('COMMIT');
It's supposed to be more efficient.
Questions:
Is this correct?
I also read that certain APIs make use of their own transactions and was wondering if anyone knew if PDO does this. In other words, should I worry about doing this at all or let PDO handle transactions?
In the case that PDO does handle transactions, am I screwing everything up with the above queries?
Thanks.
Is this correct?
Yes.
Small nitpick: I would use START TRANSACTION instead of begin, it is the same, but more self-evident.
I also read that certain APIs make use of their own transactions and was wondering if anyone knew if PDO does this. In other words, should I worry about doing this at all or let PDO handle transactions?
PDO does not magically know when you transactions start and end, so you will still have to start and end your transactions if auto-commit =1 and you want to include more than 1 statement in a transaction.
You should not worry, what you are doing above is fine.
In the case that PDO does handle transactions, am I screwing everything up with the above queries?
No.
So if I group a bunch of INSERTs together and do: {see code above}
It's supposed to be more efficient.
Not very much, if you can cramp all your inserts into a single statement that would be more efficient.
And if you can replace the insert with a load data infile that would be more efficient still.
Example:
INSERT INTO table1 (field1, field2) VALUES (1,1),(2,5),(5,6);
-- Much more efficient than 3 separate inserts
-- (and you don't need to start and end the transaction :-)
Theoretically it's correct, but the "official" way to do it, is to use PDO's built-in methods for that: http://www.php.net/manual/en/pdo.begintransaction.php

What are the advantages of UPDATE LOW_PRIORITY and INSERT DELAYED INTO?

I was going through some code and noticed that UPDATE LOW_PRIORITY and INSERT DELAYED INTO are used for updating the database. What is is the use of these statements? Should I use these in every insert and update statement for various tables in the same database?
With the LOW_PRIORITY keyword, execution of the UPDATE is delayed until no other clients are reading from the table. Normally, reading clients are put on hold until the update query is done. If you want to give the reading clients priority over the update query, you should use LOW_PRIORITY.
The DELAYED option for the INSERT statement is a MySQL extension to standard SQL that is very useful if you have clients that cannot or need not wait for the INSERT to complete. This is a common situation when you use MySQL for logging and you also periodically run SELECT and UPDATE statements that take a long time to complete.
LOW_PRIORITY, HIGH_PRIORITY and DELAYED are only useful in a few circustamces. If you don't have a BIG load they can't help you. If you have, don't do anything you don't fully understand.
All of these otpions only work with MyISAM, not InnoDB, not views.
DELAYED doesn't work with partitioned tables, and it's clearly designed for dataware house. The client sends the insert and then forgets it, without waiting for the result. So you won't know if the insert succeded, if there were duplicate values, etc. It should never be used while other threads could SELECT from that table, because an insert delayed is never concurrent.
LOW_PRIORITY waits until NO client is accessing the table. But if you have a high traffic, you may wait until the connection times out... that's not what you want, I suppose :)
Also, note that DELAYED will be removed in Oracle MySQL 5.7 (but not in MariaDB).
If you need to use these, then you have a big load on your server, and you know that some UPDATE or INSERT statements are not high priority and they can act on load.
Example: SQL that generates some statistics or items top. They are slow, and do not need to be executed immediately.
If your UPDATEs on MySQL a read intensive environment are taking as much as 1800 seconds then it is advisable to use the UPDATE LOW_PRIORITY.

SQL Server / MySQL / Access - speeding up inserting many rows in an inefficient manner

SETUP
I have to insert a couple million rows in either SQL Server 2000/2005, MySQL, or Access. Unfortunately I don't have an easy way to use bulk insert or BCP or any of the other ways that a normal human would go about this. The inserts will happen on one particular database but that code needs to be db agnostic -- so I can't do bulk copy, or SELECT INTO, or BCP. I can however run specific queries before and after the inserts, depending on which database I'm importing to.
eg.
If IsSqlServer() Then
DisableTransactionLogging();
ElseIf IsMySQL() Then
DisableMySQLIndices();
End If
... do inserts ...
If IsSqlServer() Then
EnableTransactionLogging();
ElseIf IsMySQL() Then
EnableMySQLIndices();
End If
QUESTION
Are there any interesting things I can do to SQL Server that might speed up these inserts?
For example, is there a command I could issue to tell SQL Server, "Hey, don't bother recording these transactions in the transaction log".
Or maybe I could say, "Hey, I have a million rows coming in, so don't update your index until I'm totally finished".
ALTER INDEX [IX_TableIndex] ON Table DISABLE
... inserts
ALTER INDEX [IX_TableIndex] ON Table REBUILD
(Note: Above index disable only works on 2005, not 2000. Bonus points if you know a way to do this on 2000).
What about MySQL, and Access?
The single biggest thing that will kill performance here is the fact that (it sounds like) you're executing a million different INSERTs against the DB. Each INSERT is treated as a single operation. If you can do this as a single operation, then you will almost certainly have a huge performance improvement.
Both MySQL and SQL Server support 'selects' of constant expressions without a table name, so this should work as one statement:
INSERT INTO MyTable(ID, name)
SELECT 1, 'Fred'
UNION ALL SELECT 2, 'Wilma'
UNION ALL SELECT 3, 'Barney'
UNION ALL SELECT 4, 'Betty'
It's not clear to me if Access supports that, not having Access available. HOWEVER, Access does support constants in a SELECT, as far as I can tell, and you can coerce the above into ANSI SQL-92 (which should be supported by all 3 engines; it's about as close to 'DB agnostic' as you'll get) by just adding
FROM OneRowTable
to the end of every individual SELECT, where 'OneRowTable' is a table with just one row of dummy data.
This should let you insert a million rows of data in much much less than a million INSERT statements -- and things like index reshuffling will be done once, rather than a million times. You may have much less need for other optimisations after that.
is this a regular process or a one time event?
I have, in the past, just scripted out the current indexes, dropped them, inserted the rows, then just re-add the indexes.
The SQL Management Studio can script out the indexes from the right click menus...
For SQL Server:
You can set the recovery model to "Simple", so your transaction log will be kept small. Do not forget to set back afterwards.
Disabling the indexes is actually a good idea. This will work on SQL 2005, not on SQL Server 2000.
alter index [INDEX_NAME] on [TABLE_NAME] disable
And to enable
alter index [INDEX_NAME] on [TABLE_NAME] rebuild
And then just insert the rows one by one. You have to be patient, but at least it is somewhat faster.
If it is a one-time thing (or it happens often enough to justify automating this), also considering dropping/disabling all indexes, and then adding/reenabling them again when the insert it done
The trouble with setting the recovery model to simple is that it affects any other users entering data at the same time and thus will amke thier changes unrecoverable.
Samre thing with disabling the indexes, this disables for everyone and may make the database run slower than a slug.
Suggest you run the import in batches.
If this is not something that needs to be read terribly quickly, you can do an "Insert Delayed" into the table on MySQL. This allows your code to continue running without having to wait for the insert to actually happen. This does have some limitations, but if your primary concern is to get the program to finish quickly, this may help. Be warned that there is a nice long list of situations where this may not act as expected. Check the docs.
I do not know if this functionality works for Access or MS SQL, though.
Have you considered using the Factory pattern? I'm guessing you're writing the code for this, so if using the factory pattern you could code up a factory that returned a concrete "IDataInserter" type class that would do the work for.
This would still allow you to be data agnostic and get the fastest method for each type of database.
SQL Server 2000/2005, MySQL, and Access can all load directly from a tab / cr text file they just have different commands to do it. If you've got the case statement to determine which DB you're importing into just figure out their preference for importing a text file.
Can you use DTS (2000) or SSIS (2005) to build a package to do this? DTS and SSIS can both pull from the same source and pipe out to the different potential destinations. Go for SSIS if you can. There's a lot of good, fast technology in there along with functionality to embed the IsSQLServer, IsMySQL, etc. logic.
It's worth considering breaking your inserts into smaller batches; a single transaction with lots of queries will be slow.
You might consider using SQL's bulk-logged recovery model during your bulk insert.
http://msdn.microsoft.com/en-us/library/ms190422(SQL.90).aspx
http://msdn.microsoft.com/en-us/library/ms190203(SQL.90).aspx
You might also disable the indexes on the target table during your inserts.