MySQL pause index rebuild on bulk INSERT without TRANSACTION - mysql

I have a lot of data to INSERT LOW_PRIORITY into a table. As the index is rebuilt every time a row is inserted, this takes a long time. I know I could use transactions, but this is a case where I don't want the whole set to fail if just one row fails.
Is there any way to get MySQL to stop rebuilding indices on a specific table until I tell it that it can resume?
Ideally, I would like to insert 1,000 rows or so, set the index do its thing, and then insert the next 1,000 rows.
I cannot use INSERT DELAYED as my table type is InnoDB. Otherwise, INSERT DELAYED would be perfect for me.
Not that it matters, but I am using PHP/PDO to access MySQL. Any advice you could give would be appreciated. Thanks!

ALTER TABLE tableName DISABLE KEYS
// perform inserts
ALTER TABLE tableName ENABLE KEYS
This disables updating of all non-unique indexes. The disadvantage is that those indexes won't be used for select queries as well.
You can however use multi-inserts (INSERT INTO table(...) VALUES(...),(...),(...) which will also update indexes in batches.

AFAIK, for those that use InnoDB tables, if you don't want indexes to be rebuilt after each INSERT, you must use transactions.
For example, for inserting a batch of 1000 rows, use the following SQL:
SET autocommit=0;
//Insert the rows one after the other, or using multi values inserts
COMMIT;
By disabling autocommit, a transaction will be started at the first INSERT. Then, the rows are inserted one after the other and at the end, the transaction is committed and the indexes are rebuilt.
If an error occurs during execution of one of the INSERT, the transaction is not rolled back but an error is reported to the client which has the choice of rolling back or continuing. Therefore, if you don't want the entire batch to be rolled back if one INSERT fails, you can log the INSERTs that failed and continue inserting the rows, and finally commit the transaction at the end.
However, take into account that wrapping the INSERTs in a transaction means you will not be able to see the inserted rows until the transaction is committed. It is possible to set the transaction isolation level for the SELECT to READ_UNCOMMITTED but as I've tested it, the rows are not visible when the SELECT happens very close to the INSERT. See my post.

Related

MariaDB. Use Transaction Rollback without locking tables

On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?

MySQL Transaction in Cron job

I have a PHP DAEMON on my Ubuntu server doing huge data inserts into InnoDB. The very same tables are also being used by people using the platform.
The DAEMON when not running in TRANSACTION mode uses about 60-70 secs for 100.000 inserts. When running in TRANSACTION mode, BEGIN .... COMMIT it uses 15-20 seconds.
However will TRANSACTION mode lock the tables, and prevent users using the platform to do inserts while the DAEMON TRANSACTION is beeing preformed?
Locking the tables the users are manipulating for over 20 seconds is, of course, not desirable :)
Well I'm doing inserts in batches of 500 and 500 insie a FOR loop INSERT INTO (col1, col2) VALUES (a,b) etc. This is fine, and runs smooth, however I'm able to speed up the process significantly if i issue a BEGIN before the loop, and COMMIT after to loop, but this means the time between the BEGIN/COMMIT is over 60 seconds. But while the system is doing a few hundred thousand inserts, people using the platform can do inserts to the very same table. Will the system generated Inserts account for the user insets, or will the users have to wait XX seconds before their insert is processed?
Based on your description, you use innodb with the default autocommit mode enabled and you insert records one by one in a loop. Autocommit mode means that each insert is encapsulated into its own transaction, which is fine, but very slow, since each record is persisted separately into the disk.
If you wrap your loop that inserts the records within begin - commit statements, all inserts are run within a single transaction and are persisted to the disk only once, when the commit is issued - this is why you experience the speed gain.
Regardless of which way you insert the records, innodb will use locks. However, innodb only locks the record being inserted:
INSERT sets an exclusive lock on the inserted row. This lock is an
index-record lock, not a next-key lock (that is, there is no gap lock)
and does not prevent other sessions from inserting into the gap before
the inserted row.
Prior to inserting the row, a type of gap lock called an insert
intention gap lock is set. This lock signals the intent to insert in
such a way that multiple transactions inserting into the same index
gap need not wait for each other if they are not inserting at the same
position within the gap. Suppose that there are index records with
values of 4 and 7. Separate transactions that attempt to insert values
of 5 and 6 each lock the gap between 4 and 7 with insert intention
locks prior to obtaining the exclusive lock on the inserted row, but
do not block each other because the rows are nonconflicting.
This means, that having a transaction open for a longer period of time that only inserts records will not interfere with other users inserting records into the same table.
Pls note, that issuing single insert statements in a loop is the least efficient way of inserting larger amount of data into MySQL.
Either use bulk insert (build a single insert statement in the loop and execute it after the loop, paying attention to max_allowed_packet setting :
INSERT statements that use VALUES syntax can insert multiple rows. To
do this, include multiple lists of column values, each enclosed within
parentheses and separated by commas. Example:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Or use load data infile statement.
These two solutions can significantly speed up the data insertion and will not cause table lock either.
Plan A: LOAD DATA. Drawback: This requires writing the data to a file. If it is already in a file, then this is the best approach.
Plan B: "Batched INSERTs" -- Build INSERT INTO t (a,b) VALUES (1,2), (3,4), ... and execute them. Do it in batches of 100-1000. This will be even faster than BEGIN..COMMIT around lots of 1-row INSERTs. Have autocommit=ON. Locking/blocking will be minimal since each 'transaction' will be only 100-1000 row's worth.
Let's see SHOW CREATE TABLE. INDEXes, especially UNIQUE indexes have an impact on the performance. We can advise further.
If this is a "Data Warehouse" application, then we should talk about "Summary Tables". These would lighten the load crated the 'readers' significantly and cut back on the need for indexes on the Fact table and prevent locking/blocking because they would be reading a different table.
Also, UUIDs are terrible for performance.
How big is the table? How much RAM do you have? What is the value of innodb_buffer_pool_size?

Avoiding SQL deadlock in insert combined with select

I'm trying to insert pages into a table with a sort column that I auto-increment by 2000 in this fashion:
INSERT INTO pages (sort,img_url,thumb_url,name,img_height,plank_id)
SELECT IFNULL(max(sort),0)+2000,'/image/path.jpg','/image/path.jpg','name',1600,'3'
FROM pages WHERE plank_id = '3'
The trouble is I trigger these inserts on the upload of images, so 5-10 of these queries are run almost simultaneously. This triggers a deadlock on some files, for some reason.
Any idea what is going on?
Edit: I'm running MySQL 5.5.24 and InnoDB. The sort column has an index.
What I made for myself is setting sort to 0 on insert, retrieve id of inserted row and set sort to id*2000. But, also, you can try to use transactions:
BEGIN;
INSERT INTO pages (sort,img_url,thumb_url,name,img_height,plank_id)
SELECT IFNULL(max(sort),0)+2000,'/image/path.jpg','/image/path.jpg','name',1600,'3'
FROM pages WHERE plank_id = '3';
COMMIT;
Note that not all of the MySQL client libraries support multiqueris, so you may have to execute them separately but in stream of one connection.
Another approach is to lock the whole table for the time INSERT is executed, but this will lead to increase of queries queue because they will have to wait until insert is performed

MySQL SQL_NO_CACHE not working

I have a table (InnoDB) that gets inserted, updated and read frequently (usually in a burst of few milliseconds apart). I noticed that sometimes the SELECT statement that follows an INSERT/UPDATE would get stale data. I assume this was due to cache, but after putting SQL_NO_CACHE in front of it doesn't really do anything.
How do you make sure that the SELECT always wait until the previous INSERT/UPDATE finishes and not get the data from cache? Note that these statements are executed from separate requests (not within the same code execution).
Maybe I am misunderstanding what SQL_NO_CACHE actually does...
UPDATE:
#Uday, the INSERT, SELECT and UPDATE statement looks like this:
INSERT myTable (id, startTime) VALUES(1234, 123456)
UPDATE myTable SET startTime = 123456 WHERE id = 1234
SELECT SQL_NO_CACHE * FROM myTable ORDER BY startTime
I tried using transactions with no luck.
More UPDATE:
I think this is actually a problem with INSERT, not UPDATE. The SELECT statement always tries to get the latest row sorted by time. But since INSERT does not do any table-level locking, it's possible that SELECT will get old data. Is there a way to force table-level locking when doing INSERT?
The query cache isn't the issue. Writes invalidate the cache.
MySQL gives priority to writes and with the default isolation level (REPEATABLE READ), your SELECT would have to wait for the UPDATE to finish.
INSERT can be treated differently if you have CONCURRENT INSERTS enabled for MyISAM, also InnoDB uses record locking, so it doesn't have to wait for inserts at the end of the table.
Could this be a race condition then? Are you sure your SELECT occurs after the UPDATE? Are you reading from a replicated server where perhaps the update hasn't propagated yet?
If the issue is with the concurrent INSERT, you'll want to disable CONCURRENT INSERT on MyISAM, or explicitly lock the table with LOCK TABLES during the INSERT. The solution is the same for InnoDB, explicitly lock the table on the INSERT with LOCK TABLES.
A)
If you dont need the caching at all(for any SELECT), disable the query
cache completely.
B)
If you want this for only one session, you can do it like
"set session query_cache_type=0;" which will set this for that
perticular session.
use SQL_NO_CAHCE additionally in either case.

Why does MySQL autoincrement increase on failed inserts?

A co-worker just made me aware of a very strange MySQL behavior.
Assuming you have a table with an auto_increment field and another field that is set to unique (e.g. a username-field). When trying to insert a row with a username thats already in the table the insert fails, as expected. Yet the auto_increment value is increased as can be seen when you insert a valid new entry after several failed attempts.
For example, when our last entry looks like this...
ID: 10
Username: myname
...and we try five new entries with the same username value on our next insert we will have created a new row like so:
ID: 16
Username: mynewname
While this is not a big problem in itself it seems like a very silly attack vector to kill a table by flooding it with failed insert requests, as the MySQL Reference Manual states:
"The behavior of the auto-increment mechanism is not defined if [...] the value becomes bigger than the maximum integer that can be stored in the specified integer type."
Is this expected behavior?
InnoDB is a transactional engine.
This means that in the following scenario:
Session A inserts record 1
Session B inserts record 2
Session A rolls back
, there is either a possibility of a gap or session B would lock until the session A committed or rolled back.
InnoDB designers (as most of the other transactional engine designers) chose to allow gaps.
From the documentation:
When accessing the auto-increment counter, InnoDB uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction. The special lock release strategy was introduced to improve concurrency for inserts into a table containing an AUTO_INCREMENT column
…
InnoDB uses the in-memory auto-increment counter as long as the server runs. When the server is stopped and restarted, InnoDB reinitializes the counter for each table for the first INSERT to the table, as described earlier.
If you are afraid of the id column wrapping around, make it BIGINT (8-byte long).
Without knowing the exact internals, I would say yes, the auto-increment SHOULD allow for skipped values do to failure inserts. Lets say you are doing a banking transaction, or other where the entire transaction and multiple records go as an all-or-nothing. If you try your insert, get an ID, then stamp all subsequent details with that transaction ID and insert the detail records, you need to ensure your qualified uniqueness. If you have multiple people slamming the database, they too will need to ensure they get their own transaction ID as to not conflict with yours when their transaction gets committed. If something fails on the first transaction, no harm done, and no dangling elements downstream.
Old post,
but this may help people,
You may have to set innodb_autoinc_lock_mode to 0 or 2.
System variables that take a numeric value can be specified as --var_name=value on the command line or as var_name=value in option files.
Command-Line parameter format:
--innodb-autoinc-lock-mode=0
OR
Open your mysql.ini and add following line :
innodb_autoinc_lock_mode=0
I know that this is an old article but since I also couldn't find the right answer, I actually found a way to do this. You have to wrap your query within an if statement. Its usually insert query or insert and on duplicate querys that mess up the organized auto increment order so for regular inserts use:
$check_email_address = //select query here\\
if ( $check_email_address == false ) {
your query inside of here
}
and instead of INSERT AND ON DUPLICATE use a UPDATE SET WHERE QUERY in or outside an if statement doesn't matter and a REPLACE INTO QUERY also does seem to work