Is forking possible w/ InnoDB & unique records? - mysql

I am considering moving my MyISAM table to InnoDB. I have a lot of tables w/ columns set to unique values and I use perl. If I switch to InnoDB (and thus take advantage of row-level locking rather than table-level locking) and use forking, will I encounter problems with duplicate entries? (ie, since I will be inserting many rows simultaneously into the table)

As long as you have UNIQUE indexes in place, no rows violating these constraints will be allowed.
You might however run into some concurrency issues when doing inserts within transactions. If two duplicating rows are inserted in two different, concurrent transactions, one of them will fail to commit.

Uniqueness can be achieved by creating unique indexes. In this case DB engine takes care of it. Also, a proper use of transactions helps you to avoid concurrency issues.

Related

Does InnoDB lock the whole table for a delete which uses part of a composed key?

I have a MySQL table (call it 'my_table') with a composed primary key with 4 columns (call them 'a', 'b', 'c' and 'd').
At least one time I encountered a deadlock on parallel asynchronous EJB calls calling 'DELETE FROM my_table where a=? and b=?' with different values, so I started to look into how InnoDB table locking works.
I've found no clear documentation on how table locking works with composed keys. Is the whole table locked by the delete, despite the fact that there's no overlap among the actual rows being deleted?
Do I need to do a select to recover the values for c and d and delete batches using the whole primary key?
This is in the context of a complex application which works with 4 different databases. Only MySQL seems to have this issue.
InnoDB never locks the entire table for DML statements. (Unless the DML is hitting all rows.)
There are other locks for DDL statements, such as when ALTER TABLE is modifying/adding columns/indexes/etc. (Some of these have been greatly sped up in MySQL 8.0.)
There is nothing special about a composite key wrt locking.
There is a thing called a "gap lock". For various reasons, the "gap" between two values in the index will be locked. This prevents potential conflicts such as inserting the same new value that does not yet exist, and there is a uniqueness constraint.
Since the PRIMARY KEY is a unique key, you may have hit something like that.
If practical, do SHOW ENGINE INNODB STATUS; to see whether the lock is "gap" or not.
Another thing that can happen is that a lock can start out being weak, then escalate to "eXclusive". This can lead to a deadlock.
Do I need to do a select to recover the values for c and d and delete batches using the whole primary key?
I think you need to explain more precisely what you are doing. Provide the query. Provide SHOW CREATE TABLE.
InnoDB's lock handling is possibly unique to MySQL. It has some quirks. Sometimes it is a bit greedy about what it locks; to compensate, it is possibly faster than the competition.
In any case, check for deadlocks (and timeouts) and deal with them. The hope that these problems are rare enough that having to deal with them is not too much a performance burden.
DELETE FROM my_table where a=? and b=? means that potentially a large number of rows are being deleted. That means that the undo log and MVCC need to do a lot of work. Hence, I recommend trying not to delete (or update) more than 1K rows at a time.

Is MyISAM Table locking in MySQL automatic?

Does MySQL automatically peform read / write table locks on MyIsam tables, or do I have to explicitly lock the tables?
Beyond making reasonable efforts to make certain statements atomic, MyISAM doesn't have the concept of transactions and its related row level locking.
Therefore, you should use LOCK TABLES to avoid race conditions or data inconsistencies when you use multiple statements (e.g. a SELECT statement followed by multiple related UPDATE statements).
See here about Pros and Cos Of MyISAM
MyISAM's About Internal Locking
MyISAM uses table-level locking. When a row is inserted or updated, all other changes to that table are held up until that request has been completed.
Correct me if 'm wrong

When MyISAM is better than InnoDB?

Sometimes I got asked on some interviews: what benefits does InnoDB have against MyISAM and when MyISAM is better than InnoDB? It's all clear about the first part of question: InnoDB is transaction compliant, row-level blocking instead of table-level blocking, foreign key support and some others, these points just came to mind immidiately.
But when MyISAM is really better than InnoDB?
MyISAM is better than InnoDB when you don't need those advanced features and storage speed is more important than other concerns. MyISAM also allows full-text searches to be performed inside the database engine itself, instead of needing to query results and then search them as an array or whatever in your application.
InnoDB is a reasonable choice if you need to store data with a high degree of fidelity with complicated interactions and relationships. MyISAM is a reasonable choice if you need to save or load a large number of records in a small amount of time.
I wouldn't recommend using MyISAM for data that matters. It's great for logging or comments fields or anything where you don't particularly care if a record vanishes into the twisting nether. InnoDB is good for when you care about your data, don't need fast searches and have to use MySQL.
It's also worth mentioning that InnoDB supports row-level locking, while MyISAM only supports table-level locking - which is to say that for many common situations, InnoDB can be dramatically faster due to more queries executing in parallel.
The bottom line: Use InnoDB unless you absolutely have to use MyISAM. Alternatively, develop against PostgreSQL and get the best of both.
MyISAM doesn't support transactions (and the other things mentioned) so it can work faster. MyISAM is a way to achieve higher performance in those situations when you do not need these features.
MyISAM supports full text, as mentioned, but also supports the MERGE table type. This is handy when you have a large table and would like to "swap" out/archive parts of it periodically. Think about a logging or report data that you want to keep the last quarter and/or year. MyISAM handles large amounts of data like this better, when you are mainly inserting and rarely updating or deleting.
InnoDB performance drops pretty quickly and dramatically once you can't fit the indexes in memory. If your primary key is not going to be a number (i.e. auto increment), then you may want to rethink using InnoDB. The primary key is replicated for every index on an InnoDB table. So if you have a large primary key and a few other indexes, your InnoDB table will get very large very quick.
There are a few features that MySQL only has implemented for MyISAM (such as native fulltext indexing).
That said, InnoDB is still typically better for most production apps.
Also: Full-text search in mySQL is only supported in myISAM tables.
MyISAM has a very simple structure, when compared with InnoDB. There is no row versioning, there's one file per table and rows are stored sequentially. However, while it supports concurrent inserts (SELECTs and 1 INSERT can run together), it also has table-level locks (if there are 2 INSERTs on the same table, 1 has to wait). Also, UPDATEs and DELETEs are slow because of the structure of the data files.
MyISAM doesn't support transactions or foreign keys.
Generally, MyISAM should be better if you work on general trends (so you don't care about the correctness of individual rows) and data is updated by night or never. Also, it allows to move individual tables from one server to another, via the filesystem.
InnoDB supports very well concurrency and transactions. Has a decent support for fulltext and an almost-decent support for foreign keys.

Is MySQL InnoDB is appropriate for this scenario?

My MysQL database contains multiple MyISAM tables, with each table containing millions of rows. There is a heavy insert load on the database, so I cannot issue SELECTs on that live database. Instead, I create a replica of the database for queries and conduct analysis on that.
For the analysis, I need to issue multiple parallel queries. The queries are independent (i.e., the results of the queries are not combined together), but they operate on same tables most of the time. As far as I know, the entire MyISAM table is locked for each query, which means parallel independent queries would be slow. Ideally, I would prefer an engine that supports "NO LOCKING". I am assuming MySQL doesnt have such an engine, so should I use InnoDB? I might be missing lot of things here. Please suggest what is the right path to take here.
Thanks
MyISAM read locks are compatible, so the SELECT queries won't lock each other.
If your analysis queries on the replica database don't write, only read, then it's OK to use MyISAM.
You could stick to MyISAM and use INSERT DELAYED:
When a client uses INSERT DELAYED, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.
Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.

Are Concurrent SQL inserts into the same table transactionally safe?

I have a simple table in MySql whose raison-d'ĂȘtre is to store logs.
The table has an autoincremented sequence and all the other columns has zero referential integrity to other tables. There are no unique keys or indexes on any columns. The column with autoincrement is the primary key.
Will concurrent INSERTs ever interfere with each other ? I define interference to mean losing data.
I am using autocommit=true for this insert.
You'll never lose data just because you do simultaneous inserts. If you use transactions you might "lose" some IDs - but no actual data. (Imagine that you start a transaction, insert a few rows and then do a rollback. InnoDB will have allocated the auto_increment IDs , but there are no rows with those IDs because you did the rollback).
Since you don't need indexes, you should have a look at the ARCHIVE table engine. It's amazingly insanely fast -- and your tables gets much smaller which in turn makes the table scans when you read the table later MUCH faster.
From the MySQL manual for the MyISAM storage engine:
"MyISAM supports concurrent inserts..."
Yes. For InnoDB , more information here