MEMORY(HEAP) vs. InnoDB in a Read and Write Environment - mysql

I want to program a real-time application using MySQL.
It needs a small table (less than 10000 rows) that will be under heavy read (scan) and write (update and some insert/delete) load. I am really speaking of 10000 updates or selects per second. These statements will be executed on only a few (less than 10) open mysql connections.
The table is small and does not contain any data that needs to be stored on disk. So I ask which is faster: InnoDB or MEMORY (HEAP)?
My thoughts are:
Both engines will probably serve SELECTs directly from memory, as even InnoDB will cache the whole table. What about the UPDATEs? (innodb_flush_log_at_trx_commit?)
My main concern is the locking behavior: InnoDB row lock vs. MEMORY table lock. Will this present the bottleneck in the MEMORY implementation?
Thanks for your thoughts!

If you're really having to have that much concurrent updates, it's almost certain that innodb will perform better, as HEAP tables only have table-level locks, not row-level locks like Innodb.
If you're starting from scratch I would investigate using MySQL 5.5 or Percona's XtraDB as they both contain many scalability improvements over the stock MySQL 5.1.

It's not just a question of row locks - InnoDB also has MVCC http://en.wikipedia.org/wiki/Multiversion_concurrency_control so the readers won't even block writers.
But I think your question is missing the all important detail - what sort of data are you storing? If you need to be able to recover post-crash MEMORY is not an option.
If you don't need to recover post crash, then why are you using a database? Why not use something like memcached or redis?

Related

Which engine to be used for more than 100 insert query per second

Which engine to be used for more than 100 insert query per second
I read differences and pros and cons of MYISAM and Innodb.
But i am still confused for 100+ insert query in a table (basically for tracking purpose) which db should i use.
I refered What's the difference between MyISAM and InnoDB?
Based on my understanding, for each insert MYISAM will lock table and hence innodb should be used for row locking.
But on the otherhand performance of MYISAM are 100times better.So what should be the optimal and correct selection and why?
Simple code that does one-row INSERTs without any tuning maxes out at about 100 rows per second in any engine, especially InnoDB.
But, it is possible to get 1000 rows per second or even more.
The quick fix for InnoDB is to set innodb_flush_log_at_trx_commit = 2; that will uncork the main thing stopping InnoDB at 100 inserts/second using a commodity spinning disk. Setting innodb_buffer_pool_size to about 70% of available RAM is also important.
If a user is inserting multiple rows into the same table at the same time, then LOAD DATA or a batch Insert (INSERT ... VALUES (...), (...), ...) of 100 rows or more will insert ten times as fast. This applies to any Engine.
MyISAM is not 100 times as fast; it is not even 10 times as fast as InnoDB. Today (5.6 or newer), you would be hard pressed to find a well tuned application that is more than a little faster in MyISAM. You are, or will be, I/O-limited.
As for corruption -- No engine suffers from corruption except during a crash. A power failure may mangle MyISAM indexes, usually recoverably. Moreover, a batch insert could be half done. InnoDB will be clean -- the entire batch is done or none of it is done; no corruption.
ARCHIVE saves disk space, but costs CPU.
MEMORY is often faster because it has no I/O. But you have too much data for that Engine, correct?
MariaDB with TokuDB can probably run faster than anything I describe here; but you have not indicated the need for it.
100 rows inserted per second = 8M/day = 3 Billion/year. Will you be purging the data eventually? Will you be querying the data? Purging: Let's talk about PARTITION. Querying: Let's talk about Summary Tables.
Indexing: Minimize the number of indexes. If you have a 'random' index, such as a UUID, and you have a billion rows, you will be stuck with 100 rows/second, regardless of which Engine and regardless of any tuning. Do I need to explain further?
If this is a queuing system, I say "Don't queue it, just do it."
Bottom line: Use a InnoDB. Tune it. Use batch inserts. Avoid random indexes. etc.
You are correct that MyISAM is a faster choice if your operational use case is lots of insertions. But that answer can change drastically based on the kind of use you make of the data. If this is an archival application you might consider the ARCHIVE storage engine. It is best for write-once, read-rarely applications.
You should investigate INSERT DELAYED as it will allow your client programs to fire-and-forget these inserts rather than waiting for completion. This burns RAM in your mysqld process, though. If that style of operation meets your needs, this is a compelling reason to go with MyISAM.
Beware indexes in the target table of your inserts. Maintaining indexes is a big part of the server's insert workload.
Don't forget to look into MariaDB. It's a compatible fork of MySQL with some more advanced storage engines and features.
I have experience with a similar application. In our case, the application scaled up beyond the original insert rate, and the server could not keep up.(It's always good when an application workload grows!) We ended up doing two things, one after the other.
Using a message queuing system, and running just a couple of processes to actually do the inserts. The original clients wrote their logging records to the message queue rather than directly to the database. (Amazon AWS's SQS is an example of such a queuing system).
reworking the insert process to use LOAD DATA INFILE to load great gobs of log rows at once.
(You probably have figured out that this kind of workload isn't feasible on a cheap shared hosting service or an AWS micro instance.)

InnoDB Bottleneck: Relaxing ACID to Improve Performance

After noticing that our database has become a major bottleneck on our live production systems, I decided to construct a simple benchmark to get to the bottom of the issue.
The benchmark: I time how long it takes to increment the same row in an InnoDB table 3000 times, where the row is indexed by its primary key, and the column being updated is not part of any index. I perform these 3000 updates using 20 concurrent clients running on a remote machine, each with its own separate connection to the DB.
I'm interested in learning why the different storage engines I benchmarked, InnoDB, MyISAM, and MEMORY, have the profiles that they do. I'm also hoping to understand why InnoDB fares so poorly in comparison.
InnoDB (20 concurrent clients):
Each update takes 0.175s.
All updates are done after 6.68s.
MyISAM (20 concurrent clients):
Each update takes 0.003s.
All updates are done after 0.85s.
Memory (20 concurrent clients):
Each update takes 0.0019s.
All updates are done after 0.80s.
Thinking that the concurrency could be causing this behavior, I also benchmarked a single client doing 100 updates sequentially.
InnoDB:
Each update takes 0.0026s.
MyISAM:
Each update takes 0.0006s.
MEMORY:
Each update takes 0.0005s.
The actual machine is an Amazon RDS instance (http://aws.amazon.com/rds/) with mostly default configurations.
I'm guessing that the answer will be along the following lines: InnoDB fsyncs after each update (since each update is an ACID compliant transaction), whereas MyISAM does not since it doesn't even support transaction. MyISAM is probably performing all updates in memory, and regularly flushing to disk, which is how its speed approaches the MEMORY storage engine. If this is so, is there a way to use InnoDB for its transaction support, but perhaps relax some constraints (via configurations) so that writes are done faster at the cost of some durability?
Also, any suggestions on how to improve InnoDB's performance as the number of clients increases? It is clearly scaling worse than the other storage engines.
Update
I found https://blogs.oracle.com/MySQL/entry/comparing_innodb_to_myisam_performance, which is precisely what I was looking for. Setting innodb-flush-log-at-trx-commit=2 allows us to relax ACID constraints (flushing to disk happens once per second) for the case where a power failure or server crash occurs. This gives us a similar behavior to MyISAM, but we still get to benefit from the transaction features available in InnoDB.
Running the same benchmarks, we see a 10x improvement in write performance.
InnoDB (20 concurrent clients):
Each update takes 0.017s.
All updates are done after 0.98s.
Any other suggestions?
I found https://blogs.oracle.com/MySQL/entry/comparing_innodb_to_myisam_performance, which is precisely what I was looking for. Setting innodb-flush-log-at-trx-commit=2 allows us to relax ACID constraints (flushing to disk happens once per second) for the case where a power failure or server crash occurs. This gives us a similar behavior to MyISAM, but we still get to benefit from the transaction features available in InnoDB.
Running the same benchmarks, we see a 10x improvement in write performance.
InnoDB (20 concurrent clients): Each update takes 0.017s. All updates are done after 0.98s.
We have done some similar tests in our application and we noticed that if no transaction is explicitly opened, each single SQL instruction is treated inside a transaction, which takes much more time to execute. If your business logic allows, you can put several SQL commands inside a transaction block, reducing overall ACID overhead. In our case, we had great performance improvement with this approach.

mysql innodb vs myisam inserts

I have a table with 17 million rows. I need to grab 1 column of that table and insert it all into another table. Here's what I did:
INSERT IGNORE INTO table1(name) SELECT name FROM main WHERE ID < 500001
InnoDB executes in around 3 minutes and 45 seconds
However, MyISAM executes in just below 4 seconds. Why the difference?
I see everyone praising InnoDB but honestly I don't see how it's better for me. It's so much slower. I understand that it's great for integrity and whatnot, but many of my tables will not be updated (just read). Should I even bother with InnoDB?
The difference is most likely due to configuration of innoDB, which takes a bit more tweaking than myISAM. The idea of innoDB is to keep most of your data in memory, and flushing/reading to disk only when you have a few spare cpu cycles.
should you even bother with InnoDB is a really good question. If you're going to keep using MySQL, it's highly recommended you get some experience with InnoDB. But if you're doing a quick-and-dirty job for a database that won't see a lot of traffic and not worried about scale, then the ease of MyISAM may just be a win for you. InnoDB can be overkill in many instances where someone just wants a simple database.
but many of my tables will not be updated
You can still get a performance lift from InnoDB if you are doing 99% reading. If you configure your buffer pool size to hold your entire database in memory, InnoDB will NEVER have to go to disk to get your data, even if it misses the mysql query cache.
In MyISAM, there is a good chance you have to read the row from disk, and you're leaving the operating system to do the caching and optimization for you.
innodb-buffer-pool-size
My first guess is to check innodb_buffer_pool_size which ships out of the box set to 8M. It's recommended to have this around 80% of your total memory. Once you hit that limit, innodb performance will drop significantly because it needs to flush something out of the buffer to make room for the new data, which can be expensive
autocommit=0
Also, make sure autocommit is turned off while you load your table, or flushing will happen on every insert. You can turn it back on after you're done, and it's a client-side setting. very safe.
Loading tables typically happens once
Think about if you really want to tune your database to accommodate "inserting 17million rows". How often do you do this? MyISAM might be quicker in this instance, but when you have 100 concurrent connections all reading and modifying this table at the same time, you'll find a well-tuned innoDB will win and MyISAM will choke on table locks.
How MyISAM sees this operation
MyISAM will be very good at this without any tuning, because under the covers, you're simply appending each row to a file (and updating an index). Your OS and disk caching will handle all those performance problems.
How InnoDB sees this operation
Innodb will know the table needs a write, so it throws the row into the insert buffer.
You give it no time before the next insert, so innoDB has no time to deal with the buffer, it runs out of room and is forced to 'hold up' the insert while it writes to the buffer pool and updates indexes.
Next, your buffer pool fills up, and innoDB is forced to 'hold up' the insert and flush some page out of the buffer pool to disk.
And you keep throwing inserts at it like crazy.
Note that when you do tune InnoDB to give you a MySQL> prompt very fast after you do this, InnoDB will still be scrambling underneath the covers to catch up in it's spare time, but will be willing to execute a new transaction for you.
MUST READ:
http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html (see bulk data loading tips)
You're saying right upto some extend. InnoDB is slower than MyISAM but in which cases?
Everything is not made to meet everyone's requirements. INNODB is a transactional database engine while MyISAM is not. Therefore to make it ACID compliance and transactions aware storage engine, we have to pay its cost in terms of response time.
Further more InnoDB runs faster if it is properly tuned using my.ini or other configuration file.
At the end I am able to understand following reasons why people are praising InnoDB:
It is ACID compliant and transaction supported engine
It take row-level locking while working on a table while MyISAM take table-level locks
InnoDB is highly tunable for multi-core/multi-process machines to improve concurrency
Last but not the least comment from my side; anything can meet "everyone's" needs so its solely depends in which scenario you're comparing both engines.
Check out MYISAM vs Innodb comparison on Wikipedia.
http://en.wikipedia.org/wiki/Comparison_of_MySQL_database_engines

A lot of writes,but few reads - what Mysql storage engine to use?

I was wondering if anyone has a suggestion for what kind of storage engine to use. The programs needs to perform a lot of writes to database but very few reads.
[edit] No foreign keys necessary. The data is simple, but it needs to preform the writes very fast.
From jpipes:
MyISAM and Table-Level Locks
Unlike InnoDB, which employs row-level
locking, MyISAM uses a much
coarser-grained locking system to
ensure that data is written to the
data file in a protected manner.
Table-level locking is the only level
of lock for MyISAM, and this has a
couple consequences:
Any connection issuing an UPDATE or DELETE against a MyISAM table will
request an exclusive write lock on the
MyISAM table. If no other locks (read
or write) are currently placed on the
table, the exclusive write lock is
granted and all other connections
issuing requests of any kind (DDL,
SELECT, UPDATE, INSERT, DELETE) must
wait until the thread with the
exclusive write lock updates the
record(s) it needs to and then
releases the write lock.
Since there is only table-level locks, there is no ability (like there
is with InnoDB) to only lock one or a
small set of records, allowing other
threads to SELECT from other parts of
the table data.
The point is, for writing, InnoDB is better as it will lock less of the resource and enable more parallel actions/requests to occur.
"It needs to perform the writes very fast" is a vague requirement. Whatever you do, writes may be delayed by contention in the database. If your application needs to not block when it's writing audit records to the database, you should make the audit writing asynchronous and keep your own queue of audit data on disc or in memory (so you don't block the main worker thread/process)
InnoDB may allow concurrent inserts, but that doesn't mean they won't be blocked by contention for resources or internal locks for things like index pages.
MyISAM allows one inserter and several readers ("Concurrent inserts") under the following circumstances:
The table has no "holes in it"
There are no threads trying to do an UPDATE or DELETE
If you have an append-only table, which you recreate each day (or create a new partition every day if you use 5.1 partitioning), you may get away with this.
MyISAM concurrent inserts are mostly very good, IF you can use them.
When writing audit records, do several at a time if possible - this applies whichever storage engine you use. It is a good idea for the audit process to "batch up" records and do an insert of several at once.
You've not really given us enough information to make a considered suggestion - are you wanting to use foreign keys? Row-level locking? Page-level locking? Transactions?
As a general rule, if you want to use transactions, InnoDB/BerkeleyDB. If you don't, MyISAM.
In my experience, MyISAM is great for fast writes as long as, after insertion, it's read-only. It'll keep happily appending faster than any other option I'm familiar with (including supporting indexes).
But as soon as you start deleting records or updating index keys, and it needs to refill emptied holes (in tables or indexes) the discussion gets a lot more complicated.
For classic log-type or journal-type tables, though, it's very happy.

MySQL transaction support with mixed tables

It seems like I will be needing transaction with MySQL and I have no idea how should I manage transactions in Mysql with mixed InnoDB/MyISAM tables, It all seems like a huge mess.
You might ask why would I ever want to mix the tables together... the anwer is PERFORMANCE. as many developers have noticed, InnoDB tables generally have bad performance, but in return give higher isolation level etc...
does anyone have any advice regarding this issue?
I think you are overrating the performance difference between MyISAM and InnoDB. MyISAM is faster in data warehousing situations (such as full table scan reporting, etc..), but InnoDB can actually be faster in many cases with normal OLTP queries.
InnoDB is harder to tune since it has more knobs, but a properly tuned InnoDB system can often have higher throughput than MyISAM due to better locking and better I/O patterns.
Given that you can't have transactions in MyISAM tables, I am not sure what the actual problem is. Any data you need transactions for must be in an InnoDB table and you manage the transactions using whatever access library you are using or with manual SQL commands.
There are definite performance benefits of using exactly one engine.
A server tuned for one engine won't be tuned for the other - both require that you allocate a substantial amount of RAM to its exclusive use - therefore, you can't give them both an optimal amount.
Say you have 8G of ram on your (obviously 64-bit, but still relatively small) database server, you might want to assign about 3/4 of it to your innodb page cache. Alternatively, if you're using MyISAM, you may want about half of it to be your key_buffer. You can't do both.
Pick an engine and use it exclusively. There are ways of getting around performance problems - most of them aren't easy though (i.e. they require redesigning your data structure or your application).
The short answer is that there is no transaction support in MyISAM. If you start a transaction, add or modify data in some InnoDB tables, add or modify data in a MyISAM table, and then you have to rollback, your MyISAM change cannot be removed. To support mixed engines like that, your application has to know that changes to whatever data is stored MyISAM happens "outside" of the transaction.
If you need transactions for some processes, then isolate the data that must be transactionable and put all that data in InnoDB.