Is a memory/heap engine table the same performance wise to a mostly innodb table database with big buffer pool? I usually have 2 tables - 1 innodb with varchars and several rows and a memory table compact size (5 rows, mostly just PK and indexed ints for heavy reads..I recently learned about innodb buffer so is my table clone system overkill and useless or still faster then innodb?
In memory tables must be more performant, at least in theory: in InnoDB, even with a large buffer pool, you're going to have block-based structure in the cache, so some blocks will only be partially full, and that's an overhead. Another reason is that in-memory tables don't have row versions or row locks, so, again, this is going to use less memory. But befare: in-memory tables still don't have row-level locking, so if you run large updates, you may actually find that using InnoDB is more scalable.
So, to sum up: MEMORY table - potentially less memory to store the same amount of data, InnoDB - potentially more scalable.
Everything needs to be measured for your particular case of course.
Perhaps if you need to store data in memory anyway, choose an in-memory database? (shameless plug).
Reads from the InnoDB buffer pool will be sensibly as fast as with Memory tables.
In some cases, Memory tables could even out-perform buffered InnoDB tables, the former also supports Hash indexes whereas the latter only supports B-Tree indexes. Depending on the profile of your queries, you might get faster reads with Hash tables.
Besides, buffered InnoDB tables could be flushed out of the buffer if some query require this memory space, or if the data is seldom used. By explicitely copying your data to a Memory table, you have the guarantee that your data will always be in memory.
I should also mention that regardless of the size of the buffer pool, updates to an InnoDB table will need to be flushed to disk at some stage. But I understand this does not apply in your use case.
Now this is theory. Only if this data is to be read very, very frequently should you bother with these considerations.
Related
SELECT P_CODE, P_PRICE
FROM PRODUCT
WHERE P_PRICE >= (SELECT AVG(P_PRICE) FROM PRODUCT);
Will this query (under mysql) result in two full table scans (from disk) or will the optimizer understand that it's faster too (if there is enough RAM to hold the result set) only do one full table scan? The table has no indexes.
Is it possible to read (somehow) this information from output of the EXPLAIN command in mysql?
The question is flawed based on a misunderstanding of what a table scan actually is:
A table scan iterates over all rows in the table (irrespective of how it obtains those rows).
It also differs slightly from an index scan in that it works with the "full row". Whereas an index scan has less overall data to process, because it works with a subset of columns.
But the question is actually asking about difference between physical and logical IO.
(from disk) or will the optimizer understand that it's faster too (if there is enough RAM to hold the result set)
Yes the query will do 2 table scans. That cannot be avoided:
the server has to process the full set of prices twice.
and it has to finish processing for AVG(PRICE) before it can start processing for the WHERE filter.
However, a "logical" table scan does not necessarily require reading the data from disk twice. If all the data is in memory, the server can perform the table scan in memory. So although the second stage of processing must still perform a table scan, it can be more efficient by avoiding secondary disk access.
Take a look at this question to see how to distinguish logical and physical IO on mysql:
For a MySQL query, how do you determine physical and logical I/O?
I'll add that in theory a server could choose to keep only the Price column in memory on the first pass. In which case it wouldn't need be perform a "full table scan" on the second pass.
However this is unlikely in practice as there's a benefit to keep all the data in memory for other future queries ... whatever columns they may wish to process.
Re your comment:
my assumption, when looking at the query, is than an optimizer should/would be able to determine that "this query reads the same data twice, after the first read i will put it into memory(if there is space) and use the in-memory data for the next part of the query, instead of asking the disk for it twice"
Well, at least in MySQL's InnoDB engine, something sort of like this happens. InnoDB can't really read pages directly from disk. It load every requested page into RAM before doing data operations on it. The RAM is a preallocated area called the InnoDB buffer pool. This stores byte-for-byte copies of the pages from the on-disk tablespace, plus some metadata about them.
After reading a page, the buffer pool has no immediate need to evict it from RAM, unless other pages are requested and there's no space left in the buffer pool for them. So subsequent requests for the same pages may find the pages already residing in RAM. The more this happens, the better your performance overall.
You might have more data pages in your product table than can fit in your buffer pool. During a table-scan, InnoDB will evict pages as needed to load the remaining set of pages for the table. If you have a table that is many times larger than your buffer pool, you can imagine that this results in quite a bit of "churn" as pages come in and out. If you can afford it, allocating more RAM to the buffer pool is an good way to improve performance.
All these facts about the buffer pool don't change the fact that your query will perform two table-scans. It is true that it will be faster to read the pages from the buffer pool than reading pages from disk. You can experiment:
Shutdown your MySQL Server and start it back up again. The buffer pool should be empty at this point (unless you are using the feature to save the buffer pool on shutdown).
Run your query. It might take many seconds, because each page requested has to be read from disk before it can be used.
Run the same query again. It's faster! I've seen cases where this difference makes the performance about 4x faster in tests. I understand that RAM is typically thousands of times faster than disk, but I/O speed is not the only code running. Also it depends on what other requests are occupying the disk bandwidth, and other factors.
The difference between disk speed and RAM speed is (more or less) an arithmetic factor. No matter how large your dataset, the speed difference gives the same advantage.
Indexes are much more important, because they turn a linear search O(n) into a B-tree search O(log2n). As your dataset gets larger, the advantage of this becomes more dramatic. This is why there is so much emphasis on analyzing complexity of algorithms in computer science.
Please explain how you could do this with only one table scan. It is not obvious.
The use of the AVG() function would typically result in two full scans. If you have an index, then one or both scans might use the index.
I have a table with 17 million rows. I need to grab 1 column of that table and insert it all into another table. Here's what I did:
INSERT IGNORE INTO table1(name) SELECT name FROM main WHERE ID < 500001
InnoDB executes in around 3 minutes and 45 seconds
However, MyISAM executes in just below 4 seconds. Why the difference?
I see everyone praising InnoDB but honestly I don't see how it's better for me. It's so much slower. I understand that it's great for integrity and whatnot, but many of my tables will not be updated (just read). Should I even bother with InnoDB?
The difference is most likely due to configuration of innoDB, which takes a bit more tweaking than myISAM. The idea of innoDB is to keep most of your data in memory, and flushing/reading to disk only when you have a few spare cpu cycles.
should you even bother with InnoDB is a really good question. If you're going to keep using MySQL, it's highly recommended you get some experience with InnoDB. But if you're doing a quick-and-dirty job for a database that won't see a lot of traffic and not worried about scale, then the ease of MyISAM may just be a win for you. InnoDB can be overkill in many instances where someone just wants a simple database.
but many of my tables will not be updated
You can still get a performance lift from InnoDB if you are doing 99% reading. If you configure your buffer pool size to hold your entire database in memory, InnoDB will NEVER have to go to disk to get your data, even if it misses the mysql query cache.
In MyISAM, there is a good chance you have to read the row from disk, and you're leaving the operating system to do the caching and optimization for you.
innodb-buffer-pool-size
My first guess is to check innodb_buffer_pool_size which ships out of the box set to 8M. It's recommended to have this around 80% of your total memory. Once you hit that limit, innodb performance will drop significantly because it needs to flush something out of the buffer to make room for the new data, which can be expensive
autocommit=0
Also, make sure autocommit is turned off while you load your table, or flushing will happen on every insert. You can turn it back on after you're done, and it's a client-side setting. very safe.
Loading tables typically happens once
Think about if you really want to tune your database to accommodate "inserting 17million rows". How often do you do this? MyISAM might be quicker in this instance, but when you have 100 concurrent connections all reading and modifying this table at the same time, you'll find a well-tuned innoDB will win and MyISAM will choke on table locks.
How MyISAM sees this operation
MyISAM will be very good at this without any tuning, because under the covers, you're simply appending each row to a file (and updating an index). Your OS and disk caching will handle all those performance problems.
How InnoDB sees this operation
Innodb will know the table needs a write, so it throws the row into the insert buffer.
You give it no time before the next insert, so innoDB has no time to deal with the buffer, it runs out of room and is forced to 'hold up' the insert while it writes to the buffer pool and updates indexes.
Next, your buffer pool fills up, and innoDB is forced to 'hold up' the insert and flush some page out of the buffer pool to disk.
And you keep throwing inserts at it like crazy.
Note that when you do tune InnoDB to give you a MySQL> prompt very fast after you do this, InnoDB will still be scrambling underneath the covers to catch up in it's spare time, but will be willing to execute a new transaction for you.
MUST READ:
http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html (see bulk data loading tips)
You're saying right upto some extend. InnoDB is slower than MyISAM but in which cases?
Everything is not made to meet everyone's requirements. INNODB is a transactional database engine while MyISAM is not. Therefore to make it ACID compliance and transactions aware storage engine, we have to pay its cost in terms of response time.
Further more InnoDB runs faster if it is properly tuned using my.ini or other configuration file.
At the end I am able to understand following reasons why people are praising InnoDB:
It is ACID compliant and transaction supported engine
It take row-level locking while working on a table while MyISAM take table-level locks
InnoDB is highly tunable for multi-core/multi-process machines to improve concurrency
Last but not the least comment from my side; anything can meet "everyone's" needs so its solely depends in which scenario you're comparing both engines.
Check out MYISAM vs Innodb comparison on Wikipedia.
http://en.wikipedia.org/wiki/Comparison_of_MySQL_database_engines
I run my sites all on InnoDB tables which is working really well so far. Now I like to know what is going on in real-time on my sites, so I store each pageview (page, referrer, IP, hostname, etc) in an InnoDB table. There are about 100 inserts per second, and this table is only read once in a while when i'm browsing the logs.
I clean out the table every minute with a cron that removes old items. This leaves about 35.000 rows in that table on average, with a size of about 5MB.
Would it be easier on the server if I were to transfer the InnoDB table to a MEMORY table? As far as I can see this would save a lot of disk IO right? Restarting Mysql would result in a loss of data, but this does not matter in my case.
Question: In my case, would you recommend a Memory table over a InnoDB table?
Yes I would. The conditions you mention (a lot of writes, periodic purging of data, data persistence not required) make it pretty much an ideal candidate for MEMORY.
please optimize your innodb settings:
As long as you have configured InnoDB to use enough memory to hold your entire table (with innodb_buffer_pool_size), and there is not excessive pressure from other InnoDB tables on the same server, the data will remain in memory. If you're concerned about write performance (and again barring other uses of the same system) you can reduce durability to drastically increase write performance by setting innodb_flush_log_at_trx_commit = 0 and disabling binary logging.
Using any sort of triggers with temporary tables will be a mess to maintain, and won't give you any benefits of transactionality on the temporary tables.
You can find more details right here:
http://dev.mysql.com/doc/refman/4.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit
I have a database with about 30 tables and 5 tables of them is write-intensive.
I'm considering
Convert 5 write-intensive tables to use InnoDB engine and keep the rest on MyISAM engine
Convert all tables to use InnoDB engine.
I wonder which approach is better?
To be more specific
The reason I want to keep some table on MyISAM engine is some of them has around 1,000,000 rows. I'm not sure how slower it will be for queries like "SELECT COUNT(*)" on these tables after converted to InnoDB.
I haven't done a test. I prefer getting some advices from any of you before start the switch.
These days, I always default to using InnoDB, especially on the write-intensive tables you mention where MyISAM suffers from full table locking. Here's a to-the-point comparison.
Reasons to use MyISAM:
Tables are really fast for select-heavy loads
Table level locks limit their scalability for write intensive multi-user environments.
Smallest disk space consumption
Fulltext index
Merged and compressed tables.
Reasons to use InnoDB:
ACID transactions
Row level locking
Consistent reads – allows you to reach excellent read write concurrency.
Primary key clustering – gives excellent performance in some cases.
Foreign key support.
Both index and data pages can be cached.
Automatic crash recovery – in case MySQL shutdown was unclean InnoDB tables will still - recover to the consistent state- No check / repair like MyISAM may require.
All updates have to pass through transactional engine in InnoDB, which often decreases - performance compared to non-transactional storage engines.
The above was taken from this site, which no longer seems to be working.
pros and cons for each.
for (1)
pros: less disk space usage, myisam much faster for read-heavy access patterns
cons: memory must be shared between the innodb buffers and myisam key buffers. innodb tables are about 4x bigger than their myisam counterparts. programmatic code must be adapted for deadlock handling.
just remember innodb will also lock if you're changing an indexed column or primary key.
Does anyone know how much memory MyISAM and innoDB use? How does their memory usages compare when dealing with small tables vs. when dealing with bigger tables (up to 32 GB)?
I know innoDB is heavier than MyISAM, but just how much more?
Any help would be appreciated.
Thanks,
jb
You can't compare them like that. Or at least, you shouldn't. Each one uses the memory in a different way. This is especially true if you're tunning your DB's for performance.
MyISAM has specific buffers for indexes and it uses the OS disk buffer for caching other data. It doesn't make sense to have your buffers larger than the sum of your indexes, but the more memory you give it, the faster it will be.
InnoDB has a buffer pool for all data. You configure this based on your available memory and how much you want to give it. InnoDB buffers as much of your data in memory as possible. If you can fit the entire DB in memory, InnoDB will never read from disk. A lot of InnoDB databases see huge performance hits when the data size becomes larger than the buffer pool.
MySQL is very configurable. It's tunable to meet your needs. Typically, databases should be given as much memory as possible since they are almost always disk bound. More memory means more can be buffered.