Questions :
1 - What is mean by Overhead? When I click "Optimize table" button on MyISAM table, Overhead and Effective data are gone. I wonder what it does to my table?
2 - Do I need to care of Overhead and Effective value actually? How to fix the Overhead and Effective problem on InnoDB table?
Fixing InnoDB is not as trivial as a Click of a Button. MyISAM is.
Under the hood, OPTIMIZE TABLE will do this to a MyISAM table called mytb:
Create empty temp table with same structure as mytb
Copy MyISAM data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
OPTMIZE TABLE does not work that way with InnoDB for two major reasons:
REASON #1 : InnoDB Storage Layout
By default, InnoDB has innodb_file_per_table disabled. Everything InnoDB and its grandmother lands in ibdata1. Running OPTIMIZE TABLE does the following to an InnoDB table called mytb:
Create empty InnoDB temp table with same structure as mytb
Copy InnoDB data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
Unfortunately, the temp table used for shrinking mytb is appended to ibdata1. INSTANT GROWTH FOR ibdata1 !!! In light of this, ibdata1 will never shrink. To matters worse, ANALYZE TABLE is useless (Explained in REASON #2)
If you have innodb_file_per_table enabled, the first four(4) steps will work since the data is not stored in ibdata1, but in an external tablespace file called mytb.ibd. That can shrink.
REASON #2 : Index Statistics are Always Recomputed
InnoDB does not effectively store index statistics. In fact, if you run ANALYZE TABLE on mytb, statistics are created and stored. Unfortunately, by design, InnoDB will dive into the BTREE pages of its indexes, guessimate key cardinalities, and uses those numbers to prep the MySQL Query Optimizer. This is an on-going process. In effect, ANALYZE TABLE is useless because the index statistics calculated are overwritten with each query executed against that table. I wrote about this in the DBA StackExchange June 21, 2011.
Percona explained this thoroughly in www.mysqlperformanceblog.com
As for overhead in MyISAM, that number can be figured out.
For a MyISAM table, the overhead represents internal fragmentation. This is quite common in a table that experiences, INSERT, UPDATEs, and DELETEs, especially if you have BLOB data or VARCHAR columns. Running OPTIMIZE TABLE make such fragmentation disappear by copying to a temp table (naturally not copying empty space).
Going back to InnoDB, how do you effectively eliminate waste space ? You need to rearchitect ibdata1 to hold less info. Withing ibdata1 you have four types of data:
Table Data Pages
Index Data Pages
Table MetaData
MVCC Data for Transactions
You can permamnently move Table and Indexes Out of ibdata1 forever. What about data and indexes already housed in ibdata1 ?
Follow the InnoDB Cleanup Plan that I posted October 29, 2010 : Howto: Clean a mysql InnoDB storage engine?
In fact "OPTIMIZE TABLE" is a useless waste of time on MyISAM, because if you have to do it, your database is toast already.
It takes a very long time on large tables, and blocks write-access to the table while it does so. Moreover, it has very nasty effects on the myisam keycache etc.
So in summary
small tables never need "optimize table"
large tables can never use "optimize table" (or indeed MyISAM)
It is possible to achieve (roughly) the same thing in InnoDB simply using an ALTER TABLE statement which makes no schema changes (normally ALTER TABLE t ENGINE=InnoDB). It's not as quick as MyISAM, because it doesn't do the various small-table-which-fits-in-memory optimisations.
MyISAM also uses a bunch of index optimisations to compress index pages, which generally cause quite small indexes. InnoDB also doesn't have these.
If your database is small, you don't need it. If it's big, you can't really use MyISAM anyway (because an unplanned shutdown makes the table to need rebuilding, which takes too long on large tables). Just don't use MyISAM if you need durability, reliability, transactions, any level of concurrency, or generally care about robustness in any way.
Related
My client has a store on Woocommerce with 1.2 Gb database. I know that similar store (count by product ) should have approx 700Mb.
The biggest table is wp_posts (760Mb) alone ! Which, I think is strange. Usually biggest table is wp_postmete or wp_options.
I tried optimize this database by plugins: WP-Sweep and wp-optimize so there is no revisions and draft left.
I also tried SQL:
OPTIMIZE TABLE
but it is innoDB so it do not support it. I get this message:
Table does not support optimize,doing recreate + analyze instead
So it is done? I mean:”recreate + analyze” or I should do it? And how?
I read that in innoDB i should dump table and restore but when I do this by DBeaver - i get same size.
Any Idea what should I do?
The error message is a bit misleading, because it dates back to the days when MyISAM was the default storage engine, and OPTIMIZE TABLE does a few things in MyISAM that are different from what it does in InnoDB. For example, MyISAM can't reclaim space from deleted rows until you do OPTIMIZE TABLE (whereas InnoDB does reclaim space dynamically).
InnoDB does support OPTIMIZE TABLE and it does useful things. It does basically the same as an ALTER TABLE when using the COPY algorithm. That is, it creates a new file, and copies the data row by row into the new file. This accomplishes defragmentation and rebuilding the indexes, just as if you had done a dump and restore. So you don't need to dump and restore.
After OPTIMIZE TABLE, the InnoDB table may be close to the same size it was before, if there was little fragmentation.
Frankly, a table 1.2GB in size is not so large by the standards of most MySQL projects I've worked on. We start to get concerned if a table is larger than 500GB, and we start alerting developers if the table is larger than 800GB, or larger than the remaining free disk space.
Back then when i was working heavily with MyISAM Tables i always had a cronjob which ran
~# mysqlanalyze -o database
I know that MyISAM benefit from this in certain ways e.g.: fragmentation and whatnot
Now, when running the same command on a databse where the majority of tables is InnoDB i wonder if this "does any good" to the tables and is considered a good practice to do so every now and then or if its rather counter productive. Reading alot of :
Table does not support optimize, doing recreate + analyze instead
Which sounds expensive with regards to Disk IO / CPU time ?!
would appreciate some input on this.
https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html says:
For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE ... FORCE, which rebuilds the table to update index statistics and free unused space in the clustered index.
This does do some good in cases when you had too much fragmentation. Pages will be filled more efficiently, indexes will be rebuilt, and disk space occupied by the table will be reduced if you use innodb_file_per_table (which is the default in recent versions).
It does take time, depending on the size of your table. It will lock the table while it's running. It will require extra disk space while it's running, as it creates a copy of the table.
Doing optimize table on an InnoDB table is usually not necessary to do frequently, but only after you do a lot of insert/update/delete against the table in a way that could result in fragmentation.
ANALYZE TABLE is much less impact for InnoDB. This doesn't require building a copy of the table. It's a read-only action, it just reads a random sample of pages from the table and uses that to estimate the number of rows, average size of rows, and it update statistics about the indexes, to guide the query optimizer. This is safe to run anytime, it will lock that table for moment, but that won't be any greater regardless of the size of the table.
Don't bother. InnoDB almost never needs either ANALYZE or OPTIMIZE; don't waste your time unless you have identified a need.
An exception is a FULLTEXT index on an InnoDB table. Such can benefit from DROP INDEX, then ADD INDEX.
If you are "reloading" the table from new data, then the following avoids downtime:
CREATE TABLE new LIKE real;
load `new`
RENAME TABLE real TO old, new TO real; -- fast, atomic
DROP TABLE old;
(Caveat: The above technique probably has issues if there are FOREIGN KEYS.)
I have a mysql table with 12 columns, one primary key and two unique key. I have more or less 86000 rows/records in this table.
I use this mysql code:
INSERT INTO table (col2,col3,-------col12) VALUES ($val2,$val3,----------$val12) ON DUPLICATE KEY UPDATE col2=VALUES($val2), col3=VALUES($val3),----------------col12=VALUES($val12)
When I view the structure of this table from cpanel phpmyadmin, I can see 'Optimize Table' link just below the index information of the table. If I click the link, the table is optimized.
But my question is why I see the 'optimize table' link so frequently (within 3/4 days, it appears) in this table, while the other tables of this database do not show the optimize table link (They show the link once in a month or even once in every two months or more).
As I am not deleting this table row, just inserting and if duplicate key found, just updating, then why optimization is required so frequently?
Short Answer: switch to Innodb
MyISAM storage engine uses BTree for indexes and creates index files. Every time you insert a lot of data this indexes are changed and that is why you need to optimize your table to reorganize the indexes and regain some space.
MyISAM's indexing mechanism takes much more space compared to Innodb.
Read the link below
http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/
There are a lot of other advantages to Innodb over MyISAM but that is another topic.
I will explain how inserting records affects a MyISAM table and explain what optimizing does, so you'll understand why inserting records has such a large effect.
Data
With MyISAM, when you insert records, data is simply appended to the end of the data file.
Running optimize on a MyISAM table defrags the data, physically reordering it to match the order of the primary key index. This speeds up sequential record reads (and table scans).
Indexes
Inserting records also adds leaves to the B-Tree nodes in the index. If a node fills up, it must be split, in effect rebuilding at least that page of the index.
When optimizing a MyISAM table, the indexes are flattened out, allowing room for more expansion (insertion) before having to rebuild an index page. This flatter index also speeds searches.
Statistics
MySQL also stores statistics for each index about key distribution, and the query optimizer uses this information to help develop a good execution plan. Inserting (or deleting) many records causes these statistics to become out of date.
Optimizing MySQL recalculates the statistics for the table after the defragging and rebuilding of the indexes.
vs. Appending
When you are appending data (adding a record with a higher primary key value such as with auto_increment), that data will not need to be later defragged since it will already be in the proper physical order. Also, when appending (inserting sequentially) into an index, the nodes are kept flat, so there's no rebuilding to be done there either.
vs. InnoDB
InnoDB suffers from the same issues when inserting, but since data is kept in order by primary key due to its clustered index, you take the hit up front (at the time it's inserted) for keeping the data in order, rather than having to defrag it later. Still, optimizing InnoDB does optimize the data by flattening out the B-tree nodes and freeing up unused (deleted) keys, which improves sequential reads (table scans), and secondary indexes are similar to indexes in MyISAM, so they get rebuilt to flatten them out.
Conclusion
I'm not trying to make a case to stick with MyISAM. InnoDB has superior read performance due to the clustered indexes, and better update and append performance due to the record level locking versus MyISAM's table locking (assuming concurrent users). Also, InnoDB has ACID.
Still, my goal was to answer your direct question and provide some technical details rather than conjecture and hearsay.
Neither database storage engine automatically optimizes itself.
I have a myISAM table with 2.5 million and rising rows. It is myisam as I require FullText searching.
Having done some research on stackoverflow I'm looking into creating the table again as a InnoDB table and then creating a copy in myISAM. Then I will create triggers which will replicate any changes in the innodb table to the myisam table.
The innodb table will function better as it works transactionally and doesn't lock the whole table when it is written to or updated.
My question is: Will I see much benefit in the myisam table as surely it is going to be written to as often as before because every write to the innodb table will result in a subsequent write to the myisam.
Any suggestions, or other ideas gratefully received.
Brett
Using triggers to copy from MyISAM to InnoDB has the risk of creating inconsistent data, when transactions are rolled back.
A better idea might be to install a full text search engine like Sphinx that can work with InnoDB tables.
Another idea would be to synchronise MyISAM with InnoDB periodically using event scheduler. You risk that full text search would return stale data, but on the other hand this should have less of an impact on performance, and at least you know data is consistent after each sync.
Also some good news: starting with MySQL 5.6 InnoDB gets full text searches.
Ran into an interesting problem with a MySQL table I was building as a temporary table for reporting purposes.
I found that if I didn't specify a storage engine, the DROP TEMPORARY TABLE command would hang for up to half a second.
If I defined my table as ENGINE = MEMORY this short hang would disappear.
As I have a solution to this problem (using MEMORY tables), my question is why would a temporary table take a long time to drop? Do they not use the MEMORY engine by default? It's not even a very big table, a couple of hundred rows with my current test data.
Temporary tables, by default, will be created where ever the mysql configuration tells it to, typically /tmp or somewhere else on a disk. You can set this location (and even multiple locations) to a RAM disk location such as /dev/shm.
Hope this helps!
If the temporary file is created with InnoDb engine, which may be the case if your default engine was InnoDb, and the InnoDb buffer pool is large, DROP TEMPORARY TABLE may take some time since it needs to scan all pages to discard.
It was mentionned in a comment to this stack overflow question.
Note also that DROP (TEMPORARY) TABLE uses a LOCK that may have huge impact on all your server. See for example this.
At my work, we recently had a server slow down because we had an InnoDb buffer pool of 80 Gb and some SQL requests had been optimized using InnoDb temporary tables.
About 100 such DROP TEMPORARY TABLE requests every 5 minutes were sufficient to have a huge impact. And the problem was hard to debug since slow query log would tell us that UPDATEs of a single row accessed by primary key in some other table was taking two seconds, and there was an enormous amount of such updates. But even if most query time was spent on these updates, the problem was really because of the DROP TEMPORARY TABLE requests.