Analyze + Optimize on InnoDB Tables - mysql

Back then when i was working heavily with MyISAM Tables i always had a cronjob which ran
~# mysqlanalyze -o database
I know that MyISAM benefit from this in certain ways e.g.: fragmentation and whatnot
Now, when running the same command on a databse where the majority of tables is InnoDB i wonder if this "does any good" to the tables and is considered a good practice to do so every now and then or if its rather counter productive. Reading alot of :
Table does not support optimize, doing recreate + analyze instead
Which sounds expensive with regards to Disk IO / CPU time ?!
would appreciate some input on this.

https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html says:
For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE ... FORCE, which rebuilds the table to update index statistics and free unused space in the clustered index.
This does do some good in cases when you had too much fragmentation. Pages will be filled more efficiently, indexes will be rebuilt, and disk space occupied by the table will be reduced if you use innodb_file_per_table (which is the default in recent versions).
It does take time, depending on the size of your table. It will lock the table while it's running. It will require extra disk space while it's running, as it creates a copy of the table.
Doing optimize table on an InnoDB table is usually not necessary to do frequently, but only after you do a lot of insert/update/delete against the table in a way that could result in fragmentation.
ANALYZE TABLE is much less impact for InnoDB. This doesn't require building a copy of the table. It's a read-only action, it just reads a random sample of pages from the table and uses that to estimate the number of rows, average size of rows, and it update statistics about the indexes, to guide the query optimizer. This is safe to run anytime, it will lock that table for moment, but that won't be any greater regardless of the size of the table.

Don't bother. InnoDB almost never needs either ANALYZE or OPTIMIZE; don't waste your time unless you have identified a need.
An exception is a FULLTEXT index on an InnoDB table. Such can benefit from DROP INDEX, then ADD INDEX.
If you are "reloading" the table from new data, then the following avoids downtime:
CREATE TABLE new LIKE real;
load `new`
RENAME TABLE real TO old, new TO real; -- fast, atomic
DROP TABLE old;
(Caveat: The above technique probably has issues if there are FOREIGN KEYS.)

Related

Post optimization needed after deleting rows in a MYSQL Database

I have a log table that is currently 10GB. It has a lot of data for the past 2 years, and I really feel at this point I don't need so much in there. Am I wrong to assume it is not good to have years of data in a table (a smaller table is better)?
My tables all have an engine of MYISAM.
I would like to delete all data of 2014 and 2015, and soon i'll do 2016, but i'm concerned about after I run the DELETE statement, what exactly will happen. I understand because it's ISAM there is a lock that will occur where no writing can take place? I would probably delete data by the month, and do it late at night, to minimize this as it's a production DB.
My prime interest, specifically, is this: should I take some sort of action after this deletion? Do I need to manually tell MYSQL to do anything to my table, or is MYSQL going to do all the housekeeping itself, reclaiming everything, reindexing, and ultimately optimizing my table after the 400,000k records I'll be deleting.
Thanks everyone!
Plan A: Use a time-series PARTITIONing of the table so that future deletions are 'instantaneous' because of DROP PARTITION. More discussion here . Partitioning only works if you will be deleting all rows older than X.
Plan B: To avoid lengthy locking, chunk the deletes. See here . This is optionally followed by an OPTIMIZE TABLE to reclaim space.
Plan C: Simply copy over what you want to keep, then abandon the rest. This is especially good if you need to preserve only a small proportion of the table.
CREATE TABLE new LIKE real;
INSERT INTO new
SELECT * FROM real
WHERE ... ; -- just the newer rows;
RENAME TABLE real TO old, new TO real; -- instantaneous and atomic
DROP TABLE old; -- after verifying that all went well.
Note: The .MYD file contains the data; it will never shrink. Deletes will leave holes in it. Further inserts (and opdates) will use the holes in preference to growing the table. Plans A and C (but not B) will avoid the holes, and truly free up space.
Tim and e4c5 have given some good recommendations and I urge them to add their answers.
You can run OPTIMIZE TABLE after doing the deletes. Optimize table will help you with a few things (taken from the docs):
If the table has deleted or split rows, repair the table.
If the index pages are not sorted, sort them.
If the table's statistics are not up to date (and the repair could not be accomplished by sorting the index), update them.
According to the docs: http://dev.mysql.com/doc/refman/5.7/en/optimize-table.html
Use OPTIMIZE TABLE in these cases, depending on the type of table:
...
After deleting a large part of a MyISAM or ARCHIVE table, or making
many changes to a MyISAM or ARCHIVE table with variable-length rows
(tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns). Deleted
rows are maintained in a linked list and subsequent INSERT operations
reuse old row positions. You can use OPTIMIZE TABLE to reclaim the
unused space and to defragment the data file. After extensive changes
to a table, this statement may also improve performance of statements
that use the table, sometimes significantly.

Why my mysql table has to optimize frequently

I have a mysql table with 12 columns, one primary key and two unique key. I have more or less 86000 rows/records in this table.
I use this mysql code:
INSERT INTO table (col2,col3,-------col12) VALUES ($val2,$val3,----------$val12) ON DUPLICATE KEY UPDATE col2=VALUES($val2), col3=VALUES($val3),----------------col12=VALUES($val12)
When I view the structure of this table from cpanel phpmyadmin, I can see 'Optimize Table' link just below the index information of the table. If I click the link, the table is optimized.
But my question is why I see the 'optimize table' link so frequently (within 3/4 days, it appears) in this table, while the other tables of this database do not show the optimize table link (They show the link once in a month or even once in every two months or more).
As I am not deleting this table row, just inserting and if duplicate key found, just updating, then why optimization is required so frequently?
Short Answer: switch to Innodb
MyISAM storage engine uses BTree for indexes and creates index files. Every time you insert a lot of data this indexes are changed and that is why you need to optimize your table to reorganize the indexes and regain some space.
MyISAM's indexing mechanism takes much more space compared to Innodb.
Read the link below
http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/
There are a lot of other advantages to Innodb over MyISAM but that is another topic.
I will explain how inserting records affects a MyISAM table and explain what optimizing does, so you'll understand why inserting records has such a large effect.
Data
With MyISAM, when you insert records, data is simply appended to the end of the data file.
Running optimize on a MyISAM table defrags the data, physically reordering it to match the order of the primary key index. This speeds up sequential record reads (and table scans).
Indexes
Inserting records also adds leaves to the B-Tree nodes in the index. If a node fills up, it must be split, in effect rebuilding at least that page of the index.
When optimizing a MyISAM table, the indexes are flattened out, allowing room for more expansion (insertion) before having to rebuild an index page. This flatter index also speeds searches.
Statistics
MySQL also stores statistics for each index about key distribution, and the query optimizer uses this information to help develop a good execution plan. Inserting (or deleting) many records causes these statistics to become out of date.
Optimizing MySQL recalculates the statistics for the table after the defragging and rebuilding of the indexes.
vs. Appending
When you are appending data (adding a record with a higher primary key value such as with auto_increment), that data will not need to be later defragged since it will already be in the proper physical order. Also, when appending (inserting sequentially) into an index, the nodes are kept flat, so there's no rebuilding to be done there either.
vs. InnoDB
InnoDB suffers from the same issues when inserting, but since data is kept in order by primary key due to its clustered index, you take the hit up front (at the time it's inserted) for keeping the data in order, rather than having to defrag it later. Still, optimizing InnoDB does optimize the data by flattening out the B-tree nodes and freeing up unused (deleted) keys, which improves sequential reads (table scans), and secondary indexes are similar to indexes in MyISAM, so they get rebuilt to flatten them out.
Conclusion
I'm not trying to make a case to stick with MyISAM. InnoDB has superior read performance due to the clustered indexes, and better update and append performance due to the record level locking versus MyISAM's table locking (assuming concurrent users). Also, InnoDB has ACID.
Still, my goal was to answer your direct question and provide some technical details rather than conjecture and hearsay.
Neither database storage engine automatically optimizes itself.

Create a table both in-memory and transaction-safe in MySQL

I know I should use engine=MEMORY to make the table in memory and engine=INNODB to make the table transaction safe. However, how can I achieve both objectives? I tried engine=MEMORY, INNODB, but I failed. My purpose is to access tables fast and allow multiple threads to change contents of tables.
You haven't stated your goals above. I guess you're looking for good performance, and you also seem to want the table to be transactional. Your only option really is InnoDB. As long as you have configured InnoDB to use enough memory to hold your entire table (with innodb_buffer_pool_size), and there is not excessive pressure from other InnoDB tables on the same server, the data will remain in memory. If you're concerned about write performance (and again barring other uses of the same system) you can reduce durability to drastically increase write performance by setting innodb_flush_log_at_trx_commit = 0 and disabling binary logging.
Using any sort of triggers with temporary tables will be a mess to maintain, and won't give you any benefits of transactionality on the temporary tables.
You are asking for a way to create the table with 2 (or more) engines, that is not possible with mysql.
However, I will guess that you want to use memory because you don't think innodb will be fast enough for your need. I think innodb is pretty fast and will be probably enough, but if you really need it, I think you should try creating 2 tables:
table1 memory <-- here is where you will make all the SELECTs
table2 innodb <-- here you will make the UPDATE, INSERT, DELETE, etc and add a TRIGGER so when this one is updated, the table1 gets the same modification.
as i know the there are two ways
1st way
create a temp table as ( these are stored in memory with a small diff they will get deleted as the session is logged out )
create temporary table sample(id int) engine=Innodb;
2nd way
you have to create two tables one with memory engine and other with innodb or bdb
first insert all the data into your innodb table and then trigger the data to be copied into memory table
and if you want to empty the data in the innodb table you can do it with same trigger
you can achieve this using events also

Inserting New Column in MYSQL taking too long

We have a huge database and inserting a new column is taking too long. Anyway to speed up things?
Unfortunately, there's probably not much you can do. When inserting a new column, MySQL makes a copy of the table and inserts the new data there. You may find it faster to do
CREATE TABLE new_table LIKE old_table;
ALTER TABLE new_table ADD COLUMN (column definition);
INSERT INTO new_table(old columns) SELECT * FROM old_table;
RENAME table old_table TO tmp, new_table TO old_table;
DROP TABLE tmp;
This hasn't been my experience, but I've heard others have had success. You could also try disabling indices on new_table before the insert and re-enabling later. Note that in this case, you need to be careful not to lose any data which may be inserted into old_table during the transition.
Alternatively, if your concern is impacting users during the change, check out pt-online-schema-change which makes clever use of triggers to execute ALTER TABLE statements while keeping the table being modified available. (Note that this won't speed up the process however.)
There are four main things that you can do to make this faster:
If using innodb_file_per_table the original table may be highly fragmented in the filesystem, so you can try defragmenting it first.
Make the buffer pool as big as sensible, so more of the data, particularly the secondary indexes, fits in it.
Make innodb_io_capacity high enough, perhaps higher than usual, so that insert buffer merging and flushing of modified pages will happen more quickly. Requires MySQL 5.1 with InnoDB plugin or 5.5 and later.
MySQL 5.1 with InnoDB plugin and MySQL 5.5 and later support fast alter table. One of the things that makes a lot faster is adding or rebuilding indexes that are both not unique and not in a foreign key. So you can do this:
A. ALTER TABLE ADD your column, DROP your non-unique indexes that aren't in FKs.
B. ALTER TABLE ADD back your non-unique, non-FK indexes.
This should provide these benefits:
a. Less use of the buffer pool during step A because the buffer pool will only need to hold some of the indexes, the ones that are unique or in FKs. Indexes are randomly updated during this step so performance becomes much worse if they don't fully fit in the buffer pool. So more chance of your rebuild staying fast.
b. The fast alter table rebuilds the index by sorting the entries then building the index. This is faster and also produces an index with a higher page fill factor, so it'll be smaller and faster to start with.
The main disadvantage is that this is in two steps and after the first one you won't have some indexes that may be required for good performance. If that is a problem you can try the copy to a new table approach, using just the unique and FK indexes at first for the new table, then adding the non-unique ones later.
It's only in MySQL 5.6 but the feature request in http://bugs.mysql.com/bug.php?id=59214 increases the speed with which insert buffer changes are flushed to disk and limits how much space it can take in the buffer pool. This can be a performance limit for big jobs. the insert buffer is used to cache changes to secondary index pages.
We know that this is still frustratingly slow sometimes and that a true online alter table is very highly desirable
This is my personal opinion. For an official Oracle view, contact an Oracle public relations person.
James Day, MySQL Senior Principal Support Engineer, Oracle
usually new line insert means that there are many indexes.. so I would suggest reconsidering indexing.
Michael's solution may speed things up a bit, but perhaps you should have a look at the database and try to break the big table into smaller ones. Take a look at this: link. Normalizing your database tables may save you loads of time in the future.

How to fix the Overhead and Effective problem on InnoDB table?

Questions :
1 - What is mean by Overhead? When I click "Optimize table" button on MyISAM table, Overhead and Effective data are gone. I wonder what it does to my table?
2 - Do I need to care of Overhead and Effective value actually? How to fix the Overhead and Effective problem on InnoDB table?
Fixing InnoDB is not as trivial as a Click of a Button. MyISAM is.
Under the hood, OPTIMIZE TABLE will do this to a MyISAM table called mytb:
Create empty temp table with same structure as mytb
Copy MyISAM data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
OPTMIZE TABLE does not work that way with InnoDB for two major reasons:
REASON #1 : InnoDB Storage Layout
By default, InnoDB has innodb_file_per_table disabled. Everything InnoDB and its grandmother lands in ibdata1. Running OPTIMIZE TABLE does the following to an InnoDB table called mytb:
Create empty InnoDB temp table with same structure as mytb
Copy InnoDB data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
Unfortunately, the temp table used for shrinking mytb is appended to ibdata1. INSTANT GROWTH FOR ibdata1 !!! In light of this, ibdata1 will never shrink. To matters worse, ANALYZE TABLE is useless (Explained in REASON #2)
If you have innodb_file_per_table enabled, the first four(4) steps will work since the data is not stored in ibdata1, but in an external tablespace file called mytb.ibd. That can shrink.
REASON #2 : Index Statistics are Always Recomputed
InnoDB does not effectively store index statistics. In fact, if you run ANALYZE TABLE on mytb, statistics are created and stored. Unfortunately, by design, InnoDB will dive into the BTREE pages of its indexes, guessimate key cardinalities, and uses those numbers to prep the MySQL Query Optimizer. This is an on-going process. In effect, ANALYZE TABLE is useless because the index statistics calculated are overwritten with each query executed against that table. I wrote about this in the DBA StackExchange June 21, 2011.
Percona explained this thoroughly in www.mysqlperformanceblog.com
As for overhead in MyISAM, that number can be figured out.
For a MyISAM table, the overhead represents internal fragmentation. This is quite common in a table that experiences, INSERT, UPDATEs, and DELETEs, especially if you have BLOB data or VARCHAR columns. Running OPTIMIZE TABLE make such fragmentation disappear by copying to a temp table (naturally not copying empty space).
Going back to InnoDB, how do you effectively eliminate waste space ? You need to rearchitect ibdata1 to hold less info. Withing ibdata1 you have four types of data:
Table Data Pages
Index Data Pages
Table MetaData
MVCC Data for Transactions
You can permamnently move Table and Indexes Out of ibdata1 forever. What about data and indexes already housed in ibdata1 ?
Follow the InnoDB Cleanup Plan that I posted October 29, 2010 : Howto: Clean a mysql InnoDB storage engine?
In fact "OPTIMIZE TABLE" is a useless waste of time on MyISAM, because if you have to do it, your database is toast already.
It takes a very long time on large tables, and blocks write-access to the table while it does so. Moreover, it has very nasty effects on the myisam keycache etc.
So in summary
small tables never need "optimize table"
large tables can never use "optimize table" (or indeed MyISAM)
It is possible to achieve (roughly) the same thing in InnoDB simply using an ALTER TABLE statement which makes no schema changes (normally ALTER TABLE t ENGINE=InnoDB). It's not as quick as MyISAM, because it doesn't do the various small-table-which-fits-in-memory optimisations.
MyISAM also uses a bunch of index optimisations to compress index pages, which generally cause quite small indexes. InnoDB also doesn't have these.
If your database is small, you don't need it. If it's big, you can't really use MyISAM anyway (because an unplanned shutdown makes the table to need rebuilding, which takes too long on large tables). Just don't use MyISAM if you need durability, reliability, transactions, any level of concurrency, or generally care about robustness in any way.