MySQL table data physical fragmentation - mysql

Could you tell me, will MySQL data in tables be physically fragmented if I insert new data and delete old data (insert in table top, and delete bottom rows)?
Will the size of table be growing in any way, before I do OPTIMIZE?

Are you DELETEing and INSERTing at about the same rate? Is the table InnoDB?
If Yes to both, then "don't worry". The DELETEs will free up blocks and the INSERTs will re-use those blocks (or new blocks if the INSERTs are getting ahead).
InnoDB tables are composed of 16KB blocks that are chained together in a BTree structure. What you described will be freeing blocks from one side of the tree and creating new blocks on the other side. However, each block can reside anywhere, so the concept of "side of the tree" is more 'virtual' than 'physical'.
After you have done a lot of churning, try the OPTIMIZE once. If it does not cut the table size in half, that is a further clue that OPTIMIZE is not worth it. A 30% shrinkage after a lot of churn is somewhat typical. But it stabilizes at about that, and won't go over about 30%.
If, on the other hand, you are DELETEing most of the table all at once, then you may be better off rebuilding the table with the not-to-be-deleted rows. This combines the delete and the optimize steps into one. Then use this to flip the tables 'instantly':
RENAME TABLE real TO old, new TO real; DROP TABLE old;
And do not do any inserts during the rebuild.

When innodb_file_per_table is OFF and all data is going to be stored in ibdata files. When you remove rows, they are just marked as deleted on disk but space will be consumed by InnoDB files which can be re-used later when you insert/update more rows but it will never shrink.
If you drop some tables of delete some data then there is no any other way to reclaim that unused disk space except dump/reload method.
But, if you are using innodb_file_per_table then you can reclaim the space by running OPTIMIZE TABLE on that table. OPTIMIZE TABLE will create a new identical empty table. Then it will copy row by row data from old table to the new one. In this process a new .ibd tablespace will be created and the space will be reclaimed However, the shared tablespace-ibdata1 can still grow.
For more info- mysql reference and percona expert

Related

How to alter a table in large mysql DataBase

I have very large mysql DB(more than 100 G) and I want to migrate some changes in table. It need to posing that I have no space to make backup for this size, as results, it rolls back all changes. In clear way, when I want to alter table in mysql, it backup for it self in local disk and because of no space on my disk, it 's rolling back. Is there any one to help for this issue?
It depends on the type of ALTER change. Since MySQL 5.6, some alterations can be done "inplace" which does not use extra storage. Read https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html for details.
But many types of alterations (basically any change that affects the storage size of a row), still require a "table rebuild" which means it makes a copy of the table in the process of doing the alter. This requires extra storage, usually about the same as the size of the original table. If your server does not have enough storage space to do this, then you cannot alter the table, FULL STOP. You should have addressed this long before the table grew to this size. It is your responsibility to monitor the size of the database and make sure you have enough storage space.
You said in the comments above that it is not possible to add more storage space.
Sometimes InnoDB tablespaces can become "fragmented." That means they could store the same rows using slightly less storage space. You can accomplish defragmentation with OPTIMIZE TABLE <name>; or ALTER TABLE <name> FORCE;
A possible strategy is to optimize each of your tables, starting with the smallest table and doing one by one in order of size. You hope that you have enough space for the copy table needed to do this, and any space freed up by optimizing the small tables will help you to optimize the medium-size tables, and so on. This is a long shot, because it's also possible you don't have enough "wasted" space to recover to make a difference. You may still not be able to alter your largest tables once you're all done with this process. Unfortunately, MySQL has no way of reporting fragmentation of the tables, so you can only try it and see how much space, if any, you recover.
After that, the only other option is to delete data from the table until you can do the alteration. The alter still requires extra storage, but only as much as needed to copy the remaining rows of data. That is, if you delete 50% of the rows, then the tablespace file that stores the copy only needs roughly 50% of the storage space.
You can also try dropping some of the indexes of the table as you do your alteration. Indexes take space in the same tablespace, and it's not uncommon if you have multiple indexes for them to take more space than the data itself. If you include some DROP INDEX operations in your ALTER TABLE statement, the resulting copy will take less space. On the other hand, those indexes might be necessary for proper query optimization, and without the indexes, your application won't be usable.

Huge wp_post table

My client has a store on Woocommerce with 1.2 Gb database. I know that similar store (count by product ) should have approx 700Mb.
The biggest table is wp_posts (760Mb) alone ! Which, I think is strange. Usually biggest table is wp_postmete or wp_options.
I tried optimize this database by plugins: WP-Sweep and wp-optimize so there is no revisions and draft left.
I also tried SQL:
OPTIMIZE TABLE
but it is innoDB so it do not support it. I get this message:
Table does not support optimize,doing recreate + analyze instead
So it is done? I mean:”recreate + analyze” or I should do it? And how?
I read that in innoDB i should dump table and restore but when I do this by DBeaver - i get same size.
Any Idea what should I do?
The error message is a bit misleading, because it dates back to the days when MyISAM was the default storage engine, and OPTIMIZE TABLE does a few things in MyISAM that are different from what it does in InnoDB. For example, MyISAM can't reclaim space from deleted rows until you do OPTIMIZE TABLE (whereas InnoDB does reclaim space dynamically).
InnoDB does support OPTIMIZE TABLE and it does useful things. It does basically the same as an ALTER TABLE when using the COPY algorithm. That is, it creates a new file, and copies the data row by row into the new file. This accomplishes defragmentation and rebuilding the indexes, just as if you had done a dump and restore. So you don't need to dump and restore.
After OPTIMIZE TABLE, the InnoDB table may be close to the same size it was before, if there was little fragmentation.
Frankly, a table 1.2GB in size is not so large by the standards of most MySQL projects I've worked on. We start to get concerned if a table is larger than 500GB, and we start alerting developers if the table is larger than 800GB, or larger than the remaining free disk space.

MySQL, delete inno table doesnt reduce table size

I have an innodb table with big table size (around 7GB),
I already delete about 80% of its rows,
The problem is the table size remains the same,
The innodb_file_per_table is ON,
How do I shrink the table size so it reflect the actual condition and save more disk space ?
It depends...
If you built the table with innodb_file_per_table=ON, then you can do OPTIMIZE TABLE now. (Be sure it is still ON.)
If =OFF, then the table is in the tablespace file ibdata1 and the only way to return the space to the OS is by dumping all tables, removing ibdata1, then reloading the dump. This is tedious and risky. Meanwhile, the free space created by the DELETEs will be used by future INSERTs, etc.
More discussion of big deletes.

How to fix the Overhead and Effective problem on InnoDB table?

Questions :
1 - What is mean by Overhead? When I click "Optimize table" button on MyISAM table, Overhead and Effective data are gone. I wonder what it does to my table?
2 - Do I need to care of Overhead and Effective value actually? How to fix the Overhead and Effective problem on InnoDB table?
Fixing InnoDB is not as trivial as a Click of a Button. MyISAM is.
Under the hood, OPTIMIZE TABLE will do this to a MyISAM table called mytb:
Create empty temp table with same structure as mytb
Copy MyISAM data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
OPTMIZE TABLE does not work that way with InnoDB for two major reasons:
REASON #1 : InnoDB Storage Layout
By default, InnoDB has innodb_file_per_table disabled. Everything InnoDB and its grandmother lands in ibdata1. Running OPTIMIZE TABLE does the following to an InnoDB table called mytb:
Create empty InnoDB temp table with same structure as mytb
Copy InnoDB data from mytb into the temp table
Drop table mytb
Rename temp table to mytb
Run ANALYZE TABLE against mytb and store index statistics
Unfortunately, the temp table used for shrinking mytb is appended to ibdata1. INSTANT GROWTH FOR ibdata1 !!! In light of this, ibdata1 will never shrink. To matters worse, ANALYZE TABLE is useless (Explained in REASON #2)
If you have innodb_file_per_table enabled, the first four(4) steps will work since the data is not stored in ibdata1, but in an external tablespace file called mytb.ibd. That can shrink.
REASON #2 : Index Statistics are Always Recomputed
InnoDB does not effectively store index statistics. In fact, if you run ANALYZE TABLE on mytb, statistics are created and stored. Unfortunately, by design, InnoDB will dive into the BTREE pages of its indexes, guessimate key cardinalities, and uses those numbers to prep the MySQL Query Optimizer. This is an on-going process. In effect, ANALYZE TABLE is useless because the index statistics calculated are overwritten with each query executed against that table. I wrote about this in the DBA StackExchange June 21, 2011.
Percona explained this thoroughly in www.mysqlperformanceblog.com
As for overhead in MyISAM, that number can be figured out.
For a MyISAM table, the overhead represents internal fragmentation. This is quite common in a table that experiences, INSERT, UPDATEs, and DELETEs, especially if you have BLOB data or VARCHAR columns. Running OPTIMIZE TABLE make such fragmentation disappear by copying to a temp table (naturally not copying empty space).
Going back to InnoDB, how do you effectively eliminate waste space ? You need to rearchitect ibdata1 to hold less info. Withing ibdata1 you have four types of data:
Table Data Pages
Index Data Pages
Table MetaData
MVCC Data for Transactions
You can permamnently move Table and Indexes Out of ibdata1 forever. What about data and indexes already housed in ibdata1 ?
Follow the InnoDB Cleanup Plan that I posted October 29, 2010 : Howto: Clean a mysql InnoDB storage engine?
In fact "OPTIMIZE TABLE" is a useless waste of time on MyISAM, because if you have to do it, your database is toast already.
It takes a very long time on large tables, and blocks write-access to the table while it does so. Moreover, it has very nasty effects on the myisam keycache etc.
So in summary
small tables never need "optimize table"
large tables can never use "optimize table" (or indeed MyISAM)
It is possible to achieve (roughly) the same thing in InnoDB simply using an ALTER TABLE statement which makes no schema changes (normally ALTER TABLE t ENGINE=InnoDB). It's not as quick as MyISAM, because it doesn't do the various small-table-which-fits-in-memory optimisations.
MyISAM also uses a bunch of index optimisations to compress index pages, which generally cause quite small indexes. InnoDB also doesn't have these.
If your database is small, you don't need it. If it's big, you can't really use MyISAM anyway (because an unplanned shutdown makes the table to need rebuilding, which takes too long on large tables). Just don't use MyISAM if you need durability, reliability, transactions, any level of concurrency, or generally care about robustness in any way.

why does InnoDB keep on growing without for every update?

I have a table which consists of heavy blobs, and I wanted to conduct some tests on it.
I know deleted space is not reclaimed by innodb, so I decided to reuse existing records by updating its own values instead of createing new records.
But I noticed, whether I delete and insert a new entry, or I do UPDATE on existing ROW, InnoDB keeps on growing.
Assuming I have 100 Rows, each Storing 500KB of information, My InnoDB size is 10MB, now when I call UPDATE on all rows (no insert/ no delete), the innodb grows by ~8MB for every run I do. All I am doing is I am storing exactly 500KB of data in each row, with little modification, and size of blob is fixed.
What can I do to prevent this?
I know about optimize table, but I cant do it because on regular usage, the table is going to be 60-100GB big, and running optimize will just stall entire server.
But I noticed, whether I delete and insert a new entry, or I do UPDATE on existing ROW, InnoDB keeps on growing.
InnoDB is a multiversion system.
This means that each time you update or delete a row, the previous version of the row gets copied to a special place called a rollback segment. The original record is rewritten with the new data (in case of an UPDATE) or marked as deleted (in case of a DELETE).
If you do a transaction rollback or execute a concurrent SELECT statement (which has to see the previous version since the new one is not yet committed), the data is retrieved from the rollback segment.
The problem with this is that InnoDB tablespace never shrinks. If the tablespace is full, new space is allocated. The space occupied by the rollback information from the commited transactions may be reused too, but if the tablespace is full it will be extended and this is permanent.
There is no supported way to shrink the ibdata* files except backing up the whole database, dropping and recreating the InnoDB files and restoring from the backup.
Even if you use innodb_file_per_table option, the rollback tablespace is still stored in the shared ibdata* files.
Note, though, that space occupied by the tables and space occupied by the tablespace are the different things. Tables can be shrunk by running OPTIMIZE and this will make them more compact (within the tablespace), but the tablespace itself cannot be shrunk.
You can reduce table size by running Optimize query
OPTIMIZE TABLE table_name;
more information can be found here