I'm using InnoDB database with a single file configuration (in /var), so no innodb_file_per_table.
In the MySql workbench, when I query for the databases used space, with this query
SELECT table_schema "Database", sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MB"
FROM information_schema.TABLES GROUP BY table_schema;
It says that I have 47 GB of data.
However the size of ibdata1 is 99 GB...
I know that ibdata1 contains a bunch of other things other that table data, like Table Indexes, MVCC (Multiversioning Concurrency Control) Data and Table Metadata
So my question is: Is it normal that supposedly 52 GB of ibdata1 is medatada and a bunch of other things? Usually, how much data beside table data should the ibdata1 file contains?
No, it is not normal that you would have that much metadata. It is normal though that the ibdata file can grow to a ridiculous size if you aren't using innodb_file_per_table.
Your ibdata file will grow when your database grows, but it will never actually shrink.
So, for example, if you had 130 GB of data at one point and deleted a bunch of it, the ibdata file would still be 130 GB after the data was purged. It will just have a bunch of "free space" that it will then use for subsequent inserts.
As for shrinking the file, there's not much you can really do aside from wiping out your database and restoring it. This answer has some good instructions on how to do that.
Howto: Clean a mysql InnoDB storage engine?
You also might want to consider using innodb_file_per_table as deleting data from a table and later optimizing that table will actually shrink the size of the individual table files
There are a few reasons for having a bunch of "extra" space in ibdata1, but the most likely cases are:
You have deleted large amounts of data in the past. When you delete rows or drop tables, although free space will be made available in the file, the file itself will never shrink.
You may have an excessive amount of undo log space (or have at some point in the past). Undo logs are kept during DELETE and UPDATE operations, and for very long-running operations touching many rows can grow quite large. Again, if the file is expanded to hold this data it will never shrink.
As previously mentioned using innodb_file_per_table can help with this if you expect to regularly drop tables and want to get the disk space back. My blog post The basics of InnoDB space file layout may help you understand what is included in the ibdata1 file.
Related
Using: MySQL 5.6 on Debian 9, total DB size is around 450Gb
Updated to 5.7, ran mysql_upgrade, noticed that around 150 GB has been taken up. 2 tables are really large and they stayed in 'copying to tmp table' for a couple of hours
Noticed innodb_file_per_table was on and created large ibd files that weren't there previously.
Restored from a snapshot, disabled the file_per_table, ran mysql_upgrade again. 100GB gone, which is almost 1/4 of my total DB.
In the first case, it pulled the data from ibdata and put it into a separate file but ibdata never shrinks so taken space almost doubled.
What happens in the second case? Does the temp table get created within the ibdata file that never shrinks so even when table is not used anymore - space is still gone?
Another thing I noticed is that space consumption doesn't start until query has been in copying to tmp table status for like an hour or so.
1) Is there any way to avoid/minimize space increase?
Would running update with file_per_table on, then disabling it and running alter table engine innodb free up the space?
2) Any way to predict how much space will be occupied? At least per table
3) How does max_tmp_table_size play into this?
It sounds like you painted yourself into a corner by not running innodb_file_per_table from the start, so now you have a huge, unshrinkable ibdata1 file.
1) There isn't.
1.1) It might reduce the overall space usage by rebuilding the tables outside the ibdata1 file, then rebuild them again to inside ibdata1, reusing some of the unused space inside ibdata1
2) Yes:
SELECT TABLE_SCHEMA, TABLE_NAME, DATA_LENGTH, INDEX_LENGTH FROM information_schema.TABLES;
3) It doesn't. The tables you are seeing are probably tables being rebuilt for some reason (not sure why, I have to admit I haven't seen that happen from mysql_upgrade before). max_tmp_table_size is only for implicit (when a query plan says using temporary) and explicity (CREATE TEMPORARY TABLE ...) temporary tables, not for table rebuilds.
The only(?) way to switch to file_per_table without the disk bloat is
Dump the data.
Get a fresh install (or otherwise get rid of ibdata1).
Reload (with file_per_table on).
Few things that I'm considering about NDBcluster when storing data in disk storage.
In my configuration i defined DataMemory=20G. So what would be the best total size of undo log files? I saw in a blog it should be 6xDataMemory. Is this a must?
When creating undo log files the best way to create is lot of small size files or small amount of big size files? As an example (10 of 1G Files or 100 of 100M files and if I create 200M files what is the best amount for buffer size)
Same goes with data file creating. Is the best way to create 10 of 1G file or 100 of 100M files?
I'm using separate data files for separate table spaces and always one table space for one table. Not using same table space for two tables. Is this a good way to defined and allocate table spaces or won't there any performance issues by using same table space for two tables?
(Here I deal with huge traffic kind of 4000 - 5000 TPS and database size for NDB almost 80GB. And I have 2 data nodes, 2 mysql servers. Each data node has 128GB memory.)
Wilson Hauck
UNDO log files are related to UNDO of disk data changes. So there is no
real relation between UNDO log file size and DataMemory. There is a
relation between DataMemory and REDO log size though since the REDO log
is used by both In-memory data and disk-data parts.
Whether to use small files or bigger files is mostly dependent on
the workings of the file system. Personally I would start with using
fairly large files and not that many.
From a performance point of view there is no difference if you use one
tablespace or if you use one per table. I always think about this as
using one tablespace for all tables. But I don't know of any problems
to use many tablespaces. There might be hardcoded limits on the number
of tablespaces you can have though.
Obviously one tablespace per table means that you can get rid of
files quick when dropping a table if that is of interest.
The size of the UNDO log should be quite ok with a few tens of GBytes
for most use cases. But during massive inserts one might need a bigger
size of the UNDO log since checkpoints can take a long time during
massive inserts.
I tried to know how much extent ( "free space" ) does my database have after deleting a rather large table. ( Around 10GB )
I have run the command:
SELECT table_schema "Data Base Name",
round( sum( data_free ) / 1024 / 1024 / 1024 ) "Free Space in GB"
FROM information_schema.TABLES
GROUP BY table_schema;
which gave me a list of databases, and their "free spaces".
The problem is, that the database which had the 10GB table removed now has a 1500GB+ free space according to this report which is significally bigger than my actual hard drive capacity. ( which is around 200GB )
How is this possible? How could I get a more realistic report? Am I missing something?
UPDATE
As an experiment, I have added and removed an 1GB table in this database, now the report shows around 110GB more free space. Might there be a problem with my configuration, or is this a common issue?
(This is answering some of the questions buried in Comments.)
Misnomer "Free" space only includes whole blocks, not spare room inside blocks, and many other details.
Case 1: All tables are in ibdata1 -- SHOW TABLE STATUS (or the equivalent query into information_schema will show the same Data_free value, namely how much is free in ibdata1. This space can be reused by any table. It is hard to give the space back to the OS.
Case 2: All tables are file_per_table -- Now each Data_free refers to the space for the table. And the SUM() is meaningful. (ibdata1 still exists, but it does not contain any real tables; there is a lot of other stuff that InnoDB needs.)
Case 3: Mixture -- If you turn file_per_table on/off at various times, some tables will be in ibdata1, some will have their own tablespaces.
Case 4: CREATE TABLESPACE in 5.7 -- For example, you can have a tablespace for each database.
Case 5: PARTITIONed tables -- Each partition acts like a table.
Case 6: 8.0 -- Even more changes are coming.
Database == Directory In MySQL's directory tree each database can be seen as a filesystem directory. Within that directory can be seen some set of files for each table. The .frm file contains the table definition. If an .ibd file exists, the table was created with file_per_table. This may be the most reliable way to discover whether the table is file_per_table. (8.0 will have significant changes here.)
How much space can I reuse? There is no good answer. Usually inserting a row will find space in the block where it belongs, and Data_free will not shrink. But, if there were block split(s), Data_free can drop by some multiple of 16KB (the block size) or 4MB (the "extent size" - or maybe it is 8MB?). Also, random inserts lead to BTree blocks being, on average, about 69% full.
Changing innodb_file_per_table has no effect until the next CREATE TABLE or ALTER TABLE. And then it only has effect on where to put the newly created/copied data+indexes (ibdata1 or .ibd). It will not destroy data.
Big tables usually have 4MB to 7MB of Data_free. When computing how many rows you can add, don't plan on Data_free dropping below that range.
Avg_row_size should be useful. But sometimes it (and Rows) are poorly approximated. Their product (Data_length) is always correct. So, this might be a good estimate of "rows to go before grabbing more space from OS:
(Data_free - 7M) / Avg_row_size
Tablespace Recommendations: Put 'big' tables in file_per_table. Put 'tiny' tables in ibdata1 or database-specific tablespaces (5.7). Sorry, no simple recommendation on the dividing line between 'big' and 'tiny'. And it is clumsy to migrate a table: SET global innodb_file_per_table = ...;; logout; login (to pick up the global); ALTER TABLE tbl ENGINE=InnoDB;. And it is necessarily a full copy of the table.
(Caveat: I have left out many details.)
It sounds as though you do not have innondb_file_per_table set, and are therefore using a shared table space. If so, then you will be reurning the global 'allocated but unused' shared space, repeatedly for each table_schema.
I have several databases in MySQL with InnoDB engine. All together they have around 30 GB size on filesystem. A couple of days ago, I removed a lot of data from those databases (~10-15 GB) but the used space on filesystem is the same and also reading data_length and index_length from information_schema.TABLES give almost the old size.
I dumped a 3,3 GB database and imported it on my workstation where it takes only 1,1 GB (yes, it is a 1:1 copy). So, how can I calculate the size a InnoDB database needs if I would reimport it in a new system?
Optimize your tables after deleting large amouts of data.
http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html
InnoDB doesn't free disk space; it's a PITA. Basically, you can free it by dropping the database and restoring it from a backup (as you've noticed by chance) - see here for examples.
So, you can't calculate how big a database will be after you restore a backup. But it will never be bigger than the un-backed up one (because the unbacked up version still has space from any deleted data and the restored backup will not have that space).
This can be worked around to some extent using the file per table option; more details in the first link from this post.
The size of MySQL ibdata is 4GB, but I don't think the data I have should take that much disk space. I am using MySQL InnoDB storage engine. Am I doing something wrong with configuration? How do I reclaim the disk space because deleting rows didn't help at all?
Thanks
When you delete stuff, the records are only marked as unused/free and will be reused when you insert more data. You cannot reclaim disc space without doing a full dump/reload of the database though(unless you have used the innodb_per_table option)
See more info here http://dev.mysql.com/doc/refman/5.1/en/adding-and-removing.html