Import/export manually or Dump Database - mysql

I have a rather heavy database
I have to change server, so I export the database with a dump, then I retrieved it on my new server
But after importing it, the size of the database is different :
Old server : 772 Mo
New server : 414 Mo
The difference is enormous, I have to worry? (Do you think there are missing things?)
Is not it better to do an export then an import manually?

You didn't mention if these size numbers from from the filesystem (du -ch, etc), or from a query. I'm guessing they are from the filesystem. As long as you didn't get import errors, your data is probably OK.
Check for FRAGMENTATION in your SOURCE tables. That is likely why your source is larger than your target. Basically, when rows are updated, it may no longer fit in the same data block, and that data block is split into two, leaving some free space in two blocks (16KB used on disk, but 9KB used in the data file).
Check for fragmented tables:
select ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length ,
round(INDEX_LENGTH/1024/1024) as index_length,
round(DATA_FREE/ 1024/1024) as data_free
from information_schema.tables
where DATA_FREE > 0;
Check for total disk used and total free space:
select sum((DATA_LENGTH + INDEX_LENGTH)/1024/1024) as TTL_MB_USED ,
sum(DATA_FREE)/1024/1024 as TTL_MB_FREE
from information_schema.tables
where table_schema='<your schema>';
That may help account for the difference in size between source and target.
I found this answer to be excellent to describe Fragmentation: https://serverfault.com/a/265885
first at all you must understand that Mysql tables get fragmented when a row is updated, so it's a normal situation. When a table is created, lets say imported using a dump with data, all rows are stored with no fragmentation in many fixed size pages. When you update a variable length row, the page containing this row is divided in two or more pages to store the changes, and these new two (or more) pages contains blank spaces filling the unused space.

Related

How can I reduce table size in MySQL?

I have a database named "bongoTv" where lots of table but I found one table its size about 20GB with less amount of data.
After removing few row storage did not reduced. Then I ran a command
OPTIMIZE TABLE notifiation to re-indexing. But It increase its size to 25GB.
As per my undersetting with other DBMS it should be reduce its size but why its size increased, I think it cached previous information somewhere.
After searching on web I found need to configure with innodb_file_per_table=ON. But here in my configuration it is also enabled. But it did not worked.
Need expert opinion who dedicatedly working on this MySQL.
In that case what need to do from my end, what is the solution this issue?
#Louis &
#P.Salmon Can you help me on this?
Thanks in Advance who is going to help me on this.
In general, InnoDB tablespace files never shrink. If you delete data, it makes some space "unused" and over time InnoDB will try to reuse unused space before expanding the tablespace file further.
But there is also tablespace fragmentation. As you delete rows and leave small gaps of unused space, those small gaps may not be usable for new data. So over time, the gaps grow in number, and the tablespace uses more space than it should, if you were to store the same data as compactly as possible.
The free space that comprise full extents, or contiguous 1MB areas, are shown as data_free when you run SHOW TABLE STATUS. But smaller gaps of unused space are not shown. MySQL has no way of reporting the "crumbs" of unused space.
When you use OPTIMIZE TABLE on an InnoDB table, it still cannot shrink the tablespace, it only copies data to a new tablespace. It tries to defragment the data, leaving out the gaps where possible. So if there are a lot of large and small gaps in your old tablespace, the new tablespace should have a smaller total size.
However, while filling pages of the new tablespace, InnoDB deliberately leaves 1/16 of each page unused, to allow for future updates that might need just a little bit more room. So in theory, you might see OPTIMIZE TABLE cause the file to grow larger if the original was very compact and the new file was created with more "elbow room."
But that still does not account for the 20GB to 25GB change you saw. That might be because sizes are cached. That is, the old file was in fact 25GB, but the table status was not reporting it. MySQL 8.0 especially has some caching behavior on some table statistics: https://bugs.mysql.com/bug.php?id=86170
So how to reduce the table size in MySQL?
Deleting rows is the most effective way. If you don't need data to be in the database anymore, delete it. If you might need data for archival purposes but don't need to query it every day, then copy it out to some long-term archiving format, or another database instance on a large-capacity server, and then delete the data from your primary database.
Changing data types to be smaller. For example, why use a BIGINT (64-bits) when a SMALLINT (16-bits) is sufficient for the values you store? It may seem like a small change, but it adds up. Values are stored in the row, but also stored again in any indexes that include that column.
Using compression. The best results are in text and strings that store readable text. The amount of compression depends on the nature of the data. Don't count on this too much, because at best one can expect a 2:1 ratio of compression, and often not even that much.
Ultimately, databases tend to grow larger, and often even the rate of growth accelerates. If you accumulate a lot of data and never delete or archive them, you must make a strategy to support the growth. You may just have to get larger and larger storage volumes.

MySQL db reports a vastly different total size of table on prod and local

I've got a production database with a wp_options table reportedly totalling around 951,679,500,288 (900GB+) in total data length. However, when I export the database and examine it locally it only reports a small number of MB (usually 3-7MB).
There are about 2,000-10,000 rows of data in this table. The reason for this fluctuation is there is a great number of transient cache data being stored in this table and the cron is scheduled to remove them routinely. That's why there is a discrepancy in the number of rows in the 2 screenshots. Otherwise, I have checked numerous times and the non-transient data is all exactly the same and represented in both environments.
It's like there's almost a TB of garbage data hiding in this table that I can't access or see and it's only present on production. Staging and local environments with the same database operate just fine without the missing ~TB of data.
summary of table on production:
summary of table from same db on local:
summary of both db sizes in comparison:
What could be causing the export of a SQL file to dis-regard 900GB of data? I've exported SQL and CSV via Adminer as well as using the 'wp db export' command.
And how could there be 900GB of data on production that I cannot see or account for other than when it calculates the total data length of the table?
It seems like deleted rows have not been purged. You can try OPTIMIZE TABLE.
Some WP plugins create "options" but fail to clean up after themselves. Suggest you glance through that huge table to see what patterns you find in the names. (Yeah, that will be challenging.) Then locate the plugin, and bury it more than 6 feet under.
OPTIMIZE TABLE might clean it up. But you probably don't know what the setting of innodb_file_per_table was when the table was created. So, I can't predict whether it will help a lot, take a long time, not help at all, or even crash.

Why does InnoDB give obviously false free space information

I tried to know how much extent ( "free space" ) does my database have after deleting a rather large table. ( Around 10GB )
I have run the command:
SELECT table_schema "Data Base Name",
round( sum( data_free ) / 1024 / 1024 / 1024 ) "Free Space in GB"
FROM information_schema.TABLES
GROUP BY table_schema;
which gave me a list of databases, and their "free spaces".
The problem is, that the database which had the 10GB table removed now has a 1500GB+ free space according to this report which is significally bigger than my actual hard drive capacity. ( which is around 200GB )
How is this possible? How could I get a more realistic report? Am I missing something?
UPDATE
As an experiment, I have added and removed an 1GB table in this database, now the report shows around 110GB more free space. Might there be a problem with my configuration, or is this a common issue?
(This is answering some of the questions buried in Comments.)
Misnomer "Free" space only includes whole blocks, not spare room inside blocks, and many other details.
Case 1: All tables are in ibdata1 -- SHOW TABLE STATUS (or the equivalent query into information_schema will show the same Data_free value, namely how much is free in ibdata1. This space can be reused by any table. It is hard to give the space back to the OS.
Case 2: All tables are file_per_table -- Now each Data_free refers to the space for the table. And the SUM() is meaningful. (ibdata1 still exists, but it does not contain any real tables; there is a lot of other stuff that InnoDB needs.)
Case 3: Mixture -- If you turn file_per_table on/off at various times, some tables will be in ibdata1, some will have their own tablespaces.
Case 4: CREATE TABLESPACE in 5.7 -- For example, you can have a tablespace for each database.
Case 5: PARTITIONed tables -- Each partition acts like a table.
Case 6: 8.0 -- Even more changes are coming.
Database == Directory In MySQL's directory tree each database can be seen as a filesystem directory. Within that directory can be seen some set of files for each table. The .frm file contains the table definition. If an .ibd file exists, the table was created with file_per_table. This may be the most reliable way to discover whether the table is file_per_table. (8.0 will have significant changes here.)
How much space can I reuse? There is no good answer. Usually inserting a row will find space in the block where it belongs, and Data_free will not shrink. But, if there were block split(s), Data_free can drop by some multiple of 16KB (the block size) or 4MB (the "extent size" - or maybe it is 8MB?). Also, random inserts lead to BTree blocks being, on average, about 69% full.
Changing innodb_file_per_table has no effect until the next CREATE TABLE or ALTER TABLE. And then it only has effect on where to put the newly created/copied data+indexes (ibdata1 or .ibd). It will not destroy data.
Big tables usually have 4MB to 7MB of Data_free. When computing how many rows you can add, don't plan on Data_free dropping below that range.
Avg_row_size should be useful. But sometimes it (and Rows) are poorly approximated. Their product (Data_length) is always correct. So, this might be a good estimate of "rows to go before grabbing more space from OS:
(Data_free - 7M) / Avg_row_size
Tablespace Recommendations: Put 'big' tables in file_per_table. Put 'tiny' tables in ibdata1 or database-specific tablespaces (5.7). Sorry, no simple recommendation on the dividing line between 'big' and 'tiny'. And it is clumsy to migrate a table: SET global innodb_file_per_table = ...;; logout; login (to pick up the global); ALTER TABLE tbl ENGINE=InnoDB;. And it is necessarily a full copy of the table.
(Caveat: I have left out many details.)
It sounds as though you do not have innondb_file_per_table set, and are therefore using a shared table space. If so, then you will be reurning the global 'allocated but unused' shared space, repeatedly for each table_schema.

How to reclaim space after turning on page compression in SQL 2008?

I have just turned on page compression on a table (SQL 2008 Ent) using the following command:
ALTER TABLE [dbo].[Table1] REBUILD PARTITION = ALL
WITH
(DATA_COMPRESSION = PAGE
)
The hard drive now contains 50GB less space than before. I'm guessing that I need to run a command to reclaim the space. Anyone know it?
I feel embarrassed even asking this question, but is it something that could be fixed by shrink the database in question? As it compressed the pages, perhaps it left the space free all throughout the file, and the data files just need to be condensed and shrunk to reclaim the space...
If it created a new, compressed copy of the table and then removed the old one from the file, but didn't shrink the file internally, this might also explain your sudden lack of space on the drive as well.
If this is the case, then a simple "DBCC SHRINKDATABASE('my_database')" should do the trick. NOTE: This may take a long time, and lock the database during that time so as to prevent access, so schedule it wisely.
Have you checked using the table size using sp_spaceused?
Disk space used does not equal space used by data. The compression will have affected log file size (all has to be logged) and required some free working space (like the rule of thumb that index rebuild requires free space = 1.2 times largest table space).
Another option is that you need to rebuild the clustered index because it's fragmented. This compacts data and is the only way to reclaim space for text columns.
Also, read Linchi Shea's articles on data compression

MySQL database size

Microsoft SQL Server has a nice feature, which allows a database to be automatically expanded when it becomes full. In MySQL, I understand that a database is, in fact, a directory with a bunch of files corresponding to various objects. Does it mean that a concept of database size is not applicable and a MySQL database can be as big as available disk space allows without any additional concern? If yes, is this behavior the same across different storage engines?
It depends on the engine you're using. A list of the ones that come with MySQL can be found here.
MyISAM tables have a file per table. This file can grow to your file system's limit. As a table gets larger, you'll have to tune it as there's index and data size optimizations that limit the default size. Also, this MyISAM documentation page says:
There is a limit of 2^32 (~4.295E+09)
rows in a MyISAM table. If you build
MySQL with the --with-big-tables
option, the row limitation is
increased to (2^32)^2 (1.844E+19) rows.
See Section 2.16.2, “Typical configure
Options”. Binary distributions for
Unix and Linux are built with this
option.
InnoDB can operate in 3 different modes: using innodb table files, using a whole disk as a table file or using innodb_file_per_table.
Table files are pre-created per your MySQL instance. You typically create a large amount of space and monitor it. When it starts filling up, you need to configure another file and restart your server. You can also set it to autoextend, so that it will add a chunk of space to the last table file when it starts to fill up. I typically don't use this feature, as you never know when you'll take the performance hit for extending the table. This page talks about configuring it.
I've never used a whole disk as a table file, but it can be done. Instead of pointing to a file, I believe you point your InnoDB table files at the un-formatted, unmounted device.
innodb_file_per_table makes InnoDB tables act like MyISAM tables. Each table gets its own table file. Last time I used this, the table files did not shrink if you deleted rows from them. When a table is dropped or altered, the file resizes.
The Archive engine is a gzipped MyISAM table.
A memory table doesn't use disk at all. In fact, when a server restarts, all the data is lost.
Merge tables are like a poor man's partitioning for MyISAM tables. It causes a bunch of identical tables to be queried as if there were one. Aside from the FRM table definition, no files exist other than the MyISAM ones.
CSV tables are wrappers around CSV files. The usual file system limits apply here. They are not too fast, since they can't have indexes.
I don't think anyone uses BDB any more. At least, I've never used it. It uses a Berkly database as a back end. I'm not familiar with its restrictions.
Federated tables are used to connect to and query tables on other database servers. Again, there is only an FRM file.
The Blackhole engine doesn't store anything locally. It's used primarily for creating replication logs and not for actual data storage, since there is no data storage :)
MySQL Cluster is completely different: it stores just about everything in memory (recent editions allow disk storage) and is very different from all the other engines.
what you describe is roughly true for MyISAM tables. for InnoDB tables the picture is different, and more similar to what other DBMSs do: one (or a few) big file with complex internal structure for the whole server. to optimize it, you can use a whole disk (or partition) as a file. (at least in unix-like systems, where everything is a file)