NDB cluster creating undo logs and data files - mysql

Few things that I'm considering about NDBcluster when storing data in disk storage.
In my configuration i defined DataMemory=20G. So what would be the best total size of undo log files? I saw in a blog it should be 6xDataMemory. Is this a must?
When creating undo log files the best way to create is lot of small size files or small amount of big size files? As an example (10 of 1G Files or 100 of 100M files and if I create 200M files what is the best amount for buffer size)
Same goes with data file creating. Is the best way to create 10 of 1G file or 100 of 100M files?
I'm using separate data files for separate table spaces and always one table space for one table. Not using same table space for two tables. Is this a good way to defined and allocate table spaces or won't there any performance issues by using same table space for two tables?
(Here I deal with huge traffic kind of 4000 - 5000 TPS and database size for NDB almost 80GB. And I have 2 data nodes, 2 mysql servers. Each data node has 128GB memory.)
Wilson Hauck

UNDO log files are related to UNDO of disk data changes. So there is no
real relation between UNDO log file size and DataMemory. There is a
relation between DataMemory and REDO log size though since the REDO log
is used by both In-memory data and disk-data parts.
Whether to use small files or bigger files is mostly dependent on
the workings of the file system. Personally I would start with using
fairly large files and not that many.
From a performance point of view there is no difference if you use one
tablespace or if you use one per table. I always think about this as
using one tablespace for all tables. But I don't know of any problems
to use many tablespaces. There might be hardcoded limits on the number
of tablespaces you can have though.
Obviously one tablespace per table means that you can get rid of
files quick when dropping a table if that is of interest.
The size of the UNDO log should be quite ok with a few tens of GBytes
for most use cases. But during massive inserts one might need a bigger
size of the UNDO log since checkpoints can take a long time during
massive inserts.

Related

How can I reduce table size in MySQL?

I have a database named "bongoTv" where lots of table but I found one table its size about 20GB with less amount of data.
After removing few row storage did not reduced. Then I ran a command
OPTIMIZE TABLE notifiation to re-indexing. But It increase its size to 25GB.
As per my undersetting with other DBMS it should be reduce its size but why its size increased, I think it cached previous information somewhere.
After searching on web I found need to configure with innodb_file_per_table=ON. But here in my configuration it is also enabled. But it did not worked.
Need expert opinion who dedicatedly working on this MySQL.
In that case what need to do from my end, what is the solution this issue?
#Louis &
#P.Salmon Can you help me on this?
Thanks in Advance who is going to help me on this.
In general, InnoDB tablespace files never shrink. If you delete data, it makes some space "unused" and over time InnoDB will try to reuse unused space before expanding the tablespace file further.
But there is also tablespace fragmentation. As you delete rows and leave small gaps of unused space, those small gaps may not be usable for new data. So over time, the gaps grow in number, and the tablespace uses more space than it should, if you were to store the same data as compactly as possible.
The free space that comprise full extents, or contiguous 1MB areas, are shown as data_free when you run SHOW TABLE STATUS. But smaller gaps of unused space are not shown. MySQL has no way of reporting the "crumbs" of unused space.
When you use OPTIMIZE TABLE on an InnoDB table, it still cannot shrink the tablespace, it only copies data to a new tablespace. It tries to defragment the data, leaving out the gaps where possible. So if there are a lot of large and small gaps in your old tablespace, the new tablespace should have a smaller total size.
However, while filling pages of the new tablespace, InnoDB deliberately leaves 1/16 of each page unused, to allow for future updates that might need just a little bit more room. So in theory, you might see OPTIMIZE TABLE cause the file to grow larger if the original was very compact and the new file was created with more "elbow room."
But that still does not account for the 20GB to 25GB change you saw. That might be because sizes are cached. That is, the old file was in fact 25GB, but the table status was not reporting it. MySQL 8.0 especially has some caching behavior on some table statistics: https://bugs.mysql.com/bug.php?id=86170
So how to reduce the table size in MySQL?
Deleting rows is the most effective way. If you don't need data to be in the database anymore, delete it. If you might need data for archival purposes but don't need to query it every day, then copy it out to some long-term archiving format, or another database instance on a large-capacity server, and then delete the data from your primary database.
Changing data types to be smaller. For example, why use a BIGINT (64-bits) when a SMALLINT (16-bits) is sufficient for the values you store? It may seem like a small change, but it adds up. Values are stored in the row, but also stored again in any indexes that include that column.
Using compression. The best results are in text and strings that store readable text. The amount of compression depends on the nature of the data. Don't count on this too much, because at best one can expect a 2:1 ratio of compression, and often not even that much.
Ultimately, databases tend to grow larger, and often even the rate of growth accelerates. If you accumulate a lot of data and never delete or archive them, you must make a strategy to support the growth. You may just have to get larger and larger storage volumes.

How much space of ibdata1 is metadata?

I'm using InnoDB database with a single file configuration (in /var), so no innodb_file_per_table.
In the MySql workbench, when I query for the databases used space, with this query
SELECT table_schema "Database", sum( data_length + index_length ) / 1024 / 1024 "Data Base Size in MB"
FROM information_schema.TABLES GROUP BY table_schema;
It says that I have 47 GB of data.
However the size of ibdata1 is 99 GB...
I know that ibdata1 contains a bunch of other things other that table data, like Table Indexes, MVCC (Multiversioning Concurrency Control) Data and Table Metadata
So my question is: Is it normal that supposedly 52 GB of ibdata1 is medatada and a bunch of other things? Usually, how much data beside table data should the ibdata1 file contains?
No, it is not normal that you would have that much metadata. It is normal though that the ibdata file can grow to a ridiculous size if you aren't using innodb_file_per_table.
Your ibdata file will grow when your database grows, but it will never actually shrink.
So, for example, if you had 130 GB of data at one point and deleted a bunch of it, the ibdata file would still be 130 GB after the data was purged. It will just have a bunch of "free space" that it will then use for subsequent inserts.
As for shrinking the file, there's not much you can really do aside from wiping out your database and restoring it. This answer has some good instructions on how to do that.
Howto: Clean a mysql InnoDB storage engine?
You also might want to consider using innodb_file_per_table as deleting data from a table and later optimizing that table will actually shrink the size of the individual table files
There are a few reasons for having a bunch of "extra" space in ibdata1, but the most likely cases are:
You have deleted large amounts of data in the past. When you delete rows or drop tables, although free space will be made available in the file, the file itself will never shrink.
You may have an excessive amount of undo log space (or have at some point in the past). Undo logs are kept during DELETE and UPDATE operations, and for very long-running operations touching many rows can grow quite large. Again, if the file is expanded to hold this data it will never shrink.
As previously mentioned using innodb_file_per_table can help with this if you expect to regularly drop tables and want to get the disk space back. My blog post The basics of InnoDB space file layout may help you understand what is included in the ibdata1 file.

MySql Performance of Innodb with single large data file vs. multiple data files per table

InnoDB allows the option of using a single data file for everything or one data file per table by setting the following in your my.cnf file:
[mysqld]
innodb_file_per_table
Comparing 8 databases with 20 tables roughly with a single ibdata file of 60G vs. a fairly evenly distributed 60G across the 160 individual data files in the one-per-table setup, does one setup have generally better performance than the other? Are there any considerations that would favor one approach over the other?
Benchmark it! We don't know your typical usage pattern or types of queries (Full scans? Narrow lookups based on index? Lots of updates or nearly read-only?).
innodb_file_per_table is easier to maintain — e.g. you can recover disk space after cleaning up and optimizing a single table; the default one-large-file will only grow.

Splitting an existing innodb table into separate files

I'm wondering if its possible to split an existing InnoDB table into multiple files. I understand that if you specify the innodb_file_per_table prior to creating the tables; they are split into separate files. But I have an existing database that is 150GB; and I now need to move a table to another machine.
Normally it wouldn't be a problem. But I noticed that when you use InnoDB and try to truncate/drop a table; the space isn't reallocated on the disk (ie, the ibdata1 doesn't reduce in size bug report). This is a problem as the reason I'm moving the table is due to space issues on my current server. Hence why I'm trying to split this existing ibdata1 file into separate files for tables. This will then give me the flexibility to reallocate the space on disk once I have moved the table.

MySQL database size

Microsoft SQL Server has a nice feature, which allows a database to be automatically expanded when it becomes full. In MySQL, I understand that a database is, in fact, a directory with a bunch of files corresponding to various objects. Does it mean that a concept of database size is not applicable and a MySQL database can be as big as available disk space allows without any additional concern? If yes, is this behavior the same across different storage engines?
It depends on the engine you're using. A list of the ones that come with MySQL can be found here.
MyISAM tables have a file per table. This file can grow to your file system's limit. As a table gets larger, you'll have to tune it as there's index and data size optimizations that limit the default size. Also, this MyISAM documentation page says:
There is a limit of 2^32 (~4.295E+09)
rows in a MyISAM table. If you build
MySQL with the --with-big-tables
option, the row limitation is
increased to (2^32)^2 (1.844E+19) rows.
See Section 2.16.2, “Typical configure
Options”. Binary distributions for
Unix and Linux are built with this
option.
InnoDB can operate in 3 different modes: using innodb table files, using a whole disk as a table file or using innodb_file_per_table.
Table files are pre-created per your MySQL instance. You typically create a large amount of space and monitor it. When it starts filling up, you need to configure another file and restart your server. You can also set it to autoextend, so that it will add a chunk of space to the last table file when it starts to fill up. I typically don't use this feature, as you never know when you'll take the performance hit for extending the table. This page talks about configuring it.
I've never used a whole disk as a table file, but it can be done. Instead of pointing to a file, I believe you point your InnoDB table files at the un-formatted, unmounted device.
innodb_file_per_table makes InnoDB tables act like MyISAM tables. Each table gets its own table file. Last time I used this, the table files did not shrink if you deleted rows from them. When a table is dropped or altered, the file resizes.
The Archive engine is a gzipped MyISAM table.
A memory table doesn't use disk at all. In fact, when a server restarts, all the data is lost.
Merge tables are like a poor man's partitioning for MyISAM tables. It causes a bunch of identical tables to be queried as if there were one. Aside from the FRM table definition, no files exist other than the MyISAM ones.
CSV tables are wrappers around CSV files. The usual file system limits apply here. They are not too fast, since they can't have indexes.
I don't think anyone uses BDB any more. At least, I've never used it. It uses a Berkly database as a back end. I'm not familiar with its restrictions.
Federated tables are used to connect to and query tables on other database servers. Again, there is only an FRM file.
The Blackhole engine doesn't store anything locally. It's used primarily for creating replication logs and not for actual data storage, since there is no data storage :)
MySQL Cluster is completely different: it stores just about everything in memory (recent editions allow disk storage) and is very different from all the other engines.
what you describe is roughly true for MyISAM tables. for InnoDB tables the picture is different, and more similar to what other DBMSs do: one (or a few) big file with complex internal structure for the whole server. to optimize it, you can use a whole disk (or partition) as a file. (at least in unix-like systems, where everything is a file)