InnoDB row size changing exponentially while table is growing? - mysql

I have a huge InnoDB Table with three columns (int, mediumint, int). The innodb_file_per_table setting is on and there is only a PRIMARY KEY of the first two columns
The table schema is:
CREATE TABLE `big_table` (
`user_id` int(10) unsigned NOT NULL,
`another_id` mediumint(8) unsigned NOT NULL,
`timestamp` int(10) unsigned NOT NULL,
PRIMARY KEY (`user_id`,`another_id `)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
MySQL Version is 5.6.16
Currently I am multi-inserting over 150 rows per second. No deletion, and no updates.
There are no significant rollbacks or other transaction aborts, that would cause wasted space usage.
MySQL shows a calculated size of 75,7GB on that table.
.ibd size on disc: 136,679,784,448 byte (127.29 GiB)
Counted rows: 2,901,937,966 (47.10 byte per row)
2 days later MySQL shows also a calculated size of 75.7 GB on that table.
.ibd size on disc: 144,263,086,080 byte (135.35 GiB)
Counted rows: 2,921,284,863 (49.38 byte per row)
Running SHOW TABLE STATUS for the table shows:
Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Collation
InnoDB | 10 | Compact | 2645215723 | 30 | 81287708672 | 0 | 0 | 6291456 | utf8_unicode_ci
Here are my Questions:
Why is the disc usage growing disproportionally to the row count?
Why is the Avg_row_length and Data_length totally wrong?
Hope someone can help me, that the disc usage will not grow like this anymore. I have not noticed that as the table was smaller.

I am assuming that your table hasn't grown to its present ~2.9 billion rows organically, and that you either recently loaded this data or have caused the table to be re-organized (using ALTER TABLE or OPTIMIZE TABLE, for instance). So it starts off quite well-packed on disk.
Based on your table schema (which is fortunately very simple and straightforward), each row (record) is laid out as follows:
(Header) 5 bytes
`user_id` 4 bytes
`another_id` 3 bytes
(Transaction ID) 6 bytes
(Rollback Pointer) 7 bytes
`timestamp` 4 bytes
=============================
Total 29 bytes
InnoDB will never actually fill pages to more than approximately ~15/16 full (and normally never less than 1/2 full). With all of the extra overhead in various places the full-loaded cost of a record is somewhere around 32 bytes minimum and 60 bytes maximum per row in leaf pages of the index.
When you bulk-load data through an import or through an ALTER TABLE or OPTIMIZE TABLE, the data will normally be loaded (and the indexes created) in order by PRIMARY KEY, which allows InnoDB to very efficiently pack the data on disk. If you then continue writing data to the table in random (or effectively random) order, the efficiently-packed index structures must expand to accept the new data, which in B+Tree terms means splitting pages in half. If you have an ideally-packed 16 KiB page where records consume ~32 bytes on average, and it is split in half to insert a single row, you now have two half-empty pages (~16 KiB wasted) and that new row has "cost" 16 KiB.
Of course that's not really true. Over time the index tree would settle down with pages somewhere between 1/2 full and 15/16 full -- it won't keep splitting pages forever, because the next insert that must happen into the same page will find that plenty of space already exists to do the insert.
This can be a bit disconcerting if you initially bulk load (and thus efficiently pack) your data into a table and then switch to organically growing it, though. Initially it will seem as though the tables are growing at an insane pace, but if you track the growth rate over time it should slow down.
You can read more about InnoDB index and record layout in my blog posts: The physical structure of records in InnoDB, The physical structure of InnoDB index pages, and B+Tree index structures in InnoDB.

Related

Database design for user and top 25 recommendations

I got a MySQL database and need to store upto 25 recommendations for each of the users (when user visits the site), here is my simple table that holds userid, recommendation and rank for the recommendation:
userid | recommendation | rank
1 | movie_A | 1
1 | movie_X | 2
...
10 | movie_B | 1
10 | movie_A | 2
....
I expect about 10M users and that combined with 25 recommendations would result in 250M rows. Is there any other better ways to design a user-recommendation table?
Thanks!
Is your requirement only to retrieve the 25 recommendations and send it to a UI layer for consumption?
if that is the case, the system that computes the recommendations can build a JSON document and update the value against the Userid. MySQL has support for JSON datatype.
This might not be a good approach if you want to perform search queries on the JSON document.
250 million rows isn't unreasonable in a simple table like this:
CREATE TABLE UserMovieRecommendations (
user_id INT UNSIGNED NOT NULL,
movie_id INT UNSIGNED NOT NULL,
rank TINYINT UNSIGNED NOT NULL,
PRIMARY KEY (user_id, movie_id, rank),
FOREIGN KEY (user_id) REFERENCES Users(user_id),
FOREIGN KEY (movie_id) REFERENCES Movies(movie_id)
);
That's 9 bytes per row. so only about 2GB.
25 * 10,000,000 * 9 bytes = 2250000000 bytes, or 2.1GB.
Perhaps double that to account for indexes and so on. Still not hard to imagine a MySQL server configured to hold the entire data set in RAM. And it's probably not necessary to hold all the data in RAM, since not all 10 million users will be viewing their data at once.
You might never reach 10 million users, but if you do, I expect that you will be using a server with plenty of memory to handle this.

mysql database design for tracking if a user has seen an item

I want to show a user only the content he has not viewed yet.
I considered storing a string containing the ids of the items separated by ',' that a user has viewed but i thought i won't know the possible length of the string.
The alternative i could find was to store it like a log. A table like
user_id | item_id
1 | 1
2 | 2
1 | 2
Which approach will be better for around ten thousand users and thousands of items.
A table of pairs like that would be only 10M rows. That is "medium sized" as tables go.
Have
PRIMARY KEY(user_id, item_id),
INDEX(item_id, user_id)
And, if you are not going past 10K and 1K, consider using SMALLINT UNSIGNED (up to 64K in 2 bytes). Or, to be more conservative, MEDIUMINT UNSIGNED (up to 16M in 3 bytes).

Average row length higher than possible

This is not a duplicate of Why is InnoDB table size much larger than expected? The answer to that question states that if I don't specify a primary key then 6 bytes is added to the row. I did specify a primary key, and there is more than 6 bytes to explain here.
I have a table that is expecting millions of records, so I paid close attention to the storage size of each column. Each row should take 15 bytes (smallint = 2 bytes, date = 3 bytes, datetime = 8 bytes)
CREATE TABLE archive (
customer_id smallint(5) unsigned NOT NULL,
calendar_date date NOT NULL,
inserted datetime NOT NULL,
value smallint(5) unsigned NOT NULL,
PRIMARY KEY (`customer_id`,`calendar_date`,`inserted`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The table now has a half million records in it and is taking more storage than expected. I ran this query to get more details from the system:
SELECT *
FROM information_schema.TABLES
WHERE table_name = 'archive';
information_schema.index_length = 0
information_schema.avg_row_length = 37
information_schema.engine = InnoDB
information_schema.table_type = BASE TABLE
HOW!?
I was expecting 15 bytes per row, and it's taking 37. Can anyone give me an idea of where to look next for an explanation? I've done a lot of reading on thais and I've seen some explanations for an extra 6 or 10 bytes being added to a row size, but that doesn't explain the 22 extra bytes.
One explanation is that indexes also take up storage. There are no indexes on this table.
One explanation is that the the information_schema.tables query returns an unreliable row count which would throw off the avg_row_length. I have checked the row count it is using against a count(*) query and it is only off by a little (1/20 of 1%), so that's not the whole story.
Another explanation is fragmentation. Of note, this table has been rebuilt from a sql dump, so there hasn't been any hammering of updates, inserts and deletes.
Because avg_row_length is data_length / rows.
data_length is basically the total size of the table on disk. An InnoDB table is more than just a list of rows. So there's that extra overhead.
Because an InnoDB row is more than the data.
Similar to above, each row comes with some overhead. So that's going to add to the size of a row. An InnoDB table also isn't just a list of data crammed together. It needs a little extra empty space to work efficiently.
Because stuff is stored on disks in blocks and those blocks aren't always full.
Disks store things in usually 4K, 8K or 16K blocks. Sometimes things don't fit perfectly in those blocks, so you can get some empty space.
As we'll see below, MySQL is going to allocate the table in blocks. And it's going to allocate a lot more than it needs to avoid having to grow the table (which can be slow and lead to disk fragmentation which makes things even slower).
To illustrate this, let's start with an empty table.
mysql> create table foo ( id smallint(5) unsigned NOT NULL );
mysql> select data_length, table_rows, avg_row_length from information_schema.tables where table_name = 'foo';
+-------------+------------+----------------+
| data_length | table_rows | avg_row_length |
+-------------+------------+----------------+
| 16384 | 0 | 0 |
+-------------+------------+----------------+
It uses 16K, or four 4K blocks, to store nothing. The empty table doesn't need this space, but MySQL allocated it on the assumption that you're going to put a bunch of data in it. This avoids having to do an expensive reallocation on each insert.
Now let's add a row.
mysql> insert into foo (id) VALUES (1);
mysql> select data_length, table_rows, avg_row_length from information_schema.tables where table_name = 'foo';
+-------------+------------+----------------+
| data_length | table_rows | avg_row_length |
+-------------+------------+----------------+
| 16384 | 1 | 16384 |
+-------------+------------+----------------+
The table didn't get any bigger, there's all that unused space within those 4 blocks it has. There's one row which means an avg_row_length of 16K. Clearly absurd. Let's add another row.
mysql> insert into foo (id) VALUES (1);
mysql> select data_length, table_rows, avg_row_length from information_schema.tables where table_name = 'foo';
+-------------+------------+----------------+
| data_length | table_rows | avg_row_length |
+-------------+------------+----------------+
| 16384 | 2 | 8192 |
+-------------+------------+----------------+
Same thing. 16K is allocated for the table, 2 rows using that space. An absurd result of 8K per row.
As I insert more and more rows, the table size stays the same, it's using up more and more of its allocated space, and the avg_row_length comes closer to reality.
mysql> select data_length, table_rows, avg_row_length from information_schema.tables where table_name = 'foo';
+-------------+------------+----------------+
| data_length | table_rows | avg_row_length |
+-------------+------------+----------------+
| 16384 | 2047 | 8 |
+-------------+------------+----------------+
Here also we start to see table_rows become inaccurate. I definitely inserted 2048 rows.
Now when I insert some more...
mysql> select data_length, table_rows, avg_row_length from information_schema.tables where table_name = 'foo';
+-------------+------------+----------------+
| data_length | table_rows | avg_row_length |
+-------------+------------+----------------+
| 98304 | 2560 | 38 |
+-------------+------------+----------------+
(I inserted 512 rows, and table_rows has snapped back to reality for some reason)
MySQL decided the table needs more space, so it got resized and grabbed a bunch more disk space. avg_row_length just jumped again.
It grabbed a lot more space than it needs for those 512 rows, now it's 96K or 24 4K blocks, on the assumption that it will need it later. This minimizes how many potentially slow reallocations it needs to do and minimizes disk fragmentation.
This doesn't mean all that space was filled. It just means MySQL thought it was full enough to need more space to run efficiently. If you want an idea why that's so, look into how a hash table operates. I don't know if InnoDB uses a hash table, but the principle applies: some data structures operate best when there's some empty space.
The disk used by a table is directly related to the number of rows and types of columns in the table, but the exact formula is difficult to figure out and will change from version to version of MySQL. Your best bet is to do some empirical testing and resign yourself that you'll never get an exact number.

Optimizing SQL Query from a Big Table Ordered by Timestamp

We have a big table with the following table structure:
CREATE TABLE `location_data` (
`id` int(20) NOT NULL AUTO_INCREMENT,
`dt` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`device_sn` char(30) NOT NULL,
`data` char(20) NOT NULL,
`gps_date` datetime NOT NULL,
`lat` double(30,10) DEFAULT NULL,
`lng` double(30,10) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `dt` (`dt`),
KEY `data` (`data`),
KEY `device_sn` (`device_sn`,`data`,`dt`),
KEY `device_sn_2` (`device_sn`,`dt`)
) ENGINE=MyISAM AUTO_INCREMENT=721453698 DEFAULT CHARSET=latin1
Many times we have performed query such as follow:
SELECT * FROM location_data WHERE device_sn = 'XXX' AND data = 'location' ORDER BY dt DESC LIMIT 1;
OR
SELECT * FROM location_data WHERE device_sn = 'XXX' AND data = 'location' AND dt >= '2014-01-01 00:00:00 ' AND dt <= '2014-01-01 23:00:00' ORDER BY dt DESC;
We have been optimizing this in a few ways:
By adding index and using FORCE INDEX on device_sn.
Separating the table into multiple tables based on the date (e.g. location_data_20140101) and pre-checking if there is a data based on certain date and we will pull that particular table alone. This table is created by cron once a day and the data in location_data for that particular date will be deleted.
The table location_data is HIGH WRITE and LOW READ.
However, few times, the query is running really slow. I wonder if there are other methods / ways / restructure the data that allows us to read a data in sequential date manner based on a given device_sn.
Any tips are more than welcomed.
EXPLAIN STATEMENT 1ST QUERY:
+----+-------------+--------------+------+----------------------------+-----------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+----------------------------+-----------+---------+-------------+------+-------------+
| 1 | SIMPLE | location_dat | ref | data,device_sn,device_sn_2 | device_sn | 50 | const,const | 1 | Using where |
+----+-------------+--------------+------+----------------------------+-----------+---------+-------------+------+-------------+
EXPLAIN STATEMENT 2nd QUERY:
+----+-------------+--------------+-------+-------------------------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+-------------------------------+------+---------+------+------+-------------+
| 1 | SIMPLE | test_udp_new | range | dt,data,device_sn,device_sn_2 | dt | 4 | NULL | 1 | Using where |
+----+-------------+--------------+-------+-------------------------------+------+---------+------+------+-------------+
The index device_sn (device_sn,data,dt) is good. MySQL should use it without need to do any FORCE INDEX. You can verify it by running "explain select ..."
However, your table is MyISAM, which is only supports table level locks. If the table is heavily write it may be slow. I would suggest converting it to InnoDB.
Ok, I'll provide info that I know and this might not answer your question but could provide some insight.
There exits certain differences between InnoDB and MyISAM. Forget about full text indexing or spatial indexes, the huge difference is in how they operate.
InnoDB has several great features compared to MyISAM.
First off, it can store the data set it works with in RAM. This is why database servers come with a lot of RAM - so that I/O operations could be done quick. For example, an index scan is faster if you have indexes in RAM rather than on HDD because finding data on HDD is several magnitudes slower than doing it in RAM. Same applies for full table scans.
The variable that controls this when using InnoDB is called innodb_buffer_pool_size. By default it's 8 MB if I am not mistaken. I personally set this value high, sometimes even up to 90% of available RAM. Usually, when this value is optimized - a lot of people experience incredible speed gains.
The other thing is that InnoDB is a transactional engine. That means it will tell you that a write to disk succeeded or failed and that will be 100% correct. MyISAM won't do that because it doesn't force OS to force HDD to commit data permanently. That's why sometimes records are lost when using MyISAM, it thinks data is written because OS said it was when in reality OS tried to optimize the write and HDD might lose buffer data, thus not writing it down. OS tries to optimize the write operation and uses HDD's buffers to store larger chunks of data and then it flushes it in a single I/O. What happens then is that you don't have control over how data is being written.
With InnoDB you can start a transaction, execute say 100 INSERT queries and then commit. That will effectively force the hard drive to flush all 100 queries at once, using 1 I/O. If each INSERT is 4 KB long, 100 of them is 400 KB. That means you'll utilize 400kb of your disk's bandwith with 1 I/O operation and that remainder of I/O will be available for other uses. This is how inserts are being optimized.
Next are indexes with low cardinality - cardinality is a number of unique values in an indexed column. For primary key this value is 1. it's also the highest value. Indexes with low cardinality are columns where you have a few distinct values, such as yes or no or similar. If an index is too low in cardinality, MySQL will prefer a full table scan - it's MUCH quicker. Also, forcing an index that MySQL doesn't want to use could (and probably will) slow things down - this is because when using an indexed search, MySQL processes records one by one. When it does a table scan, it can read multiple records at once and avoid processing them. If those records were written sequentially on a mechanical disk, further optimizations are possible.
TL;DR:
use InnoDB on a server where you can allocate sufficient RAM
set the value of innodb_buffer_pool_size large enough so you can allocate more resources for faster querying
use an SSD if possible
try to wrap multiple INSERTs into transactions so you can better utilize your hard drive's bandwith and I/O
avoid indexing columns that have low unique value count compared to row count - they just waste space (though there are exceptions to this)

Optimal Mysql Config (Partiontion) & Indexes / Hypertable / RAID Config (Huge Database)

tl;rd:
DB Partitioning with Primary Key
Index size problem.
DB size grows around 1-3 GB per day
Raid setup.
Do you have experience with Hypertable?
Long Version:
i just build / bought a home server:
Xeon E3-1245 3,4 HT
32GB RAM
6x 1,5 TB WD Cavier Black 7200
I will use the Server Board INTEL S1200BTL Raid (no money left for a raid controller). http://ark.intel.com/products/53557/Intel-Server-Board-S1200BTL
The mainboard has 4x SATA 3GB/s ports and 2x SATA 6GB/s
I'm not yet sure if i can setup all 6hdds in RAID 10,
if not possible, i thought 4x hdds Raid 10 (MYSQL DB) & 2xhdds Raid 0 for (OS/Mysql Indexes).
(If raid 0 breaks, its no problem for me, i need only secure the DB)
About the DB:
Its a web crawler DB, where domains, urls, links and such stuff gets stored.
So i thought i partition the DB with the primary keys of each table like
(1-1000000) (1000001-2000000) and so on.
When i do search / insert / select queries in the DB, i need to scan the hole table, cause some stuff could be in ROW 1 and the other in ROW 1000000000000.
If i do such partition by primary key (auto_increment) will this use all my CPU cores? So that it scans each partition parallel? Or should i stick with one huge DB without a partition.
The DB will be very big, on my home System right now its,
Table extract: 25,034,072 Rows
Data 2,058.7 MiB
Index 2,682.8 MiB
Total 4,741.5 MiB
Table Structure:
extract_id bigint(20) unsigned NO PRI NULL auto_increment
url_id bigint(20) NO MUL NULL
extern_link varchar(2083) NO MUL NULL
anchor_text varchar(500) NO NULL
http_status smallint(2) unsigned NO 0
Indexes:
PRIMARY BTREE Yes No extract_id 25034072
link BTREE Yes No url_id
extern_link (400) 25034072
externlink BTREE No No extern_link (400) 1788148
Table urls: 21,889,542 Rows
Data 2,402.3 MiB
Index 3,456.2 MiB
Total 5,858.4 MiB
Table Structure:
url_id bigint(20) NO PRI NULL auto_increment
domain_id bigint(20) NO MUL NULL
url varchar(2083) NO NULL
added date NO NULL
last_crawl date NO NULL
extracted tinyint(2) unsigned NO MUL 0
extern_links smallint(5) unsigned NO 0
crawl_status tinyint(11) unsigned NO 0
status smallint(2) unsigned NO 0
INDEXES:
PRIMARY BTREE Yes No url_id 21889542
domain_id BTREE Yes No domain_id 0
url (330) 21889542
extracted_status BTREE No No extracted 2
status 31
I see i could fix the externlink & link indexes, i just added externlink cause i needed to query that field and i was not able to use the link index. Do you see, what I could tune on the indexes? My new system will have 32 GB but if the DB grows in this speed, i will use 90% of the RAM in FEW wks / months.
Does a packed INDEX help? (How is the performance decrease?)
The other important tables are under 500MB.
Only the URL Source table is huge: 48.6 GiB
Structure:
url_id BIGINT
pagesource mediumblob data is packed with gzip high compression
Index is only on url_id (unique).
From this table the data can be wiped, when i have extracted all what i need.
Do you have any experience with Hypertables? http://hypertable.org/ <= Googles Bigtables. If I move to Hypertables, would this help me in performance (extracting data / searching / inserting / selecting & DB size). I read on the page but I'm still some clueless. Cause you cant directly compare MYSQL with Hypertables. I will try it out soon, must read the documentation first.
What i need, a solution, which fits in my setup, cause i have no money left for any other hardware setup.
Thanks for help.
Hypertable is an excellent choice for a crawl database. Hypertable is an open source, high performance, scalable database modeled after Google's Bigtable. Google developed Bigtable specifically for their crawl database. I recommend reading the Bigtable paper since it uses the crawl database as the running example.
Regarding to #4 (RAID setup), It's not recommended to use RAID5 for production servers. Great article about it -> http://www.dbasquare.com/2012/04/02/should-raid-5-be-used-in-a-mysql-server/