Wordpress/MySQL (InnoDB) tables get fragmented near instantaneously after optimizing tables? - mysql

So I originally ran mysqltuner and it kept saying (on a fresh Wordpress install, using Percona InnoDB with random dummy data for Wordpress posts) that there was fragmented tables. I don't know if this is the proper way to check for fragmented tables or not:
SELECT TABLE_SCHEMA, TABLE_NAME, CONCAT(ROUND(data_length / ( 1024 * 1024 ), 2), 'MB') DATA, CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'MB')FREE from information_schema.TABLES where TABLE_SCHEMA NOT IN ('information_schema','mysql') and Data_free > 0;
but that spits out:
+----------------+-------------+--------+--------+
| TABLE_SCHEMA | TABLE_NAME | DATA | FREE |
+----------------+-------------+--------+--------+
| db_wordpress | wp_postmeta | 2.52MB | 4.00MB |
| db_wordpress | wp_posts | 1.52MB | 4.00MB |
+----------------+-------------+--------+--------+
So I'm unsure if those tables are truly fragmented or not. I've ran:
ALTER TABLE wp_postmeta ENGINE='InnoDB';
which supposedly is the correct way to "optimize" InnoDB tables? That then made the above query show:
+----------------+-------------+--------+--------+
| TABLE_SCHEMA | TABLE_NAME | DATA | FREE |
+----------------+-------------+--------+--------+
| db_wordpress | wp_postmeta | 0.02MB | 4.00MB |
| db_wordpress | wp_posts | 1.52MB | 4.00MB |
+----------------+-------------+--------+--------+
Now, mysqltuner was still saying those two tables were fragmented, so I tried:
OPTIMIZE TABLE wp_posts;
When running the above query then put the "data" column back to the original of 2.52MB...
So I'm unsure what's going on exactly? Let alone why exactly the tables (particularly wp_posts and wp_postmeta) would be fragmented as my understanding was (primarily?) deletes were the big cause of fragmentation? If it's also inserts would I have fragmentation issues on pretty much every new post that's made in Wordpress considering that's what seems to have caused the tables to get fragmented in the first place?
Either way, I'm just unsure if that query above is the correct one to check for fragmentation and if so, would ALTER TABLE wp_postmeta ENGINE='InnoDB' or OPTIMIZE TABLE wp_posts be the correct query to run to optimize the table and if so why would the query still show as fragmented?
EDIT:
Percona's Config Wizard gave me:
# MyISAM #
key-buffer-size = 32M
myisam-recover = FORCE,BACKUP
# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
#sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY # I disabled this due to problems with WP/Plugins
sysdate-is-now = 1
innodb = FORCE
#innodb-strict-mode = 0 # I disabled this due to problems with WP/Plugins
# DATA STORAGE #
datadir = /var/lib/mysql/
# BINARY LOGGING #
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 14
sync-binlog = 1
# CACHES AND LIMITS #
tmp-table-size = 96M # I tweaked this due to mysqltuner recommendation
max-heap-table-size = 96M # I tweaked this due to mysqltuner recommendation
query-cache-type = 1 # I tweaked this due to mysqltuner recommendation
query-cache-size = 96M # I tweaked this due to mysqltuner recommendation
max-connections = 100
thread-cache-size = 16
open-files-limit = 65535
table-definition-cache = 1024
table-open-cache = 256
# INNODB #
innodb_stats_on_metadata = 0 # I added this when testing
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 64M
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table = 1
innodb-buffer-pool-size = 400

I would guess that you are storing all your tables in the central tablespace ibdata1, because when you do that, the data_free for all tables is reported as the amount of unused space in the whole ibdata1 file.
In other words, comparing data_length to data_free is not a very useful way to measure fragmentation unless you use innodb_file_per_table=1.
ALTER TABLE wp_postmeta ENGIN=InnoDB will indeed do a table restructure and rebuild secondary indexes. It makes sense that the pages will be filled better, and so your 2.52MB was rewritten down to 0.02MB (which is probably one or two 16KB data pages).
At the scale you're working with, I wouldn't worry about fragmentation at all. Even storing the data in the fragmented form only took 2 and half MB, so your buffer pool is still only a fraction filled. The reason to attend to fragmentation is when you are trying to stretch your buffer pool to fit more of your data. In your case, it's going to fit no matter what.
Re your comment and additional info:
innodb-buffer-pool-size = 400
This is not a good value to set. The unit for this variable is in bytes, so you have requested an awfully tiny buffer pool. In fact, InnoDB will disregard your config and instead use the minimum value of 5MB. If you check your MySQL error log, you may see this:
[Note] InnoDB: Initializing buffer pool, size = 5.0M
This is way too small for any typical production web site. The size of the buffer pool is a crucial tuning parameter if you want to get good performance. The buffer pool should be large enough to hold your working set, that is the pages of data that are most frequently used by your application. See MySQL Performance Blog's post "10 MySQL settings to tune after installation".
That said, your database might be so small that 5.0MB is adequate -- for now. But keep in mind that the default size of the buffer pool in MySQL 5.6 is 128MB.
You have allocated 32MB to your key buffer, which is used only for MyISAM indexes. I recommend against using MyISAM for any tables in most cases. If you have no MyISAM tables, then memory would be better allocated to your InnoDB buffer pool.
You also allocated 96MB to your query cache. The query cache has some downsides to it -- depending on your site's traffic, it could actually hurt more than help. Unless you're getting a lot of bang for the buck from it, I would disable it (query_cache_type=0 and query_cache_size=0), and use that RAM for the buffer pool. See MySQL Performance Blog's post "MySQL Query Cache" for more information on this.
Re your second comment:
No, the innodb_buffer_pool_size is not in MB, it's in bytes. You can make it MB only if you add the "M" suffix to the number.
The MySQL 5.6 manual on innodb_buffer_pool_size says:
The size in bytes of the buffer pool, the memory area where InnoDB caches table and index data. The default value is 128MB.
I just tested it with MySQL 5.5.36 -- not Percona Server, just stock MySQL community edition. I confirm that when I set innodb_buffer_pool_size=400, what I get is 5.0MB, which is the documented minimum size.
I also tested with MySQL 5.1.70, and I see this in the error log when I start with the buffer pool size set to 400:
140203 9:46:20 [Warning] option 'innodb-buffer-pool-size': signed value 400 adjusted to 1048576
140203 9:46:20 InnoDB: Initializing buffer pool, size = 1.0M
MySQL 5.1's minimum buffer pool size is documented to be 1.0MB.

Hmm, very interesting Bill as my setting is definitely in MB as mysqltuner shows max memory used is ~510MB, 100 connections at 1.1mb + the 400mb buffer pool size. Also, if I adjust it something like 900 MySQL fails to start with error_log stating it can't allocate 900mb to the innodb buffer.
Definitely must be a different between Percona and MySQL then.

Related

InnoDB table diskspace consuming all HDD space

Background
I have a MariaDB server installed, some of the tables use MyISAM and some of them use InnoDB. InnoDB is good for reducing query time because it is multi-core. I changed some of our huge tables into InnoDB.
Then I found my HDD is using more and more space. I have checked my CentOS 7 Linux and I found that ibdata1 is consuming my HDD space. And I know if I need to shrink the space I need to fully dump my MySQL server into a .sql file then drop all databases. After that, stop the MySQL server and delete the ibdata1 file. Moreover, set innodb_file_per_table into my.cnf. Finally, import the sql back into the server.
Everything going well until I found this issue.
Issue
I real-time checking my new HDD usage and I realised the table is now using a .ibd file with the name as same as the table name. And it is HUGE! After finishing the import, the HDD usage is even worse than before. I have tried to OPTIMIZE TABLE for a 750MB file to see if it can shrink the size but no luck. I also have a 14.8GB InnoDB table but I don't have another 14.8GB for MySQL to optimize my table and I don't think it can reduce the usage.
Attachment
Current my.cnf
[mysqld]
local-infile = 0
max_connections = 32768
long_query_time = 5
query_cache_type = ON
query_cache_size = 200M
tmp_table_size = 2M
max_heap_table_size = 64M
myisam_sort_buffer_size = 64M
table_open_cache = 4096
thread_concurrency = 28
sort_buffer_size = 16M
read_buffer_size = 16M
join_buffer_size = 16M
innodb_file_per_table
innodb_flush_method = O_DIRECT
innodb_log_file_size = 1G
innodb_buffer_pool_size = 4G
innodb_read_io_threads = 7
innodb_write_io_threads = 7
What can I do now?
Short answer: The disk space used by an InnoDB table (and indexes) is roughly 2x-3x what it would take with MyISAM. This is something to live with.
Long answer:
If you did not have a bunch of spare disk space to start with, your conversion to InnoDB will eventually run out of space, regardless of file_per_table, etc.
innodb_file_per_table = OFF: All data and indexes for all subsequently CREATEd or ALTERed tables goes into the file ibdata1. That file only grows; it cannot shrink.
innodb_file_per_table = ON: All data and indexes for all subsequently CREATEd or ALTERed tables goes into .ibd files -- each with the name of the table. Generally, this is the better approach because it allows for better maintenance in the long run.
Either way, a similar amount of disk space will be taken.
Other issues:
query_cache_size = 200M hurts performance; do not go above about 50M.
Both InnoDB and MyISAM are capable of using multiple CPUs -- but only one CPU per connection. On the other hand, MyISAM does "table locking", so there is less concurrency. (This may have confused you into thinking it was a CPU issue.)
Some ALTERs and all OPTIMIZEs copy the table over. So, during the operation, you need enough disk space for an extra copy of the table. When using ibdata1, this will expand, but not contract, the size of that file. With .ibd, the space is given back to the OS.
ALTER and OPTIMIZE may or may not shrink the size of the table and index(es) (and increase Data_free). OPTIMIZE is almost never useful for InnoDB.
Other tips on converting to InnoDB .
I tend to like putting 'tiny' tables into ibdata1 instead of file_per_table. But it is a hassle--I have to think-ahead.

Using more memory in MySQL Server

Summary:
I haven't yet been able to get MySQL to use more than 1 core for a select statement and it doesn't get above 10 or 15 GB of RAM.
The machine:
I have a dedicated Database server running MariaDB using MySQL 5.6. The machine is strong with 48 cores and 192GB of RAM.
The data:
I have about 250 million rows in one large table (also several other tables ranging from 5-100 million rows). I have been doing a lot of reading from the tables, sometimes inserting into a new table to denormalize the data a bit. I am not setting this system up as a transactional system, rather, it will be used more similarly to a data warehouse with few connections.
The problem:
When I look at my server's stats, it looks like CPU is at around 70% for one core with a select query running, and memory is at about 5-8%. There is no IO waiting, so I am convinced that I have a problem with MySQL memory allocation. After searching on how to increase the usage of memory in MySQL I have noticed that the config file may be the way to increase memory usage.
The solution I have tried based on my online searching:
I have changed the tables to MyISAM engine and added many indexes. This has helped performance, but querying these tables is still incredibly slow. The write speed using load data infile is very fast, however, running a mildly complex select query takes hours or even days.
I have also tried adjusting the following configurations:
key-buffer-size = 64G
read_buffer_size = 1M
join_buffer_size = 4294967295
read_rnd_buffer_size = 2M
key_cache_age_threshold = 400
key_cache_block_size = 800
myisam_data_pointer_size = 7
preload_buffer_size = 2M
sort_buffer_size = 2M
myisam_sort_buffer_size = 10G
bulk_insert_buffer_size = 2M
myisam_repair_threads = 8
myisam_max_sort_file_size = 30G
max-allowed-packet = 256M
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 150
open-files-limit = 65535
table-definition-cache = 1024
table-open-cache = 2048
These config changes have slightly improved the amount of memory being used, but I would like to be able to use 80% of memory or so... or as much as possible to get maximum performance. Any ideas on how to increase the memory allocation to MySQL?
As you have already no IO waiting you are using a good amount of memory. Your buffers also seem quite big. So I would doubt that you can have significant CPU savings with using additional memory. You are limited by the CPU power of a single core.
Two strategies could help:
Use EXPLAIN or query analyzers to find out if you can optimize your queries to save CPU time. Adding missing indexes could help a lot. Sometimes you also might need combined indexes.
Evaluate an alternative storage engine (or even database) that is better suited for analytical queries and can use all of your cores. MariaDB supports InfiniDB but there are also other storage engines and databases available like Infobright, MonetDB.
Use show global variables like "%thread%" and you may get some clues on enabling thread concurrency options.
read_rnd_buffer_size at 2M tested at 16384 with your data may produce significant reduction in time required to complete your query.

MYSQL (via WAMP) long update query time

I have spent several weeks crunching on this to no avail, so I'm hopeful you may be able to help. Generally, I have an update query that takes forever to run (I've given up after 12 hours). To knock the obvious out of the way, I have an index on the columns. Also, I am totally self-taught on MYSQL, so I may need additional clarification on data / processes etc. This DB is for my personal use, offline. Said another way... this is not my day job. While I enjoy MYSQL, I am not a super-user.
First, my system specs...
Laptop Samsung QX410
Windows 7, 64 bit
Intel i5, M 480 # 2.67 GHz
RAM: 8 GB (7.79 available)
WAMP 2.5 with MYSQL v5.6.17
Tables are INNODB
MYSQL set up:
' The MySQL server
[wampmysqld]
port = 3306
socket = /tmp/mysql.sock
key_buffer_size = 512M
max_allowed_packet = 32M
sort_buffer_size = 512K
net_buffer_length = 32K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
basedir=c:/wamp/bin/mysql/mysql5.6.17
log-error=c:/wamp/logs/mysql.log
datadir=c:/wamp/bin/mysql/mysql5.6.17/data
' Uncomment the following if you are using InnoDB tables
innodb_data_home_dir = C:\mysql\data/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = C:\mysql\data/
innodb_log_arch_dir = C:\mysql\data/
' You can set .._buffer_pool_size up to 50 - 80 %
' of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 4000M
innodb_additional_mem_pool_size = 32M
' Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 512M
innodb_log_buffer_size = 256M
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout = 50
Issue in more detail:
I have two tables Trade_List and Cusip_Table and am trying to populate one column in Trade_List (I need to pre-populate this value, since many queries will be run against it).
Trade_List has 11 columns, two of which are relevant.
CUSIP (varchar 45) - generally this is a 9 digit alpha-numeric number.
TICKER (varchar 45) - generally this is 10 letters or less. I want to populate this.
This table has roughly 10 million rows.
I have removed all indices from this table except one on CUSIP.
Cusip_Table has 5 columns, two of which are relevant.
CUSIP (varchar 45) - generally this is a 9 digit alpha-numeric number.
TICKER (varchar 45) - generally this is 10 letters or less. This is already populated.
This table has roughly 70,000 rows.
I have an index 'CTDuplicateCheck' on (Cusip, Ticker).
When I run...
Select A.cusip, B.ticker
From Trade_list A, Cusip_table B
Where A.cusip = B.cusip;
... MYSQL indicates that the query takes about 13 seconds, but in reality it seems to take about a minute, so I ran profiling on it...
starting 0.000093
checking permissions 0.000006
checking permissions 0.000005
Opening tables 0.000041
init 0.000037
System lock 0.000013
optimizing 0.000015
statistics 0.000041
preparing 0.000030
executing 0.000002
Sending data 10.982211
end 0.000014
query end 0.000010
closing tables 0.000018
freeing items 0.000070
logging slow query 0.000004
cleaning up 0.000019
I don't know what any of this means, but 10 seconds for sending data seems reasonable (the return set is ~9M rows.
Just for kicks, and to make sure the index is working, I ran an 'explain' (shown below). I think this says that my index is working correctly.
1 SIMPLE B index CTDuplicateCheck CTDuplicateCheck 96 53010 Using where; Using index
1 SIMPLE A ref TL1Cusip TL1Cusip 48 13f_master_data.B.CUSIP 154 Using index
**NOTE: 13f_Master_Data is the name of the database.
At any rate, when I run the same query, but change it to an update, everything falls apart and it will not complete. I would expect things to run a bit slower, but 12 hours +? I just can't imagine that this is normal for an update query that touches 9M rows. The original INSERT took less than an hour, and the select takes less than a minute. Code for the update is below...
Update Trade_list A, Cusip_table B
Set A.ticker = B.ticker
Where A.cusip = B.cusip;
Stuff I have tried:
Removed almost all index's from Trade_List. I left one in on CUSIP.
Upgraded RAM from 4 GB to 8 GB. This did nothing. Upon further investigation, my CPU and RAM are not limiting factors. CPU generally sits around 30%, RAM never gets above 5GB utilized. This leads me to believe that the issue is I/O. Is it possible MYSQL is doing a full table-scan? Why would it not utilize the index?
Changed all types of memory allocations per http://www.percona.com/blog/2013/09/20/innodb-performance-optimization-basics-updated/ and https://rtcamp.com/tutorials/mysql/mysqltuner/ and http://www.percona.com/blog/2006/09/29/what-to-tune-in-mysql-server-after-installation/. As far as I can tell, this did nothing. Again, I don't think the limiting factor is memory available. Also, I have no doubt that my memory allocations (shown above) are completely screwed up. I had no idea what I was doing and changed things all over the place. That said, I don't think the memory changes made anything any worse.
Upgraded MYSQL and Wamp versions (did nothing).
Read and learned a lot about index's. Candidly, I know very little about MYSQL, and am totally self-taught. I have learned a lot about memory on this foray, but need someone to step in and tell me where I have totally derailed. This database is for my own offline analysis. I am the only user.
I am happy to provide additional information that may help to analyze the issue. I'm at a total loss on this. The only thing I can come up with is that the system is doing full scans row by row... for every look-up in the update. Though, this could be completely false.
Your thoughts are much appreciated.
PM

heave writing to InnoDB

We are constructing, for every day, mappings from tweet user id to the list of tweet ids of tweets made by that user. The storage engine we are using is Percona xtraDB "5.1.63-rel13.4 Percona Server (GPL), 13.4, Revision 443"
We are unsatisfied with the maximal throughput in terms of row inserts per second. Our maximal throughput to process tweets with xtraDB is around 6000 ~ 8000 tweets per second. (for example, if we had to rebuild data for some day from scratch, we'll have to wait for almost a day)
For the most part we are able to do this realtime enough with the full amount of twitter data (which is roughly 4000 ~ 5000 tweets per second).
We have narrowed down the bottleneck of our application to MySQL InnoDB insert. In our application, we read the feed from the disk and parse it with jackson (which happens at about 30,000 tweets per second). Our application then proceeds in batches of tweets. For the set of authors that generates these tweets, we partitioning them into 8 groups (simple partitioning with user id modulo 8). A table is allocated for each group and 1 thread is allocated to write the data to that table. Everyday there are roughly 26 million unique users that generates these tweets, and therefore each table have roughly 4 millions rows.
For a group of users, we only use one transaction for read and update. The group size is a runtime tunable. We have tried various sizes from 8 ~ 64000 , and we have determined 256 to be a good batch size.
the schema of our table is
CREATE TABLE `2012_07_12_g0` ( `userid` bigint(20) NOT NULL, `tweetId` longblob, PRIMARY KEY (`userid`)) ENGINE=InnoDB DEFAULT CHARSET=utf8
where tweetId is the compressed list of tweet ids long integers, compressed with Google snappy
Each thread uses
Select userid,tweetId from <tablename> where userid IN (....)
to resolve the userids to readback the data, and the threads use
INSERT INTO <tablename> (userid,tweetId) VALUES (...) ON DUPLICATE KEY UPDATE tweetId=VALUES(tweetId)
to update the rows with new tweetids.
We have tried setting various XtraDB parameters
innodb_log_buffer_size = 4M
innodb_flush_log_at_trx_commit = 2
innodb_max_dirty_pages_pct = 80
innodb_flush_method = O_DIRECT
innodb_doublewrite = 0
innodb_use_purge_thread = 1
innodb_thread_concurrency = 32
innodb_write_io_threads = 8
innodb_read_io_threads = 8
#innodb_io_capacity = 20000
#innodb_adaptive_flushing = 1
#innodb_flush_neighbor_pages= 0"
The table size for each day is roughly 8G for all tables, and InnoDB is given 24GB to work with.
We are using:
6-disk (crucial m4 SSD, 512 GB, 000F firmware) software RAID5.
Mysql innodb data, table space on the SSD partition
ext4 mount with noatime,nodiratime,commit=60
centos 6.2
sun jdk 1.6.30
Any tips for making our insert go faster would be greatly appreciated, thanks.
InnoDB is given 24GB
Do you mean this is the innodb_buffer_pool_size? You didn't say how much memory you have nor what CPUs you are using. If so then you should probably be using a larger innodb_log_buffer_size. What's your setting for innodb_log_file_size? It should probably be in the region of 96Mb.
innodb_write_io_threads = 8
ISTR that ext3 has some concurrency problems with multiple writers - but I don't know about ext4
Have you tried changing innodb_flush_method?
Which I/O scheduler are you using (in the absence of a smart disk controller, usually deadline is fastest, sometimes CFQ)?
Switching off the ext4 barriers will help with throughput - its a bit more risky - make sure you've got checksums enabled in JBD2. Similarly setting innodb_flush_log_at_trx_commit=0 should give a significant increase but more risky.
Since you're obviously not bothered about maintaining your data in a relational format, then you might consider using a noSQL database.
My initial suggestions would be:
As you don't have RAID card with memory you may want to comment out innodb_flush_method = O_DIRECT line to let system cache writes
as you disabled double write buffer you could also set innodb_flush_log_at_trx_commit to 0 which would be faster than 2
set innodb_log_buffer_size to cover at least one second of writes (approx 12Mb for 30K tweets)
in case you use binary logs - make sure you have sync_binlog = 0
On the hardware side I would strongly suggest to try RAID card with at least 256Mb RAM and battery unit (BBU) to improve write speed. There are RAID cards on the market that supports SSD.
Hope this helps. Please let me know how it goes.

Mysql Lowering Cpu Usage through Buffering

My Mysql server is heavily loaded, now 300 qps average.
It uses %50 Cpu in average and just 700MB of ram. My server has 8GB and it has over 3GB free. The slow query log seems fine. There are very few and not frequent ones.
I want to be sure that it is returning the cached results and do not touch the disk unnecessarily.
I think the linux OS caches the innodb file but can I trust on that?
And is there any good practice to lower cpu usage through buffering or caching?
innodb_buffer_pool_size is set to default value. (8mb)
I have Innodb, MyIsam and Memory tables mixed.
Here is an output from a tuner script
INNODB STATUS
Current InnoDB index space = 238 M
Current InnoDB data space = 294 M
Current InnoDB buffer pool free = 0 %
Current innodb_buffer_pool_size = 8 M
KEY BUFFER
Current MyISAM index space = 113 M
Current key_buffer_size = 192 M
Key cache miss rate is 1 : 63
Key buffer free ratio = 74 %
Your key_buffer_size seems to be fine
QUERY CACHE
Query cache is enabled
Current query_cache_size = 256 M
Current query_cache_used = 19 M
Current query_cache_limit = 1 M
Current Query cache Memory fill ratio = 7.64 %
Current query_cache_min_res_unit = 4 K
Query Cache is 28 % fragmented
Since you have 3GB free, boost your innodb_buffer_pool_size to hold your entire innodb dataset (data + index).
Give it a 1G so it has some breathing room. You won't regret it. :)