WordPress database performance: Percona server vs MySQL w/o InnoDB - mysql

I don't want to ask a subjective "which DBMS is best?" or "which DBMS of these two is better?". This doesn't have to be a fanboy debate.
Rather, I welcome any benchmark test results or specific experiences, when it comes to one specific criteria - performance - especially with respect to one particular application: WordPress.
I understand that WordPress doesn't use InnoDB, and so disabling InnoDB in MySQL can speed things up. On the other hand, Percona is a MySQL fork that replaces InnoDB with XtraDB and also claims to be highly efficient, high-performance.
How does each stack up on performance when it comes to running WordPress? (no need for competition...both might come out looking very well, for all I know)
I have tried searching generally on Google, but haven't come across so much as an intelligent discussion, let alone performance benchmark tests.
Would greatly appreciate if any of the experts here could share their experiences. Many thanks!
And please keep any smug, snide comments like "why don't YOU try" to yourself. If I could, I would. And the purpose of Stack Overflow is to share expertise and learn from each-other, not to do everything yourself.

This question is less "MySQL vs. Percona Server" than "MyISAM vs. InnoDB/XtraDB". They both have their own performance characteristics and which storage engine is right for you largely depends on your workload. Most Wordpress sites are low traffic and read-mostly, so as long as your data fits into your buffer pool (for InnoDB/XtraDB) or key cache (for MyISAM), I would expect not-too-dissimilar performance.
Having done a lot of work on Wordpress database optimization, I can tell you that the performance of your Wordpress site depends more upon the class of your hardware and your chosen plugins.
You should use a Caching plugin so that you can just avoid a ton of database read requests
You should avoid plugins that issue expensive queries (sadly, this covers most plugins)
You should prune your comments (usually comments are 99+% SPAM so the ones that are marked as spam are just sitting in your database taking up space)
Your host should have enough RAM for the hot dataset to fit in memory
If you really want to go into detail about MyISAM vs. InnoDB/XtraDB, you can check out the following links:
http://www.mysqlperformanceblog.com/2009/01/12/should-you-move-from-myisam-to-innodb/
http://www.rackspace.com/knowledge_center/article/mysql-engines-myisam-vs-innodb
So, to make a long answer even longer, you'll need to profile your MySQL instance after you can generate production traffic. I know you said you couldn't, but ... this question is kind of like me asking "What haircut would look best on me", without including a picture.

Wordpress can use InnoDB (or XtraDB) just fine. I have done consulting and training for sites that host WordPress at scale, using any of MyISAM, InnoDB, and XtraDB.
WordPress 3.5.1 creates tables without specifying the storage engine. So it honors the default storage engine on whatever instance of MySQL you're using. As of MySQL 5.5 (ca. December 2010), the default storage engine is InnoDB. I tested installing WordPress on a virtual host running MySQL 5.6.10, and it created tables using the InnoDB storage engine.
I don't have any benchmarks to share, but those would be of limited use anyway, because performance depends so much on the given hardware, the traffic load, and other factors.
A CMS like WordPress tends to be heavily weighted toward read-only queries. This is where InnoDB should give good benefit, because it caches data pages as well as indexes. MyISAM only caches indexes, and relies on the filesystem cache to hold data.
So the key to making WordPress perform well is to allocate enough innodb_buffer_pool_size to hold the data and indexes for all your tables. The data size of a WordPress site (even one with hundreds of articles) isn't typically very large, so you probably only need a few GB of buffer pool to hold all frequently-requested data in the buffer. Once the data and index pages have populated the InnoDB buffer pool, 99.9% of your queries will be served out of RAM, and the site will have great performance.
As with any caching system, the real killer to performance is when your "hot" data is larger than the cache, forcing queries to incur disk I/O. A single disk I/O is worth a few thousand RAM accesses, so you want to serve content completely out of RAM as much as possible.
The improvements in XtraDB are designed to help as the number of Threads_running gets higher, or the buffer pool gets larger (e.g. dozens of GB). It's unlikely that a single WP site will exercise either MySQL or Percona Server so heavily that these improvements will offer more than a slight advantage. Unless you're going to host hundreds of WP sites on a given server like a hosting company.
You may even find that the bottleneck ceases to be the database, and then you need to focus on front-end optimizations.

Related

MySQL: Speed over reliability config

For my development machine I need no data consistency in case of a crash. Is there a config for a Debian-like system, that optimizes MySQL for speed (even if it sacrifices reliability)?
So something like: Cache the last 1 GB in RAM. Don't touch the disk with data until the 1 GB is used.
What kind of queries are going on? One of my mantras: "You cannot configure your way out of a performance problem."
Here's one thing that speeds up InnoDB, wrt transactions:
innodb_flush_log_at_trx_commit = 2
There is a simple way to speed up single-row inserts by a factor of 10.
Some 'composite' indexes can speed up a SELECT by a factor of 100.
Reformulating a WHERE can sometimes speed up a query by a factor of 100.
You can disable many of the InnoDB configurations for durability, at the risk of increased risk of losing data. But sometimes you want to operate the database in Running with scissors mode because the original data is safely stored somewhere else, and the copy in your test database is easily recreated.
This blog describes Reducing MySQL durability for testing. You aren't going to see any official MySQL recommendation to do this for any purpose other than testing!
Here's a summary of changes you can make in your /etc/my.cnf:
[mysqld]
# log_bin (comment this out to disable the binary log)
# sync_binlog=0 (irrelevant if you don't use the binary log)
sync_frm=0
innodb_flush_log_at_trx_commit=0
innodb_doublewrite=0
innodb_checksums=0
innodb_support_xa=0
innodb_log_file_size=2048M # or more
He also recommends to increase innodb_buffer_pool_size, but the size depends on your available RAM.
For what it's worth, I recently tried to set innodb_flush_log_at_trx_commit=0 in the configuration in the default Vagrant box I built for developers on my team, but I had to back out that change because it was causing too much lost time for developers who were getting corrupted databases. Just food for thought. Sometimes it's not a good tradeoff.
This doesn't do exactly what you asked (keep the last 1GB of data in RAM), as it still operates InnoDB with transaction logging and the log flushes to disk once per second. There's no way to turn that off in MySQL.
You could try using MyISAM, which uses buffered writes for data and index, and relies on the filesystem buffer. Therefore it could cache some of your data (in practice I have found that the buffer flushes to disk pretty promptly, so you're unlikely to have a full 1GB in RAM at any time). MyISAM has other problems, like lack of support for transactions. Developing with MyISAM and then using InnoDB in production can set you up for some awkward surprises.
Here's a couple of other changes you could make in your MySQL sessions for the sake of performance, but I don't recommend these even for development, because it can change your application behavior.
set session unique_checks=0;
set session foreign_key_checks=0;
Some people recommend using the MEMORY storage engine. That has its own problems, like size limits, table-locking, and lack of support for transactions.
I've also experimented with trying to put tables or tmpdir onto a tmpfs, but I found that didn't give nearly the performance boost you might expect. There's overhead in an RDBMS that is not directly related to disk I/O.
You might also like to experiment with MyRocks, a version of MySQL including the RocksDB storage engine for MySQL. Facebook developed it and released it as open-source. See Facebook rocks an open source storage engine for MySQL (InfoWorld). They promise it reduces I/O, it compresses data, and does other neat things.
But again, it's a good rule of thumb to make your development environment as close as possible to your production environment. Using a different storage engine creates a risk of not discovering some bugs until your code reaches production.
Bottom line: Tuning MySQL isn't a magic bullet. Maybe you should consider designing your application to make more use of microservices, caches, and message queues, and less reliance on direct SQL queries.
Also, I'd recommend to always supply your developers the fastest SSD-based workstation you can afford. Go for the top of the line on CPU and RAM and disk speed.
#Bill Karwin's answer has useful mysql settings to improve performance. I have used them all and was able to achieve a roughly 2x performance improvement.
However, what gave me the biggest performance boost (nearly 15x faster) for my use case -- which was reloading a mysql dump -- was to mount the underlying filesystem (ext4) using the nobarriers option.
mount -o remount,nobarrier /
More info here
You should only consider this if you have a separate partition (or logical volume) mounted at /var/lib/mysql, so that you can make this tradeoff only for MySQL, not your entire system.
Although this answer may not hit exactly the questions you ask, consider creating your tables with MEMORY engine as documented here: http://dev.mysql.com/doc/refman/5.7/en/memory-storage-engine.html
A typical use case for the MEMORY engine involves these
characteristics:
Operations involving transient, non-critical data such as session
management or caching. When the MySQL server halts or restarts, the
data in MEMORY tables is lost.
In-memory storage for fast access and low latency. Data volume can fit
entirely in memory without causing the operating system to swap out
virtual memory pages.
A read-only or read-mostly data access pattern (limited updates).
Give that a shot.
My recommendation, even for a development machine, would be to use the default InnoDB. If you choose to do transactions, InnoDB will be helpful.
This blog can help you run MySQL off of tmpfs: http://jotschi.de/2014/02/03/high-performance-mysql-testdatabase/. User Jotschi also speaks about that in a SO answer #10692398

Which database for a web crawler, and how do I use MySQL in a distributed environment?

Which database engine should I use for a web crawler, InnoDB or MYiSAM? I have two PC's, each with 1TB hard drives. If one fills up, I'd like for it to save to the other PC automatically, but reads should go to the correct PC; how do I do that?
As for the first part of your question, it rather depends on you precise implementation. If you are going to have a single crawler limited by network bandwidth, then MYiSAM can be quicker. If you are using multiple crawlers then InnoDB will give you advantages such as transactions which may help.
AFAIK MySQL doesn't support the hardware configuration you are suggesting. If you need large storage you may wan tot look at MySQL Cluster.
MyISAM is the first choice, because you will have write only operations and crawlers -- even run in parallel -- will be configured -- I suppose -- to crawl different domains/urls. So you do not need to take care of access conflicts.
When writing a lot of data, especially text!, to Mysql avoid transactions, indexes, etc., because it will slow down MySQL drastically.

Maximum capabilities of MySQL

How do I know when a project is just to big for MySQL and I should use something with a better reputation for scalability?
Is there a max database size for MySQL before degradation of performance occurs? What factors contribute to MySQL not being a viable option compared to a commercial DBMS like Oracle or SQL Server?
Google uses MySQL. Is your project bigger than Google?
Smart-alec comments aside, MySQL is a professional level database application. If your application puts a strain on MySQL, I bet it'll do the same to just about any other database.
If you are looking for a couple of examples:
Facebook moved to Cassandra only after it was storing over 7 Terabytes of inbox data. (Source: Lakshman, Malik: Cassandra - A Decentralized Structured Storage System.) (... Even though they were having quite a few issues at that stage.)
Wikipedia also handles hundreds of Gigabytes of text data in MySQL.
I work for a very large Internet company. MySQL can scale very, very large with very good performance, with a couple of caveats.
One problem you might run into is that an index greater than 4 gigabytes can't go into memory. I spent a lot of time once trying to improve the MySQL's full-text performance by fiddling with some index parameters, but you can't get around the fundamental problem that if your query hits disk for an index, it gets slow.
You might find some helper applications that can help solve your problem. For the full-text problem, there is Sphinx: http://www.sphinxsearch.com/
Jeremy Zawodny, who now works at Craig's List, has a blog on which he occasionally discusses the performance of large databases: http://blog.zawodny.com/
In summary, your project probably isn't too big for MySQL. It may be too big for some of the ways that you've used MySQL before, and you may need to adapt them.
Mostly it is table size.
I am assuming here that you will use the Oracle innoDB plugin for mysql as your engine. If you do not, that probably means you're using a commercial engine such as infiniDB, InfoBright for Tokutek, in which case your questions should be sent to them.
InnoDB gets a bit nasty with very large tables. You are advised to partition your tables if at all possible with very large instances. Essentially, if your (frequently used) indexes don't all fit into ram, inserts will be very slow as they need to touch a lot of pages not in ram. This cannot be worked around.
You can use the MySQL 5.1 partitioning feature if it does what you want, or partition your tables at the application level if it does not. If you can get your tables' indexes to fit in ram, and only load one table at a time, then you're on a winner.
You can use the plugin's compression to make your ram go a bit further (as the pages are compressed in ram as well as on disc) but it cannot beat the fundamental limtation.
If your table's indexes don't all (or at least MOSTLY - if you have a few indexes which are NULL in 99.99% of cases you might get away without those ones) fit in ram, insert speed will suck.
Database size is not a major issue, provided your tables individually fit in ram while you're doing bulk loading (and of course, you only load one at once).
These limitations really happen with most row-based databases. If you need more, consider a column database.
Infobright and Infinidb both use a mysql-based core and are column based engines which can handle very large tables.
Tokutek is quite interesting too - you may want to contact them for an evaluation.
When you evaluate the engine's suitability, be sure to load it with very large data on production-grade hardware. There's no point in testing it with a (e.g.) 10G database, that won't prove anything.
MySQL is a commercial DBMS, you just have the option to get the support/monitoring that is offered by Oracle or Microsoft. Or you can use community support or community provided monitoring software.
Things you should look at are not only size at operations. Critical are also:
Scenaros for backup and restore?
Maintenance. Example: SQL Server Enterprise can rebuild an index WHILE THE OLD ONE IS AVAILABLE - transparently. This means no downtime for an index rebuild.
Availability (basically you do not want to have to restoer a 5000gb database if a server dies) - mirroring preferred, replication "sucks" (technically).
Whatever you go for, be carefull with Oracle RAC (their cluster) - it is known to be "problematic" (to say it finely). SQL Server is known to be a lot cheaper, scale a lot worse (no "RAC" option) but basically work without making admins want to commit suicide every hour (the "RAC" option seems to do that). Scalability "a lot worse" still is good enough for the Terra Server (http://msdn.microsoft.com/en-us/library/aa226316(SQL.70).aspx)
THere wer some questions here recently of people having problems rebuilding indices on a 10gb database or something.
So much for my 2 cents. I am sure some MySQL specialists will jump in on issues there.

Best storage engine for constantly changing data

I currently have an application that is using 130 MySQL table all with MyISAM storage engine. Every table has multiple queries every second including select/insert/update/delete queries so the data and the indexes are constantly changing.
The problem I am facing is that the hard drive is unable to cope, with waiting times up to 6+ seconds for I/O access with so many read/writes being done by MySQL.
I was thinking of changing to just 1 table and making it memory based. I've never used a memory table for something with so many queries though, so I am wondering if anyone can give me any feedback on whether it would be the right thing to do?
One possibility is that there may be other issues causing performance problems - 6 seconds seems excessive for CRUD operations, even on a complex database. Bear in mind that (back in the day) ArsDigita could handle 30 hits per second on a two-way Sun Ultra 2 (IIRC) with fairly modest disk configuration. A modern low-mid range server with a sensible disk layout and appropriate tuning should be able to cope with quite a substantial workload.
Are you missing an index? - check the query plans of the slow queries for table scans where they shouldn't be.
What is the disk layout on the server? - do you need to upgrade your hardware or fix some disk configuration issues (e.g. not enough disks, logs on the same volume as data).
As the other poster suggests, you might want to use InnoDB on the heavily written tables.
Check the setup for memory usage on the database server. You may want to configure more cache.
Edit: Database logs should live on quiet disks of their own. They use a sequential access pattern with many small sequential writes. Where they share disks with a random access work load like data files the random disk access creates a big system performance bottleneck on the logs. Note that this is write traffic that needs to be completed (i.e. written to physical disk), so caching does not help with this.
I've now changed to a MEMORY table and everything is much better. In fact I now have extra spare resources on the server allowing for further expansion of operations.
Is there a specific reason you aren't using innodb? It may yield better performance due to caching and a different concurrency model. It likely will require more tuning, but may yield much better results.
should-you-move-from-myisam-to-innodb
I think that that your database structure is very wrong and needs to be optimised, has nothing to do with the storage

MySQL database optimization best practices

What are the best practices for optimizing a MySQL installation for best performance when handling somewhat larger tables (> 50k records with a total of around 100MB per table)? We are currently looking into rewriting DelphiFeeds.com (a news site for the Delphi programming community) and noticed that simple Update statements can take up to 50ms. This seems like a lot. Are there any recommended configuration settings that we should enable/set that are typically disabled on a standard MySQL installation (e.g. to take advantage of more RAM to cache queries and data and so on)?
Also, what performance implications does the choice of storage engines have? We are planning to go with InnoDB, but if MyISAM is recommended for performance reasons, we might use MyISAM.
The "best practice" is:
Measure performance, isolating the relevant subsystem as well as you can.
Identify the root cause of the bottleneck. Are you I/O bound? CPU bound? Memory bound? Waiting on locks?
Make changes to alleviate the root cause you discovered.
Measure again, to demonstrate that you fixed the bottleneck and by how much.
Go to step 2 and repeat as necessary until the system works fast enough.
Subscribe to the RSS feed at http://www.mysqlperformanceblog.com and read its historical articles too. That's a hugely useful resource for performance-related wisdom. For example, you asked about InnoDB vs. MyISAM. Their conclusion: InnoDB has ~30% higher performance than MyISAM on average. Though there are also a few usage scenarios where MyISAM out-performs InnoDB.
InnoDB vs. MyISAM vs. Falcon benchmarks - part 1
The authors of that blog are also co-authors of "High Performance MySQL," the book mentioned by #Andrew Barnett.
Re comment from #ʞɔıu: How to tell whether you're I/O bound versus CPU bound versus memory bound is platform-dependent. The operating system may offer tools such as ps, iostat, vmstat, or top. Or you may have to get a third-party tool if your OS doesn't provide one.
Basically, whichever resource is pegged at 100% utilization/saturation is likely to be your bottleneck. If your CPU load is low but your I/O load is at its maximum for your hardware, then you are I/O bound.
That's just one data point, however. The remedy may also depend on other factors. For instance, a complex SQL query may be doing a filesort, and this keeps I/O busy. Should you throw more/faster hardware at it, or should you redesign the query to avoid the filesort?
There are too many factors to summarize in a StackOverflow post, and the fact that many books exist on the subject supports this. Keeping databases operating efficiently and making best use of the resources is a full-time job requiring specialized skills and constant study.
Jeff Atwood just wrote a nice blog article about finding bottlenecks in a system:
The Computer Performance Shell Game
Go buy "High Performance MySQL" from O'Reilly. It's almost 700 pages on the topic, so I doubt you'll find a succinct answer on SO.
It's hard to broadbrush things, but a moderately high-level view is possible.
You need to evaluate read:write ratios. For tables with ratios lower than about 5:1, you will probably benefit from InnoDB because then inserts won't block selects. But if you aren't using transactions, you should change innodb_flush_log_at_trx_commit to 1 to get performance back over MyISAM.
Look at the memory parameters. MySQL's defaults are very conservative and some of the memory limits can be raised by a factor of 10 or more on even ordinary hardware. This will benefit your SELECTs rather than INSERTs.
MySQL can log things like queries that aren't using indices, as well as queries that just take too long (user-defineable).
The query cache can be useful, but you need to instrument it (i.e. see how much it is used). Cacti can do that; as can Munin.
Application design is also important:
Lightly caching frequently fetched but smallish datasets will have a big difference (i.e. cache lifetime of a few seconds).
Don't re-fetch data that you already have to hand.
Multi-step storage can help with a high volume of inserts into tables that are also busily read. The basic idea is that you can have a table for ad-hoc inserts (INSERT DELAYED can also be useful), but a batch process to move the updates within MySQL from there to where all the reads are happening. There are variations of this.
Don't forget that perspective and context are important, too: what you might think is a long time for an UPDATE to happen might actually be quite trivial if that "long" update only happens once a day.
There are tons of best practices which have been previously discussed so there is no reason to repeat them. For actually concrete advice on what to do, I would try running MySQL Tuner. Its a perl script that you can download and then run on your database server, it will give you a bunch of statistics on how your database is performing (e.g. cache hits) along with some concrete recommendations for what issues or config parameters need to be adjusted to improve performance.
While these statistics are all available in MySQL itself, I find that this tool provides them in a much easier to understand fashion. While it is important to note that YMMV with respect to the recommendations, I have found them to generally be pretty accurate. Just make sure that you have done a good job exercising the database beforehand with realistic traffic.