How to measure mySQL bottlenecks? - mysql

What mySQL server variables should we be looking at and what thresholds are significant for the following problem scenarios:
CPU bound
Disk read bound
Disk write bound
And for each scenario, what solutions are recommended to improve them, short of getting better hardware or scaling the database to multiple servers?

This is a complicated area. The "thresholds" that will affect each of your three categories overlap quite a bit.
If you are having problems with your operations being CPU bound, then you definitely need to look at:
(a) The structure of your database - is it fully normalized. Bad DB structure leads to complex queries which hit the processor.
(b) Your indexes - is everything needed for your queries sufficiently indexed. Lack of indexes can hit both the processor and the memory VERY hard. To check indexes, do "EXPLAIN ...your query". Anything row in the resulting explanation that says it isn't using an index, you need to look at closely and if possible, add an index.
(c) Use prepared statements wherever possible. These can save the CPU from doing quite a bit of crunching.
(d) Use a better compiler with optimizations appropriate for your CPU. This is one for the dedicated types, but it can glean you the odd extra percent here and there.
If you are having problems with your operations being read bound
(a) Ensure that you are caching where possible. Check the configuration variables for query_cache_limit and query_cache_size. This isn't a magic fix, but raising these can help.
(b) As with above, check your indexes. Good indexes reduce the amount of data that needs to be read.
If you having problems with your operations being write bound
(a) See if you need all the indexes you currently have. Indexes are good, but the trade-off for them improving query time, is that maintaining those indexes can impact the time spent writing the data and keeping them up to date. Normally you want indexes if in doubt, but sometimes you're more interested in rapidly writing to a table than you are in reading from it.
(b) Make possible use of INSERT DELAYED to "queue" writes to the database. Note, this is not a magic fix and often inappropriate, but in the right circumstances can be of help.
(c) Check for tables that are heavily read from and written to at the same time, e.g. an access list that update's visitor's session data constantly and is read from just as much. It's easy to optimize a table for reading from, and writing to, but not really possible to design a table to be good at both. If you have such a case and it's a bottleneck, consider whether it's possible to split its functions or move any complex operations using that table to a temporary table that you can update as a block periodically.
Note, the only stuff in the above that has a major effect, is good query design / indexing. Beyond that, you want to start considering at better hardware. In particular, you can get a lot of benefit out of a RAID-0 array which doesn't do a lot for writing bound problems, but can do wonders for read-bound problems. And it can be a pretty cheap solution for a big boost.
You also missed two items off your list.
Memory bound. If you are hitting memory problems then you must check everything that can be usefully indexed is indexed. You can also look at greater connection pooling if for some reason you're using a lot of discrete connections to your DB.
Network bound. If you are hitting network bound problems... well you probably aren't, but if you are, you need another network card or a better network.
Note, that a convenient way to analyze your DB performance is to turn on the log_slow_queries option and set long_query_time to either 0 to get everything, or 0.3 or similar to catch anything that might be holding your database up. You can also turn on log-queries-not-using-indexes to see if anything interesting shows up. Note, this sort of logging can kill a busy live server. Try it on a development box to start.
Hope that's of some help. I'd be interested in anyone's comments on the above.

Related

Determine if mysql index is being used on production site, so it can be safely removed

We have some large indexes that we suspect are not being used in our Rails site, and would like to drop them to save space and computation. However, doing so could be catastrophic if it turns out they were being used. How can we confirm they are not being used?
One option is to log all queries for a time and run 'explain plan' on any of them that use the table in question. But I've heard 'explain plan' can occasionally be inaccurate. We would also have to collect queries for a few hours to be sure, which is quite a lot of log to store and process.
If there was a way to temporarily disable an index, we'd be willing to do that, as long as we could quickly enable it if problems arose. But I don't see a way to do that universally; you can only specify an 'ignore index' hint to individual sql statements.
Short Answer:
With MySQL 5.6 it is possible to do this by using the PERFORMANCE_SCHEMA and ps_helper.
ps_helper is a series of views and routines that present the data in PERFORMANCE_SCHEMA in more useful ways. The VIEW you are after is this one: http://www.markleith.co.uk/ps_helper/#schema_unused_indexes
More detailed:
The idea of disabling an index is called 'invisible indexes' in Oracle. MySQL doesn't support them, but I'd love to see this feature as well - I filed http://bugs.mysql.com/bug.php?id=70299 a couple of months ago on it.
Removing unused indexes is very important, as it can help optimizer performance. I have a war-story of using the ps_helper + unused_indexes view here: http://www.tocker.ca/2013/09/05/migrating-from-postgresql-to-mysql.html
There's only one procedure for this: Testing, testing, testing and benchmarking.
The primary function of indexes, apart form ensuring uniqueness, is to expedite data access. If all operations were O(1) there would be no need for indexes in the first place.
You need to have another instance of your application where you can experiment with adding, removing, and adjusting your indexes. It's impossible to replicate both real-world hardware and real-world loads, but you can come pretty close if you pay careful attention to how your hardware is configured, how your application is exercised, and can produce that's reasonably similar.
If you have a application logs that are sufficiently detailed, sometimes you can replay these operations. Read operations are easier to replay than writes, but both can be simulated if you've got enough time to invest in this.
For any application running at scale, you want to know where performance falls off a cliff. So long as your production load is well below this level you'll be okay. If you don't know where the cliff is, you might hit it without any warning.
Remember that indexes not only take up space, which is a minor issue, but the size of the index has an effect on how expensive it is to update, making writes more costly. It's ideal to have only the ones you need, but it's almost impossible to identify which are actually used. There are many that might be used in theory, but never are, and some that shouldn't be used, but which are because the query optimizer is a little dumb sometimes.

Is mongoDB or Cassandra better than MySQL for large datasets?

In our (currently MySQL) database there are over 120 million records, and we make frequent use of complex JOIN queries and application-level logic in PHP that touch the database. We're a marketing company that does data mining as our primary focus, so we have many large reports that need to be run on a daily, weekly, or monthly basis.
Concurrently, customer service operates on a replicated slave of the same database.
We would love to be able to make these reports happen in real time on the web instead of having to manually generate spreadsheets for them. However, many of our reports take a significant amount of time to pull data for (in some cases, over an hour).
We do not operate in the cloud, choosing instead to operate using two physical servers in our server room.
Given all this, what is our best option for a database?
I think you're going the wrong way about the problem.
Thinking if you drop in NoSQL that you'll get better performance is not really true. At the lowest level, you're writing and retrieving a fair chunk of data. That implies your bottleneck is (most likely) HDD I/O (which is the common bottleneck).
Sticking to the hardware you have momentarily and using a monolithic data storage isn't scalable and as you noticed - has implications when wanting to do something in real-time.
What are your options? You need to scale your server and software setup (which is what you'd have to do with any NoSQL anyway, stick in faster hard drives at some point).
You also might want to look into alternative storage engines (other than MyISAM and InnoDB - for example, one of better engines that seemingly turn random I/O to sequential I/O is TokuDB).
Implementing faster HDD subsystem would also aid to your needs (FusionIO if you have the resources to get it).
Without more information on your end (what the server setup is, what MySQL version you're using and what storage engines + data sizes you're operating with), it's all speculation.
Cassandra still needs Hadoop for MapReduce, and MongoDB has limited concurrency with regard to MapReduce...
... so ...
... 120 mio records is not that much, and MySQL should easily be able to handle that. My guess is an IO bottleneck, or you're doing lots of random reads instead of sequential reads. I'd rather hire a MySQL techie for a month or so to tune your schema and queries, instead of investing into a new solution.
If you provide more information about your cluster, we might be able to help you better. "NoSQL" by itself is not the solution to your problem.
As much as I'm not a fan of MySQL once your data gets large, I have to say that you're nowhere near needing to move to a NoSQL solution. 120M rows is not a big deal: the database I'm currently working with has ~600M in one table alone and we query it efficiently. Managing that much data from an ops perspective is the problem; querying it isn't.
It's all about proper indexes and the correct use of them when joining, and secondarily memory settings. Find your slow queries (mysql slow query log FTW!), and learn to use the explain keyword to understand whey they are slow. Then tweak your indexes so your queries are efficient. Further, make sure you understand MySQL's memory settings. There are great pages in the docs explaining how they work, and they aren't that hard to understand.
If you've done both of those things and you're still having problems, make sure disk I/O isn't an issue. Then you should look in to another solution for querying your data if it is.
NoSQL solutions like Cassandra have a lot of benefits. Cassandra is fantastic at writing data. Scaling your writes is very easy--just add more nodes! But the tradeoff is that it's harder to get the data back out. From a cost perspective, if you have expertise in MySQl, it's probably better to leverage that and scale your current solution until it hits a limit before completely switching your underlying architecture.

Are the consistency/data loss/query optimization issues I read about "that bad"?

As I've been looking into the differences between Postgres and MySQL, it has struck me that, if what I read is to be believed, MySQL should be (disclaimer: by reading the rest of this sentence, you agree to read the next paragraph as well) the laughingstock of the RMDB world: it doesn't enforce ACID by default, the net is rife with stories of MySQL-related data loss and by all accounts and the query optimizer is a joke.
But none of this seems to matter. It's not hard to tell that MySQL has about a million times* as much hype as Postgres (it's LAMP, not LAPP), big installations of MySQL are not unheard of (LJ? Digg?) and I haven't noticed a drop in MySQL's popularity.
This makes me wonder: are these "problems" with MySQL really that bad?
So, if you have used MySQL for a reasonably large project**, what was your experience like? Did you use Postgres as well? How was it worse? How was it better?
*: [citation needed]
**: I'm well aware that, for "small things" (blogs, what have you), MySQL (along with practically every other RDB) is just fine.
Since it's tagged [subjective], I'll be subjective. For me it's about the little things. PostgreSQL is more developer friendly and makes it easy to do the right thing regarding data integrity by default.
If you give MySQL an incorrect type, it will implicitly convert it even if the conversion is incorrect. PostgreSQL will complain.
EXPLAIN in PostgeSQL is way more useful than in MySQL. It gives you the exact structured query plan. What kind of algorithm will it use, what cost does does each step have, etc. This means that if the query optimizer in MySQL doesn't do what you think it does, you will have hard time to debug it.
If you ever wrote anything more complex in the MySQL stored procedure language, you will know how painful it is. PL/pgSQL is actually a nice language + you can use many other languages.
MySQL doesn't have sequences, so if you need them you have to roll your own. Most people will do it wrong and have race conditions in their code.
PostgreSQL exposes most of it's internal lock types to the developer. If you need to lock your table in a special way, you can do that.
Everything is programmable in PostgreSQL. For example, if you need your own data type for some specific data, you can add it. You can add casts and operators for the data types. Probably not worth the effort for small projects, but it's better than storing things as strings.
PostgreSQL adds every action including DDL changes to a transaction, unlike MySQL. If you have a conversion script that creates/drops tables, BEGIN/END won't help you in MySQL to keep it in consistent state.
That doesn't mean it's impossible to write good database applications with MySQL, it just requires more effort.
MySQL can be used for reasonably large applications, provided you really know what you do and don't trust the defaults.
MySQL defaults are optimized to be easy-to-use and to get started quickly and to provide best performance (usually). Other databases choose defaults that are at the very least ACID and are scalable (i.e. choose defaults that are not necessarily the best/fastest for small data sets)
Another item is that MySQL only learned to be a "real database" relatively recent, while almost all competing products started life with full ACID in mind.
MySQL had problems with almost all aspects of ACID at one time or another. Most of them are gone or can be configured away, but you will have to check each one. The problem with troubles in atomicity for example is that you will not notice them until you place your system under heavy load (which often coincides with it being a production system, unfortunately).
So my summary would be: MySQL is capable of working in this environments, but it takes work. And the path it took to get to that point cost it quite a few points in the confidence area.
Provided you know what its capabilities are, then it may fit your use case.
If used correctly, then it is ACID compliant. If used incorrectly, it is not. The trouble is, that people seem to assume that it's a good thing to have ACID compliance.
In reality ACID is often the enemy of performance (Particularly the D for durability). By relaxing durability very slightly, we can typically get a very large performance boost.
Likewise, even using the MyISAM engine (which doesn't have much by way of durability, and not a lot of the others either) is still appropriate to some problem domains.
We are using MySQL in some applications - and it is doing a pretty good job.
In the newer projects we are using the InnoDB engine - and albeit it may be slower than the default engine it is working well.
Right now we are using an ORM mapper - and so most of the complexity is hidden behind the ORM mapper (and working nice).
I think the infrastructure (Tools and information) is one of MySQL's big plusses: we are using really nice tools: Toad for MySQL and MySQL Administrator.
Altough I have to admit that I had a shocking experience last week when helping a friend with a SQL statement and the correleated subquery nearly stopped his MySQL server - but with the trick of enclosing it in another query - it worked really well.
This is nothing which REALLY shocks me - because I've used other DB systems which cost big bucks (I'm looking at you - DB2) - and they had other things to work around. (maybe not as drastic - but still you had to optimize for them).
I haven't used both for a single large project, but having used both I have some idea of how they compare.
In general almost all MySQL's problems can be worked around with good discipline. The issue is more that developer has to know all the gotchas and work around them. After working with PostgreSQL or Oracle this feels a bit like death by a thousand papercuts. You get that used to stuff just working.
This is a pretty significant issue in the types of stuff that I have worked on. Complex schemas with complex queries and lots of data. tight schedules with little time for performance engineering meaning that getting consistently reasonable performance without having to manually optimize queries is important. A good cost based optimizer is almost a requirement. Combine that with quite a lot of outsourcing with development teams that don't have the experience to catch all the gotchas in time and the little issues escalate to large QA problems. Hitting any of MySQL silent data corruption gotchas in production is something that really scares me. I'll take any declarative constraints at the database level that I can get to have atleast some safety net, MySQL unfortunately falls short on that.
PostgreSQL has the added benefit that it can run significantly more algorithms using more advanced data-structures in the database. Most of our large projects have a few cases where MySQL will hit its limits. Moving the algorithms outside the database requires considerably more effort with pretty tricky code involving correct locking and synchronization. In particular I have at one time or another hit the need for partial indexes, indexes on expressions, custom aggregate functions, set returning stored procedures, array and hash datatypes, inverted indexes on array values, update/delete-returning, deferrable foreign key constraints.
On the other hand MySQL has at least for now a better story for scale out. If I had to support a huge number users on a reasonably simple application, and had the team to build a heavily partitioned and replicated database with eventual consistency, I'd pick MySQL over PostgreSQL for the low level data storage building block. On the other the competitors in that space are the key-value databases.
are these "problems" with MySQL really that bad?
Actually, the pain MySQL will inflict on you can range from moderate to insane, and much of it depends on MyISAM.
I find a good rule of thumb is this :
are you backing up some MyISAM tables ?
MyISAM is great for data you don't really care about, like traffic logs and the like, or for data that you can easily restore in case of a problem since it's read-only and hence never changed since the time you loaded that 10GB dump. In those cases the compact row format of MyISAM brings great space savings (that however do not translate into faster seq scan speed, for some reason).
If the data you put in MyISAM tables is worth backing up, you are going to enter in a world of hurt when you realize some day that it is all inconsistent because of the lack of FK and constraint checks, and incidentally all your backups will contain inconsistent data too.
If you make lots of concurrent updates to MyISAM tables, then you are gonna go way past the world of hurt stage : when the load reaches a certain threshold, you are doomed. Of course the readers block writers which block readers which block queued writers, etc, so the performance is bad, load avg goes to 200, and your box is nuked, but also I could consistently crasy MyISAM tables in a benchmark I wrote 2 years ago just by hitting them with too much load. Random data ensued, sometimes crashing the mysql on selects or spewing random errors.
So, if you avoid MyISAM like the plague it is, the problems with MySQL aren't really that bad. InnoDB is robust. However, generally I find it inferior to Postgres, which is faster and has so many less gotchas, and Gets The Job Done easier and faster.
No, the issues you mention are NOT a big deal. See Google and Facebook as two examples of companies that are using MySQL to accomplish Herculean tasks you'll only ever dream of encountering.
I use the following rules when running a MySQL to prevent headaches down the line:
Take daily, weekly, monthly snapshots of database. More often than not the problems you'll run in to have nothing to do with MySQL, instead it's a boneheaded developer running:
DELETE FROM mytable; # Where is the WHERE?
Use InnoDB by default, the only reason to use MyISAM is for full text search.
Get your database schema under source control.

MySQL database optimization best practices

What are the best practices for optimizing a MySQL installation for best performance when handling somewhat larger tables (> 50k records with a total of around 100MB per table)? We are currently looking into rewriting DelphiFeeds.com (a news site for the Delphi programming community) and noticed that simple Update statements can take up to 50ms. This seems like a lot. Are there any recommended configuration settings that we should enable/set that are typically disabled on a standard MySQL installation (e.g. to take advantage of more RAM to cache queries and data and so on)?
Also, what performance implications does the choice of storage engines have? We are planning to go with InnoDB, but if MyISAM is recommended for performance reasons, we might use MyISAM.
The "best practice" is:
Measure performance, isolating the relevant subsystem as well as you can.
Identify the root cause of the bottleneck. Are you I/O bound? CPU bound? Memory bound? Waiting on locks?
Make changes to alleviate the root cause you discovered.
Measure again, to demonstrate that you fixed the bottleneck and by how much.
Go to step 2 and repeat as necessary until the system works fast enough.
Subscribe to the RSS feed at http://www.mysqlperformanceblog.com and read its historical articles too. That's a hugely useful resource for performance-related wisdom. For example, you asked about InnoDB vs. MyISAM. Their conclusion: InnoDB has ~30% higher performance than MyISAM on average. Though there are also a few usage scenarios where MyISAM out-performs InnoDB.
InnoDB vs. MyISAM vs. Falcon benchmarks - part 1
The authors of that blog are also co-authors of "High Performance MySQL," the book mentioned by #Andrew Barnett.
Re comment from #ʞɔıu: How to tell whether you're I/O bound versus CPU bound versus memory bound is platform-dependent. The operating system may offer tools such as ps, iostat, vmstat, or top. Or you may have to get a third-party tool if your OS doesn't provide one.
Basically, whichever resource is pegged at 100% utilization/saturation is likely to be your bottleneck. If your CPU load is low but your I/O load is at its maximum for your hardware, then you are I/O bound.
That's just one data point, however. The remedy may also depend on other factors. For instance, a complex SQL query may be doing a filesort, and this keeps I/O busy. Should you throw more/faster hardware at it, or should you redesign the query to avoid the filesort?
There are too many factors to summarize in a StackOverflow post, and the fact that many books exist on the subject supports this. Keeping databases operating efficiently and making best use of the resources is a full-time job requiring specialized skills and constant study.
Jeff Atwood just wrote a nice blog article about finding bottlenecks in a system:
The Computer Performance Shell Game
Go buy "High Performance MySQL" from O'Reilly. It's almost 700 pages on the topic, so I doubt you'll find a succinct answer on SO.
It's hard to broadbrush things, but a moderately high-level view is possible.
You need to evaluate read:write ratios. For tables with ratios lower than about 5:1, you will probably benefit from InnoDB because then inserts won't block selects. But if you aren't using transactions, you should change innodb_flush_log_at_trx_commit to 1 to get performance back over MyISAM.
Look at the memory parameters. MySQL's defaults are very conservative and some of the memory limits can be raised by a factor of 10 or more on even ordinary hardware. This will benefit your SELECTs rather than INSERTs.
MySQL can log things like queries that aren't using indices, as well as queries that just take too long (user-defineable).
The query cache can be useful, but you need to instrument it (i.e. see how much it is used). Cacti can do that; as can Munin.
Application design is also important:
Lightly caching frequently fetched but smallish datasets will have a big difference (i.e. cache lifetime of a few seconds).
Don't re-fetch data that you already have to hand.
Multi-step storage can help with a high volume of inserts into tables that are also busily read. The basic idea is that you can have a table for ad-hoc inserts (INSERT DELAYED can also be useful), but a batch process to move the updates within MySQL from there to where all the reads are happening. There are variations of this.
Don't forget that perspective and context are important, too: what you might think is a long time for an UPDATE to happen might actually be quite trivial if that "long" update only happens once a day.
There are tons of best practices which have been previously discussed so there is no reason to repeat them. For actually concrete advice on what to do, I would try running MySQL Tuner. Its a perl script that you can download and then run on your database server, it will give you a bunch of statistics on how your database is performing (e.g. cache hits) along with some concrete recommendations for what issues or config parameters need to be adjusted to improve performance.
While these statistics are all available in MySQL itself, I find that this tool provides them in a much easier to understand fashion. While it is important to note that YMMV with respect to the recommendations, I have found them to generally be pretty accurate. Just make sure that you have done a good job exercising the database beforehand with realistic traffic.

Concurrency handling using the filesystem VS an RDMBS (MySQL)

I'm building an English web dictionary where users can type in words and get definitions. I thought about this for a while and since the data is 100% static and I was only to retrieve one word at a time I was better off using the filesystem (ext3) as the database system instead of opting to use MySQL to store definitions. I figured there would be less overhead considering that you have to connect to MySQL and that in itself is a very slow operation.
My fear is that if my system were to get bombarded by let's say 500 word retrievals/sec, would I still be better off using the filesystem as the database? or will the increased filesystem reads hinder performance as opposed to something that MySQL might be doing under the hood?
Currently the hierarchy is segmented by first letter, second letter and third letter of the word. So if you were to search for the definition of "water", the script (PHP) will try to read from "../dict/w/a/t/water.word" (after cleaning up the word of problematic characters and lowercasing it)
Am I heading in the right direction with this or is there a faster solution (not counting storing definitions in memory using something like memcached)? Will the amount of files stored in any directory factor in performance? What's the rough benchmark for the number of files that I should store in a directory?
What are your grounds for your belief that this decision will matter to the overall performance of the solution? WHat does it do other than provide definitions?
Do you have MySQL as part of the solution anyway, or would you need to add it should you select it as the solution here?
Where is the definitive source of definitions? The (maybe replicated) filesystem, or some off line DB?
It seems like something that should be in a DB architecturally - filesystems are a strange place to map a large number of names to values (as is evidenced by your file system structure breaking things down by initial letters)
If it's in the DB, answering questions like "how many definitions are there?" is a lot easier, but if you don't care about such things for your application, this may not matter.
So to some extent this feels like looking to hyper optimise the performance of something whose performance won't actually make much difference to the overall solution.
I'm a fan of "make it correct, then make it fast", and "correct" would be more straightforward to achieve with a DB.
And of course, the ultimate answer would to be try both and see which one works best in your situation.
Paul
The type of lookups that a dictionary requires is exactly what a database is good at. I think the filesystem method you describe will be unworkable. Don't make it hard! Use a Database.
You can keep a connection pool around to speed up connecting to the DB.
Also, if this application needs to scale to multiple servers, the file system may be tricky to share between servers.
So, I third the suggestion. Use a DB.
But unless it's a fabulously large dictionary, caching would mean you're nearly alwys getting stuff from local memory, so I don't think this is going to be the biggest issue for your application :)
A DB sounds perfect for your needs.
I also don't see why memcached is relevant (how big is your data? Can't be more than a few GB... right?)
The data is approximately a couple of GBs. And my goal is speed, speed, speed (definitions will be loaded using XHR). The data as I said is static and is never going to change, and in no where would I using anything other than a single read operation for each request. So I'm having a pretty hard time getting convinced of using MySQL and all its bloat.
Which would be first to fail under high load using this strategy, the filesystem or MySQL? As for scaling replication is the answer since the data will never change and is only a couple of GBs.
Make it work first. Premature optimisation is bad.
Using a database enables easier refactoring of your schema, and you don't have to write an implementation of an index-based lookup, which in actual fact is nontrivial.
Saying that connecting to a database "is a very slow operation" overstates the problem. Actually connecting should not take very long, plus you can reuse connections anyway.
If you are worried about read-scaling, a 1G database is very small, so you can push readonly replicas of it to each web server and they can each read from their local copy. Provided the writes stay at a level which doesn't impact read performance, that gives you almost perfect read-scalability.
Moreover, 1G of data will fit into ram easily, so you can make it fast by loading the entire database into memory at startup time (before that node advertises itself to the load balancer).
500 lookups per second is trivially small. I would start worrying about 5000 per second per server, maybe. If you can't achieve 5000 key lookups per second on modern hardware (from a database which fits in RAM?!!), there is something seriously wrong with your implementation.
Agreeing that this is premature optimization, and that MySQL surely will be performant enough for this use case. I must add you can also use a file based database, like the very fast Tokyo Cabinet as a compromise. Sadly it doesn't have a PHP binding so you could use its grandfather, DBM.
That said, do not use a filesystem, there's no good reason to, as far as I can see.
Use a virtual Drive in your ram (google it for a how to for your distro) or if your data is provided by PHP use APC, memcache might work well with mysql. Personally I don't think the optimization you are doing here is really where you should be spending your time. 500 requests a second is massive, I think using mysql would give you better forward features for later. I think you need to concentrate on features and not speed if you want to differentiate yourself from your competitors. Also there are a few good talks about UI for the web, the server speed is only a small factor in the whole picture.
Good luck
You might also think about a no-sql database (like riak, mongo, or even redis) for something like this. They are all super-fast and help out with your replication. Mysql might be over-kill and hard-to-scale in an instance like this, but the other ones have some robust tools