Does MySQL scale on a single multi-processor machine? - mysql

My application's typical DB usage is to read/update on one large table. I wonder if MySQL scales read operations on a single multi-processor machine? How about write operations - are they able to utilize multi-processors?
By the way - unfortunately I am not able to optimize the table schema.
Thank you.
Setup details:
X64, quard core
Single hard disk (no RAID)
Plenty of memory (4GB+)
Linux 2.6
MySQL 5.5

If you're using conventional hard disks, you'll often find you run out of IO bandwidth before you run out of CPU cores. The only way to pin a four core machine is to have a very high performance SSD striped RAID array.
If you're not able to optimize the schema you have very limited options. This is like asking to tune a car without lifting the hood. Maybe you can change the tires or use better gasoline, but fundamental performance gains come from several factors, including, most notably, additional indexes and strategically de-normalizing data.
In database land, 4GB of memory is almost nothing, 8GB is the absolute minimum for a system with any loading, and a single disk is a very bad idea. At the very least you should have some form of mirroring for data integrity reasons.

Related

Tuning MySQL Database

I am having a MySQL database which is running on a dedicated Ubuntu server having 2GB RAM and 500GB hard drive. I appreciate if anyone could help on fine tuning the database to increase the performance. Enhancements need to impact on CRUD tasks of the database, including procedure calls' and scheduled events' performances.
I have done searches on the web regarding this and found various mechanisms, tools and etc in various websites to do the job. But I need to know the proper way of escalating the performance (ex: execution time of an SQL query and etc) of a MySQL database itself without using any 3rd party tools or software. The database configurations which I am having are listed below.
MySQL version: 5.5
Used storage engine: MyISAM
Operating system: Ubuntu 12
Hard disk capacity: 500GB
RAM: 2GB
Other: The database consists of Tables, Indexes, Stored Procedures, Scheduled Events and Views
You have said nothing about the specifics of your data, its distribution, the type of workload you use, the ratio of reads to writes, the variety of your queries, the complexity of your queries, and so on. This is a vital part of the tuning process for one simple reason:
Tuning is specific to your data and your workload.
The guys who make database platforms such as MySQL pay a lot of attention to making sure the default settings are good enough for the majority of users. If there was some easy route to improving the performance of a database, they'd already have done it at the factory.
The guys who make the third party tools, on the other hand, write code that reads your data and your logs to find out information about your tables, their contents, and your queries, and that code makes best-guess estimates about tuning based on your data and your workload. They're not perfect, but they sure beat having to do that stuff manually if you don't know how to.
Think of tuning a database like tuning a guitar: You start with an idea of what you want (Standard tuning? Drop D? DADGAD?) and then you make small adjustments to one string at a time, measuring it against your desired result. Once you've achieved the best possible result for that string, you move onto the next one and make small changes there etc. When you get to the final string, you might have adjusted the balance of the whole guitar so you might have to revisit the settings from the beginning to make tiny incremental changes until the whole lot is singing perfectly.
Read http://dev.mysql.com/doc/refman/5.5/en/server-parameters.html to get started on the most important "strings" to tune in MySQL 5.5. There are lots, but none of them are particularly difficult on their own.
As a tangent, tuning your server away from the defaults might give you a 5-10% boost in performance. You'd be much better spending your time looking at your database design, data types, and the indexes you're using. You can often make 50%-100% improvements in performance by doing that sort of thing.
You should find http://www.mysqlcalculator.com/ helpful for starters.
This will show you some critical general defaults and allow you to
enter your own values as displayed by
SHOW GLOBAL VARIABLES
to calculate MySQL maximum memory usage.
This will only scratch the surface - and will be enlightening.
There is NO simple answer.

Does MySQL consume significantly more resources compared to other DBMS?

There is a growing tendency for shifting from mysql to NOSQL, SQLite, etc. I have read many blogs and articles comparing the speed of mysql with other types of DBMS. However, I believe that speed is not a problem with mysql, as it is really fast; but the problem is more connected with resource usage. It is common to face extreme server load due to mysql slow queries. For instance, an advantage of Oracle over mysql is to have less problem associated with memory leaks.
Is it true that mysql consumes significantly more resources (CPU and memory) comparing with other databases such as SQLite, Non-relational databases, key/value databases. By significantly I mean is it the main reason for not using mysql for large databases (to save server costs).
If YES (to 1), what can be an estimate of better resource usage of a similar system like SQLite comparing with Mysql.
Note: Consider a simple system as advanced features of mysql is not needed. Just comparing the performance for simple queries.
If you're only using "simple" queries, I don't think there's much of a difference regarding ressource usage between MySQL and e.g. Oracle.
Those "professional" DBMS do a lot of "magic" regarding caching, prefetching and data maintanance.
Of course MySQL does that as well, but it might not be as efficient for really complex databases and advanced queries.
Your choise of DBMS highly depends on what you're planning to do, especially if you're choosing between SQL/NoSQL/Key-Value/..., which are for completely different scenarios… that's not so much a question of memory and CPU usage.
CPU and Memory never are the reason, as they are cheap. The problem is with the I/O speed. NoSQL databases are used in write-intensive applications, as well as in applications which need schema-less database (because changing the table schema in MySQL involves rewriting the table, which may be extremely slow). So some trade-offs are made to optimize the disk operations, which often lead to consuming more CPU, memory or disk space.
Another reason could be pessimistic vs optimistic locks. Which is another topic.
But since the answer to the question "Is it true that mysql consumes significantly more resources (CPU and memory) comparing with other databases" is NO, it is pointless to discuss it further :)

How to benchmark / test the speed of MySQL, Postgresql, MongoDB, where there are so many cache layers around?

I think with SQLite3, at least it doesn't keep any cached page because there is no server and each write will exit SQLite3, so it can't do any caching directly.
But when it is MySQL, Postgresql, or MongoDB, there will be a layer which, when the data is thought to be saved already, it is actually in the memory cache of the DBMS... to be written later to the disk.
And even when it is written to the disk, there is an OS layer that keeps sectors that are to be written to the disk.
And then there is the hard drive's cache. With it being 8MB, so maybe when the test is inserting data creating a 800MB database, then the error can be 1% or less.
But what about the other layers? There really needs to be flushing all the way to the OS layer. Otherwise, with computers having 4GB of RAM or 8GB of RAM, the whole database can easily reside in RAM when it is thought to be quite fast. How do we tell the test to flush the data all the way to the hard disk physical layer or at least out of the OS layer?
When you are doing benchmarking, you can never negate all the speed optimizations at every layer down to the OS or even CPU level, including caching. You don't need that. What you can do is to benchmark performance in the different lifecycle states of your system. Also, if you know what data is cached when (approximately) you can do benchmark before and after that. For example - clean start, first DB dataset access, subsequent DB access, etc. It is best to establish the bottlenecks first and then benchmark only there in greater details. Another good practise is to simulate a real life system load and benchmark that. Any syntetic benchmarks are practically pointless. Good luck ;)

Scalability comparison between different DBMSs

By what factor does the performance (read queries/sec) increase when a machine is added to a cluster of machines running either:
a Bigtable-like database
MySQL?
Google's research paper on Bigtable suggests that "near-linear" scaling is achieved can be achieved with Bigtable. This page here featuring MySQL's marketing jargon suggests that MySQL is capable of scaling linearly.
Where is the truth?
Having built and benchmarked several applications using VoltDB I consistently measure between 90% and 95% of additional transactional throughput as each new server is added to the cluster. So if an application is performing 100,000 transaction per second (TPS) on a single server, I measure 190,000 TPS on 2 servers, 280,000 TPS on 3 servers, and so on. At some point we expect the server to server networking to become a bottleneck but our largest cluster (30 servers) is still above 90%.
If you don't do that many writes to the database, MySQL may be a good and easy solution, especially if coupled with memcached in order to increase the read speed.
OTOH if you data is constantly changing, you should probably look somewhere else:
Cassandra
VoltDB
Riak
MongoDB
CouchDB
HBase
These systems have been designed to scale linearly with the number of computers added to the system.
A full list is available here.

Best storage engine for constantly changing data

I currently have an application that is using 130 MySQL table all with MyISAM storage engine. Every table has multiple queries every second including select/insert/update/delete queries so the data and the indexes are constantly changing.
The problem I am facing is that the hard drive is unable to cope, with waiting times up to 6+ seconds for I/O access with so many read/writes being done by MySQL.
I was thinking of changing to just 1 table and making it memory based. I've never used a memory table for something with so many queries though, so I am wondering if anyone can give me any feedback on whether it would be the right thing to do?
One possibility is that there may be other issues causing performance problems - 6 seconds seems excessive for CRUD operations, even on a complex database. Bear in mind that (back in the day) ArsDigita could handle 30 hits per second on a two-way Sun Ultra 2 (IIRC) with fairly modest disk configuration. A modern low-mid range server with a sensible disk layout and appropriate tuning should be able to cope with quite a substantial workload.
Are you missing an index? - check the query plans of the slow queries for table scans where they shouldn't be.
What is the disk layout on the server? - do you need to upgrade your hardware or fix some disk configuration issues (e.g. not enough disks, logs on the same volume as data).
As the other poster suggests, you might want to use InnoDB on the heavily written tables.
Check the setup for memory usage on the database server. You may want to configure more cache.
Edit: Database logs should live on quiet disks of their own. They use a sequential access pattern with many small sequential writes. Where they share disks with a random access work load like data files the random disk access creates a big system performance bottleneck on the logs. Note that this is write traffic that needs to be completed (i.e. written to physical disk), so caching does not help with this.
I've now changed to a MEMORY table and everything is much better. In fact I now have extra spare resources on the server allowing for further expansion of operations.
Is there a specific reason you aren't using innodb? It may yield better performance due to caching and a different concurrency model. It likely will require more tuning, but may yield much better results.
should-you-move-from-myisam-to-innodb
I think that that your database structure is very wrong and needs to be optimised, has nothing to do with the storage