How to speed up mariaDB query using multiple db server, using Galera Cluster and MaxScale? - mysql

I test with stress test tool (to run many concurrent user and many query). There is no speed added when i use more db server (i use 5 servers).
I have check on each server, and i see the queries have been distributed on each server.
What should i do, if i want to scale-up the database server, to add more speed on queries?

Galera maxes out at somewhere around 5 nodes in the cluster. Probably it is due to the broadcasting of each write to every other node, and waiting for replies.
There are many ways to scale up a MySQL; Galera is one of them, and it is perhaps the best today for write scaling.
For read scaling, replication slaves provide virtually unlimited scaling. You can hang traditional replication slaves off of each Galera node. This will let you offload the reads from your 5 nodes. Slaves can be cascaded (using "relays"), thereby giving unlimited scaling. A server can easily have 10 slaves hanging off it; do 6 levels of that, and you have a million servers. (I have not worked with more than 3 levels and 30+ slaves.)
A common way of scaling up is to look at the code. Composite indexes is unknown to a lot of novices. For inserting, batching and LOAD DATA is very effective. For Data Warehousing, Summary tables can often speed up "reports" 10-fold. For high speed ingestion, ping-ponging a staging table is very good. For GUID/UUID indexes, abandoning them is best. Ditto for EAV. For huge deletes there are several approaches.
And my tips on Galera.

Related

Optimize write performance for AWS Aurora instance

I've got an AWS Aurora DB cluster running that is 99.9% focused on writes. At it's peak, it will be running 2-3k writes/sec.
I know Aurora is somewhat optimized by default for writes, but I wanted to ask as a relative newcomer to AWS - what are some best practices/tips for write performance with Aurora?
From my experience, Amazon Aurora is unsuited to running a database with heavy write traffic. At least in its implementation circa 2017. Maybe it'll improve over time.
I worked on some benchmarks for a write-heavy application earlier in 2017, and we found that RDS (non-Aurora) was far superior to Aurora on write performance, given our application and database. Basically, Aurora was two orders of magnitude slower than RDS. Amazon's claims of high performance for Aurora are apparently completely marketing-driven bullshit.
In November 2016, I attended the Amazon re:Invent conference in Las Vegas. I tried to find a knowledgeable Aurora engineer to answer my questions about performance. All I could find were junior engineers who had been ordered to repeat the claim that Aurora is magically 5-10x faster than MySQL.
In April 2017, I attended the Percona Live conference and saw a presentation about how to develop an Aurora-like distributed storage architecture using standard MySQL with CEPH for an open-source distributed storage layer. There's a webinar on the same topic here: https://www.percona.com/resources/webinars/mysql-and-ceph, co-presented by Yves Trudeau, the engineer I saw speak at the conference.
What became clear about using MySQL with CEPH is that the engineers had to disable the MySQL change buffer because there's no way to cache changes to secondary indexes, while also have the storage distributed. This caused huge performance problems for writes to tables that have secondary (non-unique) indexes.
This was consistent with the performance problems we saw in benchmarking our application with Aurora. Our database had a lot of secondary indexes.
So if you absolutely have to use Aurora for a database that has high write traffic, I recommend the first thing you must do is drop all your secondary indexes.
Obviously, this is a problem if the indexes are needed to optimize some of your queries. Both SELECT queries of course, but also some UPDATE and DELETE queries may use secondary indexes.
One strategy might be to make a non-Aurora read replica of your Aurora cluster, and create the secondary indexes only in the read replica to support your SELECT queries. I've never done this, but apparently it's possible, according to https://aws.amazon.com/premiumsupport/knowledge-center/enable-binary-logging-aurora/
But this still doesn't help cases where your UPDATE/DELETE statements need secondary indexes. I don't have any suggestion for that scenario. You might be out of luck.
My conclusion is that I wouldn't choose to use Aurora for a write-heavy application. Maybe that will change in the future.
Update April 2021:
Since writing the above, I have run sysbench benchmarks against Aurora version 2. I can't share the specific numbers, but I conclude that current Aurora improvements are better for write-heavy workload. I did run tests with lots of secondary indexes to make sure. But I encourage anyone serious about adopting Aurora to run their own benchmarks.
At least, Aurora is much better than conventional Amazon RDS for MySQL using EBS storage. That's probably where they claim Aurora is 5x faster than MySQL. But Aurora is no faster than some other alternatives I tested, and in fact cannot match:
MySQL Server installed myself on EC2 instances using local storage, especially i3 instances with locally-attached NVMe. I understand instance storage is not dependable, so one would need to run redundant nodes.
MySQL Server installed myself on physical hosts in our data center, using direct-attached SSD storage.
The value of using Aurora as a managed cloud database is not just about performance. It also has automated monitoring, backups, failover, upgrades, etc.
I had a relatively positive experience w/ Aurora, for my use case. I believe ( time has passed ) we were pushing somewhere close to 20k DML per second, largest instance type ( I think db.r3.8xlarge? ). Apologies for vagueness, I no longer have the ability to get the metrics for that particular system.
What we did:
This system did not require "immediate" response to a given insert, so writes were enqueued to a separate process. This process would collect N queries, and split them into M batches, where each batch correlated w/ a target table. Those batches would be put inside a single txn.
We did this to achieve the write efficiency from bulk writes, and to avoid cross table locking. There were 4 separate ( I believe? ) processes doing this dequeue and write behavior.
Due to this high write load, we absolutely had to push all reads to a read replica, as the primary generally sat at 50-60% CPU. We vetted this arch in advance by simply creating random data writer processes, and modeled the general system behavior before we committed the actual application to it.
The writes were almost all INSERT ON DUPLICATE KEY UPDATE writes, and the tables had a number of secondary indexes.
I suspect this approach worked for us simply because we were able to tolerate delay between when information appeared in the system, and when readers would actually need it, thus allowing us to batch at much higher amounts. YMMV.
For Googlers:
Aurora needs to write to multiple replicas in real time, thus there must be a queue w/ locking, waiting, checking mechanisms
This behavior inevitably causes ultra high CPU utilization and lag when there are continuous writing requests which only succeed when multiple replicas are sync'd
This has been around since Aurora's inception, up til 2020, which is logically difficult if not impossible to solve if we were to keep the low storage cost and fair compute cost of the service
High-volume writing performance of Aurora MySQL could be more than 10x worse than RDS MySQL (from personal experience and confirmed by above answers)
To solve the problem (more like a work-around):
BE CAREFUL with Aurora if more than 5% of your workload is writing
BE CAREFUL with Aurora if you need near real-time result of large volume writing
Drop secondary indices as #Bill Karwin points out to improve writing
Batch apply inserts and updates may improve writing
I said "BE CAREFUL" but not "DO NOT USE" as many scenarios could be solved by clever architecture design. Database writing performance can be hardly depended on.

MySQL dual master replication -- is this scenario safe?

I currently have a MySQL dual master replication (A<->B) set up and everything seems to be running swimmingly. I drew on the basic ideas from here and here.
Server A is my web server (a VPS). User interaction with the application leads to updates to several fields in table X (which are replicated to server B). Server B is the heavy-lifter, where all the big calculations are done. A cron job on server B regularly adds rows to table X (which are replicated to server A).
So server A can update (but never add) rows, and server B can add rows. Server B can also update fields in X, but only after the user no longer has the ability to update that row.
What kinds of potential disasters can I expect with this scenario if I go to production with it? Or does this seem OK? I'm asking mostly because I'm ignorant about whether any simultaneous operation on the table (from either the A copy or the B copy) can cause problems or if it's just operations on the same row that get hairy.
Dual master replication is messy if you attempt to write to the same database on both masters.
One of the biggest points of contention (and high blood pressure) is the use of autoincrement keys.
As long as you remember to set auto_increment_increment and auto_increment_offset, you can lookup any data you want and retrieve auto_incremented ids.
You just have to remember this rule: If you read an id from serverX, you must lookup needed data from serverX using the same id.
Here is one saving grace for using dual master replication.
Suppose you have
two databases (db1 and db2)
two DB servers (serverA and serverB)
If you impose the following restrictions
all writes of db1 to serverA
all writes of db2 to serverB
then you are not required to set auto_increment_increment and auto_increment_offset.
I hope my answer clarifies the good, the bad, and the ugly of using dual master replication.
Here is a pictorial example of 4 masters using auto increment settings
Nice article from Percona on this subject
Master-master replication can be very tricky, are you sure that this is the best solution for you ? Usually it is used for load-balancing purposes (e.g. round-robin connect to your db servers) and sometimes when you want to avoid the replication lag effect. A big known issue is the auto_increment problem which is supposedly solved using different offsets and increment value.
I think you should modify your configuration to simple master-slave by making A the master and B the slave, unless I am mistaken about the requirements of your system.
I think you can depend on
Percona XtraDB Cluster Feature 2: Multi-Master replication than regular MySQL replication
They promise the foll:
By Multi-Master I mean the ability to write to any node in your cluster and do not worry that eventually you get out-of-sync situation, as it regularly happens with regular MySQL replication if you imprudently write to the wrong server.
With Cluster you can write to any node, and the Cluster guarantees consistency of writes. That is the write is either committed on all nodes or not committed at all.
The two important consequences of Muti-master architecture.
First: we can have several appliers working in parallel. This gives us true parallel replication. Slave can have many parallel threads, and you can tune it by variable wsrep_slave_threads
Second: There might be a small period of time when the slave is out-of-sync from master. This happens because the master may apply event faster than a slave. And if you do read from the slave, you may read data, that has not changes yet. You can see that from diagram. However you can change this behavior by using variable wsrep_causal_reads=ON. In this case the read on the slave will wait until event is applied (this however will increase the response time of the read. This gap between slave and master is the reason why this replication named “virtually synchronous replication”, not real “synchronous replication”
The described behavior of COMMIT also has the second serious implication.
If you run write transactions to two different nodes, the cluster will use an optimistic locking model.
That means a transaction will not check on possible locking conflicts during individual queries, but rather on the COMMIT stage. And you may get ERROR response on COMMIT. I am highlighting this, as this is one of incompatibilities with regular InnoDB, that you may experience. In InnoDB usually DEADLOCK and LOCK TIMEOUT errors happen in response on particular query, but not on COMMIT. Well, if you follow a good practice, you still check errors code after “COMMIT” query, but I saw many applications that do not do that.
So, if you plan to use Multi-Master capabilities of XtraDB Cluster, and run write transactions on several nodes, you may need to make sure you handle response on “COMMIT” query.
You can find it here along with pictorial expln
From my rather extensive experience on this topic I can say you will regret writing to more than one master someday. It may be soon, it may not be for a long time, but it will happen. You will have two servers that each have some correct data and some wrong data, and you will either pick one as the authoritative source and throw the other away (probably without really knowing what you're throwing away) or you'll reconcile the two. No matter how you design it, you cannot eliminate the possibility of this happening, so it's a mathematical certainty that it will happen someday.
Percona (my employer) has handled probably several hundred cases of recovery after doing what you're attempting. Some of them take hours, some take weeks, one I helped with took a few months -- and that's with excellent tools to help.
Use a different replication technology or find a different way to do what you want to do. MMM will not help -- it will bring catastrophe sooner. You cannot do this with standard MySQL replication, with or without external tools. You need a replacement replication technology such as Continuent Tungsten or Percona XtraDB Cluster.
It's often easier to just solve the real need in some other fashion and give up multi-master writes, if you want to use vanilla MySQL replication.
and thanks for sharing my Master-Master Mysql cluster article. As Rolando clarified this configuration is not suitable for most production environment due to the limitation of autoincrement support.
The most adequate way to get a MySQL cluster is using NDB, which require at least 4 servers (2 management and 2 data nodes).
I have written a detailed article to get this running on two servers only, which is very similar to my previous article but using NDB instead.
http://www.hbyconsultancy.com/blog/mysql-cluster-ndb-up-and-running-7-4-and-6-3-on-ubuntu-server-trusty-14-04.html
Notice that I always recommend to analyse your needs and find out the most adequate solution, don't just look for available solutions and try to figure out if they fit with your needs or not.
-Hatem
I would highly recommend looking into a tool that will manage this for you. Multi-master replication can be very troublesome if things go wrong.
I would suggest something like Percona XtraDB Cluster. I've been following this project, and it looks very cool. I definitely think it will be a game changer in the MySQL world. It's still in beta though.

Clustering, Sharding or simple Partition / Replication

We have created a Facebook application and it got a lot of virality. The problem is that our database started getting REALLY FULL (some tables have more than 25 million rows now). It got to the point that the app just stopped working because there was a queue of thousands and thousands of writes to be made.
I need to implement a solution for scaling this app QUICKLY but I'm not sure if I should pursue Sharding or Clustering since I'm not sure what are the pro's and con's of each of them and I was thinking of doing a Partition / Replication approach but I think that doesn't help if the load is on the writes?
25 million rows is a completely reasonable size for a well-constructed relational database. Something you should bear in mind, however, is that the more indexes you have (and the more comprehensive they are), the slower your writes will be. Indexes are designed to improve query performance at the expense of write speed. Be sure that you're not over-indexed.
What sort of hardware is powering this database? Do you have enough RAM? It's far easier to change these attributes than it is to try to implement complex RDBMS load balancing techniques, especially if you're under a time crunch.
Clustering/Sharding/Partitioning comes when single node has reached to the point where its hardware cannot bear the load. But your hardware has still room to expand.
This is the first lesson I learnt when I started being hit by such issues
Well, to understand that, you need to understand how MySQL handles clustering. There are 2 main ways to do it. You can either do Master-Master replication, or NDB (Network Database) clustering.
Master-Master replication won't help with write loads, since both masters need to replay every single write issued (so you're not gaining anything).
NDB clustering will work very well for you if and only if you are doing mostly primary key lookups (since only with PK lookups can NDB operate more efficient than a regular master-master setup). All data is automatically partitioned among many servers. Like I said, I would only consider this if the vast majority of your queries are nothing more than PK lookups.
So that leaves two more options. Sharding and moving away from MySQL.
Sharding is a good option for handling a situation like this. However, to take full advantage of sharding, the application needs to be fully aware of it. So you would need to go back and rewrite all the database accessing code to pick the right server to talk to for each query. And depending on how your system is currently setup, it may not be possible to effectively shard...
But another option which I think may suit your needs best is switching away from MySQL. Since you're going to need to rewrite your DB access code anyway, it shouldn't be too hard to switch to a NoSQL database (again, depending on your current setup). There are tons of NoSQL servers out there, but I like MongoDB. It should be able to withstand your write load without worry. Just beware that you really need a 64 bit server to use it properly (with your data volume).
Replication is for data backup not for performance so its out of question.
Well, 8GB RAM is still not that much you can have many hundred GB RAM with quite big hard disk space and MySQL would still work for you.
Clustering/Sharding/Partitioning comes when single node has reached to the point where its hardware cannot bear the load. But your hardware has still room to expand.
If you don't want to upgrade your hardware then you need to give more information about database design and if there are lot of joins or not so that above named options can be considered deeply.

Best storage engine for constantly changing data

I currently have an application that is using 130 MySQL table all with MyISAM storage engine. Every table has multiple queries every second including select/insert/update/delete queries so the data and the indexes are constantly changing.
The problem I am facing is that the hard drive is unable to cope, with waiting times up to 6+ seconds for I/O access with so many read/writes being done by MySQL.
I was thinking of changing to just 1 table and making it memory based. I've never used a memory table for something with so many queries though, so I am wondering if anyone can give me any feedback on whether it would be the right thing to do?
One possibility is that there may be other issues causing performance problems - 6 seconds seems excessive for CRUD operations, even on a complex database. Bear in mind that (back in the day) ArsDigita could handle 30 hits per second on a two-way Sun Ultra 2 (IIRC) with fairly modest disk configuration. A modern low-mid range server with a sensible disk layout and appropriate tuning should be able to cope with quite a substantial workload.
Are you missing an index? - check the query plans of the slow queries for table scans where they shouldn't be.
What is the disk layout on the server? - do you need to upgrade your hardware or fix some disk configuration issues (e.g. not enough disks, logs on the same volume as data).
As the other poster suggests, you might want to use InnoDB on the heavily written tables.
Check the setup for memory usage on the database server. You may want to configure more cache.
Edit: Database logs should live on quiet disks of their own. They use a sequential access pattern with many small sequential writes. Where they share disks with a random access work load like data files the random disk access creates a big system performance bottleneck on the logs. Note that this is write traffic that needs to be completed (i.e. written to physical disk), so caching does not help with this.
I've now changed to a MEMORY table and everything is much better. In fact I now have extra spare resources on the server allowing for further expansion of operations.
Is there a specific reason you aren't using innodb? It may yield better performance due to caching and a different concurrency model. It likely will require more tuning, but may yield much better results.
should-you-move-from-myisam-to-innodb
I think that that your database structure is very wrong and needs to be optimised, has nothing to do with the storage

How to scale MySQL with multiple machines?

I have a web app running LAMP. We recently have an increase in load and is now looking at solutions to scale. Scaling apache is pretty easy we are just going to have multiple multiple machines hosting it and round robin the incoming traffic.
However, each instance of apache will talk with MySQL and eventually MySQL will be overloaded. How to scale MySQL across multiple machines in this setup? I have already looked at this but specifically we need the updates from the DB available immediately so I don't think replication is a good strategy here? Also hopefully this can be done with minimal code change.
PS. We have around a 1:1 read-write ratio.
There're only two strategies: replication and sharding. Replication comes often in place when you have less write and much read traffic, so you can redirect the reads to many slaves, with the pitfall of lots of replication traffic with the time and a probability for inconsitency.
With sharding you shard your database tables across multiple machines (called functional sharding), which makes especially joins much harder. If this doenst fit anymore you also need to shard you rows across multiple machines, but this is no fun and depends a sharding layer implemented between you application and the database.
Document oriented databases or column stores do this work for you, but they are currently optimized for OLAP not for OLTP.
Depends on the application backend (i.e. how the PKs, transactions and insert IDs are handled), you might consider MASTER-MASTER replication with different auto_increment setups. This can be tricky and needs to be thoroughly tested but it can work.
Also, in new MySQL 5.6 there is a GTID (Global Transaction Identifier) that generally helps a lot in keeping the replication in sync, especially in this scenario.
You should take a look at MySQL Performance Blog. Maybe you'll find something useful.
Well... good luck scaling all those writes to a real large scale. The database engine becomes the bottleneck, too many locks and buffers mgmt and stuff...
The only way I found that really works is scale out, sharding, unfortunately sharding is not provided for MySQL "out of the box" (like in some NoSQLs such as Mongo). ScaleBase (disclaimer: I work there) is a maker of a complete scale-out solution an "automatic sharding machine" if you like. ScaleBae analyzes your data and SQL stream, splits the data across DB nodes, route commands and aggregates results in runtime – so you won’t have to!