How many concurrent connection RDS handle? - mysql

I host my two MySql databases on Amazon RDS, which is db.m3.medium and db.r3.large having 5.6.27 innodb engine. Now I want to know, how many concurrent connections these instances can handle? How to load test it? What will be the impact if 1000 concurrent users access the databases?

It depends on many factors like MySQL servers configuration, data volumes, nature of requests, etc. The only way to tell it for sure is simulating your 1000 users with a load testing tool.
Few examples:
Tsung -> MySQL
Apache JMeter -> The Real Secret to Building a Database Test Plan With JMeter
Grinder -> Grinding a database with JDBC
If you will be happy with the databases performance with 1000 users you can gradually increase the load to identify the breaking point

Related

MySql localhost vs Amazon RDS instance Performance

I am running Django Rest API on an AWS ec2-server. Right now the Api's are using MySql localhost database. Should I shift my database from MySql localhost to Amazon RDS instance?
As per what I Know for remote servers would take a little extra time to transmit the request and shared resources. Would this little extra time be worth migrating my database from MySql localhost to Amazon RDS instance?
I read this answer but it didn't helped me much.
MySql localhost vs Amazon RDS instance
An answer with all possible Pros and Cons will really be appreciated.
Pros for local MySQL
Slightly faster, because of proximity to the application
Cons for local MySQL
Not Easily Scalable
If you want to use autoscaling for your application load and traffic then you might have nightmares, because as you scale you will have even the MySQL servers running on each new node.
Pros for RDS
You don't have to worry about installing and maintaining MySQL server
You don't have to worry about scaling
You don't have to worry about load balancing
You don't have to worry about EC2 upgrades and patching
You don't have to worry about failure recovery because when you provision a Multi-AZ DB Instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone (AZ)List item
Cons for RDS
Slightly slower due to network latency
It depends how database intensive your application is.
See this benchmark The local database blew RDS out of the water on query latency with low load.
The answer is probably use both? Use both a local Redis/MySQL for quick queries and an off server RDS for long queries over large data sets where paying the additional network latency makes sense.
Also think about using SQLite on S3. If you can easily shard your data, and most queries are read intensive it could be a lot cheaper, especially with something like Redis on the server to cache frequent queries.
If you want to really eek out performance per $$, you can use a lot of Pang's tricks by having a hierarchy of SQLite files.

MySQL Group Replication or a Single Server is enough?

I'm planning to create a system which tracks visitors clicks into the database. I'm expecting around 1M inserts/day into the Database.
On the backend, I'll have an analytics system which will analyze all the data that's been collected over the days/weeks/months/years.
My question is: is it a practical approach to have 2 different MySQL Servers + 1 Web server? MySQL Server A would insert the clicks into it's DB and it would be connected to MySQL Server B by group replication, so whenever I create reports, etc on MySQL Server B, it doesn't load Server A heavily.
These 2 Database servers would then be connected to the Web Server which would handle all the click requests and displaying the backend reports also.
Is it a practical solution, or is it better to have one bigger server to handle all the MySQL data? Or have multiple MySQL servers that are load balancing each other? Anything else perhaps?
1M inserts/day is not a high load by modern standards. That's less than 12 per second on average.
On sufficiently powerful servers with fast storage and proper tuning of MySQL options, you can expect to support at least 100x that load with a single MySQL server.
A better reason to use multiple MySQL servers is redundancy. Inevitably, any MySQL server needs to be upgraded, or you might have hardware failures and need to replace a disk, or other components. To avoid downtime, you should have a standby database server, which stays in sync with the primary server, either using MySQL replication or by disk-level replication like DRBD.

Benchmarking Mysql cluster using sysbench

When benchmarking a mysql clustering using sysbench, do you have to install sysbench on every machine in the cluster to benchmark the cluster performance? Is there a way to install sysbench on one machine and use it to benchmark other mysql servers on different machines?
If, for example i have HAProxy as the load balancer for the cluster which is configured on its own machine separate from the cluster nodes, then can you use the HAProxy machine only to benchmark the entire cluster since HAProxy machine will be doing the load balancing and acts as the window to all other cluster nodes?
I am knew to MySQL benchmarking, and new to using sysbench.
Thanks.
Yes, you will need to install sysbench on every SQL node, your intending to use and benchmark. (not the NDB data node).
HAProxy & ProxySQL are two different things but you can get the best of both worlds if you really want to.
HAProxy is a very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for very high traffic web-sites and powers quite a number of the world's most visited ones. HAProxy(High Availability Proxy) is an open source load balancer which can load balance any TCP service. It is particularly suited for HTTP load balancing as it supports session persistence and layer 7 processing.
ProxySQL is an open-source MySQL proxy server, meaning it serves as an intermediary between a MySQL server and the applications that access its databases. ProxySQL can improve performance by distributing traffic among a pool of multiple database servers and also improve availability by automatically failing over to a standby if one or more of the database servers fail.
To run the sysbench benchmarks, follow this guide:
https://wiki.gentoo.org/wiki/Sysbench
To setup the ProxySQL:
https://www.digitalocean.com/community/tutorials/how-to-use-proxysql-as-a-load-balancer-for-mysql-on-ubuntu-16-04
Set Up Highly Available HAProxy Servers with Keepalived and Floating IPs: https://www.digitalocean.com/community/tutorials/how-to-set-up-highly-available-haproxy-servers-with-keepalived-and-floating-ips-on-ubuntu-14-04
To create multi-node SQL Cluster: https://www.digitalocean.com/community/tutorials/how-to-create-a-multi-node-mysql-cluster-on-ubuntu-16-04
Set the engine to use sysbench - ENGINE=NDBCLUSTER; on mysql client.
you will need to create database and then prepare the sysbench before running it. Good luck!

Scalable web application architecture

I have a really simple bookshop webapplication written in Spring framework, just to test its scalability.
I deployed this bookshop on one EC2 instance (t1.micro), and database on Amazon RDS (t1.micro) with master/slave replication of one master instance and 3 slave instances (There's really a lot more reads than writes). One t1.micro RDS instance can have at maximum of 32 concurrent connections
Then I did stress testing with JMeter, figured out that the bottleneck is in the database, since you can have at maximum 32 concurrent connections to t1.micro RDS instance.
Should I auto scale RDS database instances, since creating new replica modifies master and it really takes long time to make it available?
Instead of using RDS should I create EC2 instances with MySQL master/replica and then auto scale these instances?
Should I shard my database instead of replication?
Application also uses com.mysql.jdbc.ReplicationDriver to load balance between master and slave instances. Should I use something different like HAProxy?
Have you ever consider Caching and Partitioning ? The web application we have worked have used Memcache. It really helps in performance issues. On the other hand If you have tables that have so much records, you should consider partitioning, accessing these tables on partitions can have remarkable affect.

Is it normal for mysql to be slow when connecting to a remote host?

Is it normal for mysql to be slow when connecting to a remote host or should it have the same performance as connecting to a local host?
I noticed a small performance difference, when I tried to connect to a remote host, so I'm wondering if that's normal?
Assuming that the remote machine is equal in terms of processing power as your local machine, then the primary difference in speed should be network latency - the round trip time for a network traffic. If you are sending huge amounts of data (e.g., reading or writing large BLOBs), then the network bandwidth can come into play as well and "slow" things down. But in general, the round trip cost is often the biggest factor. If you are executing a large number of "small" queries, this cost difference can be fairly significant when comparing a local connection to a remote connection.
Out of curiosity, I just now ran a test that I had already built that simply runs a bunch of update queries. This is not using MySQL but another client/server DBMS. Thus the results would likely be different, but the idea is the same and I would imagine the relative differences would not be significantly different.
Local host (using IPC comm): 5.3 seconds
Remote host (UDP comm): 20.2 seconds
This involved about 50,000 operations. The remote host was 2 hops away on the LAN with (if I measured it correctly) a round trip latency of approximately 0.25 ms for a packet with a 1 byte payload.
It depends entirely on the network connection between the program and the MySQL database server. A slow network will make the database appear slow.
I'd expect a "small performance difference" (as you described it) to be normal for a remote connection.
By default the MySQL server will perform a reverse DNS lookup the first time a client connects to it. It then stores this in its cache. This can potentially give a performance hit depending on the speed of the reverse DNS resolution.
It can depend on how many MySQL queries you're doing: Slow MySQL Remote Connection
You can optimize your code by converting many small queries into larger ones.