Load protecting a multi database mariadb cluster with HAProxy - mysql

I have a setup were multiple databases (one per tenant) live within the same mariadb server (or cluster). The goal is to protect mariadb from to many connections but also from to many connections to each database. Basically throttle each tenant at the database level without affecting others.
Example: The tenant1 database is being hit hard and limited at a total of 10 connections. Other connections are queued. At the same time tenant2 can continue working as normal because it has not hit any limit and is therefore not affected by the queue.
I know HAProxy is great if you have one database being hit from multiple applications as you can have connections queued in HAProxy instead of hitting a hard limit in the database and having to deal with that in the application.
So the question, can HAProxy be used as a front for multiple databases within the same cluster (potentially with their own database credentials) and allow throttling connections per database. Or would you need multiple HAProxy servers for that (one per tenant)?

Another approach is to set up separate VMs, each with a MySQL instance. Then throttle access via CGroups. With this approach, HAProxy (etc) is not relevant unless you have some replication also.
CGroups has the feature that each VM can get "at least a certain percentage" of various resources (CPU, net, I/O). When the system is too busy, that percentage becomes a max. When otherwise idle, users can use more than their share.
You have set the VARIABLE max_user_connections?

Related

Google Cloud SQL MySQL 2nd Gen Concurrent Connections?

The pricing page only gives this information for the 1st gen. Does anybody know the concurrent connection limits for the 2nd gen?
Second Generation instances are configured to allow up to 4000 connections though it does not mean that you can safely run your workload at 4000 connections for a given instance size. Different workloads will have different demands so you still need to monitor/benchmark your application to choose the appropriate instance size.
e.g. You might be able to make 4000 concurrent connections to a n1-standard-1 instance but it's unlikely to perform well for many workloads

AWS reading mysql replicas instances keeps gettin Too Many Connections error

I've purchased a single VPC on AWS and initiated there 6 MySql databases, and foreach one I've created a reading replica, so that I can always run queries on the reading replicas quickly.
Most of the day, my writing instances (original instances) are fully loaded and their CPUs percentage is mostly 99%. However, the reading replicas shows something ~7-10% CPU usage, but sometimes I get an error when I run a service connecting to the reading replica "TOO MANY CONNECTIONS".
I'm not that expert with AWS, but is this happening because the writing replicas are fully loaded and they're on the same VPC?
this happening because the writing replicas are fully loaded and they're on the same VPC?
No, it isn't. This is unrelated to replication. In replication, the replica counts as exactly 1 connection on the master, but replication does not consume any connections on the replica itself. There is no impact on connections related to the intensity of the total workload from replication.
This issue simply means you have more clients connecting to the replica than are allowed by the parameter group based on your RDS instance type. Use the query SELECT ##MAX_CONNECTIONS; to see what this limit is. Use SHOW STATUS LIKE 'THREADS_CONNECTED'; to see how many connections exist currently, and use SHOW PROCESSLIST; (as the administrative user, or any user holding the PROCESS privilege) in order to see what all of these connections are doing.
If many of them show Sleep and have long values in Time (seconds spent in the current state) then the problem is that your application is somehow abandoning connections, rather than properly closing them after use or when they are otherwise no longer needed.

Max no of connections using web sockets

I am developing a web application using web-sockets which needs real time data.
The number of clients using the web application will be over 100 000.
Server side web socket coding is done in Java. Can a single web-socket server handle this amount of connections?
If not, how can I achieve this. I have to use web sockets only.
WebSocket servers, like any other TCP-based server, can open huge numbers of connections. They can be file-descriptor-based. You can find out the max (system-wide) FDs easily enough on Linux:
% cat /proc/sys/fs/file-max
165038
There are system-wide and there are kernel parameters for user limits (and shell-level things like "ulimit"). Btw, you'll need to edit /etc/sysctl.conf to increase your FD mods during a reboot.
And of course you can increase this number to whatever you want (with the proportional impact on kernel memory).
Or servers can do tricks to multiplex a single connection.
But the real question is, what is the profile of the data that will flow over the connection? Will you have 100K users getting 1 64-byte message per day? Or are those 100K users getting 50 1K messages a second? Can the WebSocket server shard its connections over multiple NICs (ie, spread the I/O load)? Are the messages all encrypted and therefore need a lot of CPU? How easily can you cluster your WebSocket server so failover is easy for you and painless for your users? Is your server mission/business critical?... that is, can you afford to have 100K users disappear if a disaster occurs? There are many questions to consider when you thinking about scalability of a WebSocket server.
In our labs, we can create millions of connections on a server (and many more in a cluster). In the real-world, there are other 'scale' factors to consider in a production deployment besides file descriptors. Hope this helps.
Full disclosure: I work for Kaazing, a WS vendor.
As FrankG explained above, the number of WebSocket connections is depended on the use case.
Here are two benchmarks using MigratoryData WebSocket Server for two very different use cases that also detail system configuration (let's note however that system configuration is only a detail and the high scalability is achieved by the architecture of the MigratoryData which has been designed for real-time websites with millions of users).
In one use case MigratoryData scaled up to 10 million concurrent connections (while delivering ~1 Gbps messaging):
https://mrotaru.wordpress.com/2016/01/20/migratorydata-makes-its-c10m-scalability-record-more-robust-with-zing-jvm-achieve-near-1-gbps-messaging-to-10-million-concurrent-users-with-only-15-milliseconds-consistent-latency/
In another use case MigratoryData scaled up to 192,000 (while delivering ~9 Gbps):
https://mrotaru.wordpress.com/2013/03/27/migratorydata-demonstrates-record-breaking-8x-higher-websocket-scalability-than-competition/
These numbers are achieved on a single instance of MigratoryData WebSocket Server. MigratoryData can be clustered so you can also scale horizontally to any number of subscribers in an effective way.
Full disclosure: I work for MigratoryData.

simultaneous connections to a mysql database

I made a program that receives user input and stores it on a MySQL database. I want to implement this program on several computers so users can upload information to the same database simoultaneously. The database is very simple, it has just seven columns and the user will only enter four of them.
There would be around two-three hundred computers uploading information (not always at the same time but it can happen). How reliable is this? Is that even possible?
It's my first script ever so I appreciate if you could point me in the right direction. Thanks in advance.
Having simultaneous connections from the same script depends on how you're processing the requests. The typical choices are by forking a new Python process (usually handled by a webserver), or by handling all the requests with a single process.
If you're forking processes (new process each request):
A single MySQL connection should be perfectly fine (since the total number of active connections will be equal to the number of requests you're handling).
You typically shouldn't worry about multiple connections since a single MySQL connection (and the server), can handle loads much higher than that (completely dependent upon the hardware of course). In which case, as #GeorgeDaniel said, it's more important that you focus on controlling how many active processes you have and making sure they don't strain your computer.
If you're running a single process:
Yet again, a single MySQL connection should be fast enough for all of those requests. If you want, you can look into grouping the inserts together, as well as multiple connections.
MySQL is fast and should be able to easily handle 200+ simultaneous connections that are writing/reading, regardless of how many active connections you have open. And yet again, the performance you get from MySQL is completely dependent upon your hardware.
Yes, it is possible to have up to that many number of mySQL connectins. It depends on a few variables. The maximum number of connections MySQL can support depends on the quality of the thread library on a given platform, the amount of RAM available, how much RAM is used for each connection, the workload from each connection, and the desired response time.
The number of connections permitted is controlled by the max_connections system variable. The default value is 151 to improve performance when MySQL is used with the Apache Web server.
The important part is to properly handle the connections and closing them appropriately. You do not want redundant connections occurring, as it can cause slow-down issues in the long run. Make sure when coding that you properly close connections.

Persistent vs non-Persistent - Which should I use?

My site has always used persistent connections, based on my understanding of them there's no reason not to. Why close the connection when it can be reused? I have a site that in total accesses about 7 databases. It's not a huge traffic site, but it's big enough. What's your take on persistent, should I use them?
With persistent connections:
You cannot build transaction processing effectively
impossible user sessions on the same connection
app are not scalable. With time you may need to extend it and it will require management/tracking of persistent connections
if the script, for whatever reason, could not release the lock on the table, then any following scripts will block indefinitely and one should restart the db server. Using transactions, transaction block will also pass to the next script (using the same connection) if script execution ends before the transaction block completes, etc.
Persistent connections do not bring anything you can do with non-persistent connections.
Then, why to use them, at all?
The only possible reason is performance, to use them when overhead of creating a link to your SQL Server is high. And this depends on many factors like:
database type
whether MySQl server is on the same machine and, if not, how far? might be out of your local network /domain?
how much overloaded by other processes the machine on which MySQL sits
One always can replace persistent connections with non-persistent connections. It might change the performance of the script, but not its behavior!
Commercial RDMS might be licensed by the number of concurrent opened connections and here the persistent connections can misserve
My knowledge on the area is kinda limited so I can't give you many details on the subject but, as far as I know, the process of creating connections and handing them to a thread really costs resources, so I would avoid it if I were you. Anyhow I think that most of this decisions can't be generalized and depend on the business.
If, for instance, your application communicates continuously with the Database and will only stop when the application is closed, then perhaps persistent connections are the way to go, for you avoid the process mentioned before.
However, if your application only communicates with the Database sporadically to get minor information then closing the connection might be more sane, for you won't waste resources on opened connections that are not being used.
Also there is a technique called "Connection Pooling", in which you create a series of connections a priori and keep them there for other applications to consume. In this case connections are persistent to the database but non-persistent to the applications.
Note: Connections in MSSQL are always persistent to the database because connection pooling is the default behavior.