I have a Saas application on AWS ECS and databases on AWS RDS. We are planning to implement AWS RDS Proxy for pooling implementation. From the RDS proxy documentation, I saw that we don't need to make any changes to the application code. Currently, we are using application side connection pooling. When we implement an RDS proxy for pooling, does the current pooling have any impact?
Do we need to remove the application side pooling to work with RDS effectively?
My main concern is, if I choose 100% pooling in RDS proxy and from application pooling configuration if we limit that to say 100 max connection. Will that be a bottleneck?
TLDR: keep the connection pool in your application, and size it to the number of connections required by that one instance of your application (e.g. the ECS task or EKS pod).
With a database proxy in the middle, there are two separate legs to a "connection":
First, there is a connection from the application to the proxy. What you called the "application side pooling" is this type of connection. Since there's still overhead associated with creating a new instance of this type of connection, continuing to use a connection pool in your application probably is a good idea.
Second, there is a connection from the proxy to the database. These connections are managed by the proxy. The number of connections of this type is controlled by a proxy configuration. If you set this configuration to 100%, then you're allowing the proxy to use up to the database's max_connections value, and other clients may be starved for connections.
So, when your application wants to use a connection, it needs to get a connection from its local pool. Then, the proxy needs to pair that with a connection to the database. The proxy will reuse connections to the database where possible (this technique also is called multiplexing). Or, quoting the official docs: "You can open many simultaneous connections to the proxy, and the proxy keeps a smaller number of connections open to the DB instance or cluster. Doing so further minimizes the memory overhead for connections on the database server. This technique also reduces the chance of "too many connections" errors."
As your container orchestrator (e.g. ECS or EKS) scales your application horizontally, your application will open/close connections to the proxy, but the proxy will prevent your database from becoming overwhelmed by these changes.
Related
I've created a Java Spring Boot application that launches 36 downloader droplets on digital ocean, which ssh tunnel to a database CPU Optimized droplet and downloads from an API into the database.
I've configured hikari as follows towards less pooling connections assuming the database may have trouble with too many and thinking they might not be required.
spring.datasource.hikari.maximumPoolSize=5
spring.datasource.hikari.connectionTimeout=200000
spring.datasource.hikari.maxLifetime=1800000
spring.datasource.hikari.validationTimeout=100000
I'm wondering if those settings may or may not be recommended and why. I've reduced the maximumPoolSize to 5 however I haven't found much information on whether that is considered too small for Java Spring Boot Application to run effectively.
Given each downloader is storing data in the database sequentially do I need to have more than a few pooling connections on each downloader?
I've configured the maximum connections in mysql to 250 and the maximum ssh connections on the database server to 200. I note that 114 sshD processes are created on the server. Can a server handle that many ssh tunneling connections?
Do you forsee any problems with this kind of distributed setup with Spring boot? One thing I have had to do before adjusting to these settings is place retry connection code around each database connection to prevent disconnection errors.
Thanks
Conteh
I am very new to cloud computing. I have never worked with MySQL outside of 1 instance. I am trying to understand how AWS RDS read replicas work with my application. For example say I have 1 master and 2 read replicas. I then from my application server send the query to AWS:
SELECT * FROM users where username = 'bob';
How does this work now? Do I need to include more into my code to choose a certain read replica or does AWS automatically reroute the request or how does it work?
Amazon does not currently provide any sort of load balancing or other traffic distribution across RDS servers. When you send queries to the primary RDS endpoint, 100% of that traffic goes to the primary RDS server. You would have to architect your system to open connections to each server and distribute the queries across the different database servers.
To do this in a way that is transparent to your application, you could setup an HAProxy instance between your application and the database that manages the traffic distribution.
Use of Elastic Load Balancers to distribute RDS traffic is an often requested feature, but Amazon has given no indication that they are working on this feature at this time.
I'm running Tomcat 7/MySQL 5.6 on Centos 6. It's time to separate the database to another server. What is the best approach to securing the connection between Tomcat and the backend MySQL server. It's Virtualized and I don't want to run the connection open over a shared network.
I'm thinking tunneling through ssh. SSL seems a lot of work. But what's the "recommended" approach?
You're right to be careful about sending traffic over an open network. The MySQL protocol by default is not encrypted at all, so if someone can capture packets on your network, then they can see all your data.
I prefer using either an ssh tunnel or a vpn connection. I just find it easier to configure.
My colleague Ernie Souhrada at Percona posted a couple of really good blog articles about the efficiency of using an ssh tunnel versus using MySQL client options to connect via SSL and bear the overhead of handshaking on every connection.
http://www.mysqlperformanceblog.com/2013/10/10/mysql-ssl-performance-overhead/
http://www.mysqlperformanceblog.com/2013/11/18/mysql-encryption-performance-revisited/
The performance impact of SSL handshake that Ernie reports won't be quite a much of an issue for a Tomcat environment, since you would typically have a connection pool, and therefore new connections would be made less frequently.
I'm using Apache HTTPClient (4.2.2) / Java7 to open many reusable connections to a tomcat 7 server (to simulate many users repeatedly hitting the service). Both client and server on Ubuntu 12 (but different machines). I made sure that systctl.conf and limits.conf allow this scenario.
This works well up to about 1500 simulated users / connections. The connections get reused as expected. Somewhere between 1500 and 1600 simulated users however, connections are no longer reused and closed/ re-opend all the time. Why might this be the case?
I don't think the problem is on the server side as when I start multiple simulation clients on different machines against the same server, the server has no problems reusing the connections as long as each client doesn't go beyond 1500 connections.
There can be various reasons as to why connections are not longer being re-used depending on the configuration of the connection manager OR server side configuration. The easiest way to find out the reason is to run HttpClient with context logging on as described in the 'context logging for connection management / request execution' example in the Logging Guide
You might need to increase the number of available workers,at least check if there are workers free when you run out of connections by going to server-status
Is there any limit on server on serving number of requests per second or number of requests serving simultaneously. [in configuration, not due to RAM, CPU etc hardware limitations]
Is there any limit on number of simultaneous requests on an instance of CouchbaseClient in Java servlet.
Is it best to create only one instance on CouchbaseClient and keep it open or to create multiple instances and destroy.
Is Moxi helpful with Couchbase 1.8.0 server/Couchbase java client 1.0.2
I need this info to setup application in production.
Thanks you
The memcached instance that runs behind Couchbase has a hard
connection limit of 10,000 connections. Couchbase in general
recommends that you should increase the number of nodes to address
the distrobution of traffic on that level.
The client itself does not have a hardcoded limit in regards to how
many connections it makes to a Couchbase cluster.
Couchbase generally recommends that you create a connection pool
from your application to the cluster and just re-use those
connections versus creation and destroying them over and over. On
heavier load applications, the creation and destruction of these
connections over and over can get very expensive from a resource
perspective.
Moxi is an integrated piece of Couchbase. However, it is generally
in place as an adapter layer for clients developers to specifically
use it or to give legacy access to applications designed to directly
access a memcached interface. If you are using the Couchbase client
driver you won't need to use the Moxi interface.