We have a Java web application that utilizes several different connections to a variety of database servers. Somewhat recently, a couple databases servers (managed by a different department) were reconfigured to sit behind a firewall which cuts any 45 second idle connection. I was told that there should be a keepAlive property for our connections that we can set. However, that doesn't seem to be the case after some research. Instead I have to come to the conclusion that I must use properties like:
validationQuery
testOnBorrow
timeBetweenEvictionRunsMillis
etc. (Source: Commons DBCP)
The short of it is I have to ensure that the application retains its connection to these servers regardless of the 45 second timeout on the other side. I've run through a few configurations with little success (Example: When the connection drops, the connection is re-established sooner, but still drops rather quickly).
Here is the original code for the BasicDataSource object:
private BasicDataSource createDefaultDataSource() {
BasicDataSource ds = new BasicDataSource();
ds.setInitialSize(3);
ds.setMinIdle(1);
ds.setMaxIdle(5);
ds.setMaxActive(100);
ds.setMaxWait(3 * 1000);
ds.setTimeBetweenEvictionRunsMillis(30 * 1000);
ds.setMinEvictableIdleTimeMillis(60 * 1000);
ds.setLogAbandoned(true);
ds.setRemoveAbandoned(true);
ds.setRemoveAbandonedTimeout(60);
return ds;
}
And here is what I assumed would be the solution, yet doesn't seem to work how I expect:
private BasicDataSource createDefaultDataSource() {
BasicDataSource ds = new BasicDataSource();
ds.setInitialSize(3);
ds.setMinIdle(1);
ds.setMaxIdle(5);
ds.setMaxActive(100);
ds.setMaxWait(3 * 1000);
ds.setTimeBetweenEvictionRunsMillis(15 * 1000); // was 30 seconds
ds.setMinEvictableIdleTimeMillis(30 * 1000); // was 60 seconds
ds.setLogAbandoned(true);
ds.setRemoveAbandoned(true);
ds.setRemoveAbandonedTimeout(60);
ds.setValidationQuery("select 1");
ds.setTestOnBorrow(true);
ds.setTestOnReturn(true);
ds.setTestWhileIdle(true);
return ds;
}
Maybe my timing is off, perhaps there are other properties I have to set in conjunction to these, or maybe I should be going about this a different way.
What properties should be set/changed in order to make up for a 45 second idle drop policy?
Related
me and my collegues are working on a SpringBoot project, we work simultaneously and we all connect to the same mysql database.
Trouble is that after a while some of us will no longer be able to connect, error is Too many connections, now I've spoken to the db administrator and he raised the number of max connections to 500 (it was something like 150 before this), but we still get the error, how can I fix this? the only configuration properties we use are these:
spring.datasource.url= ...
spring.datasource.username= ...
spring.datasource.password= ...
spring.datasource.hikari.maximum-pool-size=25
Maybe jpa opens a new connection every time he does a query but doesn't close it? I don't know I'm clueless here
EDIT:
I've been asked to show some code regarding the interactions with the database so here it is:
#Autowired
private EmployeeDAO employeeDAO;
#Autowired
private LogDAO logDAO;
#Autowired
private ContrattoLavoroDAO contrattoLavoroDAO;
#Override
public void deleteEmployeeById(Long idEmployee, String username) {
contrattoLavoroDAO.deleteContrattoByEmpId(idEmployee);
employeeDAO.deleteById(idEmployee);
LogEntity log = new LogEntity();
LocalDateTime date = LocalDateTime.now();
log.setData(date);
log.setUser(username);
log.setCrud("delete");
log.setTabella("Employees");
log.setDescrizione("l'utente " + username + " ha rimosso il dipendente con matricola " + idEmployee);
logDAO.save(log);
}
and here's the model for a DAO:
public interface ContrattoLavoroDAO extends JpaRepository<ContrattoLavoroEntity, Long> {
#Modifying
#Query(value = "DELETE contratto_lavoro, employee_contratto FROM contratto_lavoro" + " INNER JOIN"
+ " employee_contratto ON employee_contratto.id_contratto = contratto_lavoro.id_contratto" + " WHERE"
+ " contratto_lavoro.id_contratto = ?1", nativeQuery = true)
public void deleteContrattoByEmpId(Long empId);
}
You set the Hikari maximum pool size to 25.
Which is pretty high for a development environment.
But this shouldn't be a problem, because it's only the maximum, right?
Well Hikaris documentation says:
🔢minimumIdle
This property controls the minimum number of idle connections that HikariCP tries to maintain in the pool. If the idle connections dip below this value and total connections in the pool are less than maximumPoolSize, HikariCP will make a best effort to add additional connections quickly and efficiently. However, for maximum performance and responsiveness to spike demands, we recommend not setting this value and instead allowing HikariCP to act as a fixed size connection pool. Default: same as maximumPoolSize
🔢maximumPoolSize
This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections. Basically this value will determine the maximum number of actual connections to the database backend. A reasonable value for this is best determined by your execution environment. When the pool reaches this size, and no idle connections are available, calls to getConnection() will block for up to connectionTimeout milliseconds before timing out. Please read about pool sizing. Default: 10
If I'm reading this correct it means each developer on your team opens 25 connections when they start the application. Another 25 when they start an integration test which starts a new application context. If the tests of the test suite have different configuration each set of configurations will have their own set of 25 connections in use.
The quick solution is to reduce the maximumPoolSize significantly for your development environment. I'd recommend 2. This is enough to allow for one normal transaction and one background process.
It will throw exceptions if the application requires more connections, which is probably a good thing since in most cases it shouldn't.
Above that you might want to set the minimumIdle to 1 or 0, so the application doesn't consume shared resources if an application is just running because no one shut it down yet.
Mid term you probably want to get rid of having a central database for development.
With the availability of TestContainers there really isn't a reason anymore to not have a local database for each developer.
As a nice side effect it will ensure that all the schema update scripts work properly.
I have Lambda that uses RDS. I wanted to improve it and use the Lambda connection caching. I have found several articles, and implemented it on my side, best to my knowledge. But now, I am not sure it is this the rigth way to go.
I have Lambda (running Node 8), which has several files used with require. I will start from the main function, until I reach the MySQL initializer, which is exact path. All will be super simple, showing only to flow of the code that runs MySQL:
Main Lambda:
const jobLoader = require('./Helpers/JobLoader');
exports.handler = async (event, context) => {
const emarsysPayload = event.Records[0];
let validationSchema;
const body = jobLoader.loadJob('JobName');
...
return;
...//
Job Code:
const MySQLQueryBuilder = require('../Helpers/MySqlQueryBuilder');
exports.runJob = async (params) => {
const data = await MySQLQueryBuilder.getBasicUserData(userId);
MySQLBuilder:
const mySqlConnector = require('../Storage/MySqlConnector');
class MySqlQueryBuilder {
async getBasicUserData (id) {
let query = `
SELECT * from sometable WHERE id= ${id}
`;
return mySqlConnector.runQuery(query);
}
}
And Finally the connector itself:
const mySqlConnector = require('promise-mysql');
const pool = mySqlConnector.createPool({
host: process.env.MY_SQL_HOST,
user: process.env.MY_SQL_USER,
password: process.env.MY_SQL_PASSWORD,
database: process.env.MY_SQL_DATABASE,
port: 3306
});
exports.runQuery = async query => {
const con = await pool.getConnection();
const result = con.query(query);
con.release();
return result;
};
I know that measuring performance will show the actual results, but today is Friday, and I will not be able to run this on Lambda until the late next week... And really, it would be awesome start of the weekend knowing I am in right direction... or not.
Thank for the inputs.
First thing would be to understand how require works in NodeJS. I do recommend you go through this article if you're interested in knowing more about it.
Now, once you have required your connection, you have it for good and it won't be required again. This matches what you're looking for as you don't want to overwhelm your database by creating a new connection every time.
But, there is a problem...
Lambda Cold Starts
Whenever you invoke a Lambda function for the first time, it will spin up a container with your function inside it and keep it alive for approximately 5 mins. It's very likely (although not guaranteed) that you will hit the same container every time as long as you are making 1 request at a time. But what happens if you have 2 requests at the same time? Then another container will be spun up in parallel with the previous, already warmed up container. You have just created another connection on your database and now you have 2 containers. Now, guess what happens if you have 3 concurrent requests? Yes! One more container, which equals one more DB connection.
As long as there are new requests to your Lambda functions, by default, they will scale out to meet demand (you can configure it in the console to limit the execution to as many concurrent executions as you want - respecting your Account limits)
You cannot safely make sure you have a fixed amount of connections to your Database by simply requiring your code upon a Function's invocation. The good thing is that this is not your fault. This is just how Lambda functions behave.
...one other approach is
to cache the data you want in a real caching system, like ElasticCache, for example. You could then have one Lambda function be triggered by a CloudWatch Event that runs in a certain frequency of time. This function would then query your DB and store the results in your external cache. This way you make sure your DB connection is only opened by one Lambda at a time, because it will respect the CloudWatch Event, which turns out to run only once per trigger.
EDIT: after the OP sent a link in the comment sections, I have decided to add a few more info to clarify what the mentioned article wants to say
From the article:
"Simple. You ARE able to store variables outside the scope of our
handler function. This means that you are able to create your DB
connection pool outside of the handler function, which can then be
shared with each future invocation of that function. This allows for
pooling to occur."
And this is exactly what you're doing. And this works! But the problem is if you have N connections (Lambda Requests) at the same time. If you don't set any limits, by default, up to 1000 Lambda functions can be spun up concurrently. Now, if you then make another 1000 requests simultaneously in the next 5 minutes, it's very likely you won't be opening any new connections, because they have already been opened on previous invocations and the containers are still alive.
Adding to the answer above by Thales Minussi but for a Python Lambda. I am using PyMySQL and to create a connection pool I added the connection code above the handler in a Lambda that fetches data. Once I did this, I was not getting any new data that was added to the DB after an instance of the Lambda was executed. I found bugs reported here and here that are related to this issue.
The solution that worked for me was to add a conn.commit() after the SELECT query execution in the Lambda.
According to the PyMySQL documentation, conn.commit() is supposed to commit any changes, but a SELECT does not make changes to the DB. So I am not sure exactly why this works.
I'm using Grails 2.5.3 and Tomcat7 and after 8 hours of app deployment our logs start blowing up with connection already closed issues. A good assumption is that MySql is killing the connection after the default wait time of 8 hrs.
By way of the docs my pool seems to be configured correctly to keep the idle connections open but it doesn't seem to be the case.
What might be wrong with my connection pool setting?
dataSource {
pooled = true
url = 'jdbc:mysql://******.**********.us-east-1.rds.amazonaws.com/*****'
driverClassName = 'com.mysql.jdbc.Driver'
username = '********'
password = '******************'
dialect = org.hibernate.dialect.MySQL5InnoDBDialect
loggingSql = false
properties {
jmxEnabled = true
initialSize = 5
timeBetweenEvictionRunsMillis = 10000
minEvictableIdleTimeMillis = 60000
validationQuery = "SELECT 1"
initSQL = "SELECT 1"
validationQueryTimeout = 10
testOnBorrow = true
testWhileIdle = true
testOnReturn = true
testOnConnect = true
removeAbandonedTimeout = 300
maxActive=100
maxIdle=10
minIdle=1
maxWait=30000
maxAge=900000
removeAbandoned="true"
jdbcInterceptors="org.apache.tomcat.jdbc.pool.interceptor.StatementCache;"
}
}
hibernate {
cache.use_second_level_cache=true
cache.use_query_cache=true
cache.region.factory_class = 'org.hibernate.cache.ehcache.EhCacheRegionFactory'
}
Also, I have confirmed that the dataSource at runtime is an instance of (org.apache.tomcat.jdbc.pool.DataSource)
UPDATE 1 (NOT FIXED)
We think we may have found the problem! We were storing a domain class in the http session and after reading a bit about how the session factory works we believe that the stored http object was somehow bound to a connection. When a user accessed the domain class form the http session after 8 hours we think that hibernate stored a reference to the dead connection. It's in production now and we are monitoring.
UPDATE 2 (FIXED)
We finally found the problem. Removing removeAbandoned and removeAbandonedTimeout resolved all our problems. We're not entirely sure why this resolved the issue as our assumption was that these two properties exist to prevent exactly what was occurring. The only thought is that our database was more aggressively managing the abandoned connections. It's been over 4 weeks with no issues.
I've had this issue with a completely different setup. It's really not fun to deal with. Basically it boils down to this:
You have some connection somewhere in your application just sitting around while Java is doing some sort of "other" processing. Here's a really basic way to reproduce:
Connection con = (get connection from pool);
Sleep(330 seconds);
con.close();
The code is not doing anything with the database connection above, so tomcat detects it as abandoned and returns it to the pool at 300 seconds.
Your application is high traffic enough that the same connection (both opened and abandoned in the above code) is opened somewhere else in the application in a different part of code.
Either the original code hits 330 seconds and closes the connection, or the new code picks up the connection and finished and closes it. At this point there are two places using the same connection and one of them has closed it.
The other location of code using the same connection then tries to either use or close the same connection
The connection is already closed. Producing the above error.
Suggested route to fix:
Use the setting logAbandoned="true" to find where the connections are being abandoned from.
Our url usually looks alike:
url = "jdbc:mysql://localhost/db?useUnicode=yes&characterEncoding=UTF-8&autoReconnect=true"
Check out also encoding params if you don't want to face such an issue.
(see update 2 on question)
Removing removeAbandoned and removeAbandonedTimeout resolved all our problems. Someone may want to provide a more detailed answer on why this did because we are not entirely sure.
I am writing a .NET 4.0 console app that
Opens up a connection Uses a Data Reader to cursor through a list of keys
For each key read, calls a web service
Stores the result of a web service in the database
I then spawn multiple threads of this process in order to improve the maximum number of records that I can process per second.
When I up the process beyond about 30 or so threads, I get the following error:
System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
Is there an Server or client side option to tweak to allow me to obtain more connections fromn the connection pool?
I am calling a sql 2008 r2 DATABASE.
tHx
This sounds like a design issue. What's your total record count from the database? Iterating through the reader will be really fast. Even if you have hundreds of thousands of rows, going through that reader will be quick. Here's a different approach you could take:
Iterate through the reader and store the data in a list of objects. Then iterate through your list of objects at a number of your choice (e.g. two at a time, three at a time, etc) and spawn that number of threads to make calls to your web service in parallel.
This way you won't be opening multiple connections to the database, and you're dealing with what is likely the true bottleneck (the HTTP call to the web service) in parallel.
Here's an example:
List<SomeObject> yourObjects = new List<SomeObject>();
if (yourReader.HasRows) {
while (yourReader.Read()) {
SomeObject foo = new SomeObject();
foo.SomeProperty = myReader.GetInt32(0);
yourObjects.Add(foo);
}
}
for (int i = 0; i < yourObjects.Count; i = i + 2) {
//Kick off your web service calls in parallel. You will likely want to do something with the result.
Task[] tasks = new Task[2] {
Task.Factory.StartNew(() => yourService.MethodName(yourObjects[i].SomeProperty)),
Task.Factory.StartNew(() => yourService.MethodName(yourObjects[i+1].SomeProperty)),
};
Task.WaitAll(tasks);
}
//Now do your database INSERT.
Opening up a new connection for all your requests is incredibly inefficient. If you simply want to use the same connection to keep requesting things, that is more than possible. You can open a connection, and then run as many SqlCommand commands through that one connection. Simply keep the ONE connection around, and dispose of it after all your threading is done.
Please restart the IIS you will be able to connect
I'm using latest version of Xampp on 64bit Win7.
The problem is that, when I use mysql_connect with "bool $new_link" set to true like so:
mysql_connect('localhost', 'root', 'my_password', TRUE);
script execution time increases dramatically (about 0,5 seconds per connection, and when I have 4 diffirent objects using different connections, it takes ~2 seconds).
Is setting "bool $new_link" to true, generally a bad idea or could it just be some problem with my software configuration.
Thank you.
//Edit:
I'm using new link, because I have multiple objects, that use mysql connections (new objects can be created inside already existing objects and so on). In the end, when it comes to unsetting objects (I have mysql_close inside my __destruct() functions), I figured, that the only way to correctly clean up loose ends would be that all objects have their own connection variables.
I just formated my PC so configuration should be default conf.
Don't open a new connection unless you have a need for one (for instance, accessing multiple databases simultaneously).
Also you don't have to explicitly call mysql_close. I usually just include a function to quickly retrieve an existing db link (or a new one if none exists yet).
function &getDBConn() {
global $DBConn;
if(!$DBConn) $DBConn = mysql_connect(...);
return $DBConn;
}
// now you can just call $dbconn = getDBConn(); whenever you need it
Use "127.0.0.1" instead of "localhost". It improved my performance with mysql_connect from ~1 sek to a couple of milliseconds.
Article about php/mysql_connect and IPv6 on windows: http://www.bluetopazgames.com/uncategorized/php-mysql_connect-is-slow-1-second-for-localhost-windows-7/