Timeout: Pool empty. Unable to fetch a connection in 10 seconds, none available[size:50; busy:50; idle:0; lastwait:10000]
Whenever we are connecting to web app with Socket, it throws this error and socket gets disconnected.
Even after doing following things, problem still persists -
Scaled up AWS EC2 from micro to large
In /etc/my.cnf
wait_timeout = 28800
interactive_timeout = 28800
Added following configurations under both Development as well as production environment.
maxActive = 50
minIdle = 5
maxIdle = 25
maxWait = 10000
maxAge = 10 * 60000
has anyone faced this problem?
Related
I have an application using Tomcat 8.5 connection pool, Java 8, and Multi-AZ AWS RDS MySQL database. In the last years, we had a couple of database issues that lead to failover. When the failover occurred, the pool was always able to detect the connection was closed (No operations allowed after connection closed) and reconnect correctly a minute later when the backup node is up.
Some days ago we had a failover that didn't follow this rule. Because of a hardware database issue, the database was unavailable and a failover took place. Then, when the backup node was up a couple of minutes later, we could connect correctly to the database from our desktop MySQL client.
Even several minutes after the failover took place and connectivity to database was recovered, the application showed logs hundreds of exceptions like:
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed
...
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
...
The last packet successfully received from the server was 20,017 milliseconds ago. The last packet sent successfully to the server was 20,016 milliseconds ago
...
Caused by: java.net.SocketTimeoutException: Read timed out
...
The application couldn't reconnect until we restarted the Tomcat servers.
Our pool is configured this way:
initialSize = 5
maxActive = 16
minIdle = 5
maxIdle = 8
maxWait = 10000
maxAge = 600000
timeBetweenEvictionRunsMillis = 5000
minEvictableIdleTimeMillis = 60000
validationQuery = "SELECT 1"
validationQueryTimeout = 3
validationInterval = 15000
testOnBorrow = true
testWhileIdle = true
testOnReturn = false
jdbcInterceptors = "ConnectionState;StatementCache(max=200)"
defaultTransactionIsolation = java.sql.Connection.TRANSACTION_READ_COMMITTED
And the JDBC connection URL has these parameters:
autoreconnect=true&socketTimeout=20000
Under my understanding, the validationQuery should have failed and the connection discarded, so a new correct connection should have created. Also, according to maxAge after 10 minutes all connections should have been discarded and new ones created.
The pool couldn't be recovered even after 20 minutes. As said, we had to restart the Tomcat servers.
Is there any explanation why the pool has always recovered correctly from a failover, but in this case, it couldn't?
Try to add ENABLE=Broken in your connection string.
For example :
jdbc:oracle:thin:#(DESCRIPTION=(ENABLE=BROKEN)(ADDRESS=(PROTOCOL=tcp)(PORT=)(HOST=))(CONNECT_DATA=(SID=)))
I ended up adding an AWS RDS Proxy that resolves this issue.
I have been provoking DB Failovers for an hour and everything worked fine with outages less than 20 seconds. And this, without modifying my application code, only pointing to the new proxy endpoint.
I deployed the latest airflow on a centos 7.5 vm and updated sql_alchemy_conn and result_backend to postgres databases on a postgresql instance and designated my executor as CeleryExecutor. Without enabling any dag at all and even with no airflow scheduler started, I see about one connection established every 5 seconds and then disposed to run a SELECT 1 and a SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1.
The number of short-lived connections drastically increase when one starts the scheduler and turns on dags. Does anyone know the reason for this? Is this a heartbeat check or task status check? With sql_alchemy_pool_enabled = True in airflow.cfg should these connections not be longer lived? Is there a log that I can look to pinpoint the source of these connections with sub-second life?
Config values used for reference
executor = CeleryExecutor
sql_alchemy_conn = postgres://..../db1
sql_alchemy_pool_enabled = True
sql_alchemy_pool_size = 5
sql_alchemy_max_overflow = 0
parallelism = 32
dag_concurrency = 16
max_active_runs_per_dag = 16
worker_concurrency = 16
broker_url = redis://...
result_backend = db+postgresql+psycopg2://.../db2
job_heartbeat_sec = 5
scheduler_heartbeat_sec = 5
Set AIRFLOW__CORE__SQL_ALCHEMY_POOL_PRE_PING to False.
Check connection at the start of each connection pool checkout. Typically, this is a simple statement like SELECT 1.
More information here: https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
I've used WordPress site, If we give multiple requests using JMeter or on search engine boot time MySQL server getting down
I have changed the below configuration on my server
key_buffer = 25M
max_allowed_packet = 1M
thread_stack = 128K
table_cache = 25
innodb_buffer_pool_size= "64M" to "512M"
max_connections = 200;
How to fix the issue?
This issue happens only on high traffic like 200 to 500 request at a time
Am using MySQL as Back end for my website, while executing the website.. at some phase i got the error as Too many connections.
Am using a class file for handling mysql transactions. in that class file i have a sub for closing the connections which is as follows:
Public Sub CloseConn()
ConnDB.Close()
ConnDB.Dispose()
End Sub
After getting the error i will restart the mysql for continue the operations. in mysql administrator i can saw that all the connections state is Sleep. how can i kill the sleepy connections programmatically?
mysqld will timeout DB Connections based on two(2) server options:
interactive_timeout
wait_timetout
Both are 28800 seconds (8 hours) by default.
You can set these options in /etc/my.cnf
If your connections are persistent (opened via mysql_pconnect) you could lower these numbers to something reasonable like 600 (10 min) or even 60 (1 min). Or, if your app works just fine, you can leave the default. This is up to you.
You must set these as follows in my.cnf (takes effect after mysql restart):
[mysqld]
interactive_timeout=180
wait_timeout=180
If you do not want to restart mysql, then run these two commands:
SET GLOBAL interactive_timeout = 180;
SET GLOBAL wait_timeout = 180;
Update: Rackspace have got back to me and told me that their MySQL cloud uses a wait_timeout value of 120 seconds
I've been banging my head against this so I thought I'd ask you guys. Any ideas you might have would be appreciated
util.JDBCExceptionReporter - SQL Error: 0, SQLState: 08S01
util.JDBCExceptionReporter - Communications link failure
The last packet successfully received from the server was 264,736 milliseconds
ago. The last packet sent successfully to the server was 32 milliseconds ago.
The error occurs intermittently, often just minutes after the server in question comes up. The DB is nowhere near capacity in terms of load or connections, and I've tried dozens of different configuration combinations.
The fact that this connection last received a packet from the server 264 seconds ago is revealing because it's well above the 120 second timeout put in place by Rackspace. I've also confirmed from the DB end that my 30 second limit is being respected.
Things I've tried
Setting DBCP to expire connections aggressively after 30 seconds, and verified that the MySQL instance reflects this behaviour via SELECT * FROM PROCESSLIST
Switched connection string from hostname to IP address, so this isn't a DNS issue
Various different combinations of connection settings
Tried declaring the connection pool settings in DataSources.groovy or resources.groovy, but I'm fairly sure that the settings are being respected as the DB reflects them: anything over 30 seconds is quickly killed
Any ideas?
Right now my best guess is that something in Grails is holding onto a reference to a stale connection for long enough that the 120 second limit is problematic... but it's a desperate theory and realistically I doubt it's true, but that leaves me short of ideas.
The latest config I've tried:
dataSource {
pooled = true
driverClassName = "com.mysql.jdbc.Driver"
dialect = 'org.hibernate.dialect.MySQL5InnoDBDialect'
properties {
maxActive = 50
maxIdle = 20
minIdle = 5
maxWait = 10000
initialSize = 5
minEvictableIdleTimeMillis = 1000 * 30
timeBetweenEvictionRunsMillis = 1000 * 5
numTestsPerEvictionRun = 50
testOnBorrow = true
testWhileIdle = true
testOnReturn = true
validationQuery = "SELECT 1"
}
}
Stack trace:
2012-10-25 12:36:12,375 [http-bio-8080-exec-2] WARN util.JDBCExceptionReporter - SQL Error: 0, SQLState: 08S01
2012-10-25 12:36:12,375 [http-bio-8080-exec-2] ERROR util.JDBCExceptionReporter - Communications link failure
The last packet successfully received from the server was 264,736 milliseconds ago. The last packet sent successfully to the server was 32 milliseconds ago.
2012-10-25 12:36:12,433 [http-bio-8080-exec-2] ERROR errors.GrailsExceptionResolver - EOFException occurred when processing request: [GET] /cart
Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.. Stacktrace follows:
org.hibernate.exception.JDBCConnectionException: could not execute query
at grails.orm.HibernateCriteriaBuilder.invokeMethod(HibernateCriteriaBuilder.java:1531)
at SsoRealm.hasRole(SsoRealm.groovy:30)
at org.apache.shiro.grails.RealmWrapper.hasRole(RealmWrapper.groovy:193)
at org.apache.shiro.authz.ModularRealmAuthorizer.hasRole(ModularRealmAuthorizer.java:374)
at org.apache.shiro.mgt.AuthorizingSecurityManager.hasRole(AuthorizingSecurityManager.java:153)
at org.apache.shiro.subject.support.DelegatingSubject.hasRole(DelegatingSubject.java:225)
at ShiroSecurityFilters$_closure1_closure4_closure6.doCall(ShiroSecurityFilters.groovy:98)
at grails.plugin.cache.web.filter.PageFragmentCachingFilter.doFilter(PageFragmentCachingFilter.java:195)
at grails.plugin.cache.web.filter.AbstractFilter.doFilter(AbstractFilter.java:63)
at org.apache.shiro.grails.SavedRequestFilter.doFilter(SavedRequestFilter.java:55)
at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:380)
at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 264,736 milliseconds ago. The last packet sent successfully to the server was 32 milliseconds ago.
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1116)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3589)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2683)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2144)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2310)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
... 20 more
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3039)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489)
... 29 more
Figured this out. The grails-elasticsearch plugin was holding onto stale connections. This was a known issue in that plugin and a fix came in via this pull request: