How to deal with lost client connections in Apache Qpid + MRg - qpid

Using the c++ client, it seems that if for some reason the connection is lost to the server, for example through power failure, manual termination, network , then the server does not detect that the client is lost. An open connection (or half open) is kept. How can this be prevented? Is there some server side heartbeat option?

The client connection to the broker has a connection option called heartbeat. Heartbeat option values are a integer representing time in seconds. Heartbeats keepalive frames are sent every N seconds. If two successive heartbeats are missed the connection is considered to be lost.
See connection options

Related

Regarding MySQL Aborted connection

I'm looking into aborted connection -
2022-11-21T20:10:43.215738Z 640870 [Note] Aborted connection 640870 to db: '' user: '' host: '10.0.0.**' (Got timeout reading communication packets)
My understanding is that I need to figure out whether it is an interactive or not connection, and increase wait_timeout (or interactive_timeout) accordingly. If it has no effect, then I'll need to adjust net_read_timeout or net_write_timeout and see.
I'd like to ask:
Is there a meta table that I can query for the connection type
(interactive or not)?
There are how-to's on the internet on adjusting wait_timeout (or
interactive_timeout) and all of them have rebooting the database as
the last step. Is that really required? Given that immediate effect
is not required, the sessions are supposed to come and go, and new
sessions will pick up the new value (after the system value is set),
I suppose if there is a way to track how many connections are left
with the old values, then it will be ok?
Finally, can someone suggest any blog (strategy) on handling aborted
connection or adjusting the timeout values?
Thank you!
RDS MySQL version 5.7
There is only one client that sets the interactive flag by default: the mysql command-line client. All other client tools and connectors do not set this flag by default. You can choose to set the interactive flag, because it's a flag in the MySQL client API mysql_real_connect(). So you would know if you did it. In some connectors, you aren't calling the MySQL client API directly, and it isn't even an option to set this flag.
So for practical purposes, you can ignore the difference between wait_timeout and interactive_timeout, unless you're trying to tune the timeout of the mysql client in a shell window.
You should never need to restart the MySQL Server. The timeout means the client closed the session after there has been no activity for wait_timeout seconds. The default value is 28800, which is 8 hours.
The proper way of handling this in application code is to catch exceptions, reconnect if necessary, and then retry whatever query was interrupted.
Some connectors have an auto-reconnect option. Auto-reconnect does not automatically retry the query.
In many applications, you are borrowing a connection from a connection pool, and the connection pool manager is supposed to test the connection before returning it to the caller. For example running SELECT 1; is a common test. The action of testing the connection causes a reconnect if the connection was not used for 8 hours.
If you don't use a connection pool (for example if your client program is PHP, which doesn't support connection pools as far as I know), then your client opens a new connection on request, so naturally it can't be idle for 8 hours if it's a new connection. Then the connection is closed as the request finishes, and presumably this request lasts less than 8 hours.
So this comes up only if your client opens a long-lived MySQL connection that is inactive for periods of 8 hours or more. In such cases, it's your responsibility to test the connection and reopen it if necessary before running a query.

Will AWS Lambda automatically close MySQL connections?

If we don't close the MySQL connection at the end of the handler function in lambda-- will the MySQL connection close automatically when lambda dies and re-connect at the cold-start?
The connections won't be closed immediately but eventually they will. By default, the connection timeouts are 8 hour on MySQL and maximum connections are also capped at 66.
show variables like "wait_timeout"; -- 28800
show variables like "max_connections"; -- 66
When you create a connection to MySQL server, it would create a Thread on the MySQL server to serve this connection.
show status where variable_name = 'threads_connected';
select * from information_schema.processlist;
After a Lambda executes a request and sends a response, the Lambda execution environment is not removed immediately and the same one may be used to serve other requests. This is your Warm/Hot Lambda and in this case an active MySQL connection would be really good for your function execution and this is possible only when you did not close the connection in the previous invocation. Eventually, when there are no more requests, this Lambda execution environment can be shutdown and the resources are returned to the pool of AWS compute resources. When the Lambda execution environment shuts down, the TCP connection to the MySQL server from the Lambda will also terminate. Now the MySQL server can remove the thread associated with the Lambda and in essence would reduce the pool of active connections on the server. This also takes a bit of time. So if you are getting a lot of requests concurrently and if the maximum connections are already active, then the request would start failing.
I did some test to see how long does it really take to reclaim the connections and here is the snapshot. The X axis is in minutes and the Y axis is on the scale of 0-70 where each line parallel to X-Axis is 10 units away from each other.
It roughly takes 10-15 minutes to reclaim the connections. But again it depends on the Lambda usage pattern as well.
So should you close the connection on every invocation? Well, it depends!
Take a look at Lambda Runtime extensions and see if you can use the shutdown hook to close connection. If you can, then it would mean while the Lambda execution environment was serving multiple requests, you used a cached connection and just before your Lambda execution environment is taken away from you, you closed the connection.
Lambda RDS Proxy is also an alternative as mentioned above, but it is not free. Before you take the RDS Proxy route, do consider using another Serverless solution like AWS Fargate. In this case probably you would use a connection pool just like any long running server side application.
No, they will not be closed automatically unless you are doing something with your mysql client that implicit closes the connection when it goes out of scope.
The connection will stay open until it times out. There has been many people who reported problems in the past with poorly written Lambdas creating tons of open sessions/connections to relational databases because the connections were not properly closed and they had to wait to be timed out.
One feature that came out a year or so ago was RDS Proxies which are sort of an intermediary between clients and the MySQL server that implements connection pooling. This solves the problem with Lambdas not being able to effectively use connection pooling since RDS Proxies service can do that for serverless clients.

MySQL stuck or network issue?

we have mysql-server(5.5.47)that hosted on physical server. It listen external internet interface(with restrict user access), mysql server intensively used from different places(we use different libraries to communicate with mysql). But sometimes whole mysql server(or network) stuck and stop accept connection, and a clients failed with etimedout(connect)/timeout(recv), even direct connection from server to mysql with mysql cli not working(stuck without any response — seems to be try to establish connections).
First thought was that it is related to tcp backlog, so mysql backlog was increased — but this not help at all.
Issue not repeatable, so last time when this issue happened we sniff traffic, and what we get:
http://grab.by/STwq — screenshot
*.*.27.65 — it is client
*.*.20.80 — it is mysql server
From session we can assume that tcp connection established, but server retransmit SYN/ACK to client(from dump we see that server receive ACK, why retransmit ?), but in normal case mysql must generate init packet and send to client, after connection was established.
It is only screen from 1 session, but all other sessions mostly same, SYN -> SYN/ACK -> ACK -> and server retransmit SYN/ACK up to retries_count.
After restart mysql all get normal immediately after restart. So not sure it is related to network or mysql.
Any thoughts would be appropriate.
Thank you!

Configure GlassFish JDBC connection pool to handle Amazon RDS Multi-AZ failover

I have a Java EE application running in GlassFish on EC2, with a MySQL database on Amazon RDS.
I am trying to configure the JDBC connection pool to in order to minimize downtime in case of database failover.
My current configuration isn't working correctly during a Multi-AZ failover, as the standby database instance appears to be available in a couple of minutes (according to the AWS console) while my GlassFish instance remains stuck for a long time (about 15 minutes) before resuming work.
The connection pool is configured like this:
asadmin create-jdbc-connection-pool --restype javax.sql.ConnectionPoolDataSource \
--datasourceclassname com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource \
--isconnectvalidatereq=true --validateatmostonceperiod=60 --validationmethod=auto-commit \
--property user=$DBUSER:password=$DBPASS:databaseName=$DBNAME:serverName=$DBHOST:port=$DBPORT \
MyPool
If I use a Single-AZ db.m1.small instance and reboot the database from the console, GlassFish will invalidate the broken connections, throw some exceptions and then reconnect as soon the database is available. In this setup I get less than 1 minute of downtime.
If I use a Multi-AZ db.m1.small instance and reboot with failover from the AWS console, I see no exception at all. The server halts completely, with all incoming requests timing out. After 15 minutes I finally get this:
Communication failure detected when attempting to perform read query outside of a transaction. Attempting to retry query. Error was: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.2.v20111125-r10461): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 940,715 milliseconds ago. The last packet sent successfully to the server was 935,598 milliseconds ago.
It appears as if each HTTP thread gets blocked on an invalid connection without getting an exception and so there's no chance to perform connection validation.
Downtime in the Multi-AZ case is always between 15-16 minutes, so it looks like a timeout of some sort but I was unable to change it.
Things I have tried without success:
connection leak timeout/reclaim
statement leak timeout/reclaim
statement timeout
using a different validation method
using MysqlDataSource instead of MysqlConnectionPoolDataSource
How can I set a timeout on stuck queries so that connections in the pool are reused, validated and replaced?
Or how can I let GlassFish detect a database failover?
As I commented before, it is because the sockets that are open and connected to the database don't realize the connection has been lost, so they stayed connected until the OS socket timeout is triggered, which I read might be usually in about 30 minutes.
To solve the issue you need to override the socket Timeout in your JDBC Connection String or in the JDNI COnnection Configuration/Properties to define the socketTimeout param to a smaller time.
Keep in mind that any connection longer than the value defined will be killed, even if it is being used (I haven't been able to confirm this, is what I read).
The other two parameters I mention in my comment are connectTimeout and autoReconnect.
Here's my JDBC Connection String:
jdbc:(...)&connectTimeout=15000&socketTimeout=60000&autoReconnect=true
I also disabled Java's DNS cache by doing
java.security.Security.setProperty("networkaddress.cache.ttl" , "0");
java.security.Security.setProperty("networkaddress.cache.negative.ttl" , "0");
I do this because Java doesn't honor the TTL's, and when the failover takes place, the DNS is the same but the IP changes.
Since you are using an Application Server, the parameters to disable DNS cache must be passed to the JVM when starting the glassfish with -Dnet and not the application itself.

How can I configure HAProxy to work with server sent events?

I'm trying to add an endpoint to an existing application that sends Server Sent Events. There often may be no event for ~5 minutes. I'm hoping to configure that endpoint to not cut off my server even when the response has not been completed in ~1min, but all other endpoints to timeout if the server fails to respond.
Is there an easy way to support server sent events in HAProxy?
Here is my suggestion for HAProxy and SSE: you have plenty of custom timeout options in HAProxy, and there is 2 interesting options for you.
The timeout tunnel specifies timeout for tunnel connection - used for Websockets, SSE or CONNECT. Bypass both server and client timeout.
The timeout client handles the situation where a client looses their connection (network loss, disappear before the ACK of ending session, etc...)
In your haproxy.cfg, this is what you should do, first in your defaults section :
# Set the max time to wait for a connection attempt to a server to succeed
timeout connect 30s
# Set the max allowed time to wait for a complete HTTP request
timeout client 50s
# Set the maximum inactivity time on the server side
timeout server 50s
Nothing special until there.
Now, still in the defaults section :
# handle the situation where a client suddenly disappears from the net
timeout client-fin 30s
Next, jump to your backend definition and add this:
timeout tunnel 10h
I suggest a high value, 10 hours seems ok.
You should also avoid using the default http-keep-alive option, SSE does not use it. Instead, use http-server-close.