When to call close on CloseableHttpClient instnances - apache-httpclient-4.x

Following the documentation at https://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html
2.3.4. Connection manager shutdown
When an HttpClient instance is no longer needed and is about to go out of scope it is important to shut down its connection manager to ensure that all connections kept alive by the manager get closed and system resources allocated by those connections are released.
CloseableHttpClient httpClient = <...>
httpClient.close();
My confusion is about conflating the the instance going out of scope and needing to shutdown the connection manager.
In my use case, I am using the PoolingConnection so I want to keep the connections open, but of course return them back to the pool.
In my client code I have
ResponseHandler<Integer> rh = new ResponseHandler<Integer>()
.... elided ....
CloseableHttpClient httpclient = this.httpClientBuilder.build();
Integer statusCode = httpclient.execute(httpPost, rh);
My understanding from the docs is that that the useage of ResponseHandler takes care of return the lease
When using a ResponseHandler, HttpClient will automatically take care of ensuring release of the connection back to the connection manager

You understanding is correct. One needs to shut down the connection manager and the underlying connection pool only once it is no longer needed in order to ensure immediate shutdown and dealocation of persistent connections kept alive in the pool.
ResponseHandler ensures the connection leased from the pool gets released back to the manager no matter the outcome of request execution but it is up to the manager either to close the connection or keep it alive for re-use by subsequent requests.

Related

EF Core MySql connections not closing

We have developed a project in .NET Core and Entity Framework Core using the MySql nuget package.
The context is added to dependancy injection using the following line:
services.AddDbContext<ReadWriteContext>(options => options.UseMySQL(Configuration["Machine:ReadWriteConnectionString"]));
Then in a controller, this is injected as such:
public class SystemController : Controller
{
private readonly ReadWriteContext _dataContext;
public SystemController(ReadWriteContext dataContext)
{
_dataContext = dataContext;
}
...
}
And used as such:
var hasServices = await _dataContext.Services.AnyAsync();
In the logs we see the opening and closing log lines:
Opening connection to database 'config_service' on server '10.211.55.5'.
Closing connection to database 'config_service' on server '10.211.55.5'.
However, when we look at the MySql server and run "show full processlist", the connections are still showing as being in the sleep state and never close. When you stop the .NET process, the connections then close and disappear from MySql process list.
How do I get the connections to close when the request is finished. The AddDbContext should be scoped to the current request, but it does not appear to properly close the connections.
Any help would be great?
I'm not aware with MySql. That said if you were in a SQL server context you would be facing the "connection pool": https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-connection-pooling.
That is your application keep a set of active connection to speed up the process of reaching the data server. Thus even if you sais to the pool "I don't need this connection anymore", it does not believe you and keep the connection open... just in case.
For how long, well it depends on logic beyond the scope of the application. One use to say: it is the connection pool realm, just let it do his job.
So the answer for your question is: you can't explicitly close the connection to the server. The connection pool will decide when to close or not.

Rails: Does every http request creates a new connection pool?

i am reading this article https://polycrystal.org/posts/2012-05-25-active-record-connection-pool-fairness.html and it states that every http reuest create a new connection pool. is it true??
If it is true then what if a http request creates two threads that needs to access database then will that two threads create two separate connection pool agian or they will use the connection pool created by a http request.
Thanks,
Not request, but every worker process. The whole concept of connection pooling is to eliminate the need for establishing a db connection in every request.

Couchbase/Membase: Moxi proxy downstream timeout SERVER_ERROR

I have a live Couchbase cluster on two Amazon EC2 instances (version 1.8.0) and about 5 application servers each running PHP with moxi clients on them. Once in a while, Moxi will return a SERVER_ERROR when attempting to access data. This happens about once every few minutes on average. The cluster processes about 500 operations per second.
After inspecting the moxi logs (with -vvv enabled), I notice the following at around the time I get a SERVER_ERROR:
2013-07-16 03:07:22: (cproxy.c.2680) downstream_timeout
2013-07-16 03:07:22: (cproxy.c.1911) 56: could not forward upstream to downstream
2013-07-16 03:07:22: (cproxy.c.2004) 56: upstream_error: SERVER_ERROR proxy downstream timeout^M
I tried increasing the downstream timeout in the moxi configs from 5000 to 25000, but that doesn't help at all. The errors still happen just as frequently.
Can someone suggest any ideas for me to discover the cause of the problem? Or if there's some likely culprit?
SERVER_ERROR proxy downstream timeout
In this error response, moxi reached a timeout while waiting for a
downstream server to respond to a request. That is, moxi did not see
any explicit errors such as a connection going down, but the response
is just taking too long. The downstream connection will be also closed
by moxi rather than putting the downstream connection back into a
connection pool. The default downstream_timeout configuration is 5000
(milliseconds).
Pretty straight forward error, but it can be caused by a few possible things.
try getting the output of "stats proxy" from moxi:
echo stats proxy | nc HOST 11211
obviously you have already figured out that you are concerned with these settings:
STAT 11211:default:pstd_stats:tot_downstream_timeout xxxx
STAT 11211:default:pstd_stats:tot_wait_queue_timeout nnnnn
your downstream_timeout as you've said should appear as 5000
but also check out:
STAT 11211:default:pstd_stats:tot_downstream_conn_queue_timeout 0
from URL:
http://www.couchbase.com/docs/moxi-manual-1.8/moxi-dataflow.html
pretty much a perfect walk through of the way moxi operates.
To understand some of the configurable command-line flags in moxi
(concurrency, downstream_max, downstream_conn_max, downstream_timeout,
wait_queue_timeout, etc), it can be helpful to follow a request
through moxi...
The normal flow of data for moxi is as follows:
A client connects
A client creates a connection (an upstream conn) to moxi.
moxi's -c command-line parameter ultimately controls the limits on
the maximum number of connections.
In this -c parameter, moxi inherits the same behavior as memcached,
and will stop accept()'ing client connections until
existing connections are closed. When the count of existing
connections drops below the -c defined level, moxi will accept() more
client connections.
The client makes a request, which goes on the wait queue
Next, the client makes a request — such as simple single-key
command (like set, add, append, or a single-key get).
At this point, moxi places the upstream conn onto the tail of a wait
queue. moxi's wait_queue_timeout parameter controls how long an
upstream conn should stay on the wait queue before moxi times it out
and responds to the client with a SERVER_ERROR response.
The concurrency parameter
Next, there's a configurable max limit to how many upstream conn
requests moxi will process concurrently off the head of the wait
queue. This configurable limit is called concurrency. (This formerly
used to be known, perhaps confusingly, as downstream_max. For
backwards compatibility, concurrency and downstream_max configuration
flags are treated as synonyms.)
The concurrency configuration is per-thread and per-bucket. That
is, the moxi process-level concurrency is actually concurrency X
num-worker-threads X num-buckets.
The default concurrency configuration value is 1024. This means moxi
will concurrently process 1024 upstream connection requests from
the head of the wait queue. (There are more queues in moxi, however,
before moxi actually forwards a request. This is discussed in later
sections.)
Taking the concurrency value of 1024 as an example, if you have 4
worker threads (the default, controlled by moxi's -t parameter) and 1
bucket (what most folks start out with, such as the "default" bucket),
you'll have a limit of 1024 x 4 x 1 or 4096 concurrently processed
client requests in that single moxi process.
The rationale behind the concurrency increase to 1024 for moxi's
configuration (it used to be much lower) is due to the evolving design
of moxi. Originally, moxi only had the wait queue as its only internal
queue. As more, later-stage queues were added during moxi's history,
we found that getting requests off the wait queue sooner and onto the
later stage queues was a better approach. We'll discuss these
later-stage queues below.
Next, let's discuss how client requests are matched to downstream connections.
Key hashing
The concurrently processed client requests (taken from the head
of the wait queue) now need to be matched up with downstream connections
to the Couchbase server. If the client's request comes with a key
(like a SET, DELETE, ADD, INCR, single-key GET), the request's key is
hashed to find the right downstream server "host:port:bucket" info.
For example, something like — "memcache1:11211:default". If the
client's request was a broadcast-style command (like FLUSH_ALL, or a
multi-key GET), moxi knows the downstream connections that it needs to
acquire.
The downstream conn pool
Next, there's a lookup using those host:port:bucket identifiers into
a downstream conn pool in order to acquire or reserve the
appropriate downstream conns. There's a downstream conn pool per
thread. Each downstream conn pool is just a hashmap keyed by
host:port:bucket with hash values of a linked-list of available
downstream conns. The max length of any downstream conn linked list is
controlled by moxi's downstream_conn_max configuration parameter.
The downstream_conn_max parameter
By default the downstream_conn_max value is 4. A value of 0 means no limit.
So, if you've set downstream_conn_max of 4, have 4 worker threads,
and have 1 bucket, you should see moxi create a maximum of 4
X 4 X 1 or 16 connections to any Couchbase server.
Connecting to a downstream server
If there isn't a downstream conn available, and the
downstream_conn_max wasn't reached, moxi creates a downstream conn as
needed by doing a connect() and SASL auth as needed.
The connect_timeout and auth_timeout parameters
The connect() and SASL auth have their own configurable timeout
parameters, called connect_timeout and auth_timeout, and these
are in milliseconds. The default value for connect_timeout is 400
milliseconds, and the auth_timeout default is 100 milliseconds.
The downstream conn queue
If downstream_conn_max is reached, then the request must wait until a
downstream conn becomes available; the request therefore is
placed on a per-thread, per-host:port:bucket queue, which is called a
downstream conn queue. As downstream conns are released back into the
downstream conn pool, they will be assigned to any requests that are
waiting on the downstream conn queue.
The downstream_conn_queue_timeout parameter
There is another configurable timeout, downstream_conn_queue_timeout,
that defines how long a request should
stay on the downstream conn queue in milliseconds before timing out.
By default, the downstream_conn_queue_timeout is 200 milliseconds. A
value of 0 indicates no timeout.
A downstream connection is reserved
Finally, at this point, downstream conn's are matched up for the
client's request. If you've configured moxi to track timing histogram
statistics, moxi will now get the official start time of the request.
moxi now starts asynchronously sending request message bytes to the
downstream conn and asynchronously awaits responses.
To turn on timing histogram statistics, use the "time_stats=1"
configuration flag. By default, time_stats is 0 or off.
The downstream_timeout parameter
Next, if you've configured a downstream_timeout, moxi starts a timer
for the request where moxi can limit the time it will spend
processing a request at this point. If the timer fires, moxi will
return a "SERVER_ERROR proxy downstream timeout" back to the client.
The downstream_timeout default value is 5000 milliseconds. If moxi sees
this time elapse, it will close any downstream connections that
were assigned to the request. Due to this simple behavior of closing
downstream connections on timeout, having a very short
downstream_timeout is not recommended. This will help avoid repetitive
connection creation, timeout, closing and reconnecting. On an
overloaded cluster, you may want to increase downstream_timeout so
that moxi does not constantly attempt to time out downstream
connections on an already overloaded cluster, or by creating even more
new connections when servers are already trying to process requests on
old, closed connections. If you see your servers greatly spiking, you
should consider making this adjustment.
Responses are received
When all responses are received from the downstream servers for a request (or the
downstream conn had an error), moxi asynchronously
sends those responses to the client's upstream conn. If you've
configured moxi to track timing histogram statistics, moxi now tracks
the official end time of the request. The downstream conn is now
released back to the per-thread downstream conn pool, and another
waiting client request (if any) is taken off the downstream conn queue
and assigned to use that downstream conn.
Backoff/Blacklisting
At step 6, there's a case where a connect() attempt might fail. Moxi
can be configured to count up the number of connect() failures for a
downstream server, and will also track the time of the last failing
connect() attempt.
With the connect() failure counting, moxi can be configured to
blacklist a server if too many connect() failures are seen, which is
defined by the connect_max_errors configuration parameter. When more
than connect_max_errors number of connect() failures are seen, moxi
can be configured to temporarily stop making connect() attempts to
that server (or backoff) for a configured amount of time. The backoff
time is defined via the connect_retry_interval configuration, in
milliseconds.
The default for connect_max_errors is 5 and the connect_retry_interval
is 30000 millisecods, that is, 30 seconds.
If you use connect_max_errors parameter, it should be set greater than
the downstream_conn_max configuration parameter.

DB connection pool expired, then what?

After my connection pool expires, when I try to open more connections in parallel than my max number of allowed connections in the pool, then I start getting timeout exceptions when trying to obtain a connection from the pool.
That is expected, however, the pool seems to be left in that state, where everything else I do since that moment gets the same timeout exceptions. As if each of the connections in the pool had been left busy, and can't be reused. I would expect the connections to get freed up over time, and then other connections being allowed, but this is not happening.
I'm using Play 1.2.5 with a jdbc driver to mysql, and from the logs I reckon the pool is C3P0.
I'm not explicitly closing the connections, since I believe is the right thing to do when using a pool, but I'm not 100% sure.
I don't know if this could be a connection leak in one of the framework/libraries I'm using, or if I'm doing something wrong or not doing something I should.
When I catch one of the timeout exceptions, what is the right thing to do?
You must explicitly close connections when you use a connection pool. A connection pool has a collection of physical connections to a database. When you request a connection from the pool, it marks that physical connection as in-use and hands you a logical handle to that connection. This logical handle essentially is a wrapper or proxy which forwards most method calls (either directly or with some modification) to the physical connection.
When you call close() on this logical handle, the connection pool gets a signal that the physical connection is available again (that is: can be returned to the pool), the logical handle will from then on behave as a closed connection, but the actual physical connection is still open. If you don't call close(), the connection pool never gets this signal so the physical connection will remain in-use and won't be available for re-use.
Some advanced pool configurations allows the pool to detect this situation (eg using a timeout, or maybe with finalizers etc) and reclaim the connection, but you should not depend on that.
TL;DR: Always call close() on your connection when you are done with it, whether it comes from a connection pool, a non-pooled DataSource or DriverManager.

MySQL - Persistent connection vs connection pooling

In order to avoid the overhead of establishing a new connection each time a query needs fired against MySQL, there are two options available:
Persistent connections, whereby a new connection is requested a check is made to see if an 'identical' connection is already open and if so use it.
Connection pooling, whereby the client maintains a pool of connections, so that each thread that needs to use a connection will check out one from the pool and return it back to the pool when done.
So, if I have a multi-threaded server application expected to handle thousands of requests per second, and each thread needs to fire a query against the database, then what is a better option?
From my understanding, With persistent connections, all the threads in my application will try and use the same persistent connection to the database because they all are using identical connections. So it is one connection shared across multiple application threads - as a result the requests will block on the database side soon.
If I use a connection pooling mechanism, I will have all application threads share a pool of connections. So there is less possibility of a blocking request. However, with connection pooling, should an application thread wait to acquire a connection from the pool or should it send a request on the connections in the pool anyway in a round-robin manner, and let the queuing if any, happen on the database?
Having persistent connections does not imply that all threads use the same connection. It just "says" that you keep the connection open (in contradiction to open a connection each time you need one). Opening a connection is an expensive operation, so - in general - you try to avoid opening connections more often than necessary.
This is the reason why multithreaded applications often use connection pools. The pool takes care of opening and closing connections and every thread that needs a connection requests one from the pool. It is important to take care that the thread returns the connection as soon as possible to the pool, so that another thread can use it.
If your application has only a few long running threads that need connections you can also open a connection for each thread and keep this open.
Using just one connection (as you described it) is equal to a connection pool with the maximum size one. This will be sooner or later your bottleneck as all threads will have to wait for the connection. This could be an option to serialize the database operations (perform them in a certain order), although there are better options to ensure serialisation.
Update: The newer X Protocol supports asynchronous connections, and newer drivers like Node's can utilize this.
Regarding your question about should the application server wait for a connection, the answer is yes.
MySQL connections are blocking. When you issue a request from MySQL server over a connection, the connection will wait, idle, until a response is received from the server.
There is no way to send two requests on the same connection and see which returns first. You can only send one request at a time.
So, generally, a single thread in a connection pool consists of one client side connection (in your case, the application server is the client) and one server side connection (database).
Your application should wait for an available connection thread from the pool, allowing the pool to grow when it's needed, and to shrink back to your default number of threads, when it's less busy.