Do the Nagle Algorith and Delayed ACK's affect TCP Connection Setup? - language-agnostic

Do the client's SYN and the servers SYN+ACK get delayed by Nagle? Will the client's ACK of the server's SYN get delayed?
Will connect return after rtt+spt or will it take rtt + spt + 2x Nagle Delay?
Or more generally, how do the Nagle Algorith and Delayed ACK's affect TCP Connection Setup?

No, NAGLE does not effect SYN or SYN+ACK. Neither does delayed ACK (logically enough, you want the SYN+ACK to come back as quick as possible, and there is no congestion state to cause a holdoff yet anyway).

Related

MySQL stuck or network issue?

we have mysql-server(5.5.47)that hosted on physical server. It listen external internet interface(with restrict user access), mysql server intensively used from different places(we use different libraries to communicate with mysql). But sometimes whole mysql server(or network) stuck and stop accept connection, and a clients failed with etimedout(connect)/timeout(recv), even direct connection from server to mysql with mysql cli not working(stuck without any response — seems to be try to establish connections).
First thought was that it is related to tcp backlog, so mysql backlog was increased — but this not help at all.
Issue not repeatable, so last time when this issue happened we sniff traffic, and what we get:
http://grab.by/STwq — screenshot
*.*.27.65 — it is client
*.*.20.80 — it is mysql server
From session we can assume that tcp connection established, but server retransmit SYN/ACK to client(from dump we see that server receive ACK, why retransmit ?), but in normal case mysql must generate init packet and send to client, after connection was established.
It is only screen from 1 session, but all other sessions mostly same, SYN -> SYN/ACK -> ACK -> and server retransmit SYN/ACK up to retries_count.
After restart mysql all get normal immediately after restart. So not sure it is related to network or mysql.
Any thoughts would be appropriate.
Thank you!

How can I configure HAProxy to work with server sent events?

I'm trying to add an endpoint to an existing application that sends Server Sent Events. There often may be no event for ~5 minutes. I'm hoping to configure that endpoint to not cut off my server even when the response has not been completed in ~1min, but all other endpoints to timeout if the server fails to respond.
Is there an easy way to support server sent events in HAProxy?
Here is my suggestion for HAProxy and SSE: you have plenty of custom timeout options in HAProxy, and there is 2 interesting options for you.
The timeout tunnel specifies timeout for tunnel connection - used for Websockets, SSE or CONNECT. Bypass both server and client timeout.
The timeout client handles the situation where a client looses their connection (network loss, disappear before the ACK of ending session, etc...)
In your haproxy.cfg, this is what you should do, first in your defaults section :
# Set the max time to wait for a connection attempt to a server to succeed
timeout connect 30s
# Set the max allowed time to wait for a complete HTTP request
timeout client 50s
# Set the maximum inactivity time on the server side
timeout server 50s
Nothing special until there.
Now, still in the defaults section :
# handle the situation where a client suddenly disappears from the net
timeout client-fin 30s
Next, jump to your backend definition and add this:
timeout tunnel 10h
I suggest a high value, 10 hours seems ok.
You should also avoid using the default http-keep-alive option, SSE does not use it. Instead, use http-server-close.

Couchbase/Membase: Moxi proxy downstream timeout SERVER_ERROR

I have a live Couchbase cluster on two Amazon EC2 instances (version 1.8.0) and about 5 application servers each running PHP with moxi clients on them. Once in a while, Moxi will return a SERVER_ERROR when attempting to access data. This happens about once every few minutes on average. The cluster processes about 500 operations per second.
After inspecting the moxi logs (with -vvv enabled), I notice the following at around the time I get a SERVER_ERROR:
2013-07-16 03:07:22: (cproxy.c.2680) downstream_timeout
2013-07-16 03:07:22: (cproxy.c.1911) 56: could not forward upstream to downstream
2013-07-16 03:07:22: (cproxy.c.2004) 56: upstream_error: SERVER_ERROR proxy downstream timeout^M
I tried increasing the downstream timeout in the moxi configs from 5000 to 25000, but that doesn't help at all. The errors still happen just as frequently.
Can someone suggest any ideas for me to discover the cause of the problem? Or if there's some likely culprit?
SERVER_ERROR proxy downstream timeout
In this error response, moxi reached a timeout while waiting for a
downstream server to respond to a request. That is, moxi did not see
any explicit errors such as a connection going down, but the response
is just taking too long. The downstream connection will be also closed
by moxi rather than putting the downstream connection back into a
connection pool. The default downstream_timeout configuration is 5000
(milliseconds).
Pretty straight forward error, but it can be caused by a few possible things.
try getting the output of "stats proxy" from moxi:
echo stats proxy | nc HOST 11211
obviously you have already figured out that you are concerned with these settings:
STAT 11211:default:pstd_stats:tot_downstream_timeout xxxx
STAT 11211:default:pstd_stats:tot_wait_queue_timeout nnnnn
your downstream_timeout as you've said should appear as 5000
but also check out:
STAT 11211:default:pstd_stats:tot_downstream_conn_queue_timeout 0
from URL:
http://www.couchbase.com/docs/moxi-manual-1.8/moxi-dataflow.html
pretty much a perfect walk through of the way moxi operates.
To understand some of the configurable command-line flags in moxi
(concurrency, downstream_max, downstream_conn_max, downstream_timeout,
wait_queue_timeout, etc), it can be helpful to follow a request
through moxi...
The normal flow of data for moxi is as follows:
A client connects
A client creates a connection (an upstream conn) to moxi.
moxi's -c command-line parameter ultimately controls the limits on
the maximum number of connections.
In this -c parameter, moxi inherits the same behavior as memcached,
and will stop accept()'ing client connections until
existing connections are closed. When the count of existing
connections drops below the -c defined level, moxi will accept() more
client connections.
The client makes a request, which goes on the wait queue
Next, the client makes a request — such as simple single-key
command (like set, add, append, or a single-key get).
At this point, moxi places the upstream conn onto the tail of a wait
queue. moxi's wait_queue_timeout parameter controls how long an
upstream conn should stay on the wait queue before moxi times it out
and responds to the client with a SERVER_ERROR response.
The concurrency parameter
Next, there's a configurable max limit to how many upstream conn
requests moxi will process concurrently off the head of the wait
queue. This configurable limit is called concurrency. (This formerly
used to be known, perhaps confusingly, as downstream_max. For
backwards compatibility, concurrency and downstream_max configuration
flags are treated as synonyms.)
The concurrency configuration is per-thread and per-bucket. That
is, the moxi process-level concurrency is actually concurrency X
num-worker-threads X num-buckets.
The default concurrency configuration value is 1024. This means moxi
will concurrently process 1024 upstream connection requests from
the head of the wait queue. (There are more queues in moxi, however,
before moxi actually forwards a request. This is discussed in later
sections.)
Taking the concurrency value of 1024 as an example, if you have 4
worker threads (the default, controlled by moxi's -t parameter) and 1
bucket (what most folks start out with, such as the "default" bucket),
you'll have a limit of 1024 x 4 x 1 or 4096 concurrently processed
client requests in that single moxi process.
The rationale behind the concurrency increase to 1024 for moxi's
configuration (it used to be much lower) is due to the evolving design
of moxi. Originally, moxi only had the wait queue as its only internal
queue. As more, later-stage queues were added during moxi's history,
we found that getting requests off the wait queue sooner and onto the
later stage queues was a better approach. We'll discuss these
later-stage queues below.
Next, let's discuss how client requests are matched to downstream connections.
Key hashing
The concurrently processed client requests (taken from the head
of the wait queue) now need to be matched up with downstream connections
to the Couchbase server. If the client's request comes with a key
(like a SET, DELETE, ADD, INCR, single-key GET), the request's key is
hashed to find the right downstream server "host:port:bucket" info.
For example, something like — "memcache1:11211:default". If the
client's request was a broadcast-style command (like FLUSH_ALL, or a
multi-key GET), moxi knows the downstream connections that it needs to
acquire.
The downstream conn pool
Next, there's a lookup using those host:port:bucket identifiers into
a downstream conn pool in order to acquire or reserve the
appropriate downstream conns. There's a downstream conn pool per
thread. Each downstream conn pool is just a hashmap keyed by
host:port:bucket with hash values of a linked-list of available
downstream conns. The max length of any downstream conn linked list is
controlled by moxi's downstream_conn_max configuration parameter.
The downstream_conn_max parameter
By default the downstream_conn_max value is 4. A value of 0 means no limit.
So, if you've set downstream_conn_max of 4, have 4 worker threads,
and have 1 bucket, you should see moxi create a maximum of 4
X 4 X 1 or 16 connections to any Couchbase server.
Connecting to a downstream server
If there isn't a downstream conn available, and the
downstream_conn_max wasn't reached, moxi creates a downstream conn as
needed by doing a connect() and SASL auth as needed.
The connect_timeout and auth_timeout parameters
The connect() and SASL auth have their own configurable timeout
parameters, called connect_timeout and auth_timeout, and these
are in milliseconds. The default value for connect_timeout is 400
milliseconds, and the auth_timeout default is 100 milliseconds.
The downstream conn queue
If downstream_conn_max is reached, then the request must wait until a
downstream conn becomes available; the request therefore is
placed on a per-thread, per-host:port:bucket queue, which is called a
downstream conn queue. As downstream conns are released back into the
downstream conn pool, they will be assigned to any requests that are
waiting on the downstream conn queue.
The downstream_conn_queue_timeout parameter
There is another configurable timeout, downstream_conn_queue_timeout,
that defines how long a request should
stay on the downstream conn queue in milliseconds before timing out.
By default, the downstream_conn_queue_timeout is 200 milliseconds. A
value of 0 indicates no timeout.
A downstream connection is reserved
Finally, at this point, downstream conn's are matched up for the
client's request. If you've configured moxi to track timing histogram
statistics, moxi will now get the official start time of the request.
moxi now starts asynchronously sending request message bytes to the
downstream conn and asynchronously awaits responses.
To turn on timing histogram statistics, use the "time_stats=1"
configuration flag. By default, time_stats is 0 or off.
The downstream_timeout parameter
Next, if you've configured a downstream_timeout, moxi starts a timer
for the request where moxi can limit the time it will spend
processing a request at this point. If the timer fires, moxi will
return a "SERVER_ERROR proxy downstream timeout" back to the client.
The downstream_timeout default value is 5000 milliseconds. If moxi sees
this time elapse, it will close any downstream connections that
were assigned to the request. Due to this simple behavior of closing
downstream connections on timeout, having a very short
downstream_timeout is not recommended. This will help avoid repetitive
connection creation, timeout, closing and reconnecting. On an
overloaded cluster, you may want to increase downstream_timeout so
that moxi does not constantly attempt to time out downstream
connections on an already overloaded cluster, or by creating even more
new connections when servers are already trying to process requests on
old, closed connections. If you see your servers greatly spiking, you
should consider making this adjustment.
Responses are received
When all responses are received from the downstream servers for a request (or the
downstream conn had an error), moxi asynchronously
sends those responses to the client's upstream conn. If you've
configured moxi to track timing histogram statistics, moxi now tracks
the official end time of the request. The downstream conn is now
released back to the per-thread downstream conn pool, and another
waiting client request (if any) is taken off the downstream conn queue
and assigned to use that downstream conn.
Backoff/Blacklisting
At step 6, there's a case where a connect() attempt might fail. Moxi
can be configured to count up the number of connect() failures for a
downstream server, and will also track the time of the last failing
connect() attempt.
With the connect() failure counting, moxi can be configured to
blacklist a server if too many connect() failures are seen, which is
defined by the connect_max_errors configuration parameter. When more
than connect_max_errors number of connect() failures are seen, moxi
can be configured to temporarily stop making connect() attempts to
that server (or backoff) for a configured amount of time. The backoff
time is defined via the connect_retry_interval configuration, in
milliseconds.
The default for connect_max_errors is 5 and the connect_retry_interval
is 30000 millisecods, that is, 30 seconds.
If you use connect_max_errors parameter, it should be set greater than
the downstream_conn_max configuration parameter.

How to deal with lost client connections in Apache Qpid + MRg

Using the c++ client, it seems that if for some reason the connection is lost to the server, for example through power failure, manual termination, network , then the server does not detect that the client is lost. An open connection (or half open) is kept. How can this be prevented? Is there some server side heartbeat option?
The client connection to the broker has a connection option called heartbeat. Heartbeat option values are a integer representing time in seconds. Heartbeats keepalive frames are sent every N seconds. If two successive heartbeats are missed the connection is considered to be lost.
See connection options

mysql TIME_WAIT ; too many connections problem

When i was checking for mysql load time on site. i got result showing connections as TIME_WAIT. Even though i close connection on every page. Sometimes the site doesnt load saying too many connections. What could be solution to this problem?
Thanks in Advance for any replies or suggestions
If a client connects to a MySQL-Server, it usually opens a local port, example:
localhost:12345 -> mysqlserver:3306
If the client closes the connection, the client gets a TIME_WAIT. Due to TCP routing, a packet might arrive late on the temporary port. A connection in TIME_WAIT just discards these packets. Without a TIME_WAIT, the local port might be reused for another connection and might receive packets from a former connection.
On an high frequent application on the web which opens a mysql-connection per request, a high amount of TIME_WAIT connections is expectable. There is nothing wrong with it.
Problems can occur, if your local port range is too low, so you cannot open outgoing connections any more. The usual timeout is set to 60 seconds. So a problem can already occur on more than 400 requests per second on low ranges.
Check:
To check the amount of TIME_WAIT, you can use the following command:
$ cat /proc/net/sockstat
sockets: used 341
TCP: inuse 12 orphan 0 tw 33365 alloc 23 mem 16
UDP: inuse 9 mem 2
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0
The value after "tw", in this case 33365, shows the amount of TIME_WAIT.
Solutions:
a. TIME_WAIT tuning (Linux based OS examples):
Reduce the timeout for TIME_WAIT:
# small values are ok, if your mysql server is in the same local network
echo 15 > /proc/sys/net/ipv4/tcp_fin_timeout
Increase the port range for local ports:
# check, what you highest listening ports are, before setting this
echo 15000 65000 > /proc/sys/net/ipv4/ip_local_port_range
The settings /proc/sys/net/ipv4/tcp_tw_recycle and /proc/sys/net/ipv4/tcp_tw_reuse might be interesting, too. (But we experienced strange side effects with these settings, so better avoid them. More informations in this answer)
b. Persistent Connections
Some programming languages and libraries support persistent connections. Another solution might be using a locally installed proxy like "ProxySQL". This reduces the amount of new and closed connections.
If you are getting alot of TIME_WAIT connections on the Mysql Server then that means that Mysql server is closing the connection. The most likely case in this instance would be that a host or several hosts got on a block list. You can clear this by running
mysqladmin flush-hosts
to get a list of the number of connections you have per ip run,
netstat -nat | awk {'print $5'} | cut -d ":" -f1 | sort | uniq -c | sort -n
you can also confirm this is happening by going to one of your clients that is having trouble connecting and telnet to port 3306. It will thow a message with something like,
telnet mysqlserver 3306
Trying 192.168.1.102...
Connected to mysqlserver.
Escape character is '^]'.
sHost 'clienthost.local' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'Connection closed by foreign host.
As #Zimbabao suggested in the comment, debug your code for any potential errors that may halt the execution of closing the Mysql connection.
If nothing works, check your my.cnf for a system variable called wait_timeout. If its not present add it to the section [mysqld] and restart your Mysql server.
[mysqld]
wait_timeout = 3600
Its the number of seconds the server waits for activity on a noninteractive connection before closing it. Further information can be found http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_wait_timeout
Tune the figure 3600 (1 hour) to your requirements.
HTH