Apache2 in Freebsd Concurrent request cause Connection reset - json

I am currently trying to move my web server (php zendframework based) from Ubuntu to FreeBSD. Both the servers having the same hardware configuration. After migration, I did JMeter test (Http request (Json), Concurrent = 200) of the server, "Throughput" in FreeBSD server was double that of the Ubuntu server which is amazing.
However, when I increase the concurrent to 500, I see almost 50% request Failure due to "java.net.SocketException: Connection reset". But it works as normal in the Ubuntu server.
After many times testing, I found Ubuntu can handle 1500 concurrent httprequest without error, FreeBSD server can handle 200 concurrent request with double speed without error, but cannot handle more. In order to verify the result, I tried AB command. **ab -c 200 -n 5000 127.0.0.1/responseController. It fails and terminate if the ¬-c parameter is over 200, but works fine in Ubuntu.
For debugging I did following:
1. adjust httpd.conf, /boot/loader.conf, /etc/sysctl.conf somehow, but looks like nothing changed.
2. I try switch to mpn_worker_module in Apache configuration and its relevant configuration in php. Nothing changed but failure part log was different, which showed "request failure to respond" rather than "java.net.SocketException: Connection reset"
I did a lot of search but couldn't find the cause of this failure. I though the Json request would be waiting until response or timeout?
I am not sure which configuration file or parameter will make it work.
Please help.

Thanks for Michael Zhilin, yes "ipfw" did something to caused this, and yes "kern.ipc.soacceptqueue" is the bottleneck in this case.

Related

VerneMQ plugin_chain_exhausted Authentication MySQL

I have a running instance of VerneMQ (cluster of 2 nodes) on Google kubernets and using MySQL (CloudSQL) for Auth. Server accepts connections over TLS
It works fine, but after a few days i start seeing this message on the log:
can't authenticate client {[],<<"Client-id">>} from X.X.X.X:16609 due to plugin_chain_exhausted
The client app (paho) complains that the server refused the connection for being "not authorized (code=5 in paho error)"
after a few retry it finally connects. but every time it get's harder and harder until it just won't connect anymore
If i restart VerneMQ everything get's back to normal
I have only 3 clients currently connected at most, at the same time.
clients already connected have no issues in pub/sub.
In my configuration i have (among other things):
log.console.level=debug
plugins.vmq_diversity=on
vmq_diversity.mysql.* = all of them set
allow_anonymous=off
vmq_diversity.auth_mysql.enabled=on
it's like the server degrades over time. the status webpage reports no problem
My verne server was build from the git repository about a month ago and runs on a docker container
what could be the cause?
what else could i check to find posibles causes? maybe a diversity missconfiguration?
Tks
To quickly explain the plugin_chain_exhausted log: with Verne you can run multiple authentication/authorization plugins, and they will be checked in a chain. If one plugin allows the client, it will be in. If no plugin allows the client, you'll see the log above.
This does not explain the behaviour you describe, though. I don't think I have seen that.
In any case, the first thing to check is whether you actually run multiple plugins. For instance: have you disabled the vmq.passwd and the vmq.acl plugins?

Jmeter: Getting "java.net.ConnectException: Connection timed out: connect" error

I am trying to hit 350 users but Jmeter failing script by saying Connection timed out.
I have added following:
http.connection.stalecheck$Boolean=true in hc.parameter file
httpclient4.retrycount=1
hc.parameter.file=hc.parameter
Is there anything that I am missing to add on?
This normally indicates a problem on application under test side so I would recommend checking the logs of your application for anything suspicious.
If everything seems to be fine there - check logs of your web and database servers, for instance Apache HTTP Server allows 150 connections by default, MySQL - 100, etc. so you may need to identify whether you are suffering from this form of limits and what needs to be done to raise them
And finally it may be simply lack of CPU or free RAM on application under test side so next time you run your test keep an eye on baseline OS health metrics as application may respond slowly or even hang if it doesn't have any spare headroom. You can use JMeter PerfMon plugin to integrate this form of monitoring with your load test.

MySQL Query running even after losing connection

I've a MySQL 5.1.41 Server installed on a Ubuntu machine. I get connected to it through Workbench from my Windows machine over TCP/IP. I run a bigger query, after 900 seconds I got the below message, (there is no wait_timeout defined in the server's configuration file my.cnf)
Error Code: 2013. Lost connection to MySQL server during query
But when I look into the process list by using show processlist; command, I can still see my query running.
I got this link http://dev.mysql.com/doc/refman/5.0/en/gone-away.html where I found the below lines,
The problem on Windows is that in some cases MySQL does not get an
error from the OS when writing to the TCP/IP connection to the server,
but instead gets the error when trying to read the answer from the
connection.
I'm not sure whether this is the reason for my observation.
Please clarify me on this.
Thanks in advance!!
Closing connection is not a reason to stop a query. A query might be update, or kind of transaction, or select with output to remote (server) file.
Closed connection is just is just means, that you will not receive any data from DBMS after executing query (data, timings - nothing).
The reason of closing connection could be different, as SO-User posted. Try increasing
on server side:
wait_timeout
max_allowed_packet
on client side:
any kinds of timeout you find in your client (i.e. that SO-User suggests)
Do not forget to reload DBMS config and restart client (for sure)
In MySQL WorkBench we have an option to change timeout.
Find it under
Edit → Preferences → SQL Editor → DBMS connection read time out (in seconds): 600
Changed the value to 6000 or something higher.
Update
Lost connection to MySQL server
There are three likely causes for this error message.
Usually it indicates network connectivity trouble and you should check
the condition of your network if this error occurs frequently. If the
error message includes “during query,” this is probably the case you
are experiencing.
Sometimes the “during query” form happens when millions of rows are
being sent as part of one or more queries. If you know that this is
happening, you should try increasing net_read_timeout from its default
of 30 seconds to 60 seconds or longer, sufficient for the data
transfer to complete.
More rarely, it can happen when the client is attempting the initial
connection to the server. In this case, if your connect_timeout value
is set to only a few seconds, you may be able to resolve the problem
by increasing it to ten seconds, perhaps more if you have a very long
distance or slow connection. You can determine whether you are
experiencing this more uncommon cause by using SHOW GLOBAL STATUS LIKE
'Aborted_connects'. It will increase by one for each initial
connection attempt that the server aborts. You may see “reading
authorization packet” as part of the error message; if so, that also
suggests that this is the solution that you need.
If the cause is none of those just described, you may be experiencing
a problem with BLOB values that are larger than max_allowed_packet,
which can cause this error with some clients. Sometime you may see an
ER_NET_PACKET_TOO_LARGE error, and that confirms that you need to
increase max_allowed_packet.
Doc link: Error lost connection
and also check here

PHP MySQL Connection Timed Out on private network

I am getting intermittent 'Connection Timed Out' errors when a php script on my web server connects to the MySQL database server over the private network. However, if I tell the script to use the public network to connect, these errors do not appear.
My connection script is setup so that whenever I try to connect to mysql, it checks for errors, if there is an error, it sends me an email then automatically switches to the public network to try that connection. If the public connection fails, it sends me another email and displays a custom web page to the user.
I get about 5 to 10 connection errors every hour. There are hundreds of successful connections every minute.
These machines are dedicated machines. I contacted our hosting company and they tested the routers and cables and said everything is fine. I tried pinging the servers both ways and there are no errors at all for test periods over an hour.
I am using the latest Nginx with the latest PHP and PHP-FPM. Mysql is 5.5.27. These are Centos 6 64bit systems with that latest updates.
I've tried many network configuration options, adjustments to php-fpm & mysql config file and no matter what I do or change, nothing fixes it.
The weird thing is, everything works great over the public network and pings and file transfer work great over the private network between both machines.
Any ideas?
** UPDATE **
I made some changes to the PHP-FPM config file and to the MySQL config file and the errors are now about 2 to 3 per hour but still unresolved.
I'm not sure this is your case but still worth mentioning as it helped me in a similar situation. Basically, there is a cap on max number of connections in linux kernel: https://serverfault.com/questions/10852/what-limits-the-maximum-number-of-connections-on-a-linux-server
Not sure if it is shared between all the networks, but if you think it's worth checking I'd just raise those variable values say twice and see if it had any effect on how frequently the error happens.

A transport-level error has occurred when receiving results from the server [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I'm getting a SQL Server error:
A transport-level error has occurred
when receiving results from the
server. (provider: Shared Memory
Provider, error: 0 - The handle is
invalid.)
I'm running Sql Server 2008 SP1, Windows 2008 Standard 64 bit.
It's a .Net 4.0 web application. It happens when a request is made to the server. It's intermittent. Any idea how I can resolve it?
The database connection is closed by the database server. The connection remains valid in the connection pool of your app; as a result, when you pickup the shared connection string and try to execute it's not able to reach the database. If you are developing Visual Studio, simply close the temporary web server on your task bar.
If it happens in production, resetting your application pool for your web site should recycle the connection pool.
Try the following command on the command prompt:
netsh interface tcp set global autotuning=disabled
This turns off the auto scaling abilities of the network stack
I had the same problem. I restarted Visual Studio and that fixed the problem
Transport level errors are often linked to the connection to sql server being broken ... usually network.
Timeout Expired is usually thrown when a sql query takes too long to run.
So few options can be :
Check for the connection in VPN (if used) or any other tool
Restart IIS
Restart machine
Optimize sql queries.
For those not using IIS, I had this issue when debugging with Visual Studio 2010. I ended all of the debugger processes: WebDev.WebServer40.EXE which solved the issue.
All you need is to Stop the ASP.NET Development Server and run the project again
If you are connected to your database via Microsoft SQL Server Management, close all your connections and retry.
Had this error when connected to another Azure Database, and worked for me when closed it.
Still don't know why ..
Look at the MSDN blog which details out this error:
Removing Connections
The connection pooler removes a connection from the pool after it has
been idle for a long time, or if the pooler detects that the
connection with the server has been severed.
Note that a severed connection can be detected only after attempting
to communicate with the server. If a connection is found that is no
longer connected to the server, it is marked as invalid.
Invalid connections are removed from the connection pool only when
they are closed or reclaimed.
If a connection exists to a server that has disappeared, this
connection can be drawn from the pool even if the connection pooler
has not detected the severed connection and marked it as invalid.
This is the case because the overhead of checking that the connection
is still valid would eliminate the benefits of having a pooler by
causing another round trip to the server to occur.
When this occurs, the first attempt to use the connection will detect
that the connection has been severed, and an exception is thrown.
Basically what you are seeing is that exception in the last sentence.
A connection is taken from the connection pool, the application does
not know that the physical connection is gone, an attempt to use it is
done under the assumption that the physical connection is still there.
And you get your exception.
There are a few common reasons for this.
The server has been restarted, this will close the existing connections.
In this case, have a look at the SQL Server log, usually found at:
C:\Program Files\Microsoft SQL Server\\MSSQL\LOG
If the timestamp for startup is very recent, then we can suspect that
this is what caused the error. Try to correlate this timestamp with
the time of exception.
2009-04-16 11:32:15.62 Server Logging SQL Server messages in file
‘C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG’.
Someone or something has killed the SPID that is being used.
Again, take a look in the SQL Server log. If you find a kill, try to
correlate this timestamp with the time of exception.
2009-04-16 11:34:09.57 spidXX Process ID XX was killed by
hostname xxxxx, host process ID XXXX.
There is a failover (in a mirror setup for example) again, take a look in the SQL Server log.
If there is a failover, try to correlate this timestamp with the time
of exception.
2009-04-16 11:35:12.93 spidXX The mirrored database “” is changing roles from “PRINCIPAL” to “MIRROR” due to
Failover.
Was getting this, always after about 5 minutes of operation. Investigated and found that a warning from e1iexpress always occurred before the failure. This apparently is an error having to do with certain TCP/IP adapters. But changing from WiFi to hardwired didn't affect it.
So tried Plan B and restarted Visual Studio. Then it worked fine.
On closer study I noticed that, when working correctly, the message The Thread '<No Name>' has exited with code 0 occurred at almost exactly the time the run crashed in previous attempts. Some Googling reveals that that message comes up when (among other things) the server is trimming the thread pool.
Presumably there was a bogus thread in the thread pool and every time the server attempted to "trim" it it took the app down.
You get this message when your script make SQL Service stopped for some reasons. so if you start SQL Service again perhaps your problem will be resolved.
I know this may not help everyone (who knows, maybe yes), but I had the same problem and after some time, we realized that the cause was something out of the code itself.
The computer trying to reach the server, was in another network, the connection could be established but then dropped.
The way we used to fix it, was to add a static route to the computer, allowing direct access to the server without passing thru the firewall.
route add –p YourServerNetwork mask NetworkMask Router
Sample:
route add –p 172.16.12.0 mask 255.255.255.0 192.168.11.2
I hope it helps someone, it's better to have this, at least as a clue, so if you face it, you know how to solve it.
I got the same error in Visual Studion 2012 development environment, stopped the IIS Express and rerun the application, it started working.
I had the same issue. I solved it, truncating the SQL Server LOG.
Check doing that, and then tell us, if this solution helped you.
For me the solution was totally different.
In my case I had an objectsource which required a datetimestamp parameter. Even though that ODS parameter ConvertEmptyStringToNull was true 1/1/0001 was being passed to SelectMethod. That in turn caused a sql datetime overflow exception when that datetime was passed to the sql server.
Added an additional check for datetime.year != 0001 and that solved it for me.
Weird that it would throw a transport level error and not a datetime overflow error.
Anyways..
In my case the "SQL Server" Server service stopped. When I restarted the service that enabled me to run the query and eliminate the error.
Its also a good idea to examine your query to find out why the query made this service stop
For me the answer is to upgrade the OS from 2008R2 to 2012R2, the solution of iisreset or restart apppool didn't work for me.
I also tried to turn of TCP Chimney Offload setting, but I didn't restart the server because it is a production server, which didn't work either.
We encountered this error recently between our business server and our database server.
The solution for us was to disable "IP Offloading" on the network interfaces.
Then the error went away.
One of the reason I found for this error is 'Packet Size=xxxxx' in connection string. if the value of xxxx is too large, we will see this error. Either remove this value and let SQL server handle it or keep it low, depending on the network capabilities.
It happened to me when I was trying to restore a SQL database and checked following Check Box in Options tab,
As it's a stand alone database server just closing down SSMS and reopening it solved the issue for me.
This occurs when the database is dropped and re-created some shared resources is still considering the database still exists, so when you re-run execute query to create tables in the database after it was re-created the error will not show again and Command(s) completed successfully. message will show instead of the error message Msg 233, Level 20, State 0, Line 0 A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.).
Simply ignore this error when you are dropping and recreating databases and re-execute your DDL queries with no worries.
I faced the same issue recently, but i was not able to get answer in google.
So thought of sharing it here, so that it can help someone in future.
Error:
While executing query the query will provide few output then it will throw below error.
"Transport level error has occurred when receiving output from
server(TCP:provider,error:0- specified network name is no longer
available"
Solution:
Check the provider of that linked server
In that provider properties ,Enable "Allow inprocess" option for that particular provider to fix the issue.