MySQL 5.5 : "Got an error reading communication packets" - mysql

I just upgraded MySQL from 5.1 to 5.5.
I fixed few issues running mysql_upgrade, and changing some deprecated configurations...
I also updated PHP, from 5.3.3-7 to 5.3.29-1.
But, since that, I'm having a reccurent problem (always thrown in this order) :
1. Client* - PHP Warning
Warning: Packets out of order. Expected 1 received 0. Packet size=1 in
/home/www/www.mywebsite.com/shared/vendor/doctrine/dbal/lib/Doctrine/DBAL/Connection.php
line 694
2. Client* - PHP Warning
Warning: PDOStatement::execute() [pdostatement.execute]: Error reading
result set's header in
/home/www/www.mywebsite.com/shared/vendor/doctrine/dbal/lib/Doctrine/DBAL/Connection.php
line 694
3. Server* - MySQL Warning :
150127 17:25:15 [Warning] Aborted connection 309 to db:
'my_database' user: 'root' host: '127.0.0.1' (Got an error
reading communication packets)
4. Client* - PHP Error
PDOStatement::execute() [pdostatement.execute]: MySQL server
has gone away in
/home/www/www.mywebsite.com/shared/vendor/doctrine/dbal/lib/Doctrine/DBAL/Connection.php
line 694
*NB: What I call "Client" is the PHP Application, and "Server" is the MySQL Server, even if they're both on the same localhost Server.
So, apparently, the origin of all those problems is the first one : "Packets out of order".
But when I search for this error I can't find many answers, and they are most of the time not related to my problem : I use Doctrine as an abstraction, so I don't write any query or fetch any result myself. Plus, it's almost never the same values as me, but in my case I always get those values ("Expected 1 received 0. Packet size=1").
The closest result would be this MySQL bug report, but "No feedback was provided for this bug for over a month, so it is
being suspended automatically"...
Plus, some of the "2." errors aren't thrown by my PHP Doctrine code (they're not executed from my localhost, but from another known external service, probably using some old PHP Propel code).
So that might mean there is a problem with my MySQL configuration itself, but I tried changing some parameters without obtaining any obvious effect (sometimes it takes more time after restarting MySQL to get the first errors for example).
Any help would be very much appreciated !
And here is my current configuration (I've got 2 MySQL instances, the second one using replication is mostly for read only).
I also checked most of the system resources with Munin and didn't see anything abnormal (the RAM usage for example is pretty high, but as there is 50Go on the server it's not full at all).
UPDATE
I isolated an SQL query that was repeatedly failing from my PHP Client. When I executed from my local with MySQL Workbench, it did exactly the same (closed the connexion with a MySQL server has gone away message). When I did it from the sql command line it also did the same. Then I executed it from the sql command line on the server host, and it succeded. But some time after when I tried again from Workbench/whatever it worked... So it looks like those "corrupted packets" are cached and disapear after some time.

Thanks, I fixed this issue doing :
RESET QUERY CACHE;
FLUSH QUERY CACHE;

Related

MariaDB stops working without reason on small centos8 server

I have small web/mail server with apache/mariadb. Last week we changed some of the WWW code and to make it work I changed in php.ini line :
max_input_vars to 5000 (now 4000, it was 1000 at the start)
And it seems changed something because our mariadb 10.3.28 starts making problems.
It just stops reciving any information.
Restart of mysql (and httpd) helps for 24h now ...
Log:
2022-10-05 14:28:58 2796199 [Warning] Aborted connection 2796199 to db: 'ACTIVEDB' user: 'USER' host: 'localhost' (Got an error reading communication packets)
This kind of warnings shows up sometimes but now we got dozens every hour.
In PHP i decrased max_input_vars, in my.cnf I addedd
max_allowed_packet = 124M
max_connections = 400
log_warnings = 3
Everything was at default values before.
Log level was for some time at level 4 but it stared to get too big without any time give to "crush".
Disk is nvme 500GB, Intel and shows no problems.
I would like to hear :
how to check/connect mariadb when it looks inactive
what and how (step by step) to check
Thanks all
This is not an answer, but too long for a comment.
The error "Aborted connection ... (Got an error reading communication packets)" occurs if a client disconnected without sending a COM_CLOSE notification to the server before.
This behavior is easily reproducible, e.g. by starting the command line client and killing the command line client from another session. Depending on the log_warning level, the server will write a log entry and increase the server status variable aborted_clients (or aborted_connects if this happens during connection handshake).
Here are only a few possible reasons:
Before 10.2.3 the default log_warning level was 1 (no logging of aborted_connections), since 10.2.4 default value is 2 (log aborted connections) - if the server was recently upgraded from 10.2.3 or lower the problem may have already existed in previous installation, but was not written into log file.
The PHP script(s) doesn't close the connection: As soon the script is ready with its work, make sure that all transactions were committed, memory (result sets) were freed, and the connection was closed properly.
A timeout occurred, e.g. wait_timeout was set too low and exceeded or PHP's max_execution_time exceeded (and script was killed) or net_read/write_timeouts occurred.
DNS problems: In this case enable skip-name-resolve, use IP's and verify against IP's.
Network or firewall problems
I would like to thank for sugestions and your time.
Looks like there was a problem with one table, I don't know how/why but after some time it was set up as READ-ONLY. There was no information in log about it up to level 3 (level 4 generated too much information for me :( )
The DB works some time fine (expect for one table) and in time it looks like it just hangs whole DB.
The case is still "under investigation".
About "damaged" table :
listed and read-only related thinks works fine
insert/change hangs the whole DB for 2-3 minutes and then gets back to work
after few "hangs" the DB just freezes
there was nothing strange in logs level 3
copy table to new and then change names to switch tables works (I hope so)
If anyone have any idea how to check table would be great (standard operatinos like check, analyse did nothing).
Thanks again.

Google MySQL fails with ERROR 2013 HY000 system error 2

I have a D8 google mysql instance. I'm running an etl process trying to push about 100GB of data in but the script keep stopping because of the error.
In order to get it working I have to restart the mysql instance and then re run the process from where it failed. Any help is greatly appreciated. I haven't found anything on google.
This error tends to mean the connection to the MySQL server was lost at some point. As mentioned in this 'Lost connection to MySQL server' article, this error may be encountered when writing a significant number or rows.
The article suggests setting the net_read_timeout to something greater than the default value of 30 seconds. You may also want to consider breaking up your write tasks as well.

MySQL Query running even after losing connection

I've a MySQL 5.1.41 Server installed on a Ubuntu machine. I get connected to it through Workbench from my Windows machine over TCP/IP. I run a bigger query, after 900 seconds I got the below message, (there is no wait_timeout defined in the server's configuration file my.cnf)
Error Code: 2013. Lost connection to MySQL server during query
But when I look into the process list by using show processlist; command, I can still see my query running.
I got this link http://dev.mysql.com/doc/refman/5.0/en/gone-away.html where I found the below lines,
The problem on Windows is that in some cases MySQL does not get an
error from the OS when writing to the TCP/IP connection to the server,
but instead gets the error when trying to read the answer from the
connection.
I'm not sure whether this is the reason for my observation.
Please clarify me on this.
Thanks in advance!!
Closing connection is not a reason to stop a query. A query might be update, or kind of transaction, or select with output to remote (server) file.
Closed connection is just is just means, that you will not receive any data from DBMS after executing query (data, timings - nothing).
The reason of closing connection could be different, as SO-User posted. Try increasing
on server side:
wait_timeout
max_allowed_packet
on client side:
any kinds of timeout you find in your client (i.e. that SO-User suggests)
Do not forget to reload DBMS config and restart client (for sure)
In MySQL WorkBench we have an option to change timeout.
Find it under
Edit → Preferences → SQL Editor → DBMS connection read time out (in seconds): 600
Changed the value to 6000 or something higher.
Update
Lost connection to MySQL server
There are three likely causes for this error message.
Usually it indicates network connectivity trouble and you should check
the condition of your network if this error occurs frequently. If the
error message includes “during query,” this is probably the case you
are experiencing.
Sometimes the “during query” form happens when millions of rows are
being sent as part of one or more queries. If you know that this is
happening, you should try increasing net_read_timeout from its default
of 30 seconds to 60 seconds or longer, sufficient for the data
transfer to complete.
More rarely, it can happen when the client is attempting the initial
connection to the server. In this case, if your connect_timeout value
is set to only a few seconds, you may be able to resolve the problem
by increasing it to ten seconds, perhaps more if you have a very long
distance or slow connection. You can determine whether you are
experiencing this more uncommon cause by using SHOW GLOBAL STATUS LIKE
'Aborted_connects'. It will increase by one for each initial
connection attempt that the server aborts. You may see “reading
authorization packet” as part of the error message; if so, that also
suggests that this is the solution that you need.
If the cause is none of those just described, you may be experiencing
a problem with BLOB values that are larger than max_allowed_packet,
which can cause this error with some clients. Sometime you may see an
ER_NET_PACKET_TOO_LARGE error, and that confirms that you need to
increase max_allowed_packet.
Doc link: Error lost connection
and also check here

ColdFusion 10 Communications link failure to MySQL

We are migrating some websites onto a cloud infrastructure running Windows 2008 virtual machines. These websites all run on ColdFusion with MySQL databases. They currently are running in our CoLo with no problems. Additionally, they are running on our development network in our offices with no problems.
We are setting up our cloud to match as closely as possible the configuration we currently use which is, essentially, CF10 + IIS on one server and MySQL on a separate machine. We are 99% finished and most things are running great. However....
We have run into a couple, as in 2, places where we click a link/button and are greeted with:
Error Executing Database Query.
Communications link failure The last packet successfully received from the server was 0 milliseconds ago. The last packet sent successfully to the server was 0 milliseconds ago.
Scanning the stack-trace I also find:
Caused by: java.net.SocketException: Connection reset
The communications link error is ALWAYS: 0ms.
What's most puzzling is the Queries that seem to be causing this are simple queries that are used ALL OVER the sites with no problems. Why they are failing at hese 2 particular places has us at wits end.
Our only clue is, looking at the CF Error description of what scripts are called, we can see the script where the query is failing is getting called twice? For example, one of the occurences is in our Application file:
>The error occurred in D:/Our_Web_Sites/oursite/Application.cfm: line 73
>Called from D:/Our_Web_Sites/oursite/Application.cfm: line 17
>Called from D:/Our_Web_Sites/oursite/Application.cfm: line 1
>Called from D:/Our_Web_Sites/oursite/Application.cfm: line 73
>Called from D:/Our_Web_Sites/oursite/Application.cfm: line 17
>Called from D:/Our_Web_Sites/oursite/Application.cfm: line 1
We can find nothing in our CF code that would be causing the script to be called twice so our guess is the first call is failing on the Query so CF tries again...only to fail and error.
Googling this issue I've found lots of posts about changing the MySQL timeouts. None of those worked and I didn't expect them to since what we're dealing with doesn't appear to be a timeout issue. These pages fail each and every time.
The closest we've come to a solution came from this blog posting:
http://www.talkingtree.com/blog/index.cfm/2011/1/12/Validation-Query-for-MySQL-communications-link-failure!
If we UNCHECK the "Maintain connections across client requests. " setting in CFAdmin then the error goes away. The blog suggests leaving that checked, which is our preference, and using Connection Validation of "SELECT 1;". Try that...same error.
We've also tried the JDBC AutoConnect=true option. No effect.
Downloaded latest JDBC Connector and used it instead of standard CF10-MySQL connector. No effect.
Again, 99% of the site works with the exception of these two links, both of which work just fine in all our other environments. Any other ideas?
I feel like I've had a similar problem every time I upgrade CF or MySQL. Usually a change in the JDBC driver or connection string helps, which I see you already tried.
Have you checked the MySQL error log for any hints? Ours is in /var/lib/mysql (whatever your 'datadir' variable is set to) and ends with a .err extension.
Also, maybe trying some of the other JDBC connection string options for your version? I see some extended logging you can enable.
http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-configuration-properties.html
Found the issue. We are running our network on Savvis' cloud infrastructure. The Windows server instances we were using from Savvis had Trend Micro Deep Security Agent installed. This is an intrusion protection system and it was the problem. Disabling the service cleared up all communication errors. I have no clue why it was rejecting some queries that it had just accepted previously. I am just glad to (finally) put this behind me!

how to understand "can't connect" mysql error messages?

I have the following error message:
SQLSTATE[HY000] [2003] Can't connect to MySQL server on
'192.168.50.45' (4)
How would I parse this (I have HY000, I have 2003 and I have the (4).
HY000 is a very general ODBC-level error code, and 2003 is the MySQL-specific error code that means that the initial server connection failed. 4 is the error code from the failed OS-level call that the MySQL driver tried to make. (For example, on Linux you will see "(111)" when the connection was refused, because the connect() call failed with the ECONNREFUSED error code, which has a value of 111.)
Using the perror tool that comes with MySQL:
shell> perror 4
OS error code 4: Interrupted system call
It might a bug where incorrect error is reported, in this case, it might a simple connection timeout (errno 111)
FWIW, having spent around 2-3 months looking into this in a variety of ways, we have come to the conclusion that (at least for us), the (4) error happen when the network is too full of data for the connection to complete in a sane amount of time. from our investigations, the (4) occurs midway through the handshaking process.
You can see this in a unix environment by using 'netem' to fake network congestion.
The quick solution is to up the connection timeout parameter. This will hide any (4) error, but may not be the solution to the issue.
The real solution is to see what is happeneing at the DB end at the time. If you are processing a lot of data when this happens, it may be a good ideas to see if you can split this into smaller chunks, or even pas the processing to a different server, if you have that luxury.
I happened to face this problem. Increase the connect_timeout worked out finally.
I was just struggling with the same issue.
Disable the DNS hostname lookups solved the issue for me.
[mysqld]
...
...
skip-name-resolve
Don't forget to restart MySQL to take effect.
#cdhowie While you may be right in other circumstances, with that particular error the (4) is a mysql client library error, caused by a failed handshake. Its actually visible in the source code. The normal reason is too much data causing an internal timeout. Making 'room' for the connection normally sorts it without masking the issue, like upping the timeout or increasing bandwidth.