mysql server has gone away - how do I debug the reason? - mysql

I'm experiencing "mysql server has gone away" in my import script written in php
This http://dev.mysql.com/doc/refman/5.0/en/gone-away.html page has a list of possible reasons. But, how do I debug this, to know which of the reasons?
If some time out fired, I want to know which that his happened, which timeout etc. relevant details. If the query was broken, I want to know that this is the reason.
So, do I have a way to receive the details additionally to just the very global piece of information that the server "has gone away"?
Is there a way to log this stuff: timeouts for example?

I wouldn't think that there's a MySQL magic bullet to find out. It sounds like you need to use operating system and/or network diagnostic tools. Maybe within PHP you can do some things like try pinging the server, checking if the MySQL daemon is running, etc. You might also want to check into using a tool such as Wireshark or tcpdump to see if you're seeing anything funky on the network, such as reset packets or other dropped connection indicators.
If the MySQL server is remote, try doing a constant ping to it and see if you're seeing dropped or delayed packets. Make sure you check with large packets; I've seen systems on which a ping works fine, but routers screw up larger packet sizes due to MTU mismatches and such. On Windows, it is:
ping -t -l 1500 mysqlhost
On most Linux and Unix systems, it is:
ping -s 1500 mysqlhost

Related

Google compute engine, instance dead? How to reach?

I have a small instance running in GCE, had some troubles with the MongoDb so after some tries decided to reset the instance. But... it didn't seem to come back online. So i stopped the instance and restarted it.
It is an Bitnami MEAN stack which starts apache and stuff at startup.
But... i can't reach the instance! No SCP, no SSH, no webservice running. When i try to connect via SSH (in GCE) it times out, cant make connection on port 22. In the information it says 'The instance is booting up and sshd is not running yet', which is possible of course.... But i cant reach the instance in no possible manner not even after an hour wait :) Not sure what's happening if i cant connect to it somehow :(
There is some activity in the console... some CPU usage, mostly 0%, some incomming traffic but no outgoing...
I hope someone can give me a hint here!
Update 1
After the helpfull tip form Serhii... if found this in the logs...
Booting from Hard Disk 0...
[ 0.872447] piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
/dev/sda1 contains a file system with errors, check forced.
/dev/sda1: Inodes that were part of a corrupted orphan linked list found.
/dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
fsck exited with status code 4
The root filesystem on /dev/sda1 requires a manual fsck
Update 2...
So, i need to fsck the drive...
Created a snapshot, made a new disk from that snapshot, added the new disk as an extra disk to another instance. Now that instance wont boot with the same problem... removing the extra disk fixed it again. So adding the disk makes it crash even though it isn't the boot-disk?
First, have a look at the Compute Engine -> VM instances -> NAME_OF_YOUR_VM -> Logs -> Serial port 1 (console) and try to find errors and warnings that could be connected to lack of free space or SSH. It'll be helpful if you updated your post by providing this information. In case if your instance run out of free space follow this instructions.
You can try to connect to your VM via Serial console by following this guide, but keep in mind that:
The interactive serial console does not support IP-based access
restrictions such as IP whitelists. If you enable the interactive
serial console on an instance, clients can attempt to connect to that
instance from any IP address.
more details you can find in the documentation.
Have a look at the Troubleshooting SSH guide and Known issues for SSH in browser. In addition, Google provides a troubleshooting script for Compute Engine to identify issues with SSH login/accessibility of your Linux based instance.
If you still have a problem try to use your disk on a new instance.
EDIT It looks like your test VM is trying to boot from the disk that you created from the snapshot. Try to follow this guide.
If you still have a problem, you can try to recreate the boot disk from a snapshot to resize it.

Wordpress sends lots of sleep sql query to mysql

My server got stucked last night because of database connection error.
I investigated it is caused by too many database connections. After a research from google and stackoverflow, I didn't get any useful information. While I am trying to investigate all plugins one by one to see if any of them has a bug or something did this, I would like to ask your helps..
First of all, when I logged in to MySQL I can see a lot of SLEEP queries with NULL info there. I tried to use command line to kill all sleep queries but there still more requests fill all connections right away.
The weird thing is, the apache server is not actually getting high volumn of requests. I am actually using AWS RDS as my database server so the apache and mysql is not on same server. The RDS server doesn't have public access so I am sure all requests are only from my apache server. The cpu usage on apache server is not high. Also, I searched the apache's access_log there are not a lot requests at that time. And I cannot find anything wrong with these requests. Especially there is no requests is performing injection attack. I think it is possible some thing triggered in the code so I searched 'SLEEP' in all my code but can only find some in the w3 total cache plugin, which the code blocks in this plugin is not easily get reached..I turned off the XML-RPC in apache level so it shouldn't be the XML-RPC attack.
I know there are a lots of possibility since I am using about twenty plugins in my site, but it is really weird I cannot find any possible requests caused this on apache level. Is it possible any requests can hit the server without being recorded in access_log?
I am pretty new to configure apache and mysql on my own and still learning these features..Thanks in advance for helping me!

Is expect code going to through error if i reboot remote machine?

i am not sure what will happen so i am asking this question and also because i didn't tested this.i have a function send command which sends commands to remote machine and it works fine for normal commands but what if it sends command for reboot like below.
sendcommand reload
expect -re "$prompt"
send -- "exit"
expect eof
i mean after reload how would the rest of the script going to excecute or it will thorugh some error or it will work fine? please guide.
It depends on exactly how you ask for the reboot to be done. Rebooting may be done by asking the system to restart, and the time to process that might allow you to exit. Or it might not; there's a race condition. You certainly need to drop the network connection though; when the OS comes back, it won't recognize it and you'll get a forced connection reset (if not before).
Or you could ask it to reboot a couple of seconds in the future (I forget the exact syntax for this) to give yourself time to disconnect. Some individual research and experimentation is likely to be needed; VMs are good for this as they restart much more rapidly…

Mysql have suddenly started regularly opening unsuccessful sockets

I've desperately tried to figure out what's happened here, but haven't seen this particular problem anywhere. I've 'inherited' (as in, not built any of it myself) management of a database server (remote, in a data warehouse, accessed by ssh) where some php daemons are running on a Linux server acting as data crawlers, inserting and processing information in a relatively steady stream into mysql.
A couple of days ago, the server crashed and came back on again. I logged in an restarted the mysql server and the crawlers, thinking no more of it. A day and a half later, the mysql server stopped working, and I couldn't diagnose it since I couldn't log into it, nor did it respond to "/etc/init.d/mysql stop" or varieties thereof. According to the log file, it kept throwing errors very regularly (once every four minutes and 16 seconds) and said that it had too many file handlers open. When I shut down the crawlers, however, I could log in again, but mysql kept throwing the errors. I checked lsof and it showed a lot of open sockets with "can't identify protocol" error.
mysqld 28843 mysql 1990u sock 0,4 2856488 can't identify protocol
mysqld 28843 mysql 1989u sock 0,4 2857220 can't identify protocol
^Thousands of these rows
I thought it was something the crawlers had done, and I restarted mysql and the failed sockets disappeared. But I was surprised to see that mysql kept opening new ones, even when the crawlers weren't running. It did this very regularly, about two new failed sockets a minute, regardless of whether the crawlers were active or not. I increased the maximum amount of filehandlers allowed for mysql to buy some time, but I'm obviously looking for a diagnosis and permanent solution.
All descriptions of such errors (socket leaks) that I've found on forums seems to be about your own software leaking, not closing its sockets. But this seems to be mysql itself that does it, and there has been no change in any of the code from when it worked fine, just a server crash and restart.
Any ideas?

Mysql resource temporarily unavailable

I'm seeing a few of these errors during high load times:
mysql_connect() [<a
href='function.mysql-connect'>function.mysql-connect</a>]: [2002] Resource
temporarily unavailable (trying to connect via
unix:///var/lib/mysql/mysql.sock)
From what I can tell the mysql server isn't hitting its max connections limit, but there's something else stopping it from serving the query. What other limits would MySQL be hitting?
I'm running RHEL 6.2 64bit with MySQL 5.5.21
Let's assume your system is currently Unix-based (as given in your problem statement). If this is correct, here's the set of issues you may be running into:
You've run out of memory available to MySQL.
This is the most likely problem you're facing. Each connection in MySQL's connection pool requires memory to function, and if this resource is exhausted, no further connections can be made. Of course, the memory footprints and maximum packet sizes of various operations can be tuned in your equivalent to my.cnf if you discover this to be an issue.
Here's an additional thread that can help there, but you may also consider using simpler profiling tools like top to get a good ballpark estimate of what's going on.
You've run out of file descriptors available to your MySQL user account.
Another common issue: if you're trying to service requests that require file IO above the 1,024 boundary (by default), you will run into cases where the operation simply fails. This is because most systems specify a soft and hard limit on the number of open file descriptors each user can have available at one time, and walking over this threshold can cause problems.
This will usually have a series of glaringly obvious signs expressed in your log files. Check /var/log/messages and your comparable directories (for example, /var/log/mysql to see if you can find anything interesting.
You've run into a livelock or deadlock scenario where your thread is unsatisfiable.
Corollary to memory and file descriptor exhaustion, threads can time out if you've overstepped the computational load your system is capable of handling. It won't throw this error message, but this is something to watch out for in the future.
Your system is running out of PIDs available to fork.
Another common scenario: fork only has so many PIDs available for its use at any given time. If your system is simply overforked, it will cease to be able to service requests.
The easiest check for this is to see if any other services can connect through to the machine. For example, trying to SSH into the box and discovering that you cannot is a big clue.
An upstream proxy or connection manager has run out of resources and ceased servicing requests.
If you have any service layer between your client and MySQL, it bears inspecting to see if it has crashed, hung, or otherwise become unstable. The advice above applies.
Your port mapper has exhausted itself after 65,536 connections.
Unlikely, but again, a possible exhaustion case. Checking the trivial service connection as above is, ehm, also the best port of call here.
In short: this is a resource exhaustion scenario, inclusive of the server simply being "down". You're going to have to profile your system further to see what you're blocking on. All the error message gives us in this case is the fact the resource is unavailable to the client -- we'd need to see more information about the server to determine a more adequate remedy.
I still haven't found which limits it was hitting, but I did manage to work around the problem. There was a problem with our session table (in vbulletin) which uses the MEMORY engine. The indexes for this table were HASH and thus when vbulletin purged this table once an hour it would lock the table just long enough to hold up other queries and push mysql to the limit of its resources.
By changing the indexes to BTREE this allowed MySQL to delete the rows from the session table a lot quicker and avoid any limits there were reached previously. The errors only started when we upgraded our master db server to MySQL 5.5, so I'm guessing MEMORY tables are handled differently in the latest release.
See http://www.mysqlperformanceblog.com/2008/02/01/performance-gotcha-of-mysql-memory-tables/ for information on speed increases from using BTREE indexes over HASH For MEMORY.
Geez, this could be so many things. It could be that the socket buffer space is exhausted. It could be that mysql is not accepting connections as fast as they are coming in and the backlog limit is reached (though I'd expect that to give you a "Connection Refused" error, I don't know for sure that's what you'll get for a Unix domain socket). It could be any of the things #MrGomez pointed out.
Since you are running Apache and MySQL on the same server and this is a problem under high load, it could well be that Apache is starving the system of some resource and you're just not seeing (noticing?) the dropped/failed incoming connections/requests in your logs.
Are you using connection pooling? If not, I'd start there.
I'd also look for errors in the Apache logs and syslog around the same time as the mysql_connect error and see what else turns up. I'd especially recommend getting MySQL moved over to its own separate dedicated server.
In my case, I was working with JSON data types with PDO (PHP Driver).
I was using fetch to retrieve one item but forgot to add LIMIT 1 to the query. Adding it solved the problem.