I've desperately tried to figure out what's happened here, but haven't seen this particular problem anywhere. I've 'inherited' (as in, not built any of it myself) management of a database server (remote, in a data warehouse, accessed by ssh) where some php daemons are running on a Linux server acting as data crawlers, inserting and processing information in a relatively steady stream into mysql.
A couple of days ago, the server crashed and came back on again. I logged in an restarted the mysql server and the crawlers, thinking no more of it. A day and a half later, the mysql server stopped working, and I couldn't diagnose it since I couldn't log into it, nor did it respond to "/etc/init.d/mysql stop" or varieties thereof. According to the log file, it kept throwing errors very regularly (once every four minutes and 16 seconds) and said that it had too many file handlers open. When I shut down the crawlers, however, I could log in again, but mysql kept throwing the errors. I checked lsof and it showed a lot of open sockets with "can't identify protocol" error.
mysqld 28843 mysql 1990u sock 0,4 2856488 can't identify protocol
mysqld 28843 mysql 1989u sock 0,4 2857220 can't identify protocol
^Thousands of these rows
I thought it was something the crawlers had done, and I restarted mysql and the failed sockets disappeared. But I was surprised to see that mysql kept opening new ones, even when the crawlers weren't running. It did this very regularly, about two new failed sockets a minute, regardless of whether the crawlers were active or not. I increased the maximum amount of filehandlers allowed for mysql to buy some time, but I'm obviously looking for a diagnosis and permanent solution.
All descriptions of such errors (socket leaks) that I've found on forums seems to be about your own software leaking, not closing its sockets. But this seems to be mysql itself that does it, and there has been no change in any of the code from when it worked fine, just a server crash and restart.
Any ideas?
Related
My development system has suddenly been afflicted with this weird problem where every single SQL script takes exactly 31 seconds to execute on my Classic ASP site's connection to a mySQL (MariaDB) database.
Connecting to either a local copy of the DB running off my system or even my live DB being hosted at a web host, it all the same.
Everything from a simple
adoconn.Execute("SELECT * FROM users;")
or even
adoconn.Execute("SET sql_mode''")
would take 31 seconds to execute. Each!
I can safely rule out any problems with the DB as connecting to it and running scripts from DBeaver shows no problems at all. The results come back instantly.
I can also rule out network problems as the local DB and the hosted DB have the same results and I have used WireShark to confirm that the MySQL packets are being responded to almost immediately from the hosted DB.
Debug stepping through my ASP code, everything runs fine right up until the .Execute() at which it will take 31 seconds, regardless of how complex the script is.
The strangest thing is, this problem just came out of the blue; when my system was powered down, disconnected and untouched over the weekend. No updates, installations or changes were done to the system. Friday I was doing my dev work perfectly fine. But Monday morning when I powered it back up, the DB connections there are stuffed.
I've already tried configuring mySQL to use the "skip-name-resolve" and "bind-address = ::" settings.
I have tried rebuilding my IIS websites and reinstalling IIS itself.
I've also reinstalled mySQL ODBC drivers on my system to no avail.
What is going on here?
As it turns out, the cause of this whole issue was the McAfee software that came pre-installed in my Dell laptop.
No, I did disable the firewall and antivirus, mind you.
Those were the first steps I did and triple-checked routinely during my testing. Both McAfee's firewall and auto-protection were all fully disabled.
But apparently, McAfee, ignores this setting and was screwing my DB connections over ODBC.
This problem finally only came to an end when I fully uninstalled this McAfee malware. There's no other way to describe it.
Let this post be a warning to anyone else naively believing this malware to be anything else.
My server got stucked last night because of database connection error.
I investigated it is caused by too many database connections. After a research from google and stackoverflow, I didn't get any useful information. While I am trying to investigate all plugins one by one to see if any of them has a bug or something did this, I would like to ask your helps..
First of all, when I logged in to MySQL I can see a lot of SLEEP queries with NULL info there. I tried to use command line to kill all sleep queries but there still more requests fill all connections right away.
The weird thing is, the apache server is not actually getting high volumn of requests. I am actually using AWS RDS as my database server so the apache and mysql is not on same server. The RDS server doesn't have public access so I am sure all requests are only from my apache server. The cpu usage on apache server is not high. Also, I searched the apache's access_log there are not a lot requests at that time. And I cannot find anything wrong with these requests. Especially there is no requests is performing injection attack. I think it is possible some thing triggered in the code so I searched 'SLEEP' in all my code but can only find some in the w3 total cache plugin, which the code blocks in this plugin is not easily get reached..I turned off the XML-RPC in apache level so it shouldn't be the XML-RPC attack.
I know there are a lots of possibility since I am using about twenty plugins in my site, but it is really weird I cannot find any possible requests caused this on apache level. Is it possible any requests can hit the server without being recorded in access_log?
I am pretty new to configure apache and mysql on my own and still learning these features..Thanks in advance for helping me!
I have problem, when MySQL thread sometime stuck at status "Writing to net".
I have 4 Apache server (2.4) (requests are load-balanced on them) a 1 MySQL (MariaDB 10). Apache is executing php56. All Apache servers have same configuration. All servers runs on CentOS 7. SElinux is disabled on Apache servers for debug reasons. No problems in audit logs on DB server. All servers are virtual and located on same cluster (VMware).
Problem appear only on specific pages and specific queries to DB.
Usually there is around 100-200 separate queries on page and most of them takes 0.0001-0.0010 s. But then I have one query that takes around 1-2sec. The query itself take much lesser time (around 0.0045s).
Problematic query returns around 8984 rows and when executed from CLI from debug script, it is executed fast as expected.
Strange is that in time some Apache servers execute that page quickly, and some slowly. It changes (during day). Also I tried remove one Apache server from cluster and then send same request. If server is not under any load, it usually responds fast.
All server have enough resources (CPU and RAM) so it is definitely not load issue. They usually have around 4-10 active Apache workers (prefork) and have capacity for 100 active workers.
I tried debugging with tcpdump and when requesting page, I can see packet flow for fast queries and then it stops for a while and resumes. Not sure if the problem is on MySQL server or on Apache server.
My guess is that I am hitting some kind of limit, but I have no idea which one.
The solution is quite odd.
First few more details:
All Apache severs have same application data (PHP files, images, etc.) Mounted from NFS. The NFS share was working fine (low latency, no data corruption).
Solution:
When I was desperate I went through every possible log. Then I noticed that iptables are dropping some packets from NFS server. Well I said to myself that I should probably fix that, even when its not related.
But after I allowed all traffic from NFS to my Apache servers, MySQL status "writing to net" disappeared and all websites started to respond quickly.
I am new to dealing with mysql settings and admin type issues. About 4-5 hours ago, I had two power outages within 30 minutes of eachother. As a result, my computer shutdown both times, while in the middle of what I can only assume was a around 20-30 commands on mysql at the time. After the first, mysql was unaffected. But after the second, something happened. MySQL Server cannot remain open for more than a few seconds at a time (before the outage, this was not a problem). I am running MySQL Server 5.1.
I can manually start MySQL server using the admin command line (I am running this on Windows): net start mysql. I get a message saying "The MySQL service was started successfully". Then I run a command or (max) two, and then again everything stops working with a 2013 "Lost connection to MYSQL server during query". Then I have to do restart the MySQL Server all over again.
I have some important data in the database which I cannot reach because the connection times out before I can get it out. Is there a way I can fix this connection problem easily? I know my data is in there, because I have gotten a fair amount of it out.
Any help would be appreciated. Please let me know what other information you might need, and how I can get it. I have been trying to find the error log for mysql, and have not found it yet.
And, yes, if I get through this, and even if I dont, I will make sure to create a system to update the data on a regular basis so these types of failures aren't so catastrophic in the future.
Thanks in advance
I'm seeing a few of these errors during high load times:
mysql_connect() [<a
href='function.mysql-connect'>function.mysql-connect</a>]: [2002] Resource
temporarily unavailable (trying to connect via
unix:///var/lib/mysql/mysql.sock)
From what I can tell the mysql server isn't hitting its max connections limit, but there's something else stopping it from serving the query. What other limits would MySQL be hitting?
I'm running RHEL 6.2 64bit with MySQL 5.5.21
Let's assume your system is currently Unix-based (as given in your problem statement). If this is correct, here's the set of issues you may be running into:
You've run out of memory available to MySQL.
This is the most likely problem you're facing. Each connection in MySQL's connection pool requires memory to function, and if this resource is exhausted, no further connections can be made. Of course, the memory footprints and maximum packet sizes of various operations can be tuned in your equivalent to my.cnf if you discover this to be an issue.
Here's an additional thread that can help there, but you may also consider using simpler profiling tools like top to get a good ballpark estimate of what's going on.
You've run out of file descriptors available to your MySQL user account.
Another common issue: if you're trying to service requests that require file IO above the 1,024 boundary (by default), you will run into cases where the operation simply fails. This is because most systems specify a soft and hard limit on the number of open file descriptors each user can have available at one time, and walking over this threshold can cause problems.
This will usually have a series of glaringly obvious signs expressed in your log files. Check /var/log/messages and your comparable directories (for example, /var/log/mysql to see if you can find anything interesting.
You've run into a livelock or deadlock scenario where your thread is unsatisfiable.
Corollary to memory and file descriptor exhaustion, threads can time out if you've overstepped the computational load your system is capable of handling. It won't throw this error message, but this is something to watch out for in the future.
Your system is running out of PIDs available to fork.
Another common scenario: fork only has so many PIDs available for its use at any given time. If your system is simply overforked, it will cease to be able to service requests.
The easiest check for this is to see if any other services can connect through to the machine. For example, trying to SSH into the box and discovering that you cannot is a big clue.
An upstream proxy or connection manager has run out of resources and ceased servicing requests.
If you have any service layer between your client and MySQL, it bears inspecting to see if it has crashed, hung, or otherwise become unstable. The advice above applies.
Your port mapper has exhausted itself after 65,536 connections.
Unlikely, but again, a possible exhaustion case. Checking the trivial service connection as above is, ehm, also the best port of call here.
In short: this is a resource exhaustion scenario, inclusive of the server simply being "down". You're going to have to profile your system further to see what you're blocking on. All the error message gives us in this case is the fact the resource is unavailable to the client -- we'd need to see more information about the server to determine a more adequate remedy.
I still haven't found which limits it was hitting, but I did manage to work around the problem. There was a problem with our session table (in vbulletin) which uses the MEMORY engine. The indexes for this table were HASH and thus when vbulletin purged this table once an hour it would lock the table just long enough to hold up other queries and push mysql to the limit of its resources.
By changing the indexes to BTREE this allowed MySQL to delete the rows from the session table a lot quicker and avoid any limits there were reached previously. The errors only started when we upgraded our master db server to MySQL 5.5, so I'm guessing MEMORY tables are handled differently in the latest release.
See http://www.mysqlperformanceblog.com/2008/02/01/performance-gotcha-of-mysql-memory-tables/ for information on speed increases from using BTREE indexes over HASH For MEMORY.
Geez, this could be so many things. It could be that the socket buffer space is exhausted. It could be that mysql is not accepting connections as fast as they are coming in and the backlog limit is reached (though I'd expect that to give you a "Connection Refused" error, I don't know for sure that's what you'll get for a Unix domain socket). It could be any of the things #MrGomez pointed out.
Since you are running Apache and MySQL on the same server and this is a problem under high load, it could well be that Apache is starving the system of some resource and you're just not seeing (noticing?) the dropped/failed incoming connections/requests in your logs.
Are you using connection pooling? If not, I'd start there.
I'd also look for errors in the Apache logs and syslog around the same time as the mysql_connect error and see what else turns up. I'd especially recommend getting MySQL moved over to its own separate dedicated server.
In my case, I was working with JSON data types with PDO (PHP Driver).
I was using fetch to retrieve one item but forgot to add LIMIT 1 to the query. Adding it solved the problem.