Fatal errors in live servers - language-agnostic

I'm writing some client/server software and I'm facing the following design issue. Normally, I use a VERIFY macro very liberally - if something is wrong in an user's machine, I want the software to fail and log the error so it can be fixed. I was never a fan of ignoring any kind of errors.
However, I'm now writing a server. If the server dies, many clients go down, so the server should die as little as possible. Therefore, I don't know how to treat some conditions that I'd treat as fatal exceptions otherwise.
For example, I get a network packet from an user who isn't logged in. Even though it shouldn't happen, I have enough experience to know "impossible" errors do happen from time to time. So I'm pretty sure if I do a fatal error on these cases, the server WILL crash eventually. On the other hand, I could log and ignore the error and continue, but I'm afraid some bugs may go undetected this way.
What would you do in a situation like this one?

If you can recover from the error, than obviously it wasn't fatal. I can't see the benefit of failing if you can log the error and continue execution - the most important thing is that you've captured the error on log. If you can recover and continue to operate as normal, than that is the best course.
You should implement in addition a notification system (server monitoring) that depending on the error level would notify you in varying degrees of urgency so you'd pick up as soon as possible on something time critical. There are generic system like that for servers, such as Nagios and Munin. You should have look at what they do and see if you can take something from them and implement / integrate it into your system.
Regardless, you should try to make sure client instances are as sandboxed as possible. A client thread going down shouldn't take down the entire server - ever (at least in theory).

Related

Is there a way to keep track of the calls being done in mysql server by a web app?

I'm finishing a system at work that makes calls to mysql server. Those calls' arguments reveal information that I need to keep private, like vote(idUser, idCandidate). There's no information in the db that relates those two of course, nor in "the visible part" of the back end, but even though I think this can't be done, I wanted to make sure that it is impossible to trace this sort of calls, with a log or something (calls that were made, or calls being made at the moment), as it is impossible in most languages, unless you specifically "debug" in a certain way, while the system is in production and being used. I hope the questions is clear enough. Thanks.
How do I log thee? Let me count the ways.
MySQL query log. I can enable this per-session and send everything to a log file.
I can set up a slave server and have insertions sent to me by the master. This is a significant intervention and would leave a wide trace.
On the server, unbeknownst to either Web app and MySQL log, I can intercept communications between the two. I need administrative access to the machine, of course.
On the server, again with administrative access, I can both log the query calls and inject a logging instrumentation into the SQL interface (the legitimate one is the MySQL Audit Plugin, but there are several alternatives, developed for various purposes by developers over the years)
What can you do? You can have the applications use a secure protocol, just for starters.
Then, you need to secure your machine so that administrator tricks do not work, and even if the logs are activated, nobody can read them and you can be advised of any new and modified file to delete it promptly.

what exceptions can occur with mysql statements?

I'm developing a website in php and codeignitor with three collegues, we're using mysql database.
I know that insert can throw an exception due to constraint violation, connect the server can make exception too if the server is busy.
Now what are other exceptions that might occur ? I tried looking in the web and I'm surprised I didn't find what I want, My webapp is a link-sharing website with tags, votes, flags,comments, and search(by title and tags, no advanced search yet) .
PS
Obviously we're not going to handle errors(like bad sector) so exceptions is what we want here.
Other common errors are:
The various php-generated catchable fatal errors. See here. http://php.net/manual/en/errorfunc.constants.php
php's out of memory error, which you cannot catch.
php's maximum execution time error, also which you cannot catch.
all sorts of MySQL errors.
Many web application software developers create a last-chance error handler. It logs the error message and any available stack trace to a log file and presents a "sorry, that didn't work" page to the user.
As you might guess, it's best not to use MySQL to log errors, because if it's MySQL failing, it won't work.
This is a community wiki page. That means anybody can edit it.

lost connection mysql in C

I've written a C program thats running multiple threads and uses MySQL. After some testing i repeatedly saw the error (with hours between) "Mysql server gone away", so i maximized the wait_timeout setting of mysql. But now i get the error "Lost connection to MySQL server during query". These errors only occured when i run the program on a multiple core processor.
Maybe you guys know whats wrong or what i have to do to run my threaded program?
If you've got a multithreaded program that behaves differently on a 1-core system and a multicore system (works on 1-core and has bugs on multicore), it's written incorrectly: that's a sure sign of a race condition. It means the code is actually incorrect, and if scheduled just wrong will trample on its own data, and this is actually happening in practice on the multicore system and not on the 1-core system.
Actually, the same problem could happen on the 1-core system too, it's just less likely and more rare because the threads can't be scheduled truly simultaneously, so one thread has to preempt the other at just the wrong time, for you to see the buggy behavior. This is why if you're writing multithreaded code, you should always test and debug it on a multicore host. You're much more likely to actually see the evidence of race conditions; running on a 1-core host they can remain hidden for much longer.
I don't know what libraries you're using, but they don't look thread-safe or you're not using them in a thread-safe fashion.

What causes mysterious hanging threads in Colfusion -> mysql communication

One of the more interesting "features" in Coldfusion is how it handles external requests. The basic gist of it is that when a query is made to an external source through <cfquery> or or any other external request like that it passes the external request on to a specific driver and at that point CF itself is unable to suspend it. Even if a timeout is specified on the query or in the cfsetting it is flatly ignored for all external requests.
http://www.coldfusionmuse.com/index.cfm/2009/6/9/killing.threads
So with that in mind the issue we've run into is that somehow the communication between our CF server and our mySQL server sometimes goes awry and leaves behind hung threads. They have the following characteristics.
The hung thread shows up in CF and cannot be killed from FusionReactor.
There is no hung thread visible in mySQL, and no active running query (just the usual sleeps).
The database is responding to other calls and appears to be operating correctly.
Max connections have not been reached for the DB nor the user.
It seems to me the only likely candidate is that somehow CF is making a request, mySQL is responding to that request but with an answer which CF ignores and continues to keep the thread open waiting for a response from mySQL. That would explain why the database seems to show no signs of problems, but CF keeps a thread open waiting for the mysterious answer.
Usually these hung threads appear randomly on otherwise working scripts (such as posting a comment on a news article). Even while one thread is hung for that script, other requests for that script will go through, which would imply that the script isn't neccessarily at fault, but rather the condition faced when the script was executed.
We ran some test to determine that it was not a mysql generated max_connections error... we created a user, gave it 1 max connections, tied that connection with a sleep(1000) query and executed another query. Unfortunately, it correctly errored out without generating a hung thread.
So, I'm left at this point with absolutely no clue what is going wrong. Is there some other connection limit or timeout which could be causing the communication between the servers to go awry?
One of the things you should start to look at is the hardware between the two servers. It is possible that you have a router or bridge or NIC that is dropping occasional packets. This can result in the mySQL box thinking it has completed the task while the CF server sits there and waits for a complete response indefinitely, creating a hung thread.
3com has some details on testing for packet loss here: http://support.3com.com/infodeli/tools/netmgt/tncsunix/product/091500/c11ploss.htm#22128
We had a similar problem with a MS SQL server. There, the root cause was a known issue in which, for some reason, the server thinks it's shutting down, and the thread hangs (even though the server is, obviously, not shutting down).
We weren't able to eliminate the problem, but were able to reduce it by turning off pooled DB connections and fiddling with the connection refresh rate. (I think I got that label right -- no access to administrator at my new employment.) Both are in the connection properties in Administrator.
Just a note: The problem isn't entirely with CF. The problem, apparently, affects all Java apps. Which does not, in any way, reduce how annoyed I get by this.
Long story short, but I believe the caused was due to Coldfusion's CF8 image processing. It was just buggy and now in CF9 I have never seen that problem again.

debugging Error establishing mySQL database connection under extreme load

Under high traffic my mysql 5.0.45 server /Apache2/ CentOS 5 is getting "Error establishing mySQL database connection". I need to find the root cause.
I would very much appreciate any pointer to information about the procedure I should take to find the cause (memory limit, thread limits, CPU load, slow queries etc, large dataset, wrong keys ...) I would assume it involves looking at relevant log files etc....
Thank you.
That particular error message sounds like it's being generated by your application, and not by a system library. MySQL has functionality to report the specific errors that are occurring, so your best bet would be to utilize that in some way.
For instance, if you were using PHP, there is a function called mysql_error() that returns specifics about the last error encountered (too many connections, etc). You would put in some error handling near your connection call, and log the mysql_error() results if it failed.
You didn't mention what language you were using, but the MySQL libraries would provide the same functionality to whichever you are using. I'd suggest modifying your application code to take advantage of it.
I'm willing to bet this is because you're hitting the max user limit allowed by the mysql server but in general, do print the mysql errors, if not to the screen but at least to the log, or email.