Resumable exceptions: any real-life scenario? - exception

As the topic states, I can hardly imagine, where and when to use resumable exceptions in a real life example and which effective advantage we might get by the usage of them.
What I can imagine is, a subsystem is connected, let's say via RFC to a session, which is held open. The subsystem has to pass some shopflor-data to sap, let us say, in an usual way, in the frequency of any weight/piece/liter, which is processed.
Somehow something fails.
I can get all of this done, without the use of a resumable exceptions, so, besides that this exception seems to keep track of the entire context( what does not seem to be a NEW feature), does anybody have a clue, what this is really all about ?

A non-resumable exception is an error of the kind "Something went wrong here, and I can't continue running the program as desired any more. TILT." The caller just has to deal with that.
A resumable exception still tells the caller that something went wrong, but it defers the decision if the program can be continued to the caller. I expect there to be only a few scenarios where this might be useful. Mass updates might be one scenario: "You wanted me to update both the material price and the text; I've changed the price, but the text in language ZH doesn't exist. I don't know whether you'd like to abort the operation entirely (RETURN) or keep the updated price and disregard the missing text (RESUME)."

Related

Juding whether an exception is exceptional

It's a pretty popular and well known phrase that you should "only catch/throw exceptions which are exceptional". However, how is an "exceptional" exception determined?
For example, a bad password is very routine in logging into a service, so this is not exceptional. Statistics for a web app would probably show something like one bad login attempt for every 5 attempts (from no specific user). Likewise, with attempting to go to a checkout with a basket in an online store, this could be very commmon (especially for new users). However, a file not found could go either way. I usually work along the lines that if a method is missing something to do its work, throw an exception, but then it gets a little confusing here. In some cases, a file not found could be common (e.g. a file share used by many users with no tight controls), compared to a very locked down production environment missing a file, which would be exceptional.
Is this the right way to deduce between whether an exception is exceptional or not? I can easily filter things like no network connection etc as exceptional, but some cases are hard to judge. Is it subjective?
Thanks
I think it's pretty subjective, honestly, so I prefer to avoid that method of figuring out when I should use exceptions.
Instead, I prefer to consider three things:
Is it likely that I might want to let the call stack unwind more than one level?
Is there another way? (Return null or an error code, etc.) If so, do I have even the slightest performance concern?
If neither of those lead to a clear decision, which is easier to read by someone who has to maintain the code?
If #1 is true, and I don't have a MAJOR performance concern, I will probably opt to use exceptions because it will speed up my development time not to have to code return codes (and manually code the logic to have them propagate up the call stack if needed). When you use exceptions, call stack unwinding is free of charge for development time.
If #2 is true, and either I'm not going more than one frame (maybe two?) up the call stack or I have a serious performance concern (in a tight loop, for example), then I'll try really hard to find another way that doesn't involve exceptions.
Exceptions are only a tool for programmers in a language which supports them. I don't believe they have to have any intrinsic value as to what is "exceptional" or not. Instead, I say use them when they are the best tool for the job.

What is the general consensus on user-error correction for web apps?

I'm building a RoR site, and today I get the pagination done. Upon showing it to my coworker, his first question is "what happens if you set the querystring to "?page=-1". It died with a runtime exception (error 500). He suggested that that should definitely be fixed before this site goes anywhere near live.
I happen to disagree with him (hear me out). Now, I've been in the web dev business for all of four months, so I very well could be wrong. But I would think that this isn't a big deal. I would think that, so long as said errors do not constitute a security risk, things like this shouldn't be a priority. The only way to cause this error is if you manually edit the query string, and, well, garbage in garbage out. If you're smart enough to know that you even can edit the querystring, you should be smart enough to not give it a negative number.
What is the general consensus on things like this? Do you completely idiot proof the site, so that no matter what the query string is, you never generate an error? Do you let things slide so long as it works the way it's supposed to (and doesn't expose a security risk)? Somewhere in the middle?
EDIT: Somehow my question didn't really come out completely as I intended it. The crux of my question was, where to draw the line between proactively correcting for things versus not doing them. If there's invalid input in the get string, for instance, would it be better practice to display a tasteful error as suggested in the posted replies, or to try to figure out what the user was doing, and do that. Or, as a more concrete example: If a user sets page=-1 in the get string, would it be better to silently assume they meant page=0, or to display some kind of tasteful error page saying somethign like "invalid page specified"?
You should be error checking anything that comes in from the query string. If you get an invalid page number, you should have an error message that's a little more graceful than the Error 500 page. Maybe a sorry, bad request. Try this: <possible suggestions>. It's just plain sloppy and unprofessional to knowingly and deliberately leave an easily accessible error like that on a live site.
You say you're new to web apps, but if your previous dev experience was other GUI apps being used by the "general public" (non-developers, non-techies), would it have been OK to have stack traces thrown into the user's face as the app falls apart around them? In my experience, this is never really acceptable.
You make some good points, but an incorrect query string can have many reasons. For example, a link to a record that has since been deleted. Or a Google result pointing to a page that doesn't exist in the current result set any more.
In these cases, you should show the user something a bit more verbose than a 500 error.
If you have an error-page that looks nice, and gives a polite message, I'd say it's fine. Though I might consider responding with a 404 instead. Garbage in should preferably not produce an error.
I don't think a 500 error page is very meaningful to your average user. At least tell him something is wrong with your page and guide him back on the right track by providing a link to get back to your site.
Sometimes I redirect users to a page that is likely to what he wanted. So when a query goes below zero and this is not permitted, redirect your user to ?page=0 and maybe display a message on top of that page. I think you should prefer this method because it is a better approach in terms of user experience to not use modal windows.
I agree with you, that error messages are necessary and useful but you should try to differentiate, e.g. give an 404 where the user requested a page that doesn't exist.
It varies from project to project. How many users do you expect? If it's below 10K visitors a day it might not be so bad. What percentage of users do you expect will hit the problem? I don't expect that very many but you would know best.
The goal should be to ship the product and roll out improvements regularly. Hopefully the product is sound overall.
Regarding a solution, if its a page not found, a 4xx error should be thrown instead of a 5xx. 5xx errors typically warrant a deeper look and while it's hard to write an air-tight application directly on launch, you should try to have a generic handler for 4xx and 5xx errors.
In the PCI game (Credit Card Verification / Validation) the rule is validate everything and allow for no idiots. So the answer depends on your application.

What's a good approach to writing error handling?

I hate writing error condition code. I guess I don't have a good approach to doing it:
Do you write all of your 'functional'
code first then go back in and add
error handling or vice versa?
How stupid do you assume your users
are?
How granular do you make your
exception throws?
Does anyone have sage advice to pass on to make this easier?
A lot of great answers guys, thank you. I had actually gotten more answers about dealing with the user than I thought. I'm actually more interested in error handling on the back end, dealing with database connection failures and potential effects on the front end, etc. Keep them coming!
I can answer one question: You don't need to assume your users are "stupid", you need to help them to use your application. Show nice prompts for things, validate data and explain why, so it's obvious to them, don't crash in their face if you can't handle what they've done (or more specifically, what you've let them do), show a nice page explaining what they can do instead, and so on.
Treat them with respect, and don't assume they know everything about your system, you are here to help them.
In respect to the first part; I generally write most error-handling at the time, and add a little bit back in later.
I generally don't throw that many exceptions.
Assume your users don't know anything and will break your system any way that it can possibly be broken.
Then write your error handling code accordingly.
First, and foremost, be clear to the user on what you expect. Second, test the input to verify it contains data within the boundaries you expect.
Prime example, I had a form with an email field. We weren't immediately using that data so we didn't put any checking on it. Result: about 1% of the users put in their home address. The field was labeled "Email Address" Apparently the users were just reading the second word and ignoring the first.
The fix was to change the label to simply say "Email" and then test the input. For kicks we went ahead and logged what the users were initially typing in that field just to see if the label change helped. It did.
Also, as a general practice, your functions should test the inputs to verify they contain the data you expect. Use asserts or their equivalent in your language of choice.
When i code, there will be some exceptions which i will expect, i.e. a file may be missing, or some xml serialisation may fail. Those exceptions i know will happen ahead of time, and i can put in handling for them.
There is a lot you cannot anticipate though, and nor should you try to. Put in a global error handler and logger, so that ultimately everything gets caught and logged. Then as your testers and/or users find situations that cause exceptions (i.e. bad input) then you can decide whether you want to put further handling in specifically for it, or maybe leave it as it was.
Summary: validate your input, but don't try to gaze into the crystal ball too much, as you will never anticipate every issue that the user may come up with. Have a global handler and logger, and then refine as necessary.
I have two words for you: Defensive Coding
You have to assume your users are incredibly stupid. Someone will always find a way to give you input that you thought would never happen.
I try to make my exception throws as granular as possible to provide the best feedback when something goes wrong. If you lump everything together, you can't tell which error case caused the problem.
I usually try to handle error cases first (before getting functional code), but that's not necessarily a best practice.
Someone has already mentioned defensive programming. A few thoughts from a user experience perspective, though.
If the user's input is invalid, either (a) correct it if you're reasonably sure you knew what they meant or (b) display a message in line that tells them what corrective action they should take.
Avoid messages like, "FATAL SYSTEM ERROR CODE 02382981." Users (a) don't know what this means, even if you do, and (b) are intimidated and put off by seeing things like this.
Avoid pop-up messages for every—freaking—possible—error you can come up with. You shouldn't disrupt user flow unless you absolutely need them to resolve a problem before they can do anything else.
Log, log, log. When you show an error message to the user, put relevant information that might help you debug in either (a) a log file or (b) a database, depending on the type of application you're creating. This will ease the effort of hunting down information about the error without making the user cry.
Once you identify what your users should and should not be able to do, you'll be able to effectively write error handling code. You can make this easier on yourself with helper methods/classes.
In terms of your question about writing handling before/after/during business logic, think about it this way: if you're making 400,000 sandwiches, it might be faster to add all the mustard at the same time, but it's probably also a lot more boring than making each sandwich individually. Who knows, though, maybe you really like the smell of mustard...

Misunderstood Good Ideas

Sometimes someone has a great idea that solves a problem. But as time passes, people forget why it was a great idea and try to use it in ways that end up causing problems as bad (or worse) than what the idea was originally supposed to solve.
Example:
I'm sure that distributed source
control is sufficiently
counterintuitive that people try to
establish conventions that defeat the
point of distributed source control.
Example 2:
it's very natural to think that when
you're writing some code, you should
handle all errors that could possibly
arrise. But a function doesn't always
have enough information to handle the
error properly, so all it can do is
somehow tell whoever called it that
the error occured. Passing errors up
the call stack by hand is tedious, so
exceptions were invented. With no
extra typing on the part of the
programmer, exceptions will bubble up
the call stack until somebody can do
something with them. It seems like
checked exceptions, at least in
practice, tarnish the awesomeness of
exceptions. At best, the programmer
has to tediously work her way up any
possible call stack, specifying that
every method throws a given exception
up to the point where it can be
handled (if it can be handled). Worse,
she might swallow the exception to
avoid the chore!
What are some other examples where an approach that seems like the common-sense thing to do is actually recreating a problem that had been solved in some way?
Point of this question: internalizing what is wrong with the common-sense "obvious" solution is a very good way of developing an intuition for how and why to use the initially counterintuitive elegant solution.
Hmmm..... let me think... what's with the foundation on which the web works - stateless HTTP on which many stateful frameworks have been built (ASP.NET, JSF etc.) that completely discard the stateless nature of the protocol? Well, not discard it in their implementation but discard it for their users - developers, who not even knowing anything about basic web elements try to pack megabytes of serialized objects into pages which leads to performance loss and tremendous consumption of traffic and server resources.
Would it fall into you conception definition?
You're right that there's many examples of this with DVCS. The most common one is to use DVCS like Subversion by always pushing on any commit or going days without even bothering to commit.

To what extent should code try to explain fatal exceptions?

I suspect that all non-trivial software is likely to experience situations where it hits an external problem it cannot work around and thus needs to fail. This might be due to bad configuration, an external server being down, disk full, etc.
In these situations, especially if the software is running in non-interactive mode, I expect that all one can really do is log an error and wait for the admin to read the logs and fix the problem. If someone happens to interact with the software in the meantime, e.g. a request comes in to a server that failed to initialize properly, then perhaps an appropriate hint can be given to check the logs and maybe even the error can be echoed (depending on whether you can tell if they're a technical guy as opposed to a business user). For the moment though let's not think too hard about this part.
My question is, to what extent should the software be responsible for trying to explain the meaning of the fatal error? In general, how much competence/knowledge are you allowed to presume on administrators of the software, and how much should you include troubleshooting information and potential resolution steps when logging fatal errors? Of course if there's something that's unique to the runtime context this should definitely be logged; but lets assume your software needs to talk to Active Directory via LDAP and gets back an error "[LDAP: error code 49 - 80090308: LdapErr: DSID-0C090334, comment: AcceptSecurityContext error, data 525, vece]". Is it reasonable to assume that the maintainers will be able to Google the error code and work out what it means, or should the software try to parse the error code and log that this is caused by an incorrect user DN in the LDAP config?
I don't know if there is a definitive best-practices answer for this, so I'm keen to hear a variety of views.
The approach I tend to agree with is that you should explain as much as possible if the fatal error is caused by some code in your own responsibility (i.e. not third party). Otherwise if the error is caused "further down", for example at the database level, then the administrators should be passed up the error returned without adding much further information. So if the database server dies, then your connector with throw some exception, and you would log the error code in the exception.
The administrator or support staff should then have sufficient knowledge to resolve the issue with the provided information.
When you do provide too much details on errors which are not caused by your own code you run the risk of having error details NOT matching the cause of the actual error, especially if the error codes stop matching between versions.
Of course, there are exceptions. We have worked with open source libraries that were so poorly documented that we ended up writing wrappers around the libraries just to provide decent logging of what actually is going on.
Just my 2c
The answer, as for all broad questions, is "it depends."
If you're looking at a configuration error, then by all means you should try to explain what was wrong (in the logs). If it's an out-of-memory error, there's not much you can do -- and you may not even be able to write a log message.
One thing you said concerns me:
If someone happens to interact with
the software in the meantime, e.g. a
request comes in to a server that
failed to initialize properly, then
perhaps an appropriate hint can be
given to check the logs
If this is truly a fatal error, the server should not be running, and therefore any incoming request should fail with absolutely no warning or explanation.
You should at least provide the message from the exception and a stack trace so you can find out where in the code it occurred. If possible, you should also explain what you were attempting to do and what you think may have happened depending on the exception type.
I guess it depends on how much time you have before delivering the software to your customers.
Yes, it would be nice to parse the error and give a more explicit message but, in this day and age, Google is not always very far.
So unless, you have time to create the code to parse errors, I would leave them as is.
IMHO you can never provide too much information in these case.
In the real world it comes down to cost-benefit analysis, though. What's the impact of the error to you, your app, your business, etc. How much time is it worth spending on it.
In a business critical app my first point applies. Everything else is a sliding scale.
I think it depends on who is using the application.
If the application is used by tech savvy people then show more technical details, so they will be able to troubleshoot the problem if they want. I've had some users go to great lengths to solve issues. It can be very helpful, especially for issues that are specific to certain configurations.
If your user base is more of the average Joe then technical details will confuse them in most cases. You should show them a simple error message, and try to offer some solutions if possible.
You could also merge the two techniques. Show a simple error message by default and allow the user to view more detailed error information if they want.
You just don't want to overwhelm the user with too much information that they don't understand. It just frustrates and confuses them in the majority of cases.
There are two aspects I think all errors and exceptions should have:
1) Enough information in the error to help debug the problem. Stacktrace, class/method name, type of exception etc fall in this category.
2) A human understandable message, ideally clear enough for say Ops team or Sysadmins engineer to know who to call or forward that error message. Typically it is of the form "so and so module failed" or "network call failed" etc. Something that will come as close to you explaining the problem to customer, in non technical jargon.
Now with all the time constraints etc it may not be possible to have both messages programmed in. Then I would go out on a limb and say we should have the second type of error message. Remember, the sysadmin would probably be able to call you and since you helped write the code you can maybe pinpoint the error. But if the customer is on phone asking about the error, the sysadmin better be able to explain the possible cause :)
On a different note, all products need a clear exception/error handling mechanism decided at architecture level. And the exceptions NEED to adhere to that design. There are few things more frustrating than trying to debug an error based on a design only to find out a day later that its a one of a kind error message based on completely different design.
See https://meta.stackexchange.com/questions/3122/formatting-sandbox