Turning an exception into an IObservable exception - exception

I have the following code:
var s = Observable
.StartAsync(tnk => CERNWebAccess.GetWebResponse(reqUri))
.SelectMany(resp => Observable.StartAsync(tkn => resp.Content.ReadAsStringAsync()))
.Select(ParseToMD);
The ParseToMD is pretty simple:
private static IDocumentMetadata ParseToMD(string marc21XML)
{
return MARC21Parser.ParseForMetadata(marc21XML);
}
Unfortunately, it is quite legal for the ParseForMetadata to throw an exception. I'd very much like to be able to use the normal Rx techniques to deal with the exception. For example:
var goodOrEmpty = s.Catch(Observable.Empty<Tuple<PaperStub, PaperFullInfo>>());
How can I properly protect that Select call so exceptions are correctly turned into IObservable On Error? I'm also going to need to do it for the others (StartAsync).

A pattern I follow is a little different from the standard way of dealing with exceptions. Just because a select throws an exception doesn't necessarily mean the subscription is bad. The next event might be fine.
Note that in RX, when OnError is triggered it means the subscription must be terminated.
What I do is wrap my selects in a monadic Exceptional type to do the following
Exceptional<IDocumentMetadata> s = Observable
.StartAsync(tnk => CERNWebAccess.GetWebResponse(reqUri))
.SelectMany(resp => Observable.StartAsync(tkn => resp.Content.ReadAsStringAsync()))
.SelectExceptional(ParseToMD);
If you just want to skip the bad ones
s.SelectMany(s=>s)
or you can project it like
var Exceptional<ProcessedDocumentType> =
s.Select(document => ProcessDocument(document))
If you need to extract the exception from the Exceptional type you can do this with properties
bool HasException
Exception Exception
To get the value you can access the property
T Value
However you should use Select and SelectMany if you are just operating on the good values and don't need the exceptions.
My implementation was based on
Exception or Either monad in C#

From your comments, I have to wonder if you are seeing the effect of using StartAsync rather than FromAsync. These methods differ in one important detail; the former runs once as soon as it is evaluated - i.e. it runs exactly once regardless of and before any number of subscribers. If there are no subscribers and it throws you will have an unobserved exception. Contrast with FromAsync which is called per subscriber on subscription.

Related

Throw extra exception to avoid code duplication

First of all, I know the standard answer will be that exceptions are never to be used for flow control. While I perfectly agree with this, I've been thinking a long time about something I sometimes did, which I'll describe with the following pseudo-code:
try
string keyboardInput = read()
int number = int.parse(keyboardInput)
//the conversion succeeds
if(number >= 1000)
//That's not what I asked for. The message to display to the user
//is already in the catch-block below.
throw new NumberFormatException() //well, there IS something wrong with the number...
catch(NumberFormatException ex) //the user entered text
print("Please enter a valid number below 1000.")
First of all, take this example in a very abstract way. This does not necessarily have to happen. The situation simply is:
A user input needs to be constrained and can go wrong in 2 ways,
either
by a thrown exception the language defines, or by a check. Both errors
are reported by the user in the same way, because they do not need to know
the technical difference of what caused it.
I have thought of several ways to solve it. To begin with, it would be better to throw a custom made exception. The problem I then face is, if I catch it locally, what to do with the other exception? In se, the custom exception would be cause for a second catch-block, in which the message would be copied into just as well. My solution:
//number is wrong
throw new MyException()
catch(NumberFormatException ex)
throw new MyException()
catch(MyException ex) {
print("Please enter...")
The meaning of the exceptions' names is everything here. This application of custom-made exceptions is widely accepted, but essentially I didn't do anything different from the first way: I forced to go into a catch-block, albeit by throwing a custom exception rather than a standard-library one.
The same way applied to throwing the exception on to the calling method (thus not having a catch block for the custom exception) seems to make more sense. My method can go wrong in what is technically two ways, but essentially one way: wrong user input. Therefore, one would write a UserInputException and make the method throw this. New problem: what if this is the main method of an application?
I'm not currently struggling with a specific application to implement this kind of behaviour, my question is purely theoretical and non-language specific.
What is the best way to approach this?
I would consider the first exception to be low-level, and I would handle it (by translation in this case) at the point of call. I find that this leads to code that is easier to maintain and refactor later, as you have less types of exceptions to handle.
try
string keyboardInput = read()
try
int number = int.parse(keyboardInput)
catch(NumberFormatException ex)
throw MyException("Input value was not a number")
//the conversion succeeds
if(number >= 1000)
throw MyException("Input value was out of range")
catch(MyException ex) //the user entered text
print( ex.ToString() )
print("Please enter a valid number below 1000.")
I think you have essentially a few ways to go about it with minimal code duplication in mind:
Use a boolean variable/store the exception: If there was an error anywhere in the the general logic of the specific task you are performing, you exit on the first sign of error and handle that in a separate error handling branch.
Advantages: only one place to handle the error; you can use any custom exception/error condition you like.
Disadvantages: the logic of what you are trying to achieve might be hard to discover.
Create a general function that you can use to inform the user about the error (pre-calculating/storing all information that describes the general error, e.g. the message to display the user), so you can just make one function call when an error condition happens.
Advantages: the logic of your intent might be clearer for readers of the code; you can use anu custom exception/error conditon you like.
Disadvantages: the error will have to be handled in separate places (although with the pre-computed/stored values, there is not much copy-paste, however complex the informing the user part).
If the intent is clear, I don't think throwing exceptions from within your try block explicitly is a bad idea. If you do not want to throw one of the system provided exceptions, you can always create your own that derives from one of them, so you only need a minimal number (preferably one) of catch blocks.
Advantages: only one place to handle error condition -- if there is essentially only one type of exception thrown in try-block.
Disadvantages: if more than one type of exception is thrown, you need nested try-catch blocks (to propagate the exceptions to the most outward one) or a very general (e.g. Exception) catch block to avoid having to duplicate error reporting.
The way I see it is this:
Assuming there's no other way to parse your int that doesn't throw an exception, your code as it is now, is correct and elegant.
The only issue would be if your code was in some kind of loop, in which case you might worry about the overhead of throwing and catching unnecessary exceptions. In that case, you will have to compromise some of your code's beauty in favor of only handling exceptions whenever necessary.
error=false;
try {
string keyboardInput = read();
int number = int.parse(keyboardInput);
//the conversion succeeds
if(number >= 1000) {
//That's not what I asked for. The message to display to the user
//is already in the catch-block below.
error=true;
} catch(NumberFormatException ex) { //the user entered text
error=true;
}
if (error)
print("Please enter a valid number below 1000.");
Also you can think about why you're trying to aggregate two errors into one.
Instead you could inform the user as to what error they did, which might be more helpful in some cases:
try {
string keyboardInput = read();
int number = int.parse(keyboardInput);
//the conversion succeeds
if(number >= 1000) {
//That's not what I asked for. The message to display to the user
//is already in the catch-block below.
print("Please enter a number below 1000.");
} catch(NumberFormatException ex) { //the user entered text
print("Please enter a valid number.");
}
You do not need any exceptions in this particular example.
int number;
if (int.TryParse(keyboardInput, out number) && number < 1000) // success
else // error
However, the situation you describe is common in business software, and throwing an exception to reach a uniform handler is quite common.
One such pattern is XML validation followed by XSLT. In some systems, invalid XML is handled through catching validation exceptions. In these systems, it is pretty natural to reuse the existing exception handling in XSLT (which can naturally detect some classes of data errors that a particular validation language cannot):
<xsl:if test="#required = 'yes' and #prohibited = 'yes'>
<xsl:message terminate='yes'>Error message</xsl:message>
</xsl:if>
It is important to see that if such conditions are extremely rare (expected to occur only during early integration testing, and disappear as defects in other modules get fixed), most of the typical concerns around not using exceptions for flow control do not really apply.
What about approaching this validation problem by writing several validator classes that take in an input and return errors, or no errors. As far as your struggle with exceptions: put that logic into each validator and deal with it there on a case by case basis.
after that you figure out the correct validators to use for your input, collect their errors and handle them.
the benefits of this are:
Validators do one thing, validate a single case
Its up to the validation function to decide how to handle the errors. Do you break on first validation error or do you collect them all and then deal with them?
You can write your code is such a way that the main validation function can validate different types of input using the same code, just picking the correct validators using your favorite technique.
and disadvantages:
You will end up writing more code (but if you are using java, this should be put into the 'benefits' bucket)
here is some example pseudo-code:
validate(input):
validators = Validator.for(input.type)
errors = []
for validator in validators:
errors.push(validator.validate(input))
if errors:
throw PoopException
and some validators:
MaxValidator extends IntValidator:
validate(input):
errors = []
errors.push(super.validate(input))
if input > 1000:
errors.push("bleee!!!! to big!")
return errors
IntValidator:
validate(input):
try:
int.parse(input)
catch NumberFormatException:
return ['not an int']
return []
of course you would need to do some trickery to make the parent validator possibly return you a valid version of the input, in this case string "123" converted to an int so the max validator can handle it, but this can be easily accomplished by making the validators statefull or some other magic.
I can't see this answer anywhere in here, so I'll just post it as another point of view.
As we all know, you can actually break the rules if you know them well enough, so you can use throwing an Exception for flow control if you know it's the best solution for your situation. From what I've seen, it happens usually with some dumb frameworks...
That said, before Java 7 (which brought us the mighty multicatch construct), this was my approach to avoid code repetition:
try {
someOffendingMethod();
} catch (Exception e) {
if (e instanceof NumberFormatException || e instanceof MyException) {
System.out.println("Please enter a valid number.");
}
}
It's a valid technique in C#, too.

Returning values vs returning error codes?

This is a general programming question, not pertaining to any specific language.
A new programmer typically will write a method that calculates some value, then returns the value:
public Integer doSomething()
{
// Calculate something
return something;
}
public void main()
{
Integer myValue = doSomething();
}
But when an exception occurs during the calculation of something, what is the best way to handle the exception, especially when giving the user feedback? If you do a try/catch of the calculation of something and if an exception is caught, what do you return? Nothing was calculated, so do you return null? And once you return it (whatever it may be), do you need to do another try/catch in the parent method that checks to see if a valid value was returned? And if not, then make sure the user is given some feedback?
I have heard arguments on both sides of the table about never returning values at all, but instead setting calculated values as pointers or global variables and instead returning only error codes from methods, and then (in the parent method) simply handling the error codes accordingly.
Is there a best practice or approach to this? Are there any good resources that one could access to learn more about the best way to handle this?
UPDATE for Clarification
Consider the following code:
public void main()
{
int myValue = getMyValue();
MyUIObject whatever = new MyUIObject();
whatever.displayValue(myValue); // Display the value in the UI or something
}
public Integer getMyValue()
{
try
{
// Calculate some value
} catch (exception e) {
// ??
}
return value;
}
I call the method to get some int value, then I return it. Back in main(), I do something with the value, like show it in the Log in this case. Usually I would display the value in the UI for the user.
Anyways, if an exception is caught in getMyValue(), so does value get returned but it's null? What happens in main() then? Do I have to test if it's a valid value in main() as well?
I need the program to handle the error accordingly and continue. Someone below suggested displaying the appropriate information in the UI from within the getMyValue() method. I see two potential issues:
It seems like I would be mixing the business logic with (in this case) the logic for the UI.
I would have to pass a reference of the MyUIObject to the getMyValue() or something so I could access it from within the function. In the above simple example that is no big deal, but if there is a BUNCH of UI elements that need to be updated or changed based off of what happens in getMyValue(), passing them all might be a bit much...
I've read a bunch about the fundamentals of all of this but I can't seem to find a straight answer for the above situation. I appreciate any help or insight.
I think you do not quite understand exceptions.
If you throw an exception, you do not return from the function normally:
public Integer doSomething()
{
throw new my_exception();
// The following code does NOT get run
return something;
}
public void main()
{
Integer myValue = doSomething();
}
The main advantages of exceptions are:
You can write your code as though everything is succeeding, which is usually clearer
Exceptions are hard to ignore. If an exception is unhandled, typically an obvious and loud error will be given, with a stack trace. This contrasts with error codes, where it is much easier to ignore the error handling than not.
I recommend this post by Eric Lippert, which discusses exceptions and when it is and is not appropriate to handle them.
UPDATE (in response to comment):
You can absolutely handle an exception and continue, you do this by catching the exception.
eg:
try
{
// Perform calculation
}
catch (ExceptionType ex)
{
// A problem of type 'ExceptionType' occurred - you can do whatever you
// want here.
// You could log it to a list, which will be later shown to the user,
// you could set a flag to pop up a dialog box later, etc
}
// The code here will still get run even if ExceptionType was thrown inside
// the try {} block, because we caught and handled that exception.
The nice thing about this is that you know what kind of thing went wrong (from the exception type), as well as details (by looking into the information in ex), so you
hopefully have the information you need to do the right thing.
UPDATE 2 in response to your edit:
You should handle the exception at the layer where you are able to respond in the way you want. For your example, you are correct, you should not be catching the exception so deep down in the code, since you don't have access to the UI, etc and you thus can't really do anything useful.
How about this version of your example code:
public void main()
{
int myValue = -1; // some default value
String error = null; // or however you do it in Java (:
try
{
getMyValue();
}
catch (exception e)
{
error = "Error calculating value. Check your input or something.";
}
if (error != null)
{
// Display the error message to the user, or maybe add it to a list of many
// errors to be displayed later, etc.
// Note: if we are just adding to a list, we could do that in the catch().
}
// Run this code regardless of error - will display default value
// if there was error.
// Alternatively, we could wrap this in an 'else' if we don't want to
// display anything in the case of an error.
MyUIObject whatever = new MyUIObject();
whatever.displayValue(myValue); // Display the value in the UI or something
}
public Integer getMyValue()
{
// Calculate some value, don't worry about exceptions since we can't
// do anything useful at this level.
return value;
}
Exceptions is a property of object oriented languages (OOL). If you use OOL, you should prefer exceptions. It is much better than to return error codes. You can find nice examples how the error-codes approach generates a vastly longer source code than exceptions based code. For example if you want to read a file and do something with it and save in a different format. You can do it in C without exceptions, but your code will be full of if(error)... statements, maybe you will try to use some goto statements, maybe some macros to make it shorter. But also absolutely nontransparent and hard to understand. Also you can often simply forget to test the return value so you don't see the error and program goes on. That's not good. On the other hand if you write in OOL and use exceptions, your source code focuses on "what to do when there is no error", and error handling is in a different place. Just one single error handling code for many possible file errors. The source code is shorter, more clear etc.
I personally would never try to return error codes in object oriented languages. One exception is C++ where the system of exceptions have some limitations.
You wrote:
I have heard arguments on both sides of the table about never returning values at all, but instead setting calculated values as pointers or global variables and instead returning only error codes from methods, and then (in the parent method) simply handling the error codes accordingly.
[EDIT]
Actually, excetion can be seen as error code, which comes along with the relative message, and you as the programmer should know where your exception must be caught, handled and eventually display to the user. As long as you let the exception to propagate (going down in the stack of called functions) no return values are used, so you don't have to care about handling related missed values. Good exception handling is a quite tough issue.
As jwd replied, I don't see the point to raise an exception in a method and then handle the excpetion in the same method just to return an error value. To clarify:
public Integer doSomething(){
try{
throw new my_exception();}
catch{ return err_value;}
}
is pointless.

Is it good practice to throw an exception sooner than later?

Today I encountered the following situation: ("pseudo code")
class MyClass {
public void workOnArray(Object[] data) {
for (Object item : data) {
workOnItem(item);
}
}
public void workOnItem(Object item) {
if (item == null) throw new NullPointerException();
}
}
Now if the caller calls workOnArray with an array containing null items, the caller would get a NullPointerException because of workOnItem.
But I could insert an additional check in workOnArray, in other words, the problem can be detected sooner.
(Please note that this is a trivial example, in a real life application this can be much less obvious).
On the pro side I'd say an additional check could offer more high-level diagnostic information. Also failing early is always a good thing.
On the contra side I'd ask myself "If I do that, shouldn't I also validate each and every parameter passed into the core API of my programming language and throw the exact same exceptions?"
Is there some rule of thumb when to throw exceptions early and when to just let the callee throw it?
In the case of a loop processing items like that, there's one thing that would definitely make me want to pre-validate the whole array of items first; If it would be bad for some of the items to be processed before an exception was thrown, leaving any remaining items un-processed.
Barring some sort of transaction mechanism wrapping the code, I would usually want to have some assurance that the items in the collection were valid before beginning to process them.
In this example, the workOnItem method is the one that cares whether or not item is null. The workOnArray method doesn't care whether or not items are null and so IMO shouldn't validate whether or not any items are null. The workOnItem method does care and so should perform the check.
I would also consider throwing a more appropriate exception type from workOnItem. A NullPointerException (or in C#, NullReferenceException) often indicates some unexpected flaw in the operation of a method. In C#, I would be more inclined to throw an ArgumentNullException that includes the name of the null parameter. This more clearly indicates that workOnItem can't continue because it cannot handle receiving a null argument.

Is using an if() coupled with an immediate return an accepted practise?

Is using an if coupled with an immediate return like in the example below an acceptable practise instead of having a if with a block of code inside {} ? Are these equivalent in practise or is there a disadvantage to one of the approaches ?
An example in Java:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
ServletContext sc = this.getServletContext();
// Throw exception for fatal error (Servlet not defined in web.xml ?)
if( sc == null )
return; // old-style programming
// Careful with silent bugs ! Correct way of handling this is:
// throw new RuntimeException( "BookDetail: ServletContext is null" );
BookList bookList = WebUtil.getBookList( sc );
Martin Fowler would favour the early return, and calls the idea a Guard Clause.
Personally, I don't like it in Java, as I prefer one return per method. However this is subjective and I may be in the minority.
I've blogged about this for and against.
That is not a return, it's an exception. The code is perfectly ok tho.
Even if you'd replace that throw with a "return something", it would still be ok.
I think it comes down to readability. The code should function the same either way.
I use stuff like this all the time
Function Blah() As Boolean
If expr Then
Return False
End If
Do Other work...
Return result
End Function
For error conditions, generally it's best to throw an exception - exception handling was invented to get rid of the manual return code style error checking in C that comprises about 30% of a C program.
However, early returns are fine - they are far more readable than adding an extra scope with curly braces.
if (!_cache.has_key(key))
return null;
return _cache[key]
Is better than:
if (_cache_has_key(key))
{
return _cache[key]
}
else
return null;
And it only gets more obvious the more early returns that you add, 5 early returns beats the hell out of 5 nested if statements.
Note that I didn't return null on an error condition, it's expected that often the key won't be in the cache - but it still means the caller has to write code to check the result. In .NET there's a better pattern of returning a boolean, and setting the result via an out parameter. The methods beginning with Try usually follow this pattern:
Foo foo;
if (!TryGetCachedFoo("myfoo", foo))
{
foo = new Foo(...);
AddToCache("myfoo", foo);
}
// do something with foo
As long as you're using them for conditional escapes as the first thing in the routine. I think the fact that they are obvious in that location, and avoid at least one level of indentation outweighs the negative of having a multiple returns.
In the example you give, I'd favor throwing an exception because a null ServletContext is usually a sign that something has gone wrong. However, there are times when checking whether a parameter is null and returning immediately from the method is both useful and valid.
For instance, if you are gathering contact information about a user and the user has the option of providing a phone number. In that case, you may have a method that validates that the phone number contains all numbers, has the correct number of digits, etc, but which would immediately return if the phone number was empty or null.
public void validatePhone(String phoneNumber) throws ValidationException {
if (phoneNumber == null || phoneNumber.equals("")) {
return;
}
//do validation stuff, throwing exception if not valid
In your example, there is no return after the if statement; you are throwing an exception. (edit: I see you have changed the code since I posted this answer).
There are purists who think that you should have only one return statement in a method (at the end of the method). There's some merit to that idea - it makes the code more clear, it makes it easier to see what can be returned for the method, and especially when you need to cleanup resources (especially in a language without garbage collection; or in Java where you need to close for example an InputStream) it's more clear and easier if you have just one return at the bottom, and do the cleanup code just before the return.
I would not have any objection against the code in your example, however.
I have a few (subjective, or not) remarks:
I always use accolades with a if even the block contains only one line
I don't like to have many return on one method
I don't think that this null check is necessary. If getServletContext() returns null, then you have a much bigger problem with your webapp that should definitely be fixed. In that case, having a NullPointerException later in the code is an exceptional error so I wouldn't bother handling it.

Exception handling: Contract vs Exceptional approach

I know two approaches to Exception handling, lets have a look at them.
Contract approach.
When a method does not do what it says it will do in the method header, it will throw an exception. Thus the method "promises" that it will do the operation, and if it fails for some reason, it will throw an exception.
Exceptional approach.
Only throw exceptions when something truly weird happens. You should not use exceptions when you can resolve the situation with normal control flow (If statements). You don't use Exceptions for control flow, as you might in the contract approach.
Lets use both approaches in different cases:
We have a Customer class that has a method called OrderProduct.
contract approach:
class Customer
{
public void OrderProduct(Product product)
{
if((m_credit - product.Price) < 0)
throw new NoCreditException("Not enough credit!");
// do stuff
}
}
exceptional approach:
class Customer
{
public bool OrderProduct(Product product)
{
if((m_credit - product.Price) < 0)
return false;
// do stuff
return true;
}
}
if !(customer.OrderProduct(product))
Console.WriteLine("Not enough credit!");
else
// go on with your life
Here I prefer the exceptional approach, as it is not truly Exceptional that a customer has no money assuming he did not win the lottery.
But here is a situation I err on the contract style.
Exceptional:
class CarController
{
// returns null if car creation failed.
public Car CreateCar(string model)
{
// something went wrong, wrong model
return null;
}
}
When I call a method called CreateCar, I damn wel expect a Car instance instead of some lousy null pointer, which can ravage my running code a dozen lines later. Thus I prefer contract to this one:
class CarController
{
public Car CreateCar(string model)
{
// something went wrong, wrong model
throw new CarModelNotKnownException("Model unkown");
return new Car();
}
}
Which do style do you use? What do you think is best general approach to Exceptions?
I favor what you call the "contract" approach. Returning nulls or other special values to indicate errors isn't necessary in a language that supports exceptions. I find it much easier to understand code when it doesn't have a bunch of "if (result == NULL)" or "if (result == -1)" clauses mixed in with what could be very simple, straightforward logic.
My usual approach is to use contract to handle any kind of error due to "client" invocation, that is, due to an external error (i.e ArgumentNullException).
Every error on the arguments is not handled. An exception is raised and the "client" is in charge of handling it. On the other hand, for internal errors always try to correct them (as if you can't get a database connection for some reason) and only if you can't handle it reraise the exception.
It's important to keep in mind that most unhandled exception at such level will not be able to be handled by the client anyway so they will just probably go up to the most general exception handler, so if such an exception occurs you are probably FUBAR anyway.
I believe that if you are building a class which will be used by an external program (or will be reused by other programs) then you should use the contract approach. A good example of this is an API of any kind.
If you are actually interested in exceptions and want to think about how to use them to construct robust systems, consider reading Making reliable distributed systems in the presence of software errors.
Both approaches are right. What that means is that a contract should be written in such a way as to specify for all cases that are not truly exceptional a behavior that does not require throwing an exception.
Note that some situations may or may not be exceptional based upon what the caller of the code is expecting. If the caller is expecting that a dictionary will contain a certain item, and absence of that item would indicate a severe problem, then failure to find the item is an exceptional condition and should cause an exception to be thrown. If, however, the caller doesn't really know if an item exists, and is equally prepared to handle its presence or its absence, then absence of the item would be an expected condition and should not cause an exception. The best way to handle such variations in caller expectation is to have a contract specify two methods: a DoSomething method and a TryDoSomething method, e.g.
TValue GetValue(TKey Key);
bool TryGetValue(TKey Key, ref TValue value);
Note that, while the standard 'try' pattern is as illustrated above, some alternatives may also be helpful if one is designing an interface which produces items:
// In case of failure, set ok false and return default<TValue>.
TValue TryGetResult(ref bool ok, TParam param);
// In case of failure, indicate particular problem in GetKeyErrorInfo
// and return default<TValue>.
TValue TryGetResult(ref GetKeyErrorInfo errorInfo, ref TParam param);
Note that using something like the normal TryGetResult pattern within an interface will make the interface invariant with respect to the result type; using one of the patterns above will allow the interface to be covariant with respect to the result type. Also, it will allow the result to be used in a 'var' declaration:
var myThingResult = myThing.TryGetSomeValue(ref ok, whatever);
if (ok) { do_whatever }
Not quite the standard approach, but in some cases the advantages may justify it.