Which contract (Design by contract) is better? - exception

Suppose I have a method
public Patient(int id)
{
----
}
that returns Patient object given an id.. I could define contract in 2 ways
Method would return null if patient does not exist
Method would throw an exception if patient does not exist. In this case I would also define a query method that returns true if the Patient exist in the database or false otherwise...
Which contract should I use? Any other suggestions?
Update: Please comment on this case too...
If it is not an database assigned Id and it is something a user enter in UI.. like SSN .. then which one is better..
Comment about Null pattern from Steve that I think is valid:
probably not a good idea here, as it would be really useful to know immediately when an ID did not exist.
And I also think Null pattern here would be somewhat heavy weight
Comment from Rob Wells on throwing exception because its bad Id:
i don't think a typo in a patient's name is an exceptional circumstance" IMHO

Keep in mind that going "over the wire" to another tier (whether a database or an application server) is one of the most expensive activities you can do - typically a network call will take several orders of magnitude longer than in-memory calls.
It's therefore worth while structuring your API to avoid redundant calls.
Consider, if your API is like this:
// Check to see if a given patient exists
public bool PatientExists(int id);
// Load the specified patient; throws exception if not found
public Patient GetPatient(int id);
Then you are likely to hit the database twice - or to be reliant on good caching to avoid this.
Another consideration is this: In some places your code may have a "known-good" id, in other places not. Each location requires a different policy on whether an exception should be thrown.
Here's a pattern that I've used to good effect in the past - have two methods:
// Load the specified patient; throws exception if not found
public Patient GetExistingPatient(int id);
// Search for the specified patient; returns null if not found
public Patient FindPatient(int id);
Clearly, GetExistingPatient() can be built by calling FindPatient().
This allows your calling code to get the appropriate behaviour, throwing an exception if something has gone wrong, and avoiding exception handling in cases where it is not needed.

Another option would be the Null Object pattern.

You should probably throw an exception. If you have an id that doesn't point to a valid patient, where did it come from? Something very bad has likely happened. It is an exceptional circumstance.
EDIT: If you're doing something other than an integer-based retrieval, like a search based on text, then returning null is fine. Especially since in that case you are returning a set of results, which could be more than one (more than one patient with the same name, same birth date, or whatever your criteria is).
A search function should have a different contract from a retrieval function.

It depends:
If you consider the normal operation will lead to a pation number not matching a file in the DB then an empty (NULL) record should be returned.
But if you expect that a given ID should always hit a record then when one is not found (which should be rare) then use an exception.
Other things like a DB connection error should generate an exception.
As you expect under normal situations the query to the DB to always work (though it may return 0 records or not).
P.S. I would not return a pointer. (Who owns the pointer??)
I would return an object that may or may not have the record. But that you can interogated for the existance of the record within. Potentially a smart pointer or somthing slightly smarter than a smart pointer that understands the cotext.

For this circumstance, I would have the method return null for a non-existent patient.
I tend to prefer using exceptions to assist graeful degradation when there is a problem with the system itself.
In this instance, it is mosdt probably:
a typo in the patient's ID if it was entered into a search form,
a data entry error, or
a workflow issue in that he patient's record hasn't been entered yet.
Hence, returning a null rather than an exception.
If there was a problem contacting the database, then I would have the method raise an exception.
Edit: Just saw that the patient ID in the signature was an integer, thanks Steven Lowe, so I've corrected my list of reasons.
My basic point about delineating when to use exceptions (for system errors) versus other methods of returning an error (for simple data entry typos) still stands though. IMHO.
HTH
cheers,
Rob

In a simple situation like this 1. seems to be more than sufficient. You may want to implement something like a callback method that the client calls to know why it returned null. Just a suggestion.

taking your descriptiong at face value, you probably need both:
bad IDs are errors/exceptions, as Adam pointed out, but
if you are given IDs elsewhere that might have disappeared, you will need the query method to check for them

Assuming I read that correctly...
When you call Patient(100) it will return an object reference for a Patient with an id of 100.
If no patient with an id of 100 exists, I think it should return null. Exceptions are overused IMO and this case doesn't call for it. The function simply returned a null. It didn't create some errored case that can crash your application (unless of course, you ended up not handling that null and passed it around to some other part of your application).
I would definitely have that function return 'null', especially if it was part of some search, where a user would search for a patient with a particular ID and if the object reference ended up being null, it would simply state that no patient with that id exists.

Throw an exception.
If you return null, code like this:
Console.WriteLine(Patient(id).Name);
would fail with a NullReferenceException if the id doesn't exist, which is not as helpful as a say a PatientNotFoundException(id). In this example, it's still relatively easy to track down, but consider:
somePatient = Patient(id)
// much later, in a different function:
Console.WriteLine(somePatient);
About adding a function that checks whether a patient exists: Note this won't prevent PatientNotFoundExceptions completely. For example:
if (PatientExists(id))
Console.WriteLine(Patient(id).Name);
-- another thread or another process could delete the patient between the calls to PatientExists and Patient. Also, this would mean two database queries instead of one. Usually, it's better to just try the call, and handle the exception.
Note that the situation is different for queries that return multiple values, e.g. as a list; here, it is appropriate to return an empty list if there are no matches.

Related

Why use an exception instead of if...else

For example, in the case of "The array index out of bound" exception, why don't we check the array length in advance:
if(array.length < countNum)
{
//logic
}
else
{
//replace using exception
}
My question is, why choose to use an exception? and when to use an exception, instead of if-else
Thanks.
It depends on acceptable practices for a given language.
In Java, the convention is to always check conditions whenever possible and not to use exceptions for flow control. But, for example, in Python not only using exception in this manner is acceptable, but it is also a preferred practice.
They are used to inform the code that calls your code an exceptional condition occurred. Exceptions are more expensive than well formed if/else logic so you use them in exceptional circumstances such as reaching a condition in your code you cannot handle locally, or to support giving the caller of your code the choice of how to handle the error condition.
Usually if you find yourself throwing and catching exceptions in your own function or method, you can probably find a more efficient way of doing it.
There are many answers to that question. As a single example, from Java, when you are using multiple threads, sometimes you need to interrupt a thread, and the thread will see this when an InterruptedException is thrown.
Other times, you will be using an API that throws certain exceptions. You won't be able to avoid it. If the API throws, for example, an IOException, then you can catch it, or let it bubble up.
Here's an example where it would actually be better to use an exception instead of a conditional.
Say you had a list of 10,000 strings. Now, you only want those items which are integers. Now, you know that a very small number of them won't be integers (in string form). So should you check to see if every string is an integer before trying to convert them? Or should you just try to convert them and throw and catch an exception if you get one that isn't an integer? The second way is more efficient, but if they were mostly non-integers then it would be more efficient to use an if-statement.
Most of the time, however, you should not use exceptions if you can replace them with a conditional.
As someone has already said, 'Exceptions' in programming languages are for exceptional cases and not to set logical flow of your program. For example, in the case of given code snippet of your question, you have to see what the enclosing method's or function's intention is. Is checking array.length < countNum part of the business logic or not. If yes, then putting a pair of if/else there is the way to go. If that condition is not part of the business logic and the enclosing method's intention is something else, then write code for that something else and throw exception instead of going the if/else way. For example you develop an application for a school and in your application you have a method GetClassTopperGrades which is responsible for the business logic part which requires to return the highest marks of the student in a certain class. the method/function definition would be something like this:
int GetClassTopperGrades(string classID)
In this case the method's intention is to return the grades, for a valid class, which will always be a positive integer, according to the business logic of the application. Now if someone calls your method and passes a garbage string or null, what should it do? If should throw an exception e.g. ArgumentException or 'ArgumentNullException' because this was an exceptional case in this particular context. The method assumed that always a valid class ID will be passed and NULL or empty string is NOT a valid class ID (a deviation from the business logic).
Apart from that, in some conditions there is no prior knowledge about the outcome of a given code and no defined way to prevent an exceptional situation. For example, querying some remote database, if the network goes down, you don't have any other option there apart from throwing an exception. Would you check network connectivity before issuing every SQL query to the remote database?
There is strong and indisputable reason why to use exceptions - no matter of language. I strongly believe that decision about if to use exceptions or not have nothing to do with particular language used.
Using exceptions is universal method to notify other part of code that something wrong happened in kind of loosely coupled way. Let imagine that if you would like to handle some exceptional condition by using if.. nad else.. you need to insert into different part of your code some arbitrary variables and other stuff which probably would easily led to have spaghetti code soon after.
Let next imagine that you are using any external library/package and it's author decided to put in his/her code other arbitrary way to handle wrong states - it would force you to adjust to its way of dealing with it - for example you would need to check if particular methods returns true or false or whatever. Using exceptions makes handling errors much more easy - you just assume that if something goes wrong - the other code will throw exception, so you just wrap the code in try block and handle possible exception on your own way.

assert() vs enforce(): Which to choose?

I'm having a hard time choosing whether I should "enforce" a condition or "assert" a condition in D. (This is language-neutral, though.)
Theoretically, I know that you use assertions to find bugs, and you enforce other conditions in order to check for atypical conditions. E.g. you might say assert(count >= 0) for an argument to your method, because that indicates that there's a bug with the caller, and that you would say enforce(isNetworkConnected), because that's not a bug, it's just something that you're assuming that could very well not be true in a legitimate situation beyond your control.
Furthermore, assertions can be removed from code as an optimization, with no side effects, but enforcements cannot be removed because they must always execute their condition code. Hence if I'm implementing a lazy-filled container that fills itself on the first access to any of its methods, I say enforce(!empty()) instead of assert(!empty()), because the check for empty() must always occur, since it lazily executes code inside.
So I think I know that they're supposed to mean. But theory is easier than practice, and I'm having a hard time actually applying the concepts.
Consider the following:
I'm making a range (similar to an iterator) that iterates over two other ranges, and adds the results. (For functional programmers: I'm aware that I can use map!("a + b") instead, but I'm ignoring that for now, since it doesn't illustrate the question.) So I have code that looks like this in pseudocode:
void add(Range range1, Range range2)
{
Range result;
while (!range1.empty)
{
assert(!range2.empty); //Should this be an assertion or enforcement?
result += range1.front + range2.front;
range1.popFront();
range2.popFront();
}
}
Should that be an assertion or an enforcement? (Is it the caller's fault that the ranges don't empty at the same time? It might not have control of where the range came from -- it could've come from a user -- but then again, it still looks like a bug, doesn't it?)
Or here's another pseudocode example:
uint getFileSize(string path)
{
HANDLE hFile = CreateFile(path, ...);
assert(hFile != INVALID_HANDLE_VALUE); //Assertion or enforcement?
return GetFileSize(hFile); //and close the handle, obviously
}
...
Should this be an assertion or an enforcement? The path might come from a user -- so it might not be a bug -- but it's still a precondition of this method that the path should be valid. Do I assert or enforce?
Thanks!
I'm not sure it is entirely language-neutral. No language that I use has enforce(), and if I encountered one that did then I would want to use assert and enforce in the ways they were intended, which might be idiomatic to that language.
For instance assert in C or C++ stops the program when it fails, it doesn't throw an exception, so its usage may not be the same as what you're talking about. You don't use assert in C++ unless you think that either the caller has already made an error so grave that they can't be relied on to clean up (e.g. passing in a negative count), or else some other code elsewhere has made an error so grave that the program should be considered to be in an undefined state (e.g. your data structure appears corrupt). C++ does distinguish between runtime errors and logic errors, though, which may roughly correspond but I think are mostly about avoidable vs. unavoidable errors.
In the case of add you'd use a logic error if the author's intent is that a program which provides mismatched lists has bugs and needs fixing, or a runtime exception if it's just one of those things that might happen. For instance if your function were to handle arbitrary generators, that don't necessarily have a means of reporting their length short of destructively evaluating the whole sequence, you'd be more likely consider it an unavoidable error condition.
Calling it a logic error implies that it's the caller's responsibility to check the length before calling add, if they can't ensure it by the exercise of pure reason. So they would not be passing in a list from a user without explicitly checking the length first, and in all honesty should count themselves lucky they even got an exception rather than undefined behavior.
Calling it a runtime error expresses that it's "reasonable" (if abnormal) to pass in lists of different lengths, with the exception indicating that it happened on this occasion. Hence I think an enforcement rather than an assertion.
In the case of filesize: for the existence of a file, you should if possible treat that as a potentially recoverable failure (enforcement), not a bug (assertion). The reason is simply that there is no way for the caller to be certain that a file exists - there's always someone with more privileges who can come along and remove it, or unmount the entire fielsystem, in between a check for existence and a call to filesize. It's therefore not necessarily a logical flaw in the calling code when it doesn't exist (although the end-user might have shot themselves in the foot). Because of that fact it's likely there will be callers who can treat it as just one of those things that happens, an unavoidable error condition. Creating a file handle could also fail for out-of-memory, which is another unavoidable error on most systems, although not necessarily a recoverable one if for example over-committing is enabled.
Another example to consider is operator[] vs. at() for C++'s vector. at() throws out_of_range, a logic error, not because it's inconceivable that a caller might want to recover, or because you have to be some kind of numbskull to make the mistake of accessing an array out of range using at(), but because the error is entirely avoidable if the caller wants it to be - you can always check the size() before access if you have no other way of knowing whether your index is good or not. And so operator[] doesn't guarantee any checks at all, and in the name of efficiency an out of range access has undefined behavior.
assert should be considered a "run-time checked comment" indicating an assumption that the programmer makes at that moment. The assert is part of the function implementation. A failed assert should always be considered a bug at the point where the wrong assumption is made, so at the code location of the assert. To fix the bug, use a proper means to avoid the situation.
The proper means to avoid bad function inputs are contracts, so the example function should have a input contract that checks that range2 is at least as long as range1. The assertion inside the implementation could then still remain in place. Especially in longer more complex implementations, such an assert may inprove understandability.
An enforce is a lazy approach to throwing runtime exceptions. It is nice for quick-and-dirty code because it is better to have a check in there rather then silently ignoring the possibility of a bad condition. For production code, it should be replaced by a proper mechanism that throws a more meaningful exception.
I believe you have partly answered your question yourself. Assertions are bound to break the flow. If your assertion is wrong, you will not agree to continue with anything. If you enforce something you are making a decision to allow something to happen based on the situation. If you find that the conditions are not met, you can enforce that the entry to a particular section is denied.

How to explain method calls?

let's consider a small method:
int MyFunction(string foo, int bar)
{
...
}
and some calls:
MyFunction("",0)
int x = MyFunction(foo1,bar1)
How would you explain this to a non-technical persons? Has anybody a nice metaphor?
I tried to explain method calling (or function application) several times, but I failed. Seems I can't find the right words here.
Regards,
forki
UPDATE: It is important for me to explain how the parameters are passed / matched.
(Highly non-technical solution)
It's like making an order:
Calling the method = dialing the right number
Passing the arguments = giving your details
the method does is job
Getting a return value = getting what you ordered
You could tell function is a process available into an object that could be called by other. Lets say "You" is an object with function "Work". Your "Boss" is the caller object. Your Boss then can call you to Work with different type (which is parameter).
In the end Your "Boss" can ask "You" to Work("encode this") or Work("check email") or Work("finish deadline"), etc.
How about delegating a task? Imagine you’re baking a cake and ran out of flour. Instead of buying some yourself you could just send your kid with instructions to buy flour. Input: money, output: flour.
It's difficult to understand the "method call" concept if you don't understand first the
flow of control.
A simple explanation is that methods, or routines, is a construct for packeting instructions
in order to reuse them and make the code more readable.
Calling a method, temporarily, switches the execution flow to that method.
C:: do(a ,b)
You are telling C to do something , given the condition a and b.
The best approach is probably to come up with a domain specific example which the person you are explaining to can relate to. If she is working with the post office, you should describe the function "send letter with this text to this recipient", where recipient is a parameter (containing the address) and message is the parameter for the textual content.
Parameter order is not important as long as you have a name for each parameter. Trying to explain why order is important in some arcane programming language is fruitless.
How about
Calling a function: Ask the software to perform xxx task
Returning value type function: Ask your software to perform xxx task and tell you the outcome of the operation
Calling a function with param: Given X is this value and Y is thisvalue, ask your software to perform xxx task (and tell you the outcome of the operation)
Think of the system as a teller at a desk. To call a function you fill in a form to ask the system to do something, hand it to the teller. They go off and do the work, then hand you back a piece of paper with the result written on it. Depending on what you want the system to do, you pick an appropriate form.
The form for MyMethod says:
MYMETHOD REQUISITION FORM:
String _______
int _______
This analogy can be extended in all kinds of ways. Wouldn't it be handy if the form told you what the purpose of the String and int were? That's where languages with named parameters come in.
For OO, instead of having one desk for the whole system, each object is its own teller, you hand a form to them, and in order to get the job done, they hand a lot more forms back and forth between each other. Etc.

Should functions that only output return anything?

I'm rewriting a series of PHP functions to a container class. Many of these functions do a bit of processing, but in the end, just echo content to STDOUT.
My question is: should I have a return value within these functions? Is there a "best practice" as far as this is concerned?
In systems that report errors primarily through exceptions, don't return a return value if there isn't a natural one.
In systems that use return values to indicate errors, it's useful to have all functions return the error code. That way, a user can simply assume that every single function returns an error code and develop a pattern to check them that they follow everywhere. Even if the function can never fail right now, return a success code. That way if a future change makes it possible to have an error, users will already be checking errors instead of implicitly silently ignoring them (and getting really confused why the system is behaving oddly).
Can the processing fail? If so, should the caller know about that? If either of these is no, then I don't see value in a return. However, if the processing can fail, and that can make a difference to the caller, then I'd suggest returning a status or error code.
Do not return a value if there is no value to return. If you have some value you need to convey to the caller, then return it but that doesn't sound like the case in this instance.
I will often "return: true;" in these cases, as it provides a way to check that the function worked. Not sure about best practice though.
Note that in C/C++, the output functions (including printf()) return the number of bytes written, or -1 if this fails. It may be worth investigating this further to see why it's been done like this. I confess that
I'm not sure that writing to stdout could practically fail (unless you actively close your STDOUT stream)
I've never seen anyone collect this value, let alone do anything with it.
Note that this is distinct from writing to file streams - I'm not counting stream redirection in the shell.
To do the "correct" thing, if the point of the method is only to print the data, then it shouldn't return anything.
In practice, I often find that having such functions return the text that they've just printed can often be useful (sometimes you also want to send an error message via email or feed it to some other function).
In the end, the choice is yours. I'd say it depends on how much of a "purist" you are about such things.
You should just:
return;
In my opinion the SRP (single responsibility principle) is applicable for methods/functions as well, and not only for objects. One method should do one thing, if it outputs data it shouldn't do any data processing - if it doesn't do processing it shouldn't return data.
There is no need to return anything, or indeed to have a return statement. It's effectively a void function, and it's comprehensible enough that these have no return value. Putting in a 'return;' solely to have a return statement is noise for the sake of pedantry.

api documentation and "value limits": do they match?

Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation ?
Note: I am not talking about comments within the code
By "value limits", I mean:
does a parameter can support a null value (or an empty String, or...) ?
does a 'return value' can be null or is guaranteed to never be null (or can be "empty", or...) ?
Sample:
What I often see (without having access to source code) is:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* #return array of reader names
*/
String[] getReaderNames(final String aReaderNameRegexp);
What I like to see would be:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* (can be null or empty)
* #return array of reader names
* (null if Report has not yet been published,
* empty array if no reader match criteria,
* reader names array matching regexp, or all readers if regexp is null or empty)
*/
String[] getReaderNames(final String aReaderNameRegexp);
My point is:
When I use a library with a getReaderNames() function in it, I often do not even need to read the API documentation to guess what it does. But I need to be sure how to use it.
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
Edit:
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?
I think they can belong together but don't necessarily have to belong together. In your scenario, it seems like it makes sense that the limits are documented in such a way that they appear in the generated API documentation and intellisense (if the language/IDE support it).
I think it does depend on the language as well. For example, Ada has a native data type that is a "restricted integer", where you define an integer variable and explicitly indicate that it will only (and always) be within a certain numeric range. In that case, the datatype itself indicates the restriction. It should still be visible and discoverable through the API documentation and intellisense, but wouldn't be something that a developer has to specify in the comments.
However, languages like Java and C# don't have this type of restricted integer, so the developer would have to specify it in the comments if it were information that should become part of the public documentation.
I think those kinds of boundary conditions most definitely belong in the API. However, I would (and often do) go a step further and indicate WHAT those null values mean. Either I indicate it will throw an exception, or I explain what the expected results are when the boundary value is passed in.
It's hard to remember to always do this, but it's a good thing for users of your class. It's also difficult to maintain it if the contract the method presents changes (like null values are changed to no be allowed)... you have to be diligent also to update the docs when you change the semantics of the method.
Question 1
Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation?
Almost never.
Question 2
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
If I used a function not properly I would expect a RuntimeException thrown by the method or a RuntimeException in another (sometimes very far) part of the program.
Comments like #param aReaderNameRegexp filter in order to ... (can be null or empty) seems to me a way to implement Design by Contract in a human-being language inside Javadoc.
Using Javadoc to enforce Design by Contract was used by iContract, now resurrected into JcontractS, that let you specify invariants, preconditions, postconditions, in more formalized way compared to the human-being language.
Question 3
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?
Java language doesn't have a Design by Contract feature, so you might be tempted to use Execption but I agree with you about the fact that you have to be aware about When to choose checked and unchecked exceptions. Probably you might use unchecked IllegalArgumentException, IllegalStateException, or you might use unit testing, but the major problem is how to communicate to other programmers that such code is about Design By Contract and should be considered as a contract before changing it too lightly.
I think they do, and have always placed comments in the header files (c++) arcordingly.
In addition to valid input/output/return comments, I also note which exceptions are likly to be thrown by the function (since I often want to use the return value for...well returning a value, I prefer exceptions over error codes)
//File:
// Should be a path to the teexture file to load, if it is not a full path (eg "c:\example.png") it will attempt to find the file usign the paths provided by the DataSearchPath list
//Return: The pointer to a Texture instance is returned, in the event of an error, an exception is thrown. When you are finished with the texture you chould call the Free() method.
//Exceptions:
//except::FileNotFound
//except::InvalidFile
//except::InvalidParams
//except::CreationFailed
Texture *GetTexture(const std::string &File);
#Fire Lancer: Right! I forgot about exception, but I would like to see them mentioned, especially the unchecked 'runtime' exception that this public method could throw
#Mike Stone:
you have to be diligent also to update the docs when you change the semantics of the method.
Mmmm I sure hope that the public API documentation is at the very least updated whenever a change -- that affects the contract of the function -- takes place. If not, those API documentations could be drop altogether.
To add food to yours thoughts (and go with #Scott Dorman), I just stumble upon the future of java7 annotations
What does that means ? That certain 'boundary conditions', rather than being in the documentation, should be better off in the API itself, and automatically used, at compilation time, with appropriate 'assert' generated code.
That way, if a '#CheckForNull' is in the API, the writer of the function might get away with not even documenting it! And if the semantic change, its API will reflect that change (like 'no more #CheckForNull' for instance)
That kind of approach suggests that documentation, for 'boundary conditions', is an extra bonus rather than a mandatory practice.
However, that does not cover the special values of the return object of a function. For that, a complete documentation is still needed.