api documentation and "value limits": do they match? - language-agnostic

Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation ?
Note: I am not talking about comments within the code
By "value limits", I mean:
does a parameter can support a null value (or an empty String, or...) ?
does a 'return value' can be null or is guaranteed to never be null (or can be "empty", or...) ?
Sample:
What I often see (without having access to source code) is:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* #return array of reader names
*/
String[] getReaderNames(final String aReaderNameRegexp);
What I like to see would be:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* (can be null or empty)
* #return array of reader names
* (null if Report has not yet been published,
* empty array if no reader match criteria,
* reader names array matching regexp, or all readers if regexp is null or empty)
*/
String[] getReaderNames(final String aReaderNameRegexp);
My point is:
When I use a library with a getReaderNames() function in it, I often do not even need to read the API documentation to guess what it does. But I need to be sure how to use it.
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
Edit:
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?

I think they can belong together but don't necessarily have to belong together. In your scenario, it seems like it makes sense that the limits are documented in such a way that they appear in the generated API documentation and intellisense (if the language/IDE support it).
I think it does depend on the language as well. For example, Ada has a native data type that is a "restricted integer", where you define an integer variable and explicitly indicate that it will only (and always) be within a certain numeric range. In that case, the datatype itself indicates the restriction. It should still be visible and discoverable through the API documentation and intellisense, but wouldn't be something that a developer has to specify in the comments.
However, languages like Java and C# don't have this type of restricted integer, so the developer would have to specify it in the comments if it were information that should become part of the public documentation.

I think those kinds of boundary conditions most definitely belong in the API. However, I would (and often do) go a step further and indicate WHAT those null values mean. Either I indicate it will throw an exception, or I explain what the expected results are when the boundary value is passed in.
It's hard to remember to always do this, but it's a good thing for users of your class. It's also difficult to maintain it if the contract the method presents changes (like null values are changed to no be allowed)... you have to be diligent also to update the docs when you change the semantics of the method.

Question 1
Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation?
Almost never.
Question 2
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
If I used a function not properly I would expect a RuntimeException thrown by the method or a RuntimeException in another (sometimes very far) part of the program.
Comments like #param aReaderNameRegexp filter in order to ... (can be null or empty) seems to me a way to implement Design by Contract in a human-being language inside Javadoc.
Using Javadoc to enforce Design by Contract was used by iContract, now resurrected into JcontractS, that let you specify invariants, preconditions, postconditions, in more formalized way compared to the human-being language.
Question 3
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?
Java language doesn't have a Design by Contract feature, so you might be tempted to use Execption but I agree with you about the fact that you have to be aware about When to choose checked and unchecked exceptions. Probably you might use unchecked IllegalArgumentException, IllegalStateException, or you might use unit testing, but the major problem is how to communicate to other programmers that such code is about Design By Contract and should be considered as a contract before changing it too lightly.

I think they do, and have always placed comments in the header files (c++) arcordingly.
In addition to valid input/output/return comments, I also note which exceptions are likly to be thrown by the function (since I often want to use the return value for...well returning a value, I prefer exceptions over error codes)
//File:
// Should be a path to the teexture file to load, if it is not a full path (eg "c:\example.png") it will attempt to find the file usign the paths provided by the DataSearchPath list
//Return: The pointer to a Texture instance is returned, in the event of an error, an exception is thrown. When you are finished with the texture you chould call the Free() method.
//Exceptions:
//except::FileNotFound
//except::InvalidFile
//except::InvalidParams
//except::CreationFailed
Texture *GetTexture(const std::string &File);

#Fire Lancer: Right! I forgot about exception, but I would like to see them mentioned, especially the unchecked 'runtime' exception that this public method could throw
#Mike Stone:
you have to be diligent also to update the docs when you change the semantics of the method.
Mmmm I sure hope that the public API documentation is at the very least updated whenever a change -- that affects the contract of the function -- takes place. If not, those API documentations could be drop altogether.
To add food to yours thoughts (and go with #Scott Dorman), I just stumble upon the future of java7 annotations
What does that means ? That certain 'boundary conditions', rather than being in the documentation, should be better off in the API itself, and automatically used, at compilation time, with appropriate 'assert' generated code.
That way, if a '#CheckForNull' is in the API, the writer of the function might get away with not even documenting it! And if the semantic change, its API will reflect that change (like 'no more #CheckForNull' for instance)
That kind of approach suggests that documentation, for 'boundary conditions', is an extra bonus rather than a mandatory practice.
However, that does not cover the special values of the return object of a function. For that, a complete documentation is still needed.

Related

Why is my %h is List = 1,2; a valid assignment?

While finalizing my upcoming Raku Advent Calendar post on sigils, I decided to double-check my understanding of the type constraints that sigils create. The docs describe sigil type constraints with the table
below:
Based on this table (and my general understanding of how sigils and containers work), I strongly expected this code
my %percent-sigil is List = 1,2;
my #at-sigil is Map = :k<v>;
to throw an error.
Specifically, I expected that is List would attempt to bind the %-sigiled variable to a List, and that this would throw an X::TypeCheck::Binding error – the same error that my %h := 1,2 throws.
But it didn't error. The first line created a List that seemed perfectly ordinary in every way, other than the sigil on its variable. And the second created a seemingly normal Map. Neither of them secretly had Scalar intermediaries, at least as far as I could tell with VAR and similar introspection.
I took a very quick look at the World.nqp source code, and it seems at least plausible that discarding the % type constraint with is List is intended behavior.
So, is this behavior correct/intended? If so, why? And how does that fit in with the type constraints and other guarantees that sigils typically provide?
(I have to admit, seeing an %-sigiled variable that doesn't support Associative indexing kind of shocked me…)
I think this is a grey area, somewhere between DIHWIDT (Docter, It Hurts When I Do This) and an oversight in implementation.
Thing is, you can create your own class and use that in the is trait. Basically, that overrides the type with which the object will be created from the default Hash (for %) and Array (for # sigils). As long as you provide the interface methods, it (currently) works. For example:
class Foo {
method AT-KEY($) { 42 }
}
my %h is Foo;
say %h<a>; # 42
However, if you want to pass such an object as an argument to a sub with a % sigil in the signature, it will fail because the class did not consume the Associatve role:
sub bar(%) { 666 }
say bar(%h);
===SORRY!=== Error while compiling -e
Calling bar(A) will never work with declared signature (%)
I'm not sure why the test for Associative (for the % sigil) and Positional (for #) is not enforced at compile time with the is trait. I would assume it was an oversight, maybe something to be fixed in 6.e.
Quoting the Parameters and arguments section of the S06 specification/speculation document about the related issue of binding arguments to routine parameters:
Array and hash parameters are simply bound "as is". (Conjectural: future versions ... may do static analysis and forbid assignments to array and hash parameters that can be caught by it. This will, however, only happen with the appropriate use declaration to opt in to that language version.)
Sure enough the Rakudo compiler implemented some rudimentary static analysis (in its AOT compilation optimization pass) that normally (but see footnote 3 in this SO answer) insists on binding # routine parameters to values that do the Positional role and % ones to Associatives.
I think this was the case from the first official Raku supporting release of Rakudo, in 2016, but regardless, I'm pretty sure the "appropriate use declaration" is any language version declaration, including none. If your/our druthers are static typing for the win for # and % sigils, and I think they are, then that's presumably very appropriate!
Another source is the IRC logs. A quick search quickly got me nothing.
Hmm. Let's check the blame for the above verbiage so I can find when it was last updated and maybe spot contemporaneous IRC discussion. Oooh.
That is an extraordinary read.
"oversight" isn't the right word.
I don't have time tonight to search the IRC logs to see what led up to that commit, but I daresay it's interesting. The previous text was talking about a PL design I really liked the sound of in terms of immutability, such that code could become increasingly immutable by simply swapping out one kind of scalar container for another. Very nice! But reality is important, and Jonathan switched the verbiage to the implementation reality. The switch toward static typing certainty is welcome, but has it seriously harmed the performance and immutability options? I don't know. Time for me to go to sleep and head off for seasonal family visits. Happy holidays...

What are better ways to create a method that takes many arguments? (10+?)

I was looking at some code of a fellow developer, and almost cried. In the method definition there are 12 arguments. From my experience..this isn't good. If it were me, I would have sent in an object of some sort.
Is there another / more preferred way to do this (in other words, what's the best way to fix this and explain why)?
public long Save (
String today,
String name,
String desc,
int ID,
String otherNm,
DateTime dt,
int status,
String periodID,
String otherDt,
String submittedDt
)
ignore my poor variable names - they are examples
It highly depends on the language.
In a language without compile-time typechecking (e.g. python, javascript, etc.) you should use keyword arguments (common in python: you can access them like a dictionary passed in as an argument) or objects/dictionaries you manually pass in as arguments (common in javascript).
However the "argument hell" you described is sometimes "the right way to do things" for certain languages with compile-time typechecking, because using objects will obfuscate the semantics from the typechecker. The solution then would be to use a better language with compile-time typechecking which allows pattern-matching of objects as arguments.
Yes, use objects. Also, the function is probably doing too much if it needs all of this information, so use smaller functions.
Use objects.
class User { ... }
User user = ...
Save(user);
It decision provides easy way for adding new parameters.
It depends on how complex the function is. If it does something non-trivial with each of those arguments, it should probably be split. If it just passes them through, they should probably be collected in an object. But if it just creates a row in a table, it's not really big deal. It's less of a deal if your language supports keyword arguments.
I imagine the issue you're experiencing is being able to look at the method call and know what argument is receiving what value. This is a pernicious problem in a language like Java, which lacks something like keyword arguments or JSON hashes to pass named arguments.
In this situation, the Builder pattern is a useful solution. It's more objects, three total, but leads to more comprehensible code for the problem you're describing. So the three objects in this case would be as such:
Thing: stateful entity, typically immutable (i.e. getters only)
ThingBuilder: factory class, creates a Thing entity and sets its values.
ThingDAO: not necessary for using the Builder pattern, but addresses your question.
Interaction
/*
ThingBuilder is a static inner class of Thing, where each of its
"set" method calls returns the ThingBuilder instance being worked with
while the final "build()" call returns the instantiated Thing instance.
*/
Thing thing = Thing.createBuilder().
.setToday("2012/04/01")
.setName("Example")
// ...etc...
.build();
// the Thing instance as get methods for each property
thing.getName();
// get your reference to thingDAO however it's done
thingDAO.save(thing);
The result is you get named arguments and an immutable instance.

Why use an exception instead of if...else

For example, in the case of "The array index out of bound" exception, why don't we check the array length in advance:
if(array.length < countNum)
{
//logic
}
else
{
//replace using exception
}
My question is, why choose to use an exception? and when to use an exception, instead of if-else
Thanks.
It depends on acceptable practices for a given language.
In Java, the convention is to always check conditions whenever possible and not to use exceptions for flow control. But, for example, in Python not only using exception in this manner is acceptable, but it is also a preferred practice.
They are used to inform the code that calls your code an exceptional condition occurred. Exceptions are more expensive than well formed if/else logic so you use them in exceptional circumstances such as reaching a condition in your code you cannot handle locally, or to support giving the caller of your code the choice of how to handle the error condition.
Usually if you find yourself throwing and catching exceptions in your own function or method, you can probably find a more efficient way of doing it.
There are many answers to that question. As a single example, from Java, when you are using multiple threads, sometimes you need to interrupt a thread, and the thread will see this when an InterruptedException is thrown.
Other times, you will be using an API that throws certain exceptions. You won't be able to avoid it. If the API throws, for example, an IOException, then you can catch it, or let it bubble up.
Here's an example where it would actually be better to use an exception instead of a conditional.
Say you had a list of 10,000 strings. Now, you only want those items which are integers. Now, you know that a very small number of them won't be integers (in string form). So should you check to see if every string is an integer before trying to convert them? Or should you just try to convert them and throw and catch an exception if you get one that isn't an integer? The second way is more efficient, but if they were mostly non-integers then it would be more efficient to use an if-statement.
Most of the time, however, you should not use exceptions if you can replace them with a conditional.
As someone has already said, 'Exceptions' in programming languages are for exceptional cases and not to set logical flow of your program. For example, in the case of given code snippet of your question, you have to see what the enclosing method's or function's intention is. Is checking array.length < countNum part of the business logic or not. If yes, then putting a pair of if/else there is the way to go. If that condition is not part of the business logic and the enclosing method's intention is something else, then write code for that something else and throw exception instead of going the if/else way. For example you develop an application for a school and in your application you have a method GetClassTopperGrades which is responsible for the business logic part which requires to return the highest marks of the student in a certain class. the method/function definition would be something like this:
int GetClassTopperGrades(string classID)
In this case the method's intention is to return the grades, for a valid class, which will always be a positive integer, according to the business logic of the application. Now if someone calls your method and passes a garbage string or null, what should it do? If should throw an exception e.g. ArgumentException or 'ArgumentNullException' because this was an exceptional case in this particular context. The method assumed that always a valid class ID will be passed and NULL or empty string is NOT a valid class ID (a deviation from the business logic).
Apart from that, in some conditions there is no prior knowledge about the outcome of a given code and no defined way to prevent an exceptional situation. For example, querying some remote database, if the network goes down, you don't have any other option there apart from throwing an exception. Would you check network connectivity before issuing every SQL query to the remote database?
There is strong and indisputable reason why to use exceptions - no matter of language. I strongly believe that decision about if to use exceptions or not have nothing to do with particular language used.
Using exceptions is universal method to notify other part of code that something wrong happened in kind of loosely coupled way. Let imagine that if you would like to handle some exceptional condition by using if.. nad else.. you need to insert into different part of your code some arbitrary variables and other stuff which probably would easily led to have spaghetti code soon after.
Let next imagine that you are using any external library/package and it's author decided to put in his/her code other arbitrary way to handle wrong states - it would force you to adjust to its way of dealing with it - for example you would need to check if particular methods returns true or false or whatever. Using exceptions makes handling errors much more easy - you just assume that if something goes wrong - the other code will throw exception, so you just wrap the code in try block and handle possible exception on your own way.

ActionScript * datatype - what is it?

In Flex I can specify a * as datatype.
What is it and how should I use it.
I've written a inArray() method that takes as argument something to search for and something to search in. The "something to search for" looks like a great candidate for the * datatype.
Should I not use it?
Thank you.
* indicates an untyped identifier. You'd use it when you can't guarantee that the type of an object will conform to any specific class or interface definition.
Whether * is appropriate in your example depends on what operations you use in your search. If you just rely on common equality operations, you should be fine, and in fact it would probably be the best choice.
The * is to AS3 as Object is to Java. See the following:
http://livedocs.adobe.com/flex/3/html/help.html?content=03_Language_and_Syntax_11.html
The * actually explicitly says, that something has no type. In fact it's the default type, all untyped variables or functions fall back to.
It basically tells the compiler: "I don't want you to perform any typecheck on this, because this simply gets handled at runtime".
In fact * is no type. There is simply no value corresponding to it.
Please note, that only expressions of type * can return the value of undefined. Anything, that is of type Object (i.e. anything that is of a different type, than no type), can only hold null.
You will see that many Array methods use that type. That's specifically, because Array elements are of no type and can thus be undefined.
Therefore, I suppose you could have a slight performance difference based on whether your parameter is typed * or Object (or whether you use strict comparison or not, for that matter).
Personally, I would advise you use * in this case, mostly, because it is consistent with Array methods.
you have to use * when strict mode of compilation is enabled. It's just 'no type', and it will be checked at runtime.

Are hard-coded STRINGS ever acceptable?

Similar to Is hard-coding literals ever acceptable?, but I'm specifically thinking of "magic strings" here.
On a large project, we have a table of configuration options like these:
Name Value
---- -----
FOO_ENABLED Y
BAR_ENABLED N
...
(Hundreds of them).
The common practice is to call a generic function to test an option like this:
if (config_options.value('FOO_ENABLED') == 'Y') ...
(Of course, this same option may need to be checked in many places in the system code.)
When adding a new option, I was considering adding a function to hide the "magic string" like this:
if (config_options.foo_enabled()) ...
However, colleagues thought I'd gone overboard and objected to doing this, preferring the hard-coding because:
That's what we normally do
It makes it easier to see what's going on when debugging the code
The trouble is, I can see their point! Realistically, we are never going to rename the options for any reason, so about the only advantage I can think of for my function is that the compiler would catch any typo like fo_enabled(), but not 'FO_ENABLED'.
What do you think? Have I missed any other advantages/disadvantages?
If I use a string once in the code, I don't generally worry about making it a constant somewhere.
If I use a string twice in the code, I'll consider making it a constant.
If I use a string three times in the code, I'll almost certainly make it a constant.
if (config_options.isTrue('FOO_ENABLED')) {...
}
Restrict your hard coded Y check to one place, even if it means writing a wrapper class for your Map.
if (config_options.isFooEnabled()) {...
}
Might seem okay until you have 100 configuration options and 100 methods (so here you can make a judgement about future application growth and needs before deciding on your implementation). Otherwise it is better to have a class of static strings for parameter names.
if (config_options.isTrue(ConfigKeys.FOO_ENABLED)) {...
}
I realise the question is old, but it came up on my margin.
AFAIC, the issue here has not been identified accurately, either in the question, or the answers. Forget about 'harcoding strings" or not, for a moment.
The database has a Reference table, containing config_options. The PK is a string.
There are two types of PKs:
Meaningful Identifiers, that the users (and developers) see and use. These PKs are supposed to be stable, they can be relied upon.
Meaningless Id columns which the users should never see, that the developers have to be aware of, and code around. These cannot be relied upon.
It is ordinary, normal, to write code using the absolute value of a meaningful PK IF CustomerCode = "IBM" ... or IF CountryCode = "AUS" etc.
referencing the absolute value of a meaningless PK is not acceptable (due to auto-increment; gaps being changed; values being replaced wholesale).
.
Your reference table uses meaningful PKs. Referencing those literal strings in code is unavoidable. Hiding the value will make maintenance more difficult; the code is no longer literal; your colleagues are right. Plus there is the additional redundant function that chews cycles. If there is a typo in the literal, you will soon find that out during Dev testing, long before UAT.
hundreds of functions for hundreds of literals is absurd. If you do implement a function, then Normalise your code, and provide a single function that can be used for any of the hundreds of literals. In which case, we are back to a naked literal, and the function can be dispensed with.
the point is, the attempt to hide the literal has no value.
.
It cannot be construed as "hardcoding", that is something quite different. I think that is where your issue is, identifying these constructs as "hardcoded". It is just referencing a Meaningfull PK literally.
Now from the perspective of any code segment only, if you use the same value a few times, you can improve the code by capturing the literal string in a variable, and then using the variable in the rest of the code block. Certainly not a function. But that is an efficiency and good practice issue. Even that does not change the effect IF CountryCode = #cc_aus
I really should use constants and no hard coded literals.
You can say they won't be changed, but you may never know. And it is best to make it a habit. To use symbolic constants.
In my experience, this kind of issue is masking a deeper problem: failure to do actual OOP and to follow the DRY principle.
In a nutshell, capture the decision at startup time by an appropriate definition for each action inside the if statements, and then throw away both the config_options and the run-time tests.
Details below.
The sample usage was:
if (config_options.value('FOO_ENABLED') == 'Y') ...
which raises the obvious question, "What's going on in the ellipsis?", especially given the following statement:
(Of course, this same option may need to be checked in many places in the system code.)
Let's assume that each of these config_option values really does correspond to a single problem domain (or implementation strategy) concept.
Instead of doing this (repeatedly, in various places throughout the code):
Take a string (tag),
Find its corresponding other string (value),
Test that value as a boolean-equivalent,
Based on that test, decide whether to perform some action.
I suggest encapsulating the concept of a "configurable action".
Let's take as an example (obviously just as hypthetical as FOO_ENABLED ... ;-) that your code has to work in either English units or metric units. If METRIC_ENABLED is "true", convert user-entered data from metric to English for internal computation, and convert back prior to displaying results.
Define an interface:
public interface MetricConverter {
double toInches(double length);
double toCentimeters(double length);
double toPounds(double weight);
double toKilograms(double weight);
}
which identifies in one place all the behavior associated with the concept of METRIC_ENABLED.
Then write concrete implementations of all the ways those behaviors are to be carried out:
public class NullConv implements MetricConverter {
double toInches(double length) {return length;}
double toCentimeters(double length) {return length;}
double toPounds(double weight) {return weight;}
double toKilograms(double weight) {return weight;}
}
and
// lame implementation, just for illustration!!!!
public class MetricConv implements MetricConverter {
public static final double LBS_PER_KG = 2.2D;
public static final double CM_PER_IN = 2.54D
double toInches(double length) {return length * CM_PER_IN;}
double toCentimeters(double length) {return length / CM_PER_IN;}
double toPounds(double weight) {return weight * LBS_PER_KG;}
double toKilograms(double weight) {return weight / LBS_PER_KG;}
}
At startup time, instead of loading a bunch of config_options values, initialize a set of configurable actions, as in:
MetricConverter converter = (metricOption()) ? new MetricConv() : new NullConv();
(where the expression metricOption() above is a stand-in for whatever one-time-only check you need to make, including looking at the value of METRIC_ENABLED ;-)
Then, wherever the code would have said:
double length = getLengthFromGui();
if (config_options.value('METRIC_ENABLED') == 'Y') {
length = length / 2.54D;
}
// do some computation to produce result
// ...
if (config_options.value('METRIC_ENABLED') == 'Y') {
result = result * 2.54D;
}
displayResultingLengthOnGui(result);
rewrite it as:
double length = converter.toInches(getLengthFromGui());
// do some computation to produce result
// ...
displayResultingLengthOnGui(converter.toCentimeters(result));
Because all of the implementation details related to that one concept are now packaged cleanly, all future maintenance related to METRIC_ENABLED can be done in one place. In addition, the run-time trade-off is a win; the "overhead" of invoking a method is trivial compared with the overhead of fetching a String value from a Map and performing String#equals.
I believe that the two reasons you have mentioned, Possible misspelling in string, that cannot be detected until run time and the possibility (although slim) of a name change would justify your idea.
On top of that you can get typed functions, now it seems you only store booleans, what if you need to store an int, a string etc. I would rather use get_foo() with a type, than get_string("FOO") or get_int("FOO").
I think there are two different issues here:
In the current project, the convention of using hard-coded strings is already well established, so all the developers working on the project are familiar with it. It might be a sub-optimal convention for all the reasons that have been listed, but everybody familiar with the code can look at it and instinctively knows what the code is supposed to do. Changing the code so that in certain parts, it uses the "new" functionality will make the code slightly harder to read (because people will have to think and remember what the new convention does) and thus a little harder to maintain. But I would guess that changing over the whole project to the new convention would potentially be prohibitively expensive unless you can quickly script the conversion.
On a new project, symbolic constants are the way IMO, for all the reasons listed. Especially because anything that makes the compiler catch errors at compile time that would otherwise be caught by a human at run time is a very useful convention to establish.
Another thing to consider is intent. If you are on a project that requires localization hard coded strings can be ambiguous. Consider the following:
const string HELLO_WORLD = "Hello world!";
print(HELLO_WORLD);
The programmer's intent is clear. Using a constant implies that this string does not need to be localized. Now look at this example:
print("Hello world!");
Here we aren't so sure. Did the programmer really not want this string to be localized or did the programmer forget about localization while he was writing this code?
I too prefer a strongly-typed configuration class if it is used through-out the code. With properly named methods you don't lose any readability. If you need to do conversions from strings to another data type (decimal/float/int), you don't need to repeat the code that does the conversion in multiple places and can cache the result so the conversion only takes place once. You've already got the basis of this in place already so I don't think it would take much to get used to the new way of doing things.