Related
I am trying to do a simple addition of data to a database table (PostgreSQL). At first, I couldn't even get a simple
$my_item = $_item_class->new(...);
to work. I discovered I had spelled a field differently in my code from what I had in my "model" code.
But, now, this is working, but when I try:
$my_item->save;
it seems an exception is thrown. All this is occurring in an eval {...} structure and I would like to catch the exception and see what is going wrong, but I don't know how to do that.
Why would something like the "save" be failing here? I have checked everything, and all seems right (of course!).
And, how do I catch the exception that seems to be being thrown?
Thank you!
I figured all this out myself. It was simple. I had duplicated a field in my class somehow when I had done an edit to it. That was all. The class just had two identically named fields specified in the hash table in the class, both with identical characteristics. When I removed one of these, the code worked.
With regard to my second question about how to catch the exception, I had to learn how to have an
if ($#) {
.
.
.
}
right after my "eval {...}" structure. Because I am new to Perl, I didn't understand that. But, it was actually pretty easy to figure out. My problem was that I was working from some code as a model for me that didn't do that but named specific exceptions that were thrown in its "eval {...}" code. So, I thought that I had to have the names of exceptions that could be thrown by Rose::DB::Object calls, but I couldn't find any such exceptions in the documentation. When I learned about "if ($#) {...}", I was able to print out the reported exception in $# and from that I was able to see the problem with the duplicate field I mentioned above.
That was all there was to it. Everything is working just fine now.
Take the following code:
private var m_iQuanitity:int;
public function get quantity():int
{
return m_iQuantity;
}
That seems to make perfect sense. You can see what the quantity is from an outside class without any problems, but you can't really mess with it at all. Now take the following code:
private var m_acUsers:ArrayCollection = new ArrayCollection();
public function get users():ArrayCollection
{
return m_acUsers;
}
In that case you can't really set the variable directly, but you can still do just about everything else under the sun to it without any problems. You can call its AddItem and RemoveItemAt functions, which can do quite a bit to "set" the variable.
Does it still make sense to do this? I know you can create a duplicate ArrayCollection and just pass the duplicate back to avoid allowing it to be set, but doing stuff like that all over the place, purely for defensive programming, can waste a lot of CPU time. So I guess I'm asking if it still makes sense anyway, how so, and if I'm missing the point of using get and set completely? Thanks!
Syntactically there is nothing wrong with what you've got, but the second example does break down the concept of 'get' by making more than a read only property. If you need to adhere to a read only policy, then you've broken that since now you can manipulate the ArrayCollection.
In the end it comes down to what it is you're tying to do. Does it matter for the project that you can change the value? If you're working on a project with more than a few people, this type of coding will require you to either add a comment or have you explain what you're doing. When ever you do something outside of the norm, that can add confusion, so it's always best to simplify and stick to what is expected, avoiding having to explain something.
Also, I can think of a few ways this could cause problems - changing values outside of the function if you pass the returned property off to other classes that don't know where it came from and having internal code in the original class fail.
Here is my specific scenario.
I have a class QueryQueue that wraps the QueryTask class within the ArcGIS API for Flex. This enables me to easily queue up multiple query tasks for execution. Calling QueryQueue.execute() iterate through all the tasks in my queue and call their execute method.
When all the results have been received and processed QueryQueue will dispatch the completed event. The interface to my class is very simple.
public interface IQueryQueue
{
function get inProgress():Boolean;
function get count():int;
function get completed():ISignal;
function get canceled():ISignal;
function add(query:Query, url:String, token:Object = null):void;
function cancel():void;
function execute():void;
}
For the QueryQueue.execute method to be considered successful several things must occur.
task.execute must be called on each query task once and only once
inProgress = true while the results are pending
inProgress = false when the results have been processed
completed is dispatched when the results have been processed
canceled is never called
The processing done within the queue correctly processes and packages the query results
What I am struggling with is breaking these tests into readable, logical, and maintainable tests.
Logically I am testing one state, that is the successful execution state. This would suggest one unit test that would assert #1 through #6 above are true.
[Test] public mustReturnQueryQueueEventArgsWithResultsAndNoErrorsWhenAllQueriesAreSuccessful:void
However, the name of the test is not informative as it does not describe all the things that must be true in order to be considered a passing test.
Reading up online (including here and at programmers.stackexchange.com) there is a sizable camp that asserts that unit tests should only have one assertion (as a guideline). As a result when a test fails you know exactly what failed (i.e. inProgress not set to true, completed displayed multiple times, etc.) You wind up with potentially a lot more (but in theory simpler and clearer) tests like so:
[Test] public mustInvokeExecuteForEachQueryTaskWhenQueueIsNotEmpty():void
[Test] public mustBeInProgressWhenResultsArePending():void
[Test] public mustNotInProgressWhenResultsAreProcessedAndSent:void
[Test] public mustDispatchTheCompletedEventWhenAllResultsProcessed():void
[Test] public mustNeverDispatchTheCanceledEventWhenNotCanceled():void
[Test] public mustReturnQueryQueueEventArgsWithResultsAndNoErrorsWhenAllQueriesAreSuccessful:void
// ... and so on
This could wind up with a lot of repeated code in the tests, but that could be minimized with appropriate setup and teardown methods.
While this question is similar to other questions I am looking for an answer for this specific scenario as I think it is a good representation of a complex unit testing scenario exhibiting multiple states and behaviors that need to be verified. Many of the other questions have, unfortunately, no examples or the examples do not demonstrate complex state and behavior.
In my opinion, and there will probably be many, there are a couple of things here:
If you must test so many things for one method, then it could mean your code might be doing too much in one single method (Single Responsibility Principle)
If you disagree with the above, then the next thing I would say is that what you are describing is more of an integration/acceptance test. Which allows for multiple asserts, and you have no problems there. But, keep in mind that this might need to be relegated to a separate section of tests if you are doing automated tests (safe versus unsafe tests)
And/Or, yes, the preferred method is to test each piece separately as that is what a unit test is. The closest thing I can suggest, and this is about your tolerance for writing code just to have perfect tests...Is to check an object against an object (so you would do one assert that essentially tests this all in one). However, the argument against this is that, yes it passes the one assert per test test, but you still lose expressiveness.
Ultimately, your goal should be to strive towards the ideal (one assert per unit test) by focusing on the SOLID principles, but ultimately you do need to get things done or else there is no real point in writing software (my opinion at least :)).
Let's focus on the tests you have identified first. All except the last one (mustReturnQueryQueueEventArgs...) are good ones and I could immediatelly tell what's being tested there (and that's very good sign, indicating they're descriptive and most likely simple).
The only problem is your last test. Note that extensive use of words "and", "with", "or" in test name usually rings problems bell. It's not very clear what it's supposed to do. Return correct results comes to mind first, but one might argue it's vague term? This holds true, it is vague. However you'll often find out that this is indeed pretty common requirement, described in details by method/operation contract.
In your particular case, I'd simplify last test to verify whether correct results are returned and that would be all. You tested states, events and stuff that lead to results building already, so there is no need to that again.
Now, advices in links you provided are quite good ones actually, and generally, I suggest sticking to them (single assertion for one test). The question is, what single assertion really stands for? 1 line of code at the end of test? Let's consider this simple example then:
// a method which updates two fields of our custom entity, MyEntity
public void Update(MyEntity entity)
{
entity.Name = "some name";
entity.Value = "some value";
}
This method contract is to perform those 2 operations. By success, we understand entity to be correctly updated. If one of them for some reasons fails, method as a unit is considered to fail. You can see where this is going; you'll either have two assertions or write custom comparer purely for testing purposes.
Don't be tricked by single assertion; it's not about lines of code or number of asserts (however, in majority of tests you'll write this will indeed map 1:1), but about asserting single unit (in the example above, update is considered to be an unit). And unit might be in reality multiple things that don't make any sense at all without eachother.
And this is exactly what one of questions you linked quotes (by Roy Osherove):
My guideline is usually that you test one logical CONCEPT per test. you can have multiple asserts on the same object. they will usually be the same concept being tested.
It's all about concept/responsibility; not the number of asserts.
I am not familiar with flex, but I think I have good experience in unit testing, so you have to know that unit test is a philosophy, so for the first answer, yes you can make a multiple assert but if you test the same behavior, the main point always in unit testing is to be very maintainable and simple code, otherwise the unit test will need unit test to test it! So my advice to you is, if you are new in unit testing, don't use multiple assert, but if you have good experience with unit testing, you will know when you will need to use them
One thing I've sometimes wondered is which is the better style out of the two shown below (if any)? Is it better to return immediately if a guard condition hasn't been satisfied, or should you only do the other stuff if the guard condition is satisfied?
For the sake of argument, please assume that the guard condition is a simple test that returns a boolean, such as checking to see if an element is in a collection, rather than something that might affect the control flow by throwing an exception. Also assume that methods/functions are short enough not to require editor scrolling.
// Style 1
public SomeType aMethod() {
SomeType result = null;
if (!guardCondition()) {
return result;
}
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
// Style 2
public SomeType aMethod() {
SomeType result = null;
if (guardCondition()) {
doStuffToResult(result);
doMoreStuffToResult(result);
}
return result;
}
I prefer the first style, except that I wouldn't create a variable when there is no need for it. I'd do this:
// Style 3
public SomeType aMethod() {
if (!guardCondition()) {
return null;
}
SomeType result = new SomeType();
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
Having been trained in Jackson Structured Programming in the late '80s, my ingrained philosophy was always "a function should have a single entry-point and a single exit-point"; this meant I wrote code according to Style 2.
In the last few years I have come to realise that code written in this style is often overcomplex and hard to read/maintain, and I have switched to Style 1.
Who says old dogs can't learn new tricks? ;)
Style 1 is what the Linux kernel indirectly recommends.
From https://www.kernel.org/doc/Documentation/process/coding-style.rst, chapter 1:
Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen. The answer to that is that if you need
more than 3 levels of indentation, you're screwed anyway, and should fix
your program.
Style 2 adds levels of indentation, ergo, it is discouraged.
Personally, I like style 1 as well. Style 2 makes it harder to match up closing braces in functions that have several guard tests.
I don't know if guard is the right word here. Normally an unsatisfied guard results in an exception or assertion.
But beside this I'd go with style 1, because it keeps the code cleaner in my opinion. You have a simple example with only one condition. But what happens with many conditions and style 2? It leads to a lot of nested ifs or huge if-conditions (with || , &&). I think it is better to return from a method as soon as you know that you can.
But this is certainly very subjective ^^
Martin Fowler refers to this refactoring as :
"Replace Nested Conditional with Guard Clauses"
If/else statements also brings cyclomatic complexity. Hence harder to test cases. In order to test all the if/else blocks you might need to input lots of options.
Where as if there are any guard clauses, you can test them first, and deal with the real logic inside the if/else clauses in a clearer fashion.
If you dig through the .net-Framework using .net-Reflector you will see the .net programmers use style 1 (or maybe style 3 already mentioned by unbeli).
The reasons are already mentioned by the answers above. and maybe one other reason is to make the code better readable, concise and clear.
the most thing this style is used is when checking the input parameters, you always have to do this if you program a kind of frawework/library/dll.
first check all input parameters than work with them.
It sometimes depends on the language and what kinds of "resources" that you are using (e.g. open file handles).
In C, Style 2 is definitely safer and more convenient because a function has to close and/or release any resources that it obtained during execution. This includes allocated memory blocks, file handles, handles to operating system resources such as threads or drawing contexts, locks on mutexes, and any number of other things. Delaying the return until the very end or otherwise restricting the number of exits from a function allows the programmer to more easily ensure that s/he properly cleans up, helping to prevent memory leaks, handle leaks, deadlock, and other problems.
In C++ using RAII-style programming, both styles are equally safe, so you can pick one that is more convenient. Personally I use Style 1 with RAII-style C++. C++ without RAII is like C, so, again, Style 2 is probably better in that case.
In languages like Java with garbage collection, the runtime helps smooth over the differences between the two styles because it cleans up after itself. However, there can be subtle issues with these languages, too, if you don't explicitly "close" some types of objects. For example, if you construct a new java.io.FileOutputStream and do not close it before returning, then the associated operating system handle will remain open until the runtime garbage collects the FileOutputStream instance that has fallen out of scope. This could mean that another process or thread that needs to open the file for writing may be unable to until the FileOutputStream instance is collected.
Although it goes against best practices that I have been taught I find it much better to reduce the nesting of if statements when I have a condition such as this. I think it is much easier to read and although it exits in more than one place it is still very easy to debug.
I would say that Style1 became more used because is the best practice if you combine it with small methods.
Style2 look a better solution when you have big methods. When you have them ... you have some common code that you want to execute no matter how you exit. But the proper solution is not to force a single exit point but to make the methods smaller.
For example if you want to extract a sequence of code from a big method, and this method has two exit points you start to have problems, is hard to do it automatically. When i have a big method written in style1 i usually transform it in style2, then i extract methods then in each of them i should have Style1 code.
So Style1 is best but is compatible with small methods.
Style2 is not so good but is recommended if you have big methods that you don't want, have time to split.
I prefer to use method #1 myself, it is logically easier to read and also logically more similar to what we are trying to do. (if something bad happens, exit function NOW, do not pass go, do not collect $200)
Furthermore, most of the time you would want to return a value that is not a logically possible result (ie -1) to indicate to the user who called the function that the function failed to execute properly and to take appropriate action. This lends itself better to method #1 as well.
I would say "It depends on..."
In situations where I have to perform a cleanup sequence with more than 2 or 3 lines before leaving a function/method I would prefer style 2 because the cleanup sequence has to be written and modified only once. That means maintainability is easier.
In all other cases I would prefer style 1.
Number 1 is typically the easy, lazy and sloppy way. Number 2 expresses the logic cleanly. What others have pointed out is that yes it can become cumbersome. This tendency though has an important benefit. Style #1 can hide that your function is probably doing too much. It doesn't visually demonstrate the complexity of what's going on very well. I.e. it prevents the code from saying to you "hey this is getting a bit too complex for this one function". It also makes it a bit easier for other developers that don't know your code to miss those returns sprinkled here and there, at first glance anyway.
So let the code speak. When you see long conditions appearing or nested if statements it is saying that maybe it would be better to break this stuff up into multiple functions or that it needs to be rewritten more elegantly.
The code base I'm currently working on is littered with hard-coded values.
I view all hard coded values as a code smell and I try to eliminate them where possible...however there are some cases that I am unsure about.
Here are two examples that I can think of that make me wonder what the best practice is:
1. MyTextBox.Text = someCondition ? "Yes" : "No"
2. double myPercentage = myValue / 100;
In the first case, is the best thing to do to create a class that allows me to do MyHelper.Yes and MyHelper.No or perhaps something similar in a config file (though it isn't likely to change and who knows if there might ever be a case where its usage would be case sensitive).
In the second case, finding a percentage by dividing by 100 isn't likely to ever change unless the laws of mathematics change...but I still wonder if there is a better way.
Can anyone suggest an appropriate way to deal with this sort of hard coding? And can anyone think of any places where hard coding is an acceptable practice?
And can anyone think of any places where hard coding is an acceptable practice?
Small apps
Single man projects
Throw aways
Short living projects
For short anything that won't be maintained by others.
Gee I've just realized how much being maintainer coder hurt me in the past :)
The real question isn't about hard coding, but rather repetition. If you take the excellent advice found in "The Pragmatic Programmer", simply Don't Repeat Yourself (DRY).
Taking the principle of DRY, it is fine to hardcode something at any point. However, once you use that particular value again, refactor so this value is only hardcoded once.
Of course hard-coding is sometimes acceptable. Following dogma is rarely as useful a practice as using your brain.
(For an example of this, perhaps it's interesting to go back to the goto wars. How many programmers do you know that will swear by all things holy that goto is evil? Why then does Steve McConnell devote a dozen pages to a measured discussion of the subject in Code Complete?)
Sure, there's a lot of hard-gained experience that tells us that small throw-away applications often mutate into production code, but that's no reason for zealotry. The agilists tell us we should do the simplest thing that could possibly work and refactor when needed.
That's not to say that the "simplest thing" shouldn't be readable code. It may make perfect sense, even in a throw-away spike to write:
const MAX_CACHE_RECORDS = 50
foo = GetNewCache(MAX_CACHE_RECORDS)
This is regardless of the fact that in three iterations time, someone might ask for the number of cache records to be configurable, and you might end up refactoring the constant away.
Just remember, if you go to the extremes of stuff like
const ONE_HUNDRED = 100
const ONE_HUNDRED_AND_ONE = 101
we'll all come to The Daily WTF and laugh at you. :-)
Think! That's all.
It's never good and you just proved it...
double myPercentage = myValue / 100;
This is NOT percentage. What you wanted to write is :
double myPercentage = (myValue / 100) * 100;
Or more correctly :
double myPercentage = (myValue / myMaxValue) * 100;
But this hard coded 100 messed with your mind... So go for the getPercentage method that Colen suggested :)
double getpercentage(double myValue, double maxValue)
{
return (myValue / maxValue) * 100;
}
Also as ctacke suggested, in the first case you will be in a world of pain if you ever need to localize these literals. It's never too much trouble to add a couple more variables and/or functions
The first case will kill you if you ever need to localize. Moving it to some static or constant that is app-wide would at least make localizing it a little easier.
Case 1: When should you hard-code stuff: when you have no reason to think that it will ever change. That said, you should NEVER hard code stuff in-line. Take the time to make static variables or global variables or whatever your language gives you. Do them in the class in question, and if you notice that two classes or areas of your code share the same value FOR THE SAME REASON (meaning it's not just coincidence), point them to the same place.
Case 2: For case case 2, you're correct: the laws of "percentage" will not change (being reasonable, here), so you can hard code inline.
Case 3: The third case is where you think the thing could change but you don't want to/have time to bother loading ResourceBundles or XML or whatever. In that case, you use whatever centralizing mechanism you can -- the hated Singleton class is a good one -- and go with that until you actually have need to deal with the problem.
The third case is tricky, though: it's extraordinarily hard to internationalize an application without really doing it... so you will want to hard-code stuff and just hope that, when the i18n guys come knocking, your code is not the worst-tasting code around :)
Edit: Let me mention that I've just finished a refactoring project in which the prior developer had placed the MySql connect strings in 100+ places in the code (PHP). Sometimes they were uppercase, sometimes they were lower case, etc., so they were hard to search and replace (though Netbeans and PDT did help a lot). There are reasons why he/she did this (a project called POG basically forces this stupidity), but there is just nothing that seems less like good code than repeating the same thing in a million places.
The better way for your second example would be to define an inline function:
double getpercentage(double myValue)
{
return(myValue / 100);
}
...
double myPercentage = getpercentage(myValue);
That way it's a lot more obvious what you're doing.
Hardcoded literals should appear in unit tests for the test values, unless there is so much reuse of a value within a single test class that a local constant is useful.
The unit tests are a description of expected values without any abstraction or redirection.
Imagine yourself reading the test - you want the information literally in front of you.
The only time I use constants for test values is when many tests repeat a value (itself a bit suspicious) and the value may be subject to change.
I do use constants for things like names of test files to compare.
I don't think that your second is really an example of hardcoding. That's like having a Halve() method that takes in a value to use to divide by; doesn't make sense.
Beyond that, example 1, if you want to change the language for your app, you don't want to have to change the class, so it should absolutely be in a config.
Hard coding should be avoided like Dracula avoids the sun. It'll come back to bite you in the ass eventually.
"hardcoding" is the wrong thing to worry about. The point is not whether special values are in code or in config files, the point is:
If the value could ever change, how much work is that and how hard is it to find? Putting it in one place and referring to that place elsewhere is not much work and therefore a way to play it safe.
Will maintainance programmers definitely understand why the value is what it is? If there is any doubt whatsoever, use a named constant that explains the meaning.
Both of these goals can be achieved without any need for config files; in fact I'd avoid those if possible. "putting stuff in config files means it's easier to change" is a myth, unless either
you actually want to support customers changing the values themselves
no value that could possibly be put in the config file can cause a bug (buffer overflow, anyone?)
your build and deployment process sucks
The text for the conditions should be in a resource file; that's what it's there for.
Not normally (Are hard-coding literals acceptable)
Another way at looking at this is how using a good naming convention
for constants used in-place of hard coded literals provides additional
documentation in the program.
Even if the number is used only once, it can still be hard to recognized
and may even be hard to find for future changes.
IMHO, making programs easier to read should be second nature to a
seasoned software professional. Raw numbers rarely communicate
meaningfully.
The extra time taken to use a well named constant will make the
code readability (easy to recall to the mind) and useful for future
re-mining (code re-use).
I tend to view it in terms of the project's scope and size.
Some simple projects that I am a solo dev on? Sure, I hard code lots of things. Tools I write that only I will ever use? Sure, if it gets the job done.
But, in working on larger, team projects? I agree, they are suspect and usually the product of laziness. Tag them for review and see if you can spot a pattern where they can be abstracted away.
In your example, the text box should be localizable, so why not a class that handles that?
Remember that you WILL forget the meaning of any non-obvious hard-coded value.
So be certain to put a short comment after each to remind you.
A Delphi example:
Length := Length * 0.3048; { 0.3048 converts feet to meters }
no.
What is a simple throw away app today will be driving your entire enterprise tomorrow. Always use best practices or you'll regret it.
Code always evolves. When you initially write stuff hard coding is the easiest way to go. Later when a need arrives to change the value it can be improved. In some cases the need never comes.
The need can arrive in many forms:
The value is used in many places and it needs to be changed by a programmer. In this case a constant is clearly needed.
User needs to be able to change the value.
I don't see the need to avoid hard coding. I do see the need to change things when there is a clear need.
Totally separate issue is that of course the code needs to be readable and this means that there might be a need for a comment for the hard coded value.
For the first value, it really depends. If you don't anticipate any kind of wide-spread adoption of your application and internationalization will never be an issue, I think it's mostly fine. However, if you are writing some kind of open source software or something with a larger audience consider the fact that it may one day need to be translated. In that case, you may be better off using string resources.
It's okay as long as you don't do refactoring, unit-testing, peer code reviews. And, you don't want repeat customers. Who cares?
I once had a boss who refused to not hardcode something because in his mind it gave him full control over the software and the items related to the software. Problem was, when the hardware died that ran the software the server got renamed... meaning he had to find his code. That took a while. I simply found a hex editor and hacked around it instead of waiting.
I normally add a set of helper methods for strings and numbers.
For example when I have strings such as 'yes' and 'no' I have a function called __ so I call __('yes'); which starts out in the project by just returning the first parameter but when I need to do more complex stuff (such as internationaizaton) it's already there and the param can be used a key.
Another example is VAT (form of UK tax) in online shops, recently it changed from 17.5% to 15%. Any one who hard coded VAT by doing:
$vat = $price * 0.175;
had to then go through all references and change it to 0.15, instead the super usefull way of doing it would be to have a function or variable for VAT.
In my opinion anything that could change should be written in a changeable way. If I find myself doing the same thing more than 5 times in the same day then it becomes a function or a config var.
Hard coding should be banned forever. Althought in you very simple examples i don't see anything wrong using them in any kind of project.
In my opinion hard coding is when you believe that a variable/value/define etc. will never change and create all your code based on that belief.
Example of such hard coding is the book Teach Yourself C in 24 Hours that everybody should avoid.