How do you write a (simple) variable "toggle"? - language-agnostic

Given the following idioms:
1)
variable = value1
if condition
variable = value2
2)
variable = value2
if not condition
variable = value1
3)
if condition
variable = value2
else
variable = value1
4)
if not condition
variable = value1
else
variable = value2
Which do you prefer, and why?
We assume the most common execution path to be that of condition being false.
I tend to learn towards using 1), although I'm not exactly sure why I like it more.
Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax, and they are not suitable for extending the variable assignment to include more than one statement in the future.
variable = condition ? value2 : value1
...
variable = value2 if condition else value1

In theory, I prefer #3 as it avoids having to assign a value to the variable twice. In the real world though I use any of the four above that would be more readable or would express more clearly my intention.

I prefer method 3 because it is more concise and a logical unit. It sets the value only once, it can be moved around as a block, and it's not that error-prone (which happens, esp. in method 1 if setting-to-value1 and checking-and-optionally-setting-to-value2 are separated by other statements)

3) is the clearest expression of what you want to happen. I think all the others require some extra thinking to determine which value is going to end up in the variable.
In practice, I would use the ternary operator (?:) if I was using a language that supported it. I prefer to write in functional or declarative style over imperative whenever I can.

I tend to use #1 alot myself. if condition reads easier than if !condition, especially if you acidentally miss the '!', atleast to my mind atleast.
Most coding I do is in C#, but I still tend to steer clear of the terniary operator, unless I'm working with (mostly) local variables. Lines tend to get long VERY quickly in a ternary operator if you're calling three layers deep into some structure, which quickly decreases the readability again.

Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax
This is no argument for not using them in languages that do provide such a syntax. Incidentally, that includes all current mainstream languages after my last count.
and they are not suitable for extending the variable assignment to include more than one statement in the future.
This is true. However, it's often certain that such an extension will absolutely never take place because the condition will always yield one of two possible cases.
In such situations I will always prefer the expression variant over the statement variant because it reduces syntactic clutter and improves expressiveness. In other situations I tend to go with the switch statement mentioned before – if the language allows this usage. If not, fall-back to generic if.

switch statement also works. If it's simple and more than 2 or 3 options, that's what I use.

In a situation where the condition might not happen. I would go with 1 or 2. Otherwise its just based on what i want the code to do. (ie. i agree with cruizer)

I tend to use if not...return.
But that's if you are looking to return a variable. Getting disqualifiers out of the way first tends to make it more readable. It really depends on the context of the statement and also the language. A case statement might work better and be readable most of the time, but performance suffers under VB so a series of if/else statements makes more sense in that specific case.

Method 1 or method 3 for me. Method 1 can avoid an extra scope entrance/exit, but method 3 avoids an extra assignment. I'd tend to avoid Method 2 as I try to keep condition logic as simple as possible (in this case, the ! is extraneous as it could be rewritten as method 1 without it) and the same reason applies for method 4.

It depends on what the condition is I'm testing.
If it's an error flag condition then I'll use 1) setting the Error flag to catch the error and then if the condition is successfull clear the error flag. That way there's no chance of missing an error condition.
For everything else I'd use 3)
The NOT logic just adds to confusion when reading the code - well in my head, can't speak for eveyone else :-)

If the variable has a natural default value I would go with #1. If either value is equally (in)appropriate for a default then I would go with #2.

It depends. I like the ternary operators, but sometimes it's clearer if you use an 'if' statement. Which of the four alternatives you choose depends on the context, but I tend to go for whichever makes the code's function clearer, and that varies from situation to situation.

Related

Best practice for interface design

I am wondering which version is the best one to implement.
The parameters are states that have 2 possible values.
This is an abstract example of the actual problem.
I am programming in a language that is procedural (without classes) and does not have typed variable.
I just read an article stating that version 1 is bad for readability and the caller. Personally I don't like version 2 either. Maybe there is a better option?
Version 1:
doSth(par1, par2)
Not redundant +
Single Method for a task +
More complex implementation -
Wrong parameters can be passed easily -
Version 2:
doSthWithPar1Is1AndPar2Is1()
doSthWithPar1Is1AndPar2Is2()
doSthWithPar1Is2AndPar2Is1()
doSthWithPar1Is2AndPar2Is2()
Redundant -
Too many methods (especially with more parameters) -
Long Method Names -
Simple implementation +
No parameters that could be passed wrong +
Given that you already have considered V1 feasible tells me, that the different argument value combinations have something in common with regards to how the values are to be processed.
In V2 you simply have to type and read more, which I'd say is the single most frequent reason for introducing errors/incorrectness and lose track of your requirements.
In V2 you have to repeat what is common in the individual implementations and if you make a mistake, the overall logic will be inconsistent at best. And if you want to fix it, you probably have to fix it in several places.
But, you can optimize code safety based on V1: choose a more "verbose" name for the procedure, like
doSomethingVerySpecificWithPar1OfTypeXAppliedToPar2OfTypeY(par1, par2)
(I am exaggerating a bit...) so you see immediately what you have originally intended.
You could even take the best out of V2 and introduce the individual functions, which simply redirect to the common function of V1 (so you avoid the redundancy). The gain in clarity almost always outweighs the slight loss of efficiency.
doSthWithPar1Is1AndPar2Is1()
{
doSomethingVerySpecificWithPar1OfTypeXAppliedToPar2OfTypeY(1, 1);
}
Always remember David Wheeler: "All problems in computer science can be solved by another level of indirection".
Btw: I don't consider long method names a problem but rather a benefit (up to a certain length of course).

Is a simple ternary case okay for use in program flow as long as it does not hurt readability?

After reading To ternary or not to ternary? and Is this a reasonable use of the ternary operator?, I gathered that simple uses of the ternary operator are generally accepted, because they do not hurt readability. I also gathered that having one side of the ternary block return null when you don't want it to do something is a complete waste.. However, I ran across this case while refactoring my site that made me wrinkle my nose:
if ($success) {
$database->commit();
} else {
$database->rollback();
}
I refactored this down to
$success ? $database->commit() : $database->rollback();
And I was pretty satisfied with it.. but something inside me made me come here for input. Exception catching aside, would you consider this an okay use case? Am I wondering if this is an okay use because I have never done this before, or because it really is bad practice? This doesn't seem difficult to me, but would this seem difficult to understand for anyone else? Does it depend on the language.. as in, would this be more/less wrong in C, C++, or Java?
No, it is not OK. You are turning something that should look like a statement into something that looks like an expression. In fact, if commit() and rollback() return void, this will not compile in Java at least (not sure about the others mentioned).
If you want a one-liner, you should rather create another method on the $database object such as $database->endTransaction($success) that does the if statement internally.
I would be more inclined to use it in case the two actions are mutually-exclusive and/or opposite (yet related to each other), for example:
$success ? go_up() : go_down();
For two unrelated actions I would be less inclined to use it, the reason being that there is a higher probability for one of the branches to need expanding in the future. If that's the case, you will again need to rewrite it as an if-else statement. Imagine that you have:
$success ? do_abc() : do_xyz();
If at some point you decide that the first branch needs to do_def() as well, you'll need to rewrite the whole thing to an if-else statement again.
The more frequent usage of the ternary operator, however, is:
$var = $success ? UP : DOWN;
This way you are evaluating it as an expression, not as a statement.
The real question is, "Is the ternary form more or less readable than the if form?". I'd say it isn't. But this is a question of style, not of function.

Programming style: should you return early if a guard condition is not satisfied?

One thing I've sometimes wondered is which is the better style out of the two shown below (if any)? Is it better to return immediately if a guard condition hasn't been satisfied, or should you only do the other stuff if the guard condition is satisfied?
For the sake of argument, please assume that the guard condition is a simple test that returns a boolean, such as checking to see if an element is in a collection, rather than something that might affect the control flow by throwing an exception. Also assume that methods/functions are short enough not to require editor scrolling.
// Style 1
public SomeType aMethod() {
SomeType result = null;
if (!guardCondition()) {
return result;
}
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
// Style 2
public SomeType aMethod() {
SomeType result = null;
if (guardCondition()) {
doStuffToResult(result);
doMoreStuffToResult(result);
}
return result;
}
I prefer the first style, except that I wouldn't create a variable when there is no need for it. I'd do this:
// Style 3
public SomeType aMethod() {
if (!guardCondition()) {
return null;
}
SomeType result = new SomeType();
doStuffToResult(result);
doMoreStuffToResult(result);
return result;
}
Having been trained in Jackson Structured Programming in the late '80s, my ingrained philosophy was always "a function should have a single entry-point and a single exit-point"; this meant I wrote code according to Style 2.
In the last few years I have come to realise that code written in this style is often overcomplex and hard to read/maintain, and I have switched to Style 1.
Who says old dogs can't learn new tricks? ;)
Style 1 is what the Linux kernel indirectly recommends.
From https://www.kernel.org/doc/Documentation/process/coding-style.rst, chapter 1:
Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen. The answer to that is that if you need
more than 3 levels of indentation, you're screwed anyway, and should fix
your program.
Style 2 adds levels of indentation, ergo, it is discouraged.
Personally, I like style 1 as well. Style 2 makes it harder to match up closing braces in functions that have several guard tests.
I don't know if guard is the right word here. Normally an unsatisfied guard results in an exception or assertion.
But beside this I'd go with style 1, because it keeps the code cleaner in my opinion. You have a simple example with only one condition. But what happens with many conditions and style 2? It leads to a lot of nested ifs or huge if-conditions (with || , &&). I think it is better to return from a method as soon as you know that you can.
But this is certainly very subjective ^^
Martin Fowler refers to this refactoring as :
"Replace Nested Conditional with Guard Clauses"
If/else statements also brings cyclomatic complexity. Hence harder to test cases. In order to test all the if/else blocks you might need to input lots of options.
Where as if there are any guard clauses, you can test them first, and deal with the real logic inside the if/else clauses in a clearer fashion.
If you dig through the .net-Framework using .net-Reflector you will see the .net programmers use style 1 (or maybe style 3 already mentioned by unbeli).
The reasons are already mentioned by the answers above. and maybe one other reason is to make the code better readable, concise and clear.
the most thing this style is used is when checking the input parameters, you always have to do this if you program a kind of frawework/library/dll.
first check all input parameters than work with them.
It sometimes depends on the language and what kinds of "resources" that you are using (e.g. open file handles).
In C, Style 2 is definitely safer and more convenient because a function has to close and/or release any resources that it obtained during execution. This includes allocated memory blocks, file handles, handles to operating system resources such as threads or drawing contexts, locks on mutexes, and any number of other things. Delaying the return until the very end or otherwise restricting the number of exits from a function allows the programmer to more easily ensure that s/he properly cleans up, helping to prevent memory leaks, handle leaks, deadlock, and other problems.
In C++ using RAII-style programming, both styles are equally safe, so you can pick one that is more convenient. Personally I use Style 1 with RAII-style C++. C++ without RAII is like C, so, again, Style 2 is probably better in that case.
In languages like Java with garbage collection, the runtime helps smooth over the differences between the two styles because it cleans up after itself. However, there can be subtle issues with these languages, too, if you don't explicitly "close" some types of objects. For example, if you construct a new java.io.FileOutputStream and do not close it before returning, then the associated operating system handle will remain open until the runtime garbage collects the FileOutputStream instance that has fallen out of scope. This could mean that another process or thread that needs to open the file for writing may be unable to until the FileOutputStream instance is collected.
Although it goes against best practices that I have been taught I find it much better to reduce the nesting of if statements when I have a condition such as this. I think it is much easier to read and although it exits in more than one place it is still very easy to debug.
I would say that Style1 became more used because is the best practice if you combine it with small methods.
Style2 look a better solution when you have big methods. When you have them ... you have some common code that you want to execute no matter how you exit. But the proper solution is not to force a single exit point but to make the methods smaller.
For example if you want to extract a sequence of code from a big method, and this method has two exit points you start to have problems, is hard to do it automatically. When i have a big method written in style1 i usually transform it in style2, then i extract methods then in each of them i should have Style1 code.
So Style1 is best but is compatible with small methods.
Style2 is not so good but is recommended if you have big methods that you don't want, have time to split.
I prefer to use method #1 myself, it is logically easier to read and also logically more similar to what we are trying to do. (if something bad happens, exit function NOW, do not pass go, do not collect $200)
Furthermore, most of the time you would want to return a value that is not a logically possible result (ie -1) to indicate to the user who called the function that the function failed to execute properly and to take appropriate action. This lends itself better to method #1 as well.
I would say "It depends on..."
In situations where I have to perform a cleanup sequence with more than 2 or 3 lines before leaving a function/method I would prefer style 2 because the cleanup sequence has to be written and modified only once. That means maintainability is easier.
In all other cases I would prefer style 1.
Number 1 is typically the easy, lazy and sloppy way. Number 2 expresses the logic cleanly. What others have pointed out is that yes it can become cumbersome. This tendency though has an important benefit. Style #1 can hide that your function is probably doing too much. It doesn't visually demonstrate the complexity of what's going on very well. I.e. it prevents the code from saying to you "hey this is getting a bit too complex for this one function". It also makes it a bit easier for other developers that don't know your code to miss those returns sprinkled here and there, at first glance anyway.
So let the code speak. When you see long conditions appearing or nested if statements it is saying that maybe it would be better to break this stuff up into multiple functions or that it needs to be rewritten more elegantly.

why can't conditional operator be used as a statement

Why can't the conditional operator be used as a statement?
I would like to do something like:
boolean isXyz = ...;
...
isXyz ? doXyz() : doAbc();
where doXyz and doAbc are return void.
Note that this is not the same as other operators, for example doXyz() + doAbc() intrinsically needs that doXyz and doAbc return a number-like something to operate (or strings to concatenate, or whatever, but the point is that + actually needs values to operate on).
Is there something deep or is it just an arbitrary decision.
Note: I come from the Java world, but I would like to know if this is possible in your favourite programming language.
C and C++ do allow such constructs. As long as doXyz() and doAbc() return the same type. Including void.
What would be the point? Why not just use an if statement (which, in my opinion, looks cleaner)?
Because it would reduce readability and introduce a potential for errors.
Languages offer means of doing what you wish by using the keyword "if".
// Is not much longer than the line below
// but significantly more transparent
if (isXyz) doXyz() else doAbc();
isXyz ? doXyz() : doAbc();
A statement is supposed to just perform some operations.
A conditional operator is meant to return a value.
As a novelty, mIRCscripting allows you to do this
alias canI? {
$iif($1 == 1,doThis,doThat)
}
alias doThis echo -a this can.
alias doThat echo -a that can.
calling it with /canI? 1 will echo this can.
calling it with /canI? 2 will echo that can.
Wouldn't this be exactly the same as the if statement?
if (isXyz) doXyz(); else doAbc();
Some languages do allow you to use the conditional operator as a statement. Perl comes to mind.
An expression, including a conditional expression, may be used on its own as a statement in Java and many other languages (I'd go as far as to say most currently-popular languages).
It's specifically ‘void-returning’ in Java that's the issue here, not anything to do with conditionals. It's sometimes considered bad taste to hide active (non-idempotent; with side-effects) code inside an expression, and active functions often return void. So because Java is a prescriptive language, it disallows using a void function in an expression. Many other languages are more permissive and will allow it.
You could get around it by having doAbc and doXyz return something — zero, a boolean, anything: it doesn't matter as long as they're the same type for both, the result will be thrown away in an ExpressionStatement. But I don't really know why you'd want to; as others have said, this is indeed a case where doing it in an expression is in poor taste and largely pointless.
I think your question is the wrong way round.... the conditional operator was added because the "IF THEN" statement couldn't be used as an evaluation statement.
In my option you should only use the conditional operator when conditionally evaluating as it is inherently less clear than using "IF THEN" constructs when purely implementing a condition.
Conditional operators cannot typically contain blocks of multiple instructions on each condition result, "IF THEN" can.

Should I always/ever/never initialize object fields to default values?

Code styling question here.
I looked at this question which asks if the .NET CLR will really always initialize field values. (The answer is yes.) But it strikes me that I'm not sure that it's always a good idea to have it do this. My thinking is that if I see a declaration like this:
int myBlorgleCount = 0;
I have a pretty good idea that the programmer expects the count to start at zero, and is okay with that, at least for the immediate future. On the other hand, if I just see:
int myBlorgleCount;
I have no real immediate idea if 0 is a legal or reasonable value. And if the programmer just starts reading and modifying it, I don't know whether the programmer meant to start using it before they set a value to it, or if they were expecting it to be zero, etc.
On the other hand, some fairly smart people, and the Visual Studio code cleanup utility, tell me to remove these redundant declarations. What is the general consensus on this? (Is there a consensus?)
I marked this as language agnostic, but if there is an odd case out there where it's specifically a good idea to go against the grain for a particular language, that's probably worth pointing out.
EDIT: While I did put that this question was language agnostic, it obviously doesn't apply to languages like C, where no value initialization is done.
EDIT: I appreciate John's answer, but it is exactly what I'm not looking for. I understand that .NET (or Java or whatever) will do the job and initialize the values consistently and correctly. What I'm saying is that if I see code that is modifying a value that hasn't been previously explicitly set in code, I, as a code maintainer, don't know if the original coder meant it to be the default value, or just forgot to set the value, or was expecting it to be set somewhere else, etc.
Think long term maintenance.
Keep the code as explicit as possible.
Don't rely on language specific ways to initialize if you don't have to. Maybe a newer version of the language will work differently?
Future programmers will thank you.
Management will thank you.
Why obfuscate things even the slightest?
Update: Future maintainers may come from a different background. It really isn't about what is "right" it is more what will be easiest in the long run.
You are always safe in assuming the platform works the way the platform works. The .NET platform initializes all fields to default values. If you see a field that is not initialized by the code, it means the field is initialized by the CLR, not that it is uninitialized.
This concern is valid for platforms which do not guarantee initialization, but not here. In .NET, is more often indicates ignorance from the developer, thinking initialization is necessary.
Another unnecessary hangover from the past is the following:
string foo = null;
foo = MethodCall();
I've seen that from people who should know better.
I think that it makes sense to initialize the values if it clarifies the developer's intent.
In C#, there's no overhead as the values are all initialized anyway. In C/C++, uninitialized values will contain garbage/unknown values (whatever was in the memory location), so initialization was more important.
I think it should be done if it really helps to make the code more understandable.
But I think this is a general problem with all language features. My opinion on that is: If it is an official feature of the language, you can use it. (Of course there are some anti-features which should be used with caution or avoided at all, like a missing option explicit in Visual Basic or diamond inheritance in C++)
There was I time when I was very paranoid and added all kinds of unnecessary initializations, explicit casts, über-paranoid try-finally blocks, ... I once even thought about ignoring auto-boxing and replacing all occurrences with explicit type conversions, just "to be on the safe side".
The problem is: There is no end. You can avoid almost all language features, because you do not want to trust them.
Remember: It's only magic until you understand it :)
I agree with you; it may be verbose, but I like to see:
int myBlorgleCount = 0;
Now, I always initial strings though:
string myString = string.Empty;
(I just hate null strings.)
In the case where I cannot immediately set it to something useful
int myValue = SomeMethod();
I will set it to 0. That is more to avoid having to think about what the value would be otherwise. For me, the fact that integers are always set to 0 is not on the tip of my fingers, so when I see
int myValue;
it will take me a second to pull up that fact and remember what it will be set to, disrupting my thought process.
For someone who has that knowledge readily available, they will encounter
int myValue = 0;
and wonder why the hell is that person setting it to zero, when the compiler would just do it for them. This thought would interrupt their thought process.
So do which ever makes the most sense for both you and the team you are working in. If the common practice is to set it, then set it, otherwise don't.
In my experience I've found that explicitly initializing local variables (in .NET) adds more clutter than clarity.
Class-wide variables, on the other hand should always be initialized. In the past we defined system-wide custom "null" values for common variable types. This way we could always know what was uninitialized by error and what was initialized on purpose.
I always initialize fields explicitly in the constructor. For me, it's THE place to do it.
I think a lot of that comes down to past experiences.
In older and unamanged languages, the expectation is that the value is unknown. This expectation is retained by programmers coming from these languages.
Almost all modern or managed languages have defined values for recently created variables, whether that's from class constructors or language features.
For now, I think it's perfectly fine to initialize a value; what was once implicit becomes explicit. In the long run, say, in the next 10 to 20 years, people may start learning that a default value is possible, expected, and known - especially if they stay consistent across languages (eg, empty string for strings, 0 for numerics).
You Should do it, there is no need to, but it is better if you do so, because you never know if the language you are using initialize the values. By doing it yourself, you ensure your values are both initialized and with standard predefined values set.
There is nothing wrong on doing it except perhaps a bit of 'time wasted'. I would recommend it strongly. While the commend by John is quite informative, on general use it is better to go the safe path.
I usually do it for strings and in some cases collections where I don't want nulls floating around.
The general consensus where I work is "Not to do it explicitly for value types."
I wouldn't do it. C# initializes an int to zero anyways, so the two lines are functionally equivalent. One is just longer and redundant, although more descriptive to a programmer who doesn't know C#.
This is tagged as language-agnostic but most of the answers are regarding C#.
In C and C++, the best practice is to always initialize your values. There are some cases where this will be done for you such as static globals, but there shouldn't be a performance hit of any kind for redundantly initializing these values with most compilers.
I wouldn't initialise them. If you keep the declaration as close as possible to the first use, then there shouldn't be any confusion.
Another thing to remember is, if you are gonna use automatic properties, you have to rely on implicit values, like:
public int Count { get; set; }
http://www.geekherocomic.com/2009/07/27/common-pitfalls-initialize-your-variables/
If a field will often have new values stored into it without regard for what was there previously, and if it should behave as though a zero was stored there initially but there's nothing "special" about zero, then the value should be stored explicitly.
If the field represents a count or total which will never have a non-zero value written to it directly, but will instead always have other amounts added or subtracted, then zero should be considered an "empty" value, and thus need not be explicitly stated.
To use a crude analogy, consider the following two conditions:
`if (xposition != 0) ...
`if ((flags & WoozleModes.deluxe) != 0) ...
In the former scenario, comparison to the literal zero makes sense because it is checking for a position which is semantically no different from any other. In the second scenario, however, I would suggest that the comparison to the literal zero adds nothing to readability because code isn't really interested in whether the value of the expression (flags & WoozleModes.deluxe) happens to be a number other than zero, but rather whether it's "non-empty".
I don't know of any programming languages that provide separate ways of distinguishing numeric values for "zero" and "empty", other than by not requiring the use of literal zeros when indicating emptiness.