Related
I sometimes use temporary variables to shorten the identifiers:
private function doSomething() {
$db = $this->currentDatabase;
$db->callMethod1();
$db->callMethod2();
$db->callMethod3();
$db->...
}
Although this is a PHP example, I'm asking in general:
Is this bad practice? Are there any drawbacks?
This example is perfectly fine, since you are using it in functions/methods.
The variable will be unset right after the method/function ends - so there's not much of a memory leak or what.
Also by doing this, you "sort of" implemented DRY - don't repeat yourself.
Why write so many $this->currentDatabase when you can write $db. And what if you have to change $this->currentDatabase to some other values?
Actually, you're not trying to avoid typing (otherwise, you'd use a completion mechanism in your editor), but you're just making your function more readable (by using "abbreviations") which is a good thing.
Drawbacks will show up when you start doing this to avoid typing (and sacrifice readability)
It depends what is the contract on $this->currentDatabase. Can it change at any time, after any method call? If it changes, are you supposed to keep on using the object you did when you made your first db call, or are you supposed to always us the current value? This dictates if you must always use $this->currentDatabase, or if you must always store it in a variable before using.
So, strictly speaking, this is not a style question at all.
But, assuming the member is never changed during function calls such as this, it makes no difference. I'd say storing it in a variable is slightly better, as it is easier to read and avoids a member access on an object at every operation. The compiler may optimize it away if it's good, but in many languages such optimizations are very difficult - and accessing a local variable is almost invariably faster than accessing a member of an object.
In general :
Both $db as $this->currentDatabase point to exactly the same object.
The little space allocated for $db is freed (or elligeable for garbage collection) when the function ends
so I'd say : no, it's not bad practice.
I seem to remember that Steve McConnell recommends against using temporary variables in "Code Complete". At the risk of committing heresy, I have to disagree. I prefer the additional readability introduced. I also find myself adding them to aid single-step debugging, then seeing no reason to remove them.
I don't think there is a performance penalty if you use the original variable instead of skipping the first dereference ($this->currentDatabase).
However, as readability is much improved using the abbreviation, go for it!
Of course it also will depend on your team's coding conventions.
If you do this carefully it is absolutely fine. As long as you only use a few of this variables in a small amount of code and inside of small functions I think this is ok.
If you have a lot of this variables and they are badly named like i,j,l and f in the same function the understandability of your code will suffer. If this is the case I would rather type a little bit more then have not understandable code. This is one reason a good IDE has automatic code completion.
No, I think, this is ok. Often performance if not as critical as clean readable code.
Also, you are trading memory a small allocation hit on the stack for faster method calls by avoiding extra dereferencing.
A getter will solve your problem:
private function doSomething() {
getDB()->callMethod1();
getDB()->callMethod2();
getDB()->callMethod3();
}
by clean code N.
Closed. This question is opinion-based. It is not currently accepting answers.
Closed 11 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Sorry for the waffly title - if I could come up with a concise title, I wouldn't have to ask the question.
Suppose I have an immutable list type. It has an operation Foo(x) which returns a new immutable list with the specified argument as an extra element at the end. So to build up a list of strings with values "Hello", "immutable", "world" you could write:
var empty = new ImmutableList<string>();
var list1 = empty.Foo("Hello");
var list2 = list1.Foo("immutable");
var list3 = list2.Foo("word");
(This is C# code, and I'm most interested in a C# suggestion if you feel the language is important. It's not fundamentally a language question, but the idioms of the language may be important.)
The important thing is that the existing lists are not altered by Foo - so empty.Count would still return 0.
Another (more idiomatic) way of getting to the end result would be:
var list = new ImmutableList<string>().Foo("Hello")
.Foo("immutable")
.Foo("word");
My question is: what's the best name for Foo?
EDIT 3: As I reveal later on, the name of the type might not actually be ImmutableList<T>, which makes the position clear. Imagine instead that it's TestSuite and that it's immutable because the whole of the framework it's a part of is immutable...
(End of edit 3)
Options I've come up with so far:
Add: common in .NET, but implies mutation of the original list
Cons: I believe this is the normal name in functional languages, but meaningless to those without experience in such languages
Plus: my favourite so far, it doesn't imply mutation to me. Apparently this is also used in Haskell but with slightly different expectations (a Haskell programmer might expect it to add two lists together rather than adding a single value to the other list).
With: consistent with some other immutable conventions, but doesn't have quite the same "additionness" to it IMO.
And: not very descriptive.
Operator overload for + : I really don't like this much; I generally think operators should only be applied to lower level types. I'm willing to be persuaded though!
The criteria I'm using for choosing are:
Gives the correct impression of the result of the method call (i.e. that it's the original list with an extra element)
Makes it as clear as possible that it doesn't mutate the existing list
Sounds reasonable when chained together as in the second example above
Please ask for more details if I'm not making myself clear enough...
EDIT 1: Here's my reasoning for preferring Plus to Add. Consider these two lines of code:
list.Add(foo);
list.Plus(foo);
In my view (and this is a personal thing) the latter is clearly buggy - it's like writing "x + 5;" as a statement on its own. The first line looks like it's okay, until you remember that it's immutable. In fact, the way that the plus operator on its own doesn't mutate its operands is another reason why Plus is my favourite. Without the slight ickiness of operator overloading, it still gives the same connotations, which include (for me) not mutating the operands (or method target in this case).
EDIT 2: Reasons for not liking Add.
Various answers are effectively: "Go with Add. That's what DateTime does, and String has Replace methods etc which don't make the immutability obvious." I agree - there's precedence here. However, I've seen plenty of people call DateTime.Add or String.Replace and expect mutation. There are loads of newsgroup questions (and probably SO ones if I dig around) which are answered by "You're ignoring the return value of String.Replace; strings are immutable, a new string gets returned."
Now, I should reveal a subtlety to the question - the type might not actually be an immutable list, but a different immutable type. In particular, I'm working on a benchmarking framework where you add tests to a suite, and that creates a new suite. It might be obvious that:
var list = new ImmutableList<string>();
list.Add("foo");
isn't going to accomplish anything, but it becomes a lot murkier when you change it to:
var suite = new TestSuite<string, int>();
suite.Add(x => x.Length);
That looks like it should be okay. Whereas this, to me, makes the mistake clearer:
var suite = new TestSuite<string, int>();
suite.Plus(x => x.Length);
That's just begging to be:
var suite = new TestSuite<string, int>().Plus(x => x.Length);
Ideally, I would like my users not to have to be told that the test suite is immutable. I want them to fall into the pit of success. This may not be possible, but I'd like to try.
I apologise for over-simplifying the original question by talking only about an immutable list type. Not all collections are quite as self-descriptive as ImmutableList<T> :)
In situations like that, I usually go with Concat. That usually implies to me that a new object is being created.
var p = listA.Concat(listB);
var k = listA.Concat(item);
I'd go with Cons, for one simple reason: it means exactly what you want it to.
I'm a huge fan of saying exactly what I mean, especially in source code. A newbie will have to look up the definition of Cons only once, but then read and use that a thousand times. I find that, in the long term, it's nicer to work with systems that make the common case easier, even if the up-front cost is a little bit higher.
The fact that it would be "meaningless" to people with no FP experience is actually a big advantage. As you pointed out, all of the other words you found already have some meaning, and that meaning is either slightly different or ambiguous. A new concept should have a new word (or in this case, an old one). I'd rather somebody have to look up the definition of Cons, than to assume incorrectly he knows what Add does.
Other operations borrowed from functional languages often keep their original names, with no apparent catastrophes. I haven't seen any push to come up with synonyms for "map" and "reduce" that sound more familiar to non-FPers, nor do I see any benefit from doing so.
(Full disclosure: I'm a Lisp programmer, so I already know what Cons means.)
Actually I like And, especially in the idiomatic way. I'd especially like it if you had a static readonly property for the Empty list, and perhaps make the constructor private so you always have to build from the empty list.
var list = ImmutableList<string>.Empty.And("Hello")
.And("Immutable")
.And("Word");
Whenever I'm in a jam with nomenclature, I hit up the interwebs.
thesaurus.com returns this for "add":
Definition: adjoin, increase; make
further comment
Synonyms: affix,
annex, ante, append, augment, beef
up, boost, build up, charge up,
continue, cue in, figure in, flesh
out, heat up, hike, hike up, hitch on,
hook on, hook up with, include, jack
up, jazz up, join together, pad,
parlay, piggyback, plug into, pour it
on, reply, run up, say further, slap
on, snowball, soup up, speed up,
spike, step up, supplement, sweeten,
tack on, tag
I like the sound of Adjoin, or more simply Join. That is what you're doing, right? The method could also apply to joining other ImmutableList<>'s.
Personally, I like .With(). If I was using the object, after reading the documentation or the code comments, it would be clear what it does, and it reads ok in the source code.
object.With("My new item as well");
Or, you add "Along" with it.. :)
object.AlongWith("this new item");
I ended up going with Add for all of my Immutable Collections in BclExtras. The reason being is that it's an easy predictable name. I'm not worried so much about people confusing Add with a mutating add since the name of the type is prefixed with Immutable.
For awhile I considered Cons and other functional style names. Eventually I discounted them because they're not nearly as well known. Sure functional programmers will understand but they're not the majority of users.
Other Names: you mentioned:
Plus: I'm wishy/washing on this one. For me this doesn't distinguish it as being a non-mutating operation anymore than Add does
With: Will cause issues with VB (pun intended)
Operator overloading: Discoverability would be an issue
Options I considered:
Concat: String's are Immutable and use this. Unfortunately it's only really good for adding to the end
CopyAdd: Copy what? The source, the list?
AddToNewList: Maybe a good one for List. But what about a Collection, Stack, Queue, etc ...
Unfortunately there doesn't really seem to be a word that is
Definitely an immutable operation
Understandable to the majority of users
Representable in less than 4 words
It gets even more odd when you consider collections other than List. Take for instance Stack. Even first year programmers can tell you that Stacks have a Push/Pop pair of methods. If you create an ImmutableStack and give it a completely different name, lets call it Foo/Fop, you've just added more work for them to use your collection.
Edit: Response to Plus Edit
I see where you're going with Plus. I think a stronger case would actually be Minus for remove. If I saw the following I would certainly wonder what in the world the programmer was thinking
list.Minus(obj);
The biggest problem I have with Plus/Minus or a new pairing is it feels like overkill. The collection itself already has a distinguishing name, the Immutable prefix. Why go further by adding vocabulary whose intent is to add the same distinction as the Immutable prefix already did.
I can see the call site argument. It makes it clearer from the standpoint of a single expression. But in the context of the entire function it seems unnecessary.
Edit 2
Agree that people have definitely been confused by String.Concat and DateTime.Add. I've seen several very bright programmers hit this problem.
However I think ImmutableList is a different argument. There is nothing about String or DateTime that establishes it as Immutable to a programmer. You must simply know that it's immutable via some other source. So the confusion is not unexpected.
ImmutableList does not have that problem because the name defines it's behavior. You could argue that people don't know what Immutable is and I think that's also valid. I certainly didn't know it till about year 2 in college. But you have the same issue with whatever name you choose instead of Add.
Edit 3: What about types like TestSuite which are immutable but do not contain the word?
I think this drives home the idea that you shouldn't be inventing new method names. Namely because there is clearly a drive to make types immutable in order to facilitate parallel operations. If you focus on changing the name of methods for collections, the next step will be the mutating method names on every type you use that is immutable.
I think it would be a more valuable effort to instead focus on making types identifiable as Immutable. That way you can solve the problem without rethinking every mutating method pattern out there.
Now how can you identify TestSuite as Immutable? In todays environment I think there are a few ways
Prefix with Immutable: ImmutableTestSuite
Add an Attribute which describes the level of Immutablitiy. This is certainly less discoverable
Not much else.
My guess/hope is development tools will start helping this problem by making it easy to identify immutable types simply by sight (different color, stronger font, etc ...). But I think that's the answer though over changing all of the method names.
I think this may be one of those rare situations where it's acceptable to overload the + operator. In math terminology, we know that + doesn't append something to the end of something else. It always combines two values together and returns a new resulting value.
For example, it's intuitively obvious that when you say
x = 2 + 2;
the resulting value of x is 4, not 22.
Similarly,
var empty = new ImmutableList<string>();
var list1 = empty + "Hello";
var list2 = list1 + "immutable";
var list3 = list2 + "word";
should make clear what each variable is going to hold. It should be clear that list2 is not changed in the last line, but instead that list3 is assigned the result of appending "word" to list2.
Otherwise, I would just name the function Plus().
To be as clear as possible, you might want to go with the wordier CopyAndAdd, or something similar.
I would call it Extend() or maybe ExtendWith() if you feel like really verbose.
Extends means adding something to something else without changing it. I think this is very relevant terminology in C# since this is similar to the concept of extension methods - they "add" a new method to a class without "touching" the class itself.
Otherwise, if you really want to emphasize that you don't modify the original object at all, using some prefix like Get- looks like unavoidable to me.
Added(), Appended()
I like to use the past tense for operations on immutable objects. It conveys the idea that you aren't changing the original object, and it's easy to recognize when you see it.
Also, because mutating method names are often present-tense verbs, it applies to most of the immutable-method-name-needed cases you run into. For example an immutable stack has the methods "pushed" and "popped".
I like mmyers suggestion of CopyAndAdd. In keeping with a "mutation" theme, maybe you could go with Bud (asexual reproduction), Grow, Replicate, or Evolve? =)
EDIT: To continue with my genetic theme, how about Procreate, implying that a new object is made which is based on the previous one, but with something new added.
This is probably a stretch, but in Ruby there is a commonly used notation for the distinction: add doesn't mutate; add! mutates. If this is an pervasive problem in your project, you could do that too (not necessarily with non-alphabetic characters, but consistently using a notation to indicate mutating/non-mutating methods).
Join seems appropriate.
Maybe the confusion stems from the fact that you want two operations in one. Why not separate them? DSL style:
var list = new ImmutableList<string>("Hello");
var list2 = list.Copy().With("World!");
Copy would return an intermediate object, that's a mutable copy of the original list. With would return a new immutable list.
Update:
But, having an intermediate, mutable collection around is not a good approach. The intermediate object should be contained in the Copy operation:
var list1 = new ImmutableList<string>("Hello");
var list2 = list1.Copy(list => list.Add("World!"));
Now, the Copy operation takes a delegate, which receives a mutable list, so that it can control the copy outcome. It can do much more than appending an element, like removing elements or sorting the list. It can also be used in the ImmutableList constructor to assemble the initial list without intermediary immutable lists.
public ImmutableList<T> Copy(Action<IList<T>> mutate) {
if (mutate == null) return this;
var list = new List<T>(this);
mutate(list);
return new ImmutableList<T>(list);
}
Now there's no possibility of misinterpretation by the users, they will naturally fall into the pit of success.
Yet another update:
If you still don't like the mutable list mention, even now that it's contained, you can design a specification object, that will specify, or script, how the copy operation will transform its list. The usage will be the same:
var list1 = new ImmutableList<string>("Hello");
// rules is a specification object, that takes commands to run in the copied collection
var list2 = list1.Copy(rules => rules.Append("World!"));
Now you can be creative with the rules names and you can only expose the functionality that you want Copy to support, not the entire capabilities of an IList.
For the chaining usage, you can create a reasonable constructor (which will not use chaining, of course):
public ImmutableList(params T[] elements) ...
...
var list = new ImmutableList<string>("Hello", "immutable", "World");
Or use the same delegate in another constructor:
var list = new ImmutableList<string>(rules =>
rules
.Append("Hello")
.Append("immutable")
.Append("World")
);
This assumes that the rules.Append method returns this.
This is what it would look like with your latest example:
var suite = new TestSuite<string, int>(x => x.Length);
var otherSuite = suite.Copy(rules =>
rules
.Append(x => Int32.Parse(x))
.Append(x => x.GetHashCode())
);
A few random thoughts:
ImmutableAdd()
Append()
ImmutableList<T>(ImmutableList<T> originalList, T newItem) Constructor
DateTime in C# uses Add. So why not use the same name? As long the users of your class understand the class is immutable.
I think the key thing you're trying to get at that's hard to express is the nonpermutation, so maybe something with a generative word in it, something like CopyWith() or InstancePlus().
I don't think the English language will let you imply immutability in an unmistakable way while using a verb that means the same thing as "Add". "Plus" almost does it, but people can still make the mistake.
The only way you're going to prevent your users from mistaking the object for something mutable is by making it explicit, either through the name of the object itself or through the name of the method (as with the verbose options like "GetCopyWith" or "CopyAndAdd").
So just go with your favourite, "Plus."
First, an interesting starting point:
http://en.wikipedia.org/wiki/Naming_conventions_(programming) ...In particular, check the "See Also" links at the bottom.
I'm in favor of either Plus or And, effectively equally.
Plus and And are both math-based in etymology. As such, both connote mathematical operation; both yield an expression which reads naturally as expressions which may resolve into a value, which fits with the method having a return value. And bears additional logic connotation, but both words apply intuitively to lists. Add connotes action performed on an object, which conflicts with the method's immutable semantics.
Both are short, which is especially important given the primitiveness of the operation. Simple, frequently-performed operations deserve shorter names.
Expressing immutable semantics is something I prefer to do via context. That is, I'd rather simply imply that this entire block of code has a functional feel; assume everything is immutable. That might just be me, however. I prefer immutability to be the rule; if it's done, it's done a lot in the same place; mutability is the exception.
How about Chain() or Attach()?
I prefer Plus (and Minus). They are easily understandable and map directly to operations involving well known immutable types (the numbers). 2+2 doesn't change the value of 2, it returns a new, equally immutable, value.
Some other possibilities:
Splice()
Graft()
Accrete()
How about mate, mateWith, or coitus, for those who abide. In terms of reproducing mammals are generally considered immutable.
Going to throw Union out there too. Borrowed from SQL.
Apparently I'm the first Obj-C/Cocoa person to answer this question.
NNString *empty = [[NSString alloc] init];
NSString *list1 = [empty stringByAppendingString:#"Hello"];
NSString *list2 = [list1 stringByAppendingString:#"immutable"];
NSString *list3 = [list2 stringByAppendingString:#"word"];
Not going to win any code golf games with this.
I think "Add" or "Plus" sounds fine. The name of the list itself should be enough to convey the list's immutability.
Maybe there are some words which remember me more of making a copy and add stuff to that instead of mutating the instance (like "Concatenate"). But i think having some symmetry for those words for other actions would be good to have too. I don't know of a similar word for "Remove" that i think of the same kind like "Concatenate". "Plus" sounds little strange to me. I wouldn't expect it being used in a non-numerical context. But that could aswell come from my non-english background.
Maybe i would use this scheme
AddToCopy
RemoveFromCopy
InsertIntoCopy
These have their own problems though, when i think about it. One could think they remove something or add something to an argument given. Not sure about it at all. Those words do not play nice in chaining either, i think. Too wordy to type.
Maybe i would just use plain "Add" and friends too. I like how it is used in math
Add 1 to 2 and you get 3
Well, certainly, a 2 remains a 2 and you get a new number. This is about two numbers and not about a list and an element, but i think it has some analogy. In my opinion, add does not necessarily mean you mutate something. I certainly see your point that having a lonely statement containing just an add and not using the returned new object does not look buggy. But I've now also thought some time about that idea of using another name than "add" but i just can't come up with another name, without making me think "hmm, i would need to look at the documentation to know what it is about" because its name differs from what I would expect to be called "add". Just some weird thought about this from litb, not sure it makes sense at all :)
Looking at http://thesaurus.reference.com/browse/add and http://thesaurus.reference.com/browse/plus I found gain and affix but I'm not sure how much they imply non-mutation.
I think that Plus() and Minus() or, alternatively, Including(), Excluding() are reasonable at implying immutable behavior.
However, no naming choice will ever make it perfectly clear to everyone, so I personally believe that a good xml doc comment would go a very long way here. VS throws these right in your face when you write code in the IDE - they're hard to ignore.
Append - because, note that names of the System.String methods suggest that they mutate the instance, but they don't.
Or I quite like AfterAppending:
void test()
{
Bar bar = new Bar();
List list = bar.AfterAppending("foo");
}
list.CopyWith(element)
As does Smalltalk :)
And also list.copyWithout(element) that removes all occurrences of an element, which is most useful when used as list.copyWithout(null) to remove unset elements.
I would go for Add, because I can see the benefit of a better name, but the problem would be to find different names for every other immutable operation which might make the class quite unfamiliar if that makes sense.
Similar to Is hard-coding literals ever acceptable?, but I'm specifically thinking of "magic strings" here.
On a large project, we have a table of configuration options like these:
Name Value
---- -----
FOO_ENABLED Y
BAR_ENABLED N
...
(Hundreds of them).
The common practice is to call a generic function to test an option like this:
if (config_options.value('FOO_ENABLED') == 'Y') ...
(Of course, this same option may need to be checked in many places in the system code.)
When adding a new option, I was considering adding a function to hide the "magic string" like this:
if (config_options.foo_enabled()) ...
However, colleagues thought I'd gone overboard and objected to doing this, preferring the hard-coding because:
That's what we normally do
It makes it easier to see what's going on when debugging the code
The trouble is, I can see their point! Realistically, we are never going to rename the options for any reason, so about the only advantage I can think of for my function is that the compiler would catch any typo like fo_enabled(), but not 'FO_ENABLED'.
What do you think? Have I missed any other advantages/disadvantages?
If I use a string once in the code, I don't generally worry about making it a constant somewhere.
If I use a string twice in the code, I'll consider making it a constant.
If I use a string three times in the code, I'll almost certainly make it a constant.
if (config_options.isTrue('FOO_ENABLED')) {...
}
Restrict your hard coded Y check to one place, even if it means writing a wrapper class for your Map.
if (config_options.isFooEnabled()) {...
}
Might seem okay until you have 100 configuration options and 100 methods (so here you can make a judgement about future application growth and needs before deciding on your implementation). Otherwise it is better to have a class of static strings for parameter names.
if (config_options.isTrue(ConfigKeys.FOO_ENABLED)) {...
}
I realise the question is old, but it came up on my margin.
AFAIC, the issue here has not been identified accurately, either in the question, or the answers. Forget about 'harcoding strings" or not, for a moment.
The database has a Reference table, containing config_options. The PK is a string.
There are two types of PKs:
Meaningful Identifiers, that the users (and developers) see and use. These PKs are supposed to be stable, they can be relied upon.
Meaningless Id columns which the users should never see, that the developers have to be aware of, and code around. These cannot be relied upon.
It is ordinary, normal, to write code using the absolute value of a meaningful PK IF CustomerCode = "IBM" ... or IF CountryCode = "AUS" etc.
referencing the absolute value of a meaningless PK is not acceptable (due to auto-increment; gaps being changed; values being replaced wholesale).
.
Your reference table uses meaningful PKs. Referencing those literal strings in code is unavoidable. Hiding the value will make maintenance more difficult; the code is no longer literal; your colleagues are right. Plus there is the additional redundant function that chews cycles. If there is a typo in the literal, you will soon find that out during Dev testing, long before UAT.
hundreds of functions for hundreds of literals is absurd. If you do implement a function, then Normalise your code, and provide a single function that can be used for any of the hundreds of literals. In which case, we are back to a naked literal, and the function can be dispensed with.
the point is, the attempt to hide the literal has no value.
.
It cannot be construed as "hardcoding", that is something quite different. I think that is where your issue is, identifying these constructs as "hardcoded". It is just referencing a Meaningfull PK literally.
Now from the perspective of any code segment only, if you use the same value a few times, you can improve the code by capturing the literal string in a variable, and then using the variable in the rest of the code block. Certainly not a function. But that is an efficiency and good practice issue. Even that does not change the effect IF CountryCode = #cc_aus
I really should use constants and no hard coded literals.
You can say they won't be changed, but you may never know. And it is best to make it a habit. To use symbolic constants.
In my experience, this kind of issue is masking a deeper problem: failure to do actual OOP and to follow the DRY principle.
In a nutshell, capture the decision at startup time by an appropriate definition for each action inside the if statements, and then throw away both the config_options and the run-time tests.
Details below.
The sample usage was:
if (config_options.value('FOO_ENABLED') == 'Y') ...
which raises the obvious question, "What's going on in the ellipsis?", especially given the following statement:
(Of course, this same option may need to be checked in many places in the system code.)
Let's assume that each of these config_option values really does correspond to a single problem domain (or implementation strategy) concept.
Instead of doing this (repeatedly, in various places throughout the code):
Take a string (tag),
Find its corresponding other string (value),
Test that value as a boolean-equivalent,
Based on that test, decide whether to perform some action.
I suggest encapsulating the concept of a "configurable action".
Let's take as an example (obviously just as hypthetical as FOO_ENABLED ... ;-) that your code has to work in either English units or metric units. If METRIC_ENABLED is "true", convert user-entered data from metric to English for internal computation, and convert back prior to displaying results.
Define an interface:
public interface MetricConverter {
double toInches(double length);
double toCentimeters(double length);
double toPounds(double weight);
double toKilograms(double weight);
}
which identifies in one place all the behavior associated with the concept of METRIC_ENABLED.
Then write concrete implementations of all the ways those behaviors are to be carried out:
public class NullConv implements MetricConverter {
double toInches(double length) {return length;}
double toCentimeters(double length) {return length;}
double toPounds(double weight) {return weight;}
double toKilograms(double weight) {return weight;}
}
and
// lame implementation, just for illustration!!!!
public class MetricConv implements MetricConverter {
public static final double LBS_PER_KG = 2.2D;
public static final double CM_PER_IN = 2.54D
double toInches(double length) {return length * CM_PER_IN;}
double toCentimeters(double length) {return length / CM_PER_IN;}
double toPounds(double weight) {return weight * LBS_PER_KG;}
double toKilograms(double weight) {return weight / LBS_PER_KG;}
}
At startup time, instead of loading a bunch of config_options values, initialize a set of configurable actions, as in:
MetricConverter converter = (metricOption()) ? new MetricConv() : new NullConv();
(where the expression metricOption() above is a stand-in for whatever one-time-only check you need to make, including looking at the value of METRIC_ENABLED ;-)
Then, wherever the code would have said:
double length = getLengthFromGui();
if (config_options.value('METRIC_ENABLED') == 'Y') {
length = length / 2.54D;
}
// do some computation to produce result
// ...
if (config_options.value('METRIC_ENABLED') == 'Y') {
result = result * 2.54D;
}
displayResultingLengthOnGui(result);
rewrite it as:
double length = converter.toInches(getLengthFromGui());
// do some computation to produce result
// ...
displayResultingLengthOnGui(converter.toCentimeters(result));
Because all of the implementation details related to that one concept are now packaged cleanly, all future maintenance related to METRIC_ENABLED can be done in one place. In addition, the run-time trade-off is a win; the "overhead" of invoking a method is trivial compared with the overhead of fetching a String value from a Map and performing String#equals.
I believe that the two reasons you have mentioned, Possible misspelling in string, that cannot be detected until run time and the possibility (although slim) of a name change would justify your idea.
On top of that you can get typed functions, now it seems you only store booleans, what if you need to store an int, a string etc. I would rather use get_foo() with a type, than get_string("FOO") or get_int("FOO").
I think there are two different issues here:
In the current project, the convention of using hard-coded strings is already well established, so all the developers working on the project are familiar with it. It might be a sub-optimal convention for all the reasons that have been listed, but everybody familiar with the code can look at it and instinctively knows what the code is supposed to do. Changing the code so that in certain parts, it uses the "new" functionality will make the code slightly harder to read (because people will have to think and remember what the new convention does) and thus a little harder to maintain. But I would guess that changing over the whole project to the new convention would potentially be prohibitively expensive unless you can quickly script the conversion.
On a new project, symbolic constants are the way IMO, for all the reasons listed. Especially because anything that makes the compiler catch errors at compile time that would otherwise be caught by a human at run time is a very useful convention to establish.
Another thing to consider is intent. If you are on a project that requires localization hard coded strings can be ambiguous. Consider the following:
const string HELLO_WORLD = "Hello world!";
print(HELLO_WORLD);
The programmer's intent is clear. Using a constant implies that this string does not need to be localized. Now look at this example:
print("Hello world!");
Here we aren't so sure. Did the programmer really not want this string to be localized or did the programmer forget about localization while he was writing this code?
I too prefer a strongly-typed configuration class if it is used through-out the code. With properly named methods you don't lose any readability. If you need to do conversions from strings to another data type (decimal/float/int), you don't need to repeat the code that does the conversion in multiple places and can cache the result so the conversion only takes place once. You've already got the basis of this in place already so I don't think it would take much to get used to the new way of doing things.
The code base I'm currently working on is littered with hard-coded values.
I view all hard coded values as a code smell and I try to eliminate them where possible...however there are some cases that I am unsure about.
Here are two examples that I can think of that make me wonder what the best practice is:
1. MyTextBox.Text = someCondition ? "Yes" : "No"
2. double myPercentage = myValue / 100;
In the first case, is the best thing to do to create a class that allows me to do MyHelper.Yes and MyHelper.No or perhaps something similar in a config file (though it isn't likely to change and who knows if there might ever be a case where its usage would be case sensitive).
In the second case, finding a percentage by dividing by 100 isn't likely to ever change unless the laws of mathematics change...but I still wonder if there is a better way.
Can anyone suggest an appropriate way to deal with this sort of hard coding? And can anyone think of any places where hard coding is an acceptable practice?
And can anyone think of any places where hard coding is an acceptable practice?
Small apps
Single man projects
Throw aways
Short living projects
For short anything that won't be maintained by others.
Gee I've just realized how much being maintainer coder hurt me in the past :)
The real question isn't about hard coding, but rather repetition. If you take the excellent advice found in "The Pragmatic Programmer", simply Don't Repeat Yourself (DRY).
Taking the principle of DRY, it is fine to hardcode something at any point. However, once you use that particular value again, refactor so this value is only hardcoded once.
Of course hard-coding is sometimes acceptable. Following dogma is rarely as useful a practice as using your brain.
(For an example of this, perhaps it's interesting to go back to the goto wars. How many programmers do you know that will swear by all things holy that goto is evil? Why then does Steve McConnell devote a dozen pages to a measured discussion of the subject in Code Complete?)
Sure, there's a lot of hard-gained experience that tells us that small throw-away applications often mutate into production code, but that's no reason for zealotry. The agilists tell us we should do the simplest thing that could possibly work and refactor when needed.
That's not to say that the "simplest thing" shouldn't be readable code. It may make perfect sense, even in a throw-away spike to write:
const MAX_CACHE_RECORDS = 50
foo = GetNewCache(MAX_CACHE_RECORDS)
This is regardless of the fact that in three iterations time, someone might ask for the number of cache records to be configurable, and you might end up refactoring the constant away.
Just remember, if you go to the extremes of stuff like
const ONE_HUNDRED = 100
const ONE_HUNDRED_AND_ONE = 101
we'll all come to The Daily WTF and laugh at you. :-)
Think! That's all.
It's never good and you just proved it...
double myPercentage = myValue / 100;
This is NOT percentage. What you wanted to write is :
double myPercentage = (myValue / 100) * 100;
Or more correctly :
double myPercentage = (myValue / myMaxValue) * 100;
But this hard coded 100 messed with your mind... So go for the getPercentage method that Colen suggested :)
double getpercentage(double myValue, double maxValue)
{
return (myValue / maxValue) * 100;
}
Also as ctacke suggested, in the first case you will be in a world of pain if you ever need to localize these literals. It's never too much trouble to add a couple more variables and/or functions
The first case will kill you if you ever need to localize. Moving it to some static or constant that is app-wide would at least make localizing it a little easier.
Case 1: When should you hard-code stuff: when you have no reason to think that it will ever change. That said, you should NEVER hard code stuff in-line. Take the time to make static variables or global variables or whatever your language gives you. Do them in the class in question, and if you notice that two classes or areas of your code share the same value FOR THE SAME REASON (meaning it's not just coincidence), point them to the same place.
Case 2: For case case 2, you're correct: the laws of "percentage" will not change (being reasonable, here), so you can hard code inline.
Case 3: The third case is where you think the thing could change but you don't want to/have time to bother loading ResourceBundles or XML or whatever. In that case, you use whatever centralizing mechanism you can -- the hated Singleton class is a good one -- and go with that until you actually have need to deal with the problem.
The third case is tricky, though: it's extraordinarily hard to internationalize an application without really doing it... so you will want to hard-code stuff and just hope that, when the i18n guys come knocking, your code is not the worst-tasting code around :)
Edit: Let me mention that I've just finished a refactoring project in which the prior developer had placed the MySql connect strings in 100+ places in the code (PHP). Sometimes they were uppercase, sometimes they were lower case, etc., so they were hard to search and replace (though Netbeans and PDT did help a lot). There are reasons why he/she did this (a project called POG basically forces this stupidity), but there is just nothing that seems less like good code than repeating the same thing in a million places.
The better way for your second example would be to define an inline function:
double getpercentage(double myValue)
{
return(myValue / 100);
}
...
double myPercentage = getpercentage(myValue);
That way it's a lot more obvious what you're doing.
Hardcoded literals should appear in unit tests for the test values, unless there is so much reuse of a value within a single test class that a local constant is useful.
The unit tests are a description of expected values without any abstraction or redirection.
Imagine yourself reading the test - you want the information literally in front of you.
The only time I use constants for test values is when many tests repeat a value (itself a bit suspicious) and the value may be subject to change.
I do use constants for things like names of test files to compare.
I don't think that your second is really an example of hardcoding. That's like having a Halve() method that takes in a value to use to divide by; doesn't make sense.
Beyond that, example 1, if you want to change the language for your app, you don't want to have to change the class, so it should absolutely be in a config.
Hard coding should be avoided like Dracula avoids the sun. It'll come back to bite you in the ass eventually.
"hardcoding" is the wrong thing to worry about. The point is not whether special values are in code or in config files, the point is:
If the value could ever change, how much work is that and how hard is it to find? Putting it in one place and referring to that place elsewhere is not much work and therefore a way to play it safe.
Will maintainance programmers definitely understand why the value is what it is? If there is any doubt whatsoever, use a named constant that explains the meaning.
Both of these goals can be achieved without any need for config files; in fact I'd avoid those if possible. "putting stuff in config files means it's easier to change" is a myth, unless either
you actually want to support customers changing the values themselves
no value that could possibly be put in the config file can cause a bug (buffer overflow, anyone?)
your build and deployment process sucks
The text for the conditions should be in a resource file; that's what it's there for.
Not normally (Are hard-coding literals acceptable)
Another way at looking at this is how using a good naming convention
for constants used in-place of hard coded literals provides additional
documentation in the program.
Even if the number is used only once, it can still be hard to recognized
and may even be hard to find for future changes.
IMHO, making programs easier to read should be second nature to a
seasoned software professional. Raw numbers rarely communicate
meaningfully.
The extra time taken to use a well named constant will make the
code readability (easy to recall to the mind) and useful for future
re-mining (code re-use).
I tend to view it in terms of the project's scope and size.
Some simple projects that I am a solo dev on? Sure, I hard code lots of things. Tools I write that only I will ever use? Sure, if it gets the job done.
But, in working on larger, team projects? I agree, they are suspect and usually the product of laziness. Tag them for review and see if you can spot a pattern where they can be abstracted away.
In your example, the text box should be localizable, so why not a class that handles that?
Remember that you WILL forget the meaning of any non-obvious hard-coded value.
So be certain to put a short comment after each to remind you.
A Delphi example:
Length := Length * 0.3048; { 0.3048 converts feet to meters }
no.
What is a simple throw away app today will be driving your entire enterprise tomorrow. Always use best practices or you'll regret it.
Code always evolves. When you initially write stuff hard coding is the easiest way to go. Later when a need arrives to change the value it can be improved. In some cases the need never comes.
The need can arrive in many forms:
The value is used in many places and it needs to be changed by a programmer. In this case a constant is clearly needed.
User needs to be able to change the value.
I don't see the need to avoid hard coding. I do see the need to change things when there is a clear need.
Totally separate issue is that of course the code needs to be readable and this means that there might be a need for a comment for the hard coded value.
For the first value, it really depends. If you don't anticipate any kind of wide-spread adoption of your application and internationalization will never be an issue, I think it's mostly fine. However, if you are writing some kind of open source software or something with a larger audience consider the fact that it may one day need to be translated. In that case, you may be better off using string resources.
It's okay as long as you don't do refactoring, unit-testing, peer code reviews. And, you don't want repeat customers. Who cares?
I once had a boss who refused to not hardcode something because in his mind it gave him full control over the software and the items related to the software. Problem was, when the hardware died that ran the software the server got renamed... meaning he had to find his code. That took a while. I simply found a hex editor and hacked around it instead of waiting.
I normally add a set of helper methods for strings and numbers.
For example when I have strings such as 'yes' and 'no' I have a function called __ so I call __('yes'); which starts out in the project by just returning the first parameter but when I need to do more complex stuff (such as internationaizaton) it's already there and the param can be used a key.
Another example is VAT (form of UK tax) in online shops, recently it changed from 17.5% to 15%. Any one who hard coded VAT by doing:
$vat = $price * 0.175;
had to then go through all references and change it to 0.15, instead the super usefull way of doing it would be to have a function or variable for VAT.
In my opinion anything that could change should be written in a changeable way. If I find myself doing the same thing more than 5 times in the same day then it becomes a function or a config var.
Hard coding should be banned forever. Althought in you very simple examples i don't see anything wrong using them in any kind of project.
In my opinion hard coding is when you believe that a variable/value/define etc. will never change and create all your code based on that belief.
Example of such hard coding is the book Teach Yourself C in 24 Hours that everybody should avoid.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I know what Hungarian refers to - giving information about a variable, parameter, or type as a prefix to its name. Everyone seems to be rabidly against it, even though in some cases it seems to be a good idea. If I feel that useful information is being imparted, why shouldn't I put it right there where it's available?
See also: Do people use the Hungarian naming conventions in the real world?
vUsing adjHungarian nnotation vmakes nreading ncode adjdifficult.
Most people use Hungarian notation in a wrong way and are getting wrong results.
Read this excellent article by Joel Spolsky: Making Wrong Code Look Wrong.
In short, Hungarian Notation where you prefix your variable names with their type (string) (Systems Hungarian) is bad because it's useless.
Hungarian Notation as it was intended by its author where you prefix the variable name with its kind (using Joel's example: safe string or unsafe string), so called Apps Hungarian has its uses and is still valuable.
Joel is wrong, and here is why.
That "application" information he's talking about should be encoded in the type system. You should not depend on flipping variable names to make sure you don't pass unsafe data to functions requiring safe data. You should make it a type error, so that it is impossible to do so. Any unsafe data should have a type that is marked unsafe, so that it simply cannot be passed to a safe function. To convert from unsafe to safe should require processing with some kind of a sanitize function.
A lot of the things that Joel talks of as "kinds" are not kinds; they are, in fact, types.
What most languages lack, however, is a type system that's expressive enough to enforce these kind of distinctions. For example, if C had a kind of "strong typedef" (where the typedef name had all the operations of the base type, but was not convertible to it) then a lot of these problems would go away. For example, if you could say, strong typedef std::string unsafe_string; to introduce a new type unsafe_string that could not be converted to a std::string (and so could participate in overload resolution etc. etc.) then we would not need silly prefixes.
So, the central claim that Hungarian is for things that are not types is wrong. It's being used for type information. Richer type information than the traditional C type information, certainly; it's type information that encodes some kind of semantic detail to indicate the purpose of the objects. But it's still type information, and the proper solution has always been to encode it into the type system. Encoding it into the type system is far and away the best way to obtain proper validation and enforcement of the rules. Variables names simply do not cut the mustard.
In other words, the aim should not be "make wrong code look wrong to the developer". It should be "make wrong code look wrong to the compiler".
I think it massively clutters up the source code.
It also doesn't gain you much in a strongly typed language. If you do any form of type mismatch tomfoolery, the compiler will tell you about it.
Hungarian notation only makes sense in languages without user-defined types. In a modern functional or OO-language, you would encode information about the "kind" of value into the datatype or class rather than into the variable name.
Several answers reference Joels article. Note however that his example is in VBScript, which didn't support user-defined classes (for a long time at least). In a language with user-defined types you would solve the same problem by creating a HtmlEncodedString-type and then let the Write method accept only that. In a statically typed language, the compiler will catch any encoding-errors, in a dynamically typed you would get a runtime exception - but in any case you are protected against writing unencoded strings. Hungarian notations just turns the programmer into a human type-checker, with is the kind of job that is typically better handled by software.
Joel distinguishes between "systems hungarian" and "apps hungarian", where "systems hungarian" encodes the built-in types like int, float and so on, and "apps hungarian" encodes "kinds", which is higher-level meta-info about variable beyound the machine type, In a OO or modern functional language you can create user-defined types, so there is no distinction between type and "kind" in this sense - both can be represented by the type system - and "apps" hungarian is just as redundant as "systems" hungarian.
So to answer your question: Systems hungarian would only be useful in a unsafe, weakly typed language where e.g. assigning a float value to an int variable will crash the system. Hungarian notation was specifically invented in the sixties for use in BCPL, a pretty low-level language which didn't do any type checking at all. I dont think any language in general use today have this problem, but the notation lived on as a kind of cargo cult programming.
Apps hungarian will make sense if you are working with a language without user defined types, like legacy VBScript or early versions of VB. Perhaps also early versions of Perl and PHP. Again, using it in a modern languge is pure cargo cult.
In any other language, hungarian is just ugly, redundant and fragile. It repeats information already known from the type system, and you should not repeat yourself. Use a descriptive name for the variable that describes the intent of this specific instance of the type. Use the type system to encode invariants and meta info about "kinds" or "classes" of variables - ie. types.
The general point of Joels article - to have wrong code look wrong - is a very good principle. However an even better protection against bugs is to - when at all possible - have wrong code to be detected automatically by the compiler.
I always use Hungarian notation for all my projects. I find it really helpful when I'm dealing with 100s of different identifier names.
For example, when I call a function requiring a string I can type 's' and hit control-space and my IDE will show me exactly the variable names prefixed with 's' .
Another advantage, when I prefix u for unsigned and i for signed ints, I immediately see where I am mixing signed and unsigned in potentially dangerous ways.
I cannot remember the number of times when in a huge 75000 line codebase, bugs were caused (by me and others too) due to naming local variables the same as existing member variables of that class. Since then, I always prefix members with 'm_'
Its a question of taste and experience. Don't knock it until you've tried it.
You're forgetting the number one reason to include this information. It has nothing to do with you, the programmer. It has everything to do with the person coming down the road 2 or 3 years after you leave the company who has to read that stuff.
Yes, an IDE will quickly identify types for you. However, when you're reading through some long batches of 'business rules' code, it's nice to not have to pause on each variable to find out what type it is. When I see things like strUserID, intProduct or guiProductID, it makes for much easier 'ramp up' time.
I agree that MS went way too far with some of their naming conventions - I categorize that in the "too much of a good thing" pile.
Naming conventions are good things, provided you stick to them. I've gone through enough old code that had me constantly going back to look at the definitions for so many similarly-named variables that I push "camel casing" (as it was called at a previous job). Right now I'm on a job that has many thousand of lines of completely uncommented classic ASP code with VBScript and it's a nightmare trying to figure things out.
Tacking on cryptic characters at the beginning of each variable name is unnecessary and shows that the variable name by itself isn't descriptive enough. Most languages require the variable type at declaration anyway, so that information is already available.
There's also the situation where, during maintenance, a variable type needs to change. Example: if a variable declared as "uint_16 u16foo" needs to become a 64-bit unsigned, one of two things will happen:
You'll go through and change each variable name (making sure not to hose any unrelated variables with the same name), or
Just change the type and not change the name, which will only cause confusion.
Joel Spolsky wrote a good blog post about this.
http://www.joelonsoftware.com/articles/Wrong.html
Basically it comes down to not making your code harder to read when a decent IDE will tell you want type the variable is if you can't remember. Also, if you make your code compartmentalized enough, you don't have to remember what a variable was declared as three pages up.
Isn't scope more important than type these days, e.g.
* l for local
* a for argument
* m for member
* g for global
* etc
With modern techniques of refactoring old code, search and replace of a symbol because you changed its type is tedious, the compiler will catch type changes, but often will not catch incorrect use of scope, sensible naming conventions help here.
There is no reason why you should not make correct use of Hungarian notation. It's unpopularity is due to a long-running back-lash against the mis-use of Hungarian notation, especially in the Windows APIs.
In the bad-old days, before anything resembling an IDE existed for DOS (odds are you didn't have enough free memory to run the compiler under Windows, so your development was done in DOS), you didn't get any help from hovering your mouse over a variable name. (Assuming you had a mouse.) What did you did have to deal with were event callback functions in which everything was passed to you as either a 16-bit int (WORD) or 32-bit int (LONG WORD). You then had to cast those parameter to the appropriate types for the given event type. In effect, much of the API was virtually type-less.
The result, an API with parameter names like these:
LRESULT CALLBACK WindowProc(HWND hwnd,
UINT uMsg,
WPARAM wParam,
LPARAM lParam);
Note that the names wParam and lParam, although pretty awful, aren't really any worse than naming them param1 and param2.
To make matters worse, Window 3.0/3.1 had two types of pointers, near and far. So, for example, the return value from memory management function LocalLock was a PVOID, but the return value from GlobalLock was an LPVOID (with the 'L' for long). That awful notation then got extended so that a long pointer string was prefixed lp, to distinguish it from a string that had simply been malloc'd.
It's no surprise that there was a backlash against this sort of thing.
Hungarian Notation can be useful in languages without compile-time type checking, as it would allow developer to quickly remind herself of how the particular variable is used. It does nothing for performance or behavior. It is supposed to improve code readability and is mostly a matter a taste and coding style. For this very reason it is criticized by many developers -- not everybody has the same wiring in the brain.
For the compile-time type-checking languages it is mostly useless -- scrolling up a few lines should reveal the declaration and thus type. If you global variables or your code block spans for much more than one screen, you have grave design and reusability issues. Thus one of the criticisms is that Hungarian Notation allows developers to have bad design and easily get away with it. This is probably one of the reasons for hatered.
On the other hand, there can be cases where even compile-time type-checking languages would benefit from Hungarian Notation -- void pointers or HANDLE's in win32 API. These obfuscates the actual data type, and there might be a merit to use Hungarian Notation there. Yet, if one can know the type of data at build time, why not to use the appropriate data type.
In general, there are no hard reasons not to use Hungarian Notation. It is a matter of likes, policies, and coding style.
As a Python programmer, Hungarian Notation falls apart pretty fast. In Python, I don't care if something is a string - I care if it can act like a string (i.e. if it has a ___str___() method which returns a string).
For example, let's say we have foo as an integer, 12
foo = 12
Hungarian notation tells us that we should call that iFoo or something, to denote it's an integer, so that later on, we know what it is. Except in Python, that doesn't work, or rather, it doesn't make sense. In Python, I decide what type I want when I use it. Do I want a string? well if I do something like this:
print "The current value of foo is %s" % foo
Note the %s - string. Foo isn't a string, but the % operator will call foo.___str___() and use the result (assuming it exists). foo is still an integer, but we treat it as a string if we want a string. If we want a float, then we treat it as a float. In dynamically typed languages like Python, Hungarian Notation is pointless, because it doesn't matter what type something is until you use it, and if you need a specific type, then just make sure to cast it to that type (e.g. float(foo)) when you use it.
Note that dynamic languages like PHP don't have this benefit - PHP tries to do 'the right thing' in the background based on an obscure set of rules that almost no one has memorized, which often results in catastrophic messes unexpectedly. In this case, some sort of naming mechanism, like $files_count or $file_name, can be handy.
In my view, Hungarian Notation is like leeches. Maybe in the past they were useful, or at least they seemed useful, but nowadays it's just a lot of extra typing for not a lot of benefit.
The IDE should impart that useful information. Hungarian might have made some sort (not a whole lot, but some sort) of sense when IDE's were much less advanced.
Apps Hungarian is Greek to me--in a good way
As an engineer, not a programmer, I immediately took to Joel's article on the merits of Apps Hungarian: "Making Wrong Code Look Wrong". I like Apps Hungarian because it mimics how engineering, science, and mathematics represent equations and formulas using sub- and super-scripted symbols (like Greek letters, mathematical operators, etc.). Take a particular example of Newton's Law of Universal Gravity: first in standard mathematical notation, and then in Apps Hungarian pseudo-code:
frcGravityEarthMars = G * massEarth * massMars / norm(posEarth - posMars)
In the mathematical notation, the most prominent symbols are those representing the kind of information stored in the variable: force, mass, position vector, etc. The subscripts play second fiddle to clarify: position of what? This is exactly what Apps Hungarian is doing; it's telling you the kind of thing stored in the variable first and then getting into specifics--about the closest code can get to mathematical notation.
Clearly strong typing can resolve the safe vs. unsafe string example from Joel's essay, but you wouldn't define separate types for position and velocity vectors; both are double arrays of size three and anything you're likely to do to one might apply to the other. Furthermore, it make perfect sense to concatenate position and velocity (to make a state vector) or take their dot product, but probably not to add them. How would typing allow the first two and prohibit the second, and how would such a system extend to every possible operation you might want to protect? Unless you were willing to encode all of math and physics in your typing system.
On top of all that, lots of engineering is done in weakly typed high-level languages like Matlab, or old ones like Fortran 77 or Ada.
So if you have a fancy language and IDE and Apps Hungarian doesn't help you then forget it--lots of folks apparently have. But for me, a worse than a novice programmer who is working in weakly or dynamically typed languages, I can write better code faster with Apps Hungarian than without.
It's incredibly redundant and useless is most modern IDEs, where they do a good job of making the type apparent.
Plus -- to me -- it's just annoying to see intI, strUserName, etc. :)
If I feel that useful information is being imparted, why shouldn't I put it right there where it's available?
Then who cares what anybody else thinks? If you find it useful, then use the notation.
Im my experience, it is bad because:
1 - then you break all the code if you need to change the type of a variable (i.e. if you need to extend a 32 bits integer to a 64 bits integer);
2 - this is useless information as the type is either already in the declaration or you use a dynamic language where the actual type should not be so important in the first place.
Moreover, with a language accepting generic programming (i.e. functions where the type of some variables is not determine when you write the function) or with dynamic typing system (i.e. when the type is not even determine at compile time), how would you name your variables? And most modern languages support one or the other, even if in a restricted form.
In Joel Spolsky's Making Wrong Code Look Wrong he explains that what everybody thinks of as Hungarian Notation (which he calls Systems Hungarian) is not what was it was really intended to be (what he calls Apps Hungarian). Scroll down to the I’m Hungary heading to see this discussion.
Basically, Systems Hungarian is worthless. It just tells you the same thing your compiler and/or IDE will tell you.
Apps Hungarian tells you what the variable is supposed to mean, and can actually be useful.
I've always thought that a prefix or two in the right place wouldn't hurt. I think if I can impart something useful, like "Hey this is an interface, don't count on specific behaviour" right there, as in IEnumerable, I oughtta do it. Comment can clutter things up much more than just a one or two character symbol.
It's a useful convention for naming controls on a form (btnOK, txtLastName etc.), if the list of controls shows up in an alphabetized pull-down list in your IDE.
I tend to use Hungarian Notation with ASP.NET server controls only, otherwise I find it too hard to work out what controls are what on the form.
Take this code snippet:
<asp:Label ID="lblFirstName" runat="server" Text="First Name" />
<asp:TextBox ID="txtFirstName" runat="server" />
<asp:RequiredFieldValidator ID="rfvFirstName" runat="server" ... />
If someone can show a better way of having that set of control names without Hungarian I'd be tempted to move to it.
Joel's article is great, but it seems to omit one major point:
Hungarian makes a particular 'idea' (kind + identifier name) unique,
or near-unique, across the codebase - even a very large codebase.
That's huge for code maintenance.
It means you can use good ol' single-line text search
(grep, findstr, 'find in all files') to find EVERY mention of that 'idea'.
Why is that important when we have IDE's that know how to read code?
Because they're not very good at it yet. This is hard to see in a small codebase,
but obvious in a large one - when the 'idea' might be mentioned in comments,
XML files, Perl scripts, and also in places outside source control (documents, wikis,
bug databases).
You do have to be a little careful even here - e.g. token-pasting in C/C++ macros
can hide mentions of the identifier. Such cases can be dealt with using
coding conventions, and anyway they tend to affect only a minority of the identifiers in the
codebase.
P.S. To the point about using the type system vs. Hungarian - it's best to use both.
You only need wrong code to look wrong if the compiler won't catch it for you. There are plenty of cases where it is infeasible to make the compiler catch it. But where it's feasible - yes, please do that instead!
When considering feasibility, though, do consider the negative effects of splitting up types. e.g. in C#, wrapping 'int' with a non-built-in type has huge consequences. So it makes sense in some situations, but not in all of them.
Debunking the benefits of Hungarian Notation
It provides a way of distinguishing variables.
If the type is all that distinguishes the one value from another, then it can only be for the conversion of one type to another. If you have the same value that is being converted between types, chances are you should be doing this in a function dedicated to conversion. (I have seen hungarianed VB6 leftovers use strings on all of their method parameters simply because they could not figure out how to deserialize a JSON object, or properly comprehend how to declare or use nullable types.) If you have two variables distinguished only by the Hungarian prefix, and they are not a conversion from one to the other, then you need to elaborate on your intention with them.
It makes the code more readable.
I have found that Hungarian notation makes people lazy with their variable names. They have something to distinguish it by, and they feel no need to elaborate to its purpose. This is what you will typically find in Hungarian notated code vs. modern: sSQL vs. groupSelectSql (or usually no sSQL at all because they are supposed to be using the ORM that was put in by earlier developers.), sValue vs. formCollectionValue (or usually no sValue either, because they happen to be in MVC and should be using its model binding features), sType vs. publishSource, etc.
It can't be readability. I see more sTemp1, sTemp2... sTempN from any given hungarianed VB6 leftover than everybody else combined.
It prevents errors.
This would be by virtue of number 2, which is false.
In the words of the master:
http://www.joelonsoftware.com/articles/Wrong.html
An interesting reading, as usual.
Extracts:
"Somebody, somewhere, read Simonyi’s paper, where he used the word “type,” and thought he meant type, like class, like in a type system, like the type checking that the compiler does. He did not. He explained very carefully exactly what he meant by the word “type,” but it didn’t help. The damage was done."
"But there’s still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug, and maintain, and, most importantly, it makes wrong code look wrong."
Make sure you have some time before reading Joel On Software. :)
Several reasons:
Any modern IDE will give you the variable type by simply hovering your mouse over the variable.
Most type names are way long (think HttpClientRequestProvider) to be reasonably used as prefix.
The type information does not carry the right information, it is just paraphrasing the variable declaration, instead of outlining the purpose of the variable (think myInteger vs. pageSize).
I don't think everyone is rabidly against it. In languages without static types, it's pretty useful. I definitely prefer it when it's used to give information that is not already in the type. Like in C, char * szName says that the variable will refer to a null terminated string -- that's not implicit in char* -- of course, a typedef would also help.
Joel had a great article on using hungarian to tell if a variable was HTML encoded or not:
http://www.joelonsoftware.com/articles/Wrong.html
Anyway, I tend to dislike Hungarian when it's used to impart information I already know.
Of course when 99% of programmers agree on something, there is something wrong. The reason they agree here is because most of them have never used Hungarian notation correctly.
For a detailed argument, I refer you to a blog post I have made on the subject.
http://codingthriller.blogspot.com/2007/11/rediscovering-hungarian-notation.html
I started coding pretty much the about the time Hungarian notation was invented and the first time I was forced to use it on a project I hated it.
After a while I realised that when it was done properly it did actually help and these days I love it.
But like all things good, it has to be learnt and understood and to do it properly takes time.
The Hungarian notation was abused, particularly by Microsoft, leading to prefixes longer than the variable name, and showing it is quite rigid, particularly when you change the types (the infamous lparam/wparam, of different type/size in Win16, identical in Win32).
Thus, both due to this abuse, and its use by M$, it was put down as useless.
At my work, we code in Java, but the founder cames from MFC world, so use similar code style (aligned braces, I like this!, capitals to method names, I am used to that, prefix like m_ to class members (fields), s_ to static members, etc.).
And they said all variables should have a prefix showing its type (eg. a BufferedReader is named brData). Which shown as being a bad idea, as the types can change but the names doesn't follow, or coders are not consistent in the use of these prefixes (I even see aBuffer, theProxy, etc.!).
Personally, I chose for a few prefixes that I find useful, the most important being b to prefix boolean variables, as they are the only ones where I allow syntax like if (bVar) (no use of autocast of some values to true or false).
When I coded in C, I used a prefix for variables allocated with malloc, as a reminder it should be freed later. Etc.
So, basically, I don't reject this notation as a whole, but took what seems fitting for my needs.
And of course, when contributing to some project (work, open source), I just use the conventions in place!