Acronyms in CamelCase [closed] - language-agnostic

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 4 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I have a doubt about CamelCase. Suppose you have this acronym: Unesco = United Nations Educational, Scientific and Cultural Organization.
You should write: unitedNationsEducationalScientificAndCulturalOrganization
But what if you need to write the acronym? Something like:
getUnescoProperties();
Is it right to write it this way? getUnescoProperties() OR getUNESCOProperties();

There are legitimate criticisms of the Microsoft advice from the accepted answer.
Inconsistent treatment of acronyms/initialisms depending on number of characters:
playerID vs playerId vs playerIdentifier.
The question of whether two-letter acronyms should still be capitalized if they appear at the start of the identifier:
USTaxes vs usTaxes
Difficulty in distinguishing multiple acronyms:
i.e. USID vs usId (or parseDBMXML in Wikipedia's example).
So I'll post this answer as an alternative to accepted answer. All acronyms should be treated consistently; acronyms should be treated like any other word. Quoting Wikipedia:
...some programmers prefer to treat abbreviations as if they were lower case words...
So re: OP's question, I agree with accepted answer; this is correct: getUnescoProperties()
But I think I'd reach a different conclusion in these examples:
US Taxes → usTaxes
Player ID → playerId
So vote for this answer if you think two-letter acronyms should be treated like other acronyms.
Camel Case is a convention, not a specification. So I guess popular opinion rules.
( EDIT: Removing this suggestion that votes should decide this issue; as #Brian David says; Stack Overflow is not a "popularity contest", and this question was closed as "opinion based")
Even though many prefer to treat acronyms like any-other word, the more common practice may be to put acronyms in all-caps (even though it leads to "abominations")
See "EDXML" in this XML schema
See "SFAS158" in this XBRL schema
Other Resources:
Note some people distinguish between abbreviation and acronyms
Note Microsoft guidelines distinguish between two-character acronyms, and "acronyms more than two characters long"
Note some people recommend to avoid abbreviations / acronyms altogether
Note some people recommend to avoid camelCase / PascalCase altogether
Note some people distinguish between "consistency" as "rules that seem internally inconsistent" (i.e. treating two-character acronyms different than three-character acronyms); some people define "consistency" as "applying the same rule consistently" (even if the rule is internally inconsistent)
Framework Design Guidelines
Microsoft Guidelines

Some guidelines Microsoft has written about camelCase are:
When using acronyms, use Pascal case or camel case for acronyms more than two characters long. For example, use HtmlButton or htmlButton. However, you should capitalize acronyms that consist of only two characters, such as System.IO instead of System.Io.
Do not use abbreviations in identifiers or parameter names. If you must use abbreviations, use camel case for abbreviations that consist of more than two characters, even if this contradicts the standard abbreviation of the word.
Summing up:
When you use an abbreviation or acronym that is two characters long, put them all in caps;
When the acronym is longer than two chars, use a capital for the first character.
So, in your specific case, getUnescoProperties() is correct.

To convert to CamelCase, there is also Google's (nearly) deterministic Camel case algorithm:
Beginning with the prose form of the name:
Convert the phrase to plain ASCII and remove any apostrophes.
For example, "Müller's algorithm" might become "Muellers
algorithm". Divide this result into words, splitting on
spaces and any remaining punctuation (typically hyphens).
Recommended: if any word already has a conventional camel case
appearance in common usage, split this into its constituent parts
(e.g., "AdWords" becomes "ad words"). Note that a word such
as "iOS" is not really in camel case per se; it defies any
convention, so this recommendation does not apply.
Now lowercase everything (including acronyms), then uppercase only
the first character of: … each word, to yield upper
camel case, or … each word except the first, to yield
lower camel case Finally, join all the words into
a single identifier.
Note that the casing of the original words is almost entirely
disregarded.
In the following examples, "XML HTTP request" is correctly transformed to XmlHttpRequest, XMLHTTPRequest is incorrect.

getUnescoProperties() should be the best solution...
When possible just follow the pure camelCase, when you have acronyms just let them upper case when possible otherwise go camelCase.
Generally in OO programming variables should start with lower case letter (lowerCamelCase) and class should start with upper case letter (UpperCamelCase).
When in doubt just go pure camelCase ;)
parseXML is fine, parseXml is also camelCase
XMLHTTPRequest should be XmlHttpRequest or xmlHttpRequest no way to go with subsequent upper case acronyms, it is definitively not clear for all test cases.
e.g.
how do you read this word HTTPSSLRequest, HTTP + SSL, or HTTPS + SL (that doesn't mean anything but...), in that case follow camel case convention and go for httpSslRequest or httpsSlRequest, maybe it is no longer nice, but it is definitely more clear.

There is airbnb JavaScript Style Guide at github with a lot of stars (~57.5k at this moment) and guides about acronyms which say:
Acronyms and initialisms should always be all capitalized, or all
lowercased.
Why? Names are for readability, not to appease a computer algorithm.
// bad
import SmsContainer from './containers/SmsContainer';
// bad
const HttpRequests = [
// ...
];
// good
import SMSContainer from './containers/SMSContainer';
// good
const HTTPRequests = [
// ...
];
// also good
const httpRequests = [
// ...
];
// best
import TextMessageContainer from './containers/TextMessageContainer';
// best
const requests = [
// ...
];

In addition to what #valex has said, I want to recap a couple of things with the given answers for this question.
I think the general answer is: it depends on the programming language that you are using.
C Sharp
Microsoft has written some guidelines where it seems that HtmlButton is the right way to name a class for this cases.
Javascript
Javascript has some global variables with acronyms and it uses them all in upper case (but funnily, not always consistently) here are some examples:
encodeURIComponent
XMLHttpRequest
toJSON
toISOString

Currently I am using the following rules:
Capital case for acronyms: XMLHTTPRequest, xmlHTTPRequest, requestIPAddress.
Camel case for abbreviations: ID[entifier], Exe[cutable], App[lication].
ID is an exception, sorry but true.
When I see a capital letter I assume an acronym, i.e. a separate word for each letter. Abbreviations do not have separate words for each letter, so I use camel case.
XMLHTTPRequest is ambigous, but it is a rare case and it's not so much ambiguous, so it's ok, rules and logic are more important than beauty.

The JavaScript Airbnb style guide talks a bit about this. Basically:
// bad
const HttpRequests = [ req ];
// good
const httpRequests = [ req ];
// also good
const HTTPRequests = [ req ];
Because I typically read a leading capital letter as a class, I tend to avoid that. At the end of the day, it's all preference.

disclaimer: English is not my mother tone. But I've thought about this problem for a long time, esp when using node (camelcase style) to handle database since the name of table fields should be snakeized, this is my thought:
There are 2 kinds of 'acronyms' for a programmer:
in natural language, UNESCO
in computer programming language, for example, tmc and textMessageContainer, which usually appears as a local variable.
In programming world, all acronyms in natural language should be treated as word, the reasons are:
when we programming, we should name a variable either in acronym style or non-acronym-style. So, if we name a function getUNESCOProperties, it means UNESCO is an acronym ( otherwise it shouldn't be all uppercase letters ), but evidently, get and properties are not acronyms. so, we should name this function
either gunescop or getUnitedNationsEducationalScientificAndCulturalOrganizationProperties, both are unacceptable.
natural language is evolving continuously, and
today's acronyms will become words tommorow, but programs should be independent of this trend and stand forever.
by the way, in the most-voted answer, IO is the acronym in computer language meaning (stands for InputOutput), but I don't like the name, since I think the acronym (in computer language) should only be used to name a local variable but a top-level class/function, so InputOutput should be used instead of IO

There is also another camelcase convention that tries to favor readability for acronyms by using either uppercase (HTML), or lowercase (html), but avoiding both (Html).
So in your case you could write getUNESCOProperties. You could also write unescoProperties for a variable, or UNESCOProperties for a class (the convention for classes is to start with uppercase).
This rule gets tricky if you want to put together two acronyms, for example for a class named XML HTTP Request. It would start with uppercase, but since XMLHTTPRequest would not be easy to read (is it XMLH TTP Request?), and XMLhttpRequest would break the camelcase convention (is it XM Lhttp Request?), the best option would be to mix case: XMLHttpRequest, which is actually what the W3C used. However using this sort of namings is discouraged. For this example, HTTPRequest would be a better name.
Since the official English word for identification/identity seems to be ID, although is not an acronym, you could apply the same rules there.
This convention seems to be pretty popular out there, but it's just a convention and there is no right or wrong. Just try to stick to a convention and make sure your names are readable.

UNESCO is a special case as it is usually ( in English ) read as a word and not an acronym - like UEFA, RADA, BAFTA and unlike BBC, HTML, SSL

Related

Explain the difference between Docstring and Comment with an appropriate example in python? [duplicate]

I'm a bit confused over the difference between docstrings and comments in python.
In my class my teacher introduced something known as a 'design recipe', a set of steps that will supposedly help us students plot and organize our coding better in Python. From what I understand, the below is an example of the steps we follow - this so call design recipe (the stuff in the quotations):
def term_work_mark(a0_mark, a1_mark, a2_mark, ex_mark, midterm_mark):
''' (float, float, float, float, float) -> float
Takes your marks on a0_mark, a1_mark, a2_mark, ex_mark and midterm_mark,
calculates their respective weight contributions and sums these
contributions to deliver your overall term mark out of a maximum of 55 (This
is because the exam mark is not taken account of in this function)
>>>term_work_mark(5, 5, 5, 5, 5)
11.8
>>>term_work_mark(0, 0, 0, 0, 0)
0.0
'''
a0_component = contribution(a0_mark, a0_max_mark, a0_weight)
a1_component = contribution(a1_mark, a1_max_mark, a1_weight)
a2_component = contribution(a2_mark, a2_max_mark, a2_weight)
ex_component = contribution(ex_mark, exercises_max_mark,exercises_weight)
mid_component = contribution(midterm_mark, midterm_max_mark, midterm_weight)
return (a0_component + a1_component + a2_component + ex_component +
mid_component)
As far as I understand this is basically a docstring, and in our version of a docstring it must include three things: a description, examples of what your function should do if you enter it in the python shell, and a 'type contract', a section that shows you what types you enter and what types the function will return.
Now this is all good and done, but our assignments require us to also have comments which explain the nature of our functions, using the token '#' symbol.
So, my question is, haven't I already explained what my function will do in the description section of the docstring? What's the point of adding comments if I'll essentially be telling the reader the exact same thing?
It appears your teacher is a fan of How to Design Programs ;)
I'd tackle this as writing for two different audiences who won't always overlap.
First there are the docstrings; these are for people who are going to be using your code without needing or wanting to know how it works. Docstrings can be turned into actual documentation. Consider the official Python documentation - What's available in each library and how to use it, no implementation details (Unless they directly relate to use)
Secondly there are in-code comments; these are to explain what is going on to people (generally you!) who want to extend the code. These will not normally be turned into documentation as they are really about the code itself rather than usage. Now there are about as many opinions on what makes for good comments (or lack thereof) as there are programmers. My personal rules of thumb for adding comments are to explain:
Parts of the code that are necessarily complex. (Optimisation comes to mind)
Workarounds for code you don't have control over, that may otherwise appear illogical
I'll admit to TODOs as well, though I try to keep that to a minimum
Where I've made a choice of a simpler algorithm where a better performing (but more complex) option can go if performance in that section later becomes critical
Since you're coding in an academic setting, and it sounds like your lecturer is going for verbose, I'd say just roll with it. Use code comments to explain how you are doing what you say you are doing in the design recipe.
I believe that it's worth to mention what PEP8 says, I mean, the pure concept.
Docstrings
Conventions for writing good documentation strings (a.k.a. "docstrings") are immortalized in PEP 257.
Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the def line.
PEP 257 describes good docstring conventions. Note that most importantly, the """ that ends a multiline docstring should be on a line by itself, e.g.:
"""Return a foobang
Optional plotz says to frobnicate the bizbaz first.
"""
For one liner docstrings, please keep the closing """ on the same line.
Comments
Block comments
Generally apply to some (or all) code that follows them, and are indented to the same level as that code. Each line of a block comment starts with a # and a single space (unless it is indented text inside the comment).
Paragraphs inside a block comment are separated by a line containing a single #.
Inline Comments
Use inline comments sparingly.
An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.
Inline comments are unnecessary and in fact distracting if they state the obvious.
Don't do this:
x = x + 1 # Increment x
But sometimes, this is useful:
x = x + 1 # Compensate for border
Reference
https://www.python.org/dev/peps/pep-0008/#documentation-strings
https://www.python.org/dev/peps/pep-0008/#inline-comments
https://www.python.org/dev/peps/pep-0008/#block-comments
https://www.python.org/dev/peps/pep-0257/
First of all, for formatting your posts you can use the help options above the text area you type your post.
And about comments and doc strings, the doc string is there to explain the overall use and basic information of the methods. On the other hand comments are meant to give specific information on blocks or lines, #TODO is used to remind you what you want to do in future, definition of variables and so on. By the way, in IDLE the doc string is shown as a tool tip when you hover over the method's name.
Quoting from this page http://www.pythonforbeginners.com/basics/python-docstrings/
Python documentation strings (or docstrings) provide a convenient way
of associating documentation with Python modules, functions, classes,
and methods.
An object's docsting is defined by including a string constant as the
first statement in the object's definition.
It's specified in source code that is used, like a comment, to
document a specific segment of code.
Unlike conventional source code comments the docstring should describe
what the function does, not how.
All functions should have a docstring
This allows the program to inspect these comments at run time, for
instance as an interactive help system, or as metadata.
Docstrings can be accessed by the __doc__ attribute on objects.
Docstrings can be accessed through a program (__doc__) where as inline comments cannot be accessed.
Interactive help systems like in bpython and IPython can use docstrings to display the docsting during the development. So that you dont have to visit the program everytime.

Regex getting the tags from an <a href= ...> </a> and the likes

I've tried the answers I've found in SOF, but none supported here : https://regexr.com
I essentially have an .OPML file with a large number of podcasts and descriptions.
in the following format:
<outline text="Software Engineering Daily" type="rss" xmlUrl="http://softwareengineeringdaily.com/feed/podcast/" htmlUrl="http://softwareengineeringdaily.com" />
What regex I can use to so I can just get the title and the link:
Software Engineering Daily
http://softwareengineeringdaily.com/feed/podcast/
Brief
There are many ways to go about this. The best way is likely using an XML parser. I would definitely read this post that discusses use of regex, especially with XML.
As you can see there are many answers to your question. It also depends on which language you are using since regex engines differ. Some accept backreferences, whilst others do not. I'll post multiple methods below that work in different circumstances/for different regex flavours. You can probably piece together from the multiple regex methods below which parts work best for you.
Code
Method 1
This method works in almost any regex flavour (at least the normal ones).
This method only checks against the attribute value opening and closing marks of " and doesn't include the possibility for whitespace before or after the = symbol. This is the simplest solution to get the values you want.
See regex in use here
\b(text|xmlUrl)="[^"]*"
Similarly, the following methods add more value to the above expression
\b(text|xmlUrl)\s*=\s*"[^"]*" Allows whitespace around =
\b(text|xmlUrl)=(?:"[^"]*"|'[^']*') Allows for ' to be used as attribute value delimiter
As another alternative (following the comments below my answer), if you wanted to grab every attribute except specific ones, you can use the following. Note that I use \w, which should cover most attributes, but you can just replace this with whatever valid characters you want. \S can be used to specify any non-whitespace characters or a set such as [\w-] may be used to specify any word or hyphen character. The negation of the specific attributes occurs with (?!text|xmlUrl), which says don't match those characters. Also, note that the word boundary \b at the beginning ensures that we're matching the full attribute name of text and not the possibility of other attributes with the same termination such as subtext.
\b((?!text|xmlUrl)\w+)="[^"]*"
Method 2
This method only works with regex flavours that allow backreferences. Apparently JGsoft applications, Delphi, Perl, Python, Ruby, PHP, R, Boost, and Tcl support single-digit backreferences. Double-digit backreferences are supported by JGsoft applications, Delphi, Python, and Boost. Information according this article about numbered backreferences from Regular-Expressions.info
See regex in use here
This method uses a backreference to ensure the same closing mark is used at the start and end of the attribute's value and also includes the possibility of whitespace surrounding the = symbol. This doesn't allow the possibility for attributes with no delimiter specified (using xmlUrl=http://softwareengineeringdaily.com/feed/podcast/ may also be valid).
See regex in use here
\b(text|xmlUrl)\s*=\s*(["'])(.*?)\2
Method 3
This method is the same as Method 2 but also allows attributes with no delimiters (note that delimiters are now considered to be space characters, thus, it will only match until the next space).
See regex in use here
\b(text|xmlUrl)\s*=\s*(?:(["'])(.*?)\2|(\S*))
Method 4
While Method 3 works, some people might complain that the attribute values might either of 2 groups. This can be fixed by either of the following methods.
Method 4.A
Branch reset groups are only possible in a few languages, notably JGsoft V2, PCRE 7.2+, PHP, Delphi, R (with PCRE enabled), Boost 1.42+ according to Regular-Expressions.info
This also shows the method you would use if backreferences aren't possible and you wanted to match multiple delimiters ("([^"])"|'([^']*))
See regex in use here
\b(text|xmlUrl)\s*=\s*(?|"([^"]*)"|'([^']*)'|(\S*))
Method 4.B
Duplicate subpatterns are not often supported. See this Regular-Expresions.info article for more information
This method uses the J regex flag, which allows duplicate subpattern names ((?<v>) is in there twice)
See regex in use here
\b(text|xmlUrl)\s*=\s*(?:(["'])(?<v>.*?)\2|(?<v>\S*))
Results
Input
<outline text="Software Engineering Daily" type="rss" xmlUrl="http://softwareengineeringdaily.com/feed/podcast/" htmlUrl="http://softwareengineeringdaily.com" />
Output
Each line below represents a different group. New matches are separated by two lines.
text
Software Engineering Daily
xmlUrl
http://softwareengineeringdaily.com/feed/podcast/
Explanation
I'll explain different parts of the regexes used in the Code section that way you understand the usage of each of these parts. This is more of a reference to the methods above.
"[^"]*" This is the fastest method possible (to the best of my knowledge) to grabbing anything between two " symbols. Note that it does not check for escaped backslashes, it will match any non-" character between two ". Whilst "(.*?)" can also be used, it's slightly slower
(["'])(.*?)\2 is basically shorthand for "(.*?)"|'(.*?)'. You can use any of the following methods to get the same result:
(?:"(.*?)"|'(.*?)')
(?:"([^"])"|'([^']*)') <-- slightly faster than line above
(?|) This is a branch reset group. When you place groups inside it like (?|(x)|(y)) it returns the same group index for both matches. This means that if x is captured, it'll get group index of 1, and if y is captured, it'll also get a group index of 1.
For simple HTML strings you might get along with
Url=(['"])(.+?)\1
Here, take group $2, see a demo on regex101.com.
Obligatory: consider using a parser instead (see here).

What is your system for avoiding keyword naming clashes?

Typically languages have keywords that you are unable to use directly with the exact same spelling and case for naming things (variables,functions,classes ...) in your program. Yet sometimes a keyword is the only natural choice for naming something. What is your system for avoiding/getting around this clash in your chosen technology?
I just avoid the name, usually. Either find a different name or change it slightly - e.g. clazz instead of class in C# or Java. In C# you can use the # prefix, but it's horrible:
int #int = 5; // Ick!
There is nothing intrinsically all-encompassing about a keyword, in that it should stop you from being able to name your variables. Since all names are just generalized instances of some type to one degree or another, you can always go up or down in the abstraction to find another useful name.
For example, if your writing a system that tracks students and you want an object to represent their study in a specific field, i.e. they've taken a "class" in something, if you can't use the term directly, or the plural "classes", or an alternative like "studies", you might find a more "instanced" variation: studentClass, currentClass, etc. or a higher perspective: "courses", "courseClass" or a specfic type attribute: dailyClass, nightClass, etc.
Lots of options, you should just prefer the simplest and most obvious one, that's all.
I always like to listen to the users talk, because the scope of their language helps define the scope of the problem, often if you listen long enough you'll find they have many multiple terms for the same underlying things (with only subtle differences). They usually have the answer ...
Paul.
My system is don't use keywords period!
If I have a function/variable/class and it only seems logical to name it with a keyword, I'll use a descriptive word in front of the keyword.
(adjectiveNoun) format. ie: personName instead of Name where "Name" is a keyword.
I just use a more descriptive name. For instance, 'id' becomes identifier, 'string' becomes 'descriptionString,' and so on.
In Python I usually use proper namespacing on my modules to avoid name clashes.
import re
re.compile()
instead of:
from re import *
compile()
Sometimes, when I can't avoid keyword name clashes I simply drop the last letter off the name of my variable.
for fil in files:
pass
As stated before either change class to clazz in Java/C#, or use some underscore as a prefix, for example
int _int = 0;
There should be no reason to use keywords as variable names. Either use a more detailed word or use a thesaraus. Capitalizing certain letters of the word to make it not exactly like the keyword is not going to help much to someone inheriting your code later.
Happy those with a language without ANY keywords...
But joke apart, I think in the seldom situations where "Yet sometimes a keyword is the only natural choice for naming something." you can get along by prefixing it with "my", "do", "_" or similar.
I honestly can't really think of many such instances where the keyword alone makes a good name ("int", "for" and "if" are definitely bad anyway). The only few in the C-language family which might make sense are "continue" (make it "doContinue"), "break" (how about "breakWhenEOFIsreached" or similar ?) and the already mentioned "class" (how about "classOfThingy" ?).
In other words: make the names more reasonable.
And always remember: code is WRITTEN only once, but usualy READ very often.
Typically I follow Hungarian Notation. So if, for whatever reason, I wanted to use 'End' as a variable of type integer I would declare it as 'iEnd'. A string would be 'strEnd', etc. This usually gives me some room as far as variables go.
If I'm working on a particular personal project that other people will only ever look at to see what I did, for example, when making an add-on to a game using the UnrealEngine, I might use my initials somewhere in the name. 'DS_iEnd' perhaps.
I write my own [vim] syntax highlighters for each language, and I give all keywords an obvious colour so that I notice them when I'm coding. Languages like PHP and Perl use $ for variables, making it a non-issue.
Developing in Ruby on Rails I sometime look up this list of reserved words.
In 15 years of programming, I've rarely had this problem.
One place I can immediately think of, is perhaps a css class, and in that case, I'd use a more descriptive name. So instead of 'class', I might use 'targetClass' or something similar.
In python the generally accepted method is to append an '_'
class -> class_
or -> or_
and -> and_
you can see this exemplified in the operator module.
I switched to a language which doesn't restrict identifier names at all.
First of all, most code conventions prevent such a thing from happening.
If not, I usually add a descriptive prose prefix or suffix:
the_class or theClass infix_or (prefix_or(class_param, in_class) , a_class) or_postfix
A practice, that is usually in keeping with every code style advice you can find ("long names don't kill", "Longer variable names don't take up more space in memory, I promise.")
Generally, if you think the keyword is the best description, a slightly worse one would be better.
Note that, by the very premise of your question you introduce ambiguity, which is bad for the reader, be it a compiler or human. Even if it is a custom to use class, clazz or klass and even if that custom is not so custom that it is a custom: it takes a word word, precisely descriptive as word may be, and distorts it, effectively shooting w0rd's precision in the "wrd". Somebody used to another w_Rd convention or language might have a few harsh wordz for your wolds.
Most of us have more to say about things than "Flower", "House" or "Car", so there's usually more to say about typeNames, decoratees, class_params, BaseClasses and typeReferences.
This is where my personal code obfuscation tolerance ends:
Never(!!!) rely on scoping or arcane syntax rules to prevent name clashes with "key words". (Don't know any compiler that would allow that, but, these days, you never know...).
Try that and someone will w**d you in the wörd so __rd, Word will look like TeX to you!
My system in Java is to capitalize the second letter of the word, so for example:
int dEfault;
boolean tRansient;
Class cLass;

Why shouldn't I use "Hungarian Notation"?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I know what Hungarian refers to - giving information about a variable, parameter, or type as a prefix to its name. Everyone seems to be rabidly against it, even though in some cases it seems to be a good idea. If I feel that useful information is being imparted, why shouldn't I put it right there where it's available?
See also: Do people use the Hungarian naming conventions in the real world?
vUsing adjHungarian nnotation vmakes nreading ncode adjdifficult.
Most people use Hungarian notation in a wrong way and are getting wrong results.
Read this excellent article by Joel Spolsky: Making Wrong Code Look Wrong.
In short, Hungarian Notation where you prefix your variable names with their type (string) (Systems Hungarian) is bad because it's useless.
Hungarian Notation as it was intended by its author where you prefix the variable name with its kind (using Joel's example: safe string or unsafe string), so called Apps Hungarian has its uses and is still valuable.
Joel is wrong, and here is why.
That "application" information he's talking about should be encoded in the type system. You should not depend on flipping variable names to make sure you don't pass unsafe data to functions requiring safe data. You should make it a type error, so that it is impossible to do so. Any unsafe data should have a type that is marked unsafe, so that it simply cannot be passed to a safe function. To convert from unsafe to safe should require processing with some kind of a sanitize function.
A lot of the things that Joel talks of as "kinds" are not kinds; they are, in fact, types.
What most languages lack, however, is a type system that's expressive enough to enforce these kind of distinctions. For example, if C had a kind of "strong typedef" (where the typedef name had all the operations of the base type, but was not convertible to it) then a lot of these problems would go away. For example, if you could say, strong typedef std::string unsafe_string; to introduce a new type unsafe_string that could not be converted to a std::string (and so could participate in overload resolution etc. etc.) then we would not need silly prefixes.
So, the central claim that Hungarian is for things that are not types is wrong. It's being used for type information. Richer type information than the traditional C type information, certainly; it's type information that encodes some kind of semantic detail to indicate the purpose of the objects. But it's still type information, and the proper solution has always been to encode it into the type system. Encoding it into the type system is far and away the best way to obtain proper validation and enforcement of the rules. Variables names simply do not cut the mustard.
In other words, the aim should not be "make wrong code look wrong to the developer". It should be "make wrong code look wrong to the compiler".
I think it massively clutters up the source code.
It also doesn't gain you much in a strongly typed language. If you do any form of type mismatch tomfoolery, the compiler will tell you about it.
Hungarian notation only makes sense in languages without user-defined types. In a modern functional or OO-language, you would encode information about the "kind" of value into the datatype or class rather than into the variable name.
Several answers reference Joels article. Note however that his example is in VBScript, which didn't support user-defined classes (for a long time at least). In a language with user-defined types you would solve the same problem by creating a HtmlEncodedString-type and then let the Write method accept only that. In a statically typed language, the compiler will catch any encoding-errors, in a dynamically typed you would get a runtime exception - but in any case you are protected against writing unencoded strings. Hungarian notations just turns the programmer into a human type-checker, with is the kind of job that is typically better handled by software.
Joel distinguishes between "systems hungarian" and "apps hungarian", where "systems hungarian" encodes the built-in types like int, float and so on, and "apps hungarian" encodes "kinds", which is higher-level meta-info about variable beyound the machine type, In a OO or modern functional language you can create user-defined types, so there is no distinction between type and "kind" in this sense - both can be represented by the type system - and "apps" hungarian is just as redundant as "systems" hungarian.
So to answer your question: Systems hungarian would only be useful in a unsafe, weakly typed language where e.g. assigning a float value to an int variable will crash the system. Hungarian notation was specifically invented in the sixties for use in BCPL, a pretty low-level language which didn't do any type checking at all. I dont think any language in general use today have this problem, but the notation lived on as a kind of cargo cult programming.
Apps hungarian will make sense if you are working with a language without user defined types, like legacy VBScript or early versions of VB. Perhaps also early versions of Perl and PHP. Again, using it in a modern languge is pure cargo cult.
In any other language, hungarian is just ugly, redundant and fragile. It repeats information already known from the type system, and you should not repeat yourself. Use a descriptive name for the variable that describes the intent of this specific instance of the type. Use the type system to encode invariants and meta info about "kinds" or "classes" of variables - ie. types.
The general point of Joels article - to have wrong code look wrong - is a very good principle. However an even better protection against bugs is to - when at all possible - have wrong code to be detected automatically by the compiler.
I always use Hungarian notation for all my projects. I find it really helpful when I'm dealing with 100s of different identifier names.
For example, when I call a function requiring a string I can type 's' and hit control-space and my IDE will show me exactly the variable names prefixed with 's' .
Another advantage, when I prefix u for unsigned and i for signed ints, I immediately see where I am mixing signed and unsigned in potentially dangerous ways.
I cannot remember the number of times when in a huge 75000 line codebase, bugs were caused (by me and others too) due to naming local variables the same as existing member variables of that class. Since then, I always prefix members with 'm_'
Its a question of taste and experience. Don't knock it until you've tried it.
You're forgetting the number one reason to include this information. It has nothing to do with you, the programmer. It has everything to do with the person coming down the road 2 or 3 years after you leave the company who has to read that stuff.
Yes, an IDE will quickly identify types for you. However, when you're reading through some long batches of 'business rules' code, it's nice to not have to pause on each variable to find out what type it is. When I see things like strUserID, intProduct or guiProductID, it makes for much easier 'ramp up' time.
I agree that MS went way too far with some of their naming conventions - I categorize that in the "too much of a good thing" pile.
Naming conventions are good things, provided you stick to them. I've gone through enough old code that had me constantly going back to look at the definitions for so many similarly-named variables that I push "camel casing" (as it was called at a previous job). Right now I'm on a job that has many thousand of lines of completely uncommented classic ASP code with VBScript and it's a nightmare trying to figure things out.
Tacking on cryptic characters at the beginning of each variable name is unnecessary and shows that the variable name by itself isn't descriptive enough. Most languages require the variable type at declaration anyway, so that information is already available.
There's also the situation where, during maintenance, a variable type needs to change. Example: if a variable declared as "uint_16 u16foo" needs to become a 64-bit unsigned, one of two things will happen:
You'll go through and change each variable name (making sure not to hose any unrelated variables with the same name), or
Just change the type and not change the name, which will only cause confusion.
Joel Spolsky wrote a good blog post about this.
http://www.joelonsoftware.com/articles/Wrong.html
Basically it comes down to not making your code harder to read when a decent IDE will tell you want type the variable is if you can't remember. Also, if you make your code compartmentalized enough, you don't have to remember what a variable was declared as three pages up.
Isn't scope more important than type these days, e.g.
* l for local
* a for argument
* m for member
* g for global
* etc
With modern techniques of refactoring old code, search and replace of a symbol because you changed its type is tedious, the compiler will catch type changes, but often will not catch incorrect use of scope, sensible naming conventions help here.
There is no reason why you should not make correct use of Hungarian notation. It's unpopularity is due to a long-running back-lash against the mis-use of Hungarian notation, especially in the Windows APIs.
In the bad-old days, before anything resembling an IDE existed for DOS (odds are you didn't have enough free memory to run the compiler under Windows, so your development was done in DOS), you didn't get any help from hovering your mouse over a variable name. (Assuming you had a mouse.) What did you did have to deal with were event callback functions in which everything was passed to you as either a 16-bit int (WORD) or 32-bit int (LONG WORD). You then had to cast those parameter to the appropriate types for the given event type. In effect, much of the API was virtually type-less.
The result, an API with parameter names like these:
LRESULT CALLBACK WindowProc(HWND hwnd,
UINT uMsg,
WPARAM wParam,
LPARAM lParam);
Note that the names wParam and lParam, although pretty awful, aren't really any worse than naming them param1 and param2.
To make matters worse, Window 3.0/3.1 had two types of pointers, near and far. So, for example, the return value from memory management function LocalLock was a PVOID, but the return value from GlobalLock was an LPVOID (with the 'L' for long). That awful notation then got extended so that a long pointer string was prefixed lp, to distinguish it from a string that had simply been malloc'd.
It's no surprise that there was a backlash against this sort of thing.
Hungarian Notation can be useful in languages without compile-time type checking, as it would allow developer to quickly remind herself of how the particular variable is used. It does nothing for performance or behavior. It is supposed to improve code readability and is mostly a matter a taste and coding style. For this very reason it is criticized by many developers -- not everybody has the same wiring in the brain.
For the compile-time type-checking languages it is mostly useless -- scrolling up a few lines should reveal the declaration and thus type. If you global variables or your code block spans for much more than one screen, you have grave design and reusability issues. Thus one of the criticisms is that Hungarian Notation allows developers to have bad design and easily get away with it. This is probably one of the reasons for hatered.
On the other hand, there can be cases where even compile-time type-checking languages would benefit from Hungarian Notation -- void pointers or HANDLE's in win32 API. These obfuscates the actual data type, and there might be a merit to use Hungarian Notation there. Yet, if one can know the type of data at build time, why not to use the appropriate data type.
In general, there are no hard reasons not to use Hungarian Notation. It is a matter of likes, policies, and coding style.
As a Python programmer, Hungarian Notation falls apart pretty fast. In Python, I don't care if something is a string - I care if it can act like a string (i.e. if it has a ___str___() method which returns a string).
For example, let's say we have foo as an integer, 12
foo = 12
Hungarian notation tells us that we should call that iFoo or something, to denote it's an integer, so that later on, we know what it is. Except in Python, that doesn't work, or rather, it doesn't make sense. In Python, I decide what type I want when I use it. Do I want a string? well if I do something like this:
print "The current value of foo is %s" % foo
Note the %s - string. Foo isn't a string, but the % operator will call foo.___str___() and use the result (assuming it exists). foo is still an integer, but we treat it as a string if we want a string. If we want a float, then we treat it as a float. In dynamically typed languages like Python, Hungarian Notation is pointless, because it doesn't matter what type something is until you use it, and if you need a specific type, then just make sure to cast it to that type (e.g. float(foo)) when you use it.
Note that dynamic languages like PHP don't have this benefit - PHP tries to do 'the right thing' in the background based on an obscure set of rules that almost no one has memorized, which often results in catastrophic messes unexpectedly. In this case, some sort of naming mechanism, like $files_count or $file_name, can be handy.
In my view, Hungarian Notation is like leeches. Maybe in the past they were useful, or at least they seemed useful, but nowadays it's just a lot of extra typing for not a lot of benefit.
The IDE should impart that useful information. Hungarian might have made some sort (not a whole lot, but some sort) of sense when IDE's were much less advanced.
Apps Hungarian is Greek to me--in a good way
As an engineer, not a programmer, I immediately took to Joel's article on the merits of Apps Hungarian: "Making Wrong Code Look Wrong". I like Apps Hungarian because it mimics how engineering, science, and mathematics represent equations and formulas using sub- and super-scripted symbols (like Greek letters, mathematical operators, etc.). Take a particular example of Newton's Law of Universal Gravity: first in standard mathematical notation, and then in Apps Hungarian pseudo-code:
frcGravityEarthMars = G * massEarth * massMars / norm(posEarth - posMars)
In the mathematical notation, the most prominent symbols are those representing the kind of information stored in the variable: force, mass, position vector, etc. The subscripts play second fiddle to clarify: position of what? This is exactly what Apps Hungarian is doing; it's telling you the kind of thing stored in the variable first and then getting into specifics--about the closest code can get to mathematical notation.
Clearly strong typing can resolve the safe vs. unsafe string example from Joel's essay, but you wouldn't define separate types for position and velocity vectors; both are double arrays of size three and anything you're likely to do to one might apply to the other. Furthermore, it make perfect sense to concatenate position and velocity (to make a state vector) or take their dot product, but probably not to add them. How would typing allow the first two and prohibit the second, and how would such a system extend to every possible operation you might want to protect? Unless you were willing to encode all of math and physics in your typing system.
On top of all that, lots of engineering is done in weakly typed high-level languages like Matlab, or old ones like Fortran 77 or Ada.
So if you have a fancy language and IDE and Apps Hungarian doesn't help you then forget it--lots of folks apparently have. But for me, a worse than a novice programmer who is working in weakly or dynamically typed languages, I can write better code faster with Apps Hungarian than without.
It's incredibly redundant and useless is most modern IDEs, where they do a good job of making the type apparent.
Plus -- to me -- it's just annoying to see intI, strUserName, etc. :)
If I feel that useful information is being imparted, why shouldn't I put it right there where it's available?
Then who cares what anybody else thinks? If you find it useful, then use the notation.
Im my experience, it is bad because:
1 - then you break all the code if you need to change the type of a variable (i.e. if you need to extend a 32 bits integer to a 64 bits integer);
2 - this is useless information as the type is either already in the declaration or you use a dynamic language where the actual type should not be so important in the first place.
Moreover, with a language accepting generic programming (i.e. functions where the type of some variables is not determine when you write the function) or with dynamic typing system (i.e. when the type is not even determine at compile time), how would you name your variables? And most modern languages support one or the other, even if in a restricted form.
In Joel Spolsky's Making Wrong Code Look Wrong he explains that what everybody thinks of as Hungarian Notation (which he calls Systems Hungarian) is not what was it was really intended to be (what he calls Apps Hungarian). Scroll down to the I’m Hungary heading to see this discussion.
Basically, Systems Hungarian is worthless. It just tells you the same thing your compiler and/or IDE will tell you.
Apps Hungarian tells you what the variable is supposed to mean, and can actually be useful.
I've always thought that a prefix or two in the right place wouldn't hurt. I think if I can impart something useful, like "Hey this is an interface, don't count on specific behaviour" right there, as in IEnumerable, I oughtta do it. Comment can clutter things up much more than just a one or two character symbol.
It's a useful convention for naming controls on a form (btnOK, txtLastName etc.), if the list of controls shows up in an alphabetized pull-down list in your IDE.
I tend to use Hungarian Notation with ASP.NET server controls only, otherwise I find it too hard to work out what controls are what on the form.
Take this code snippet:
<asp:Label ID="lblFirstName" runat="server" Text="First Name" />
<asp:TextBox ID="txtFirstName" runat="server" />
<asp:RequiredFieldValidator ID="rfvFirstName" runat="server" ... />
If someone can show a better way of having that set of control names without Hungarian I'd be tempted to move to it.
Joel's article is great, but it seems to omit one major point:
Hungarian makes a particular 'idea' (kind + identifier name) unique,
or near-unique, across the codebase - even a very large codebase.
That's huge for code maintenance.
It means you can use good ol' single-line text search
(grep, findstr, 'find in all files') to find EVERY mention of that 'idea'.
Why is that important when we have IDE's that know how to read code?
Because they're not very good at it yet. This is hard to see in a small codebase,
but obvious in a large one - when the 'idea' might be mentioned in comments,
XML files, Perl scripts, and also in places outside source control (documents, wikis,
bug databases).
You do have to be a little careful even here - e.g. token-pasting in C/C++ macros
can hide mentions of the identifier. Such cases can be dealt with using
coding conventions, and anyway they tend to affect only a minority of the identifiers in the
codebase.
P.S. To the point about using the type system vs. Hungarian - it's best to use both.
You only need wrong code to look wrong if the compiler won't catch it for you. There are plenty of cases where it is infeasible to make the compiler catch it. But where it's feasible - yes, please do that instead!
When considering feasibility, though, do consider the negative effects of splitting up types. e.g. in C#, wrapping 'int' with a non-built-in type has huge consequences. So it makes sense in some situations, but not in all of them.
Debunking the benefits of Hungarian Notation
It provides a way of distinguishing variables.
If the type is all that distinguishes the one value from another, then it can only be for the conversion of one type to another. If you have the same value that is being converted between types, chances are you should be doing this in a function dedicated to conversion. (I have seen hungarianed VB6 leftovers use strings on all of their method parameters simply because they could not figure out how to deserialize a JSON object, or properly comprehend how to declare or use nullable types.) If you have two variables distinguished only by the Hungarian prefix, and they are not a conversion from one to the other, then you need to elaborate on your intention with them.
It makes the code more readable.
I have found that Hungarian notation makes people lazy with their variable names. They have something to distinguish it by, and they feel no need to elaborate to its purpose. This is what you will typically find in Hungarian notated code vs. modern: sSQL vs. groupSelectSql (or usually no sSQL at all because they are supposed to be using the ORM that was put in by earlier developers.), sValue vs. formCollectionValue (or usually no sValue either, because they happen to be in MVC and should be using its model binding features), sType vs. publishSource, etc.
It can't be readability. I see more sTemp1, sTemp2... sTempN from any given hungarianed VB6 leftover than everybody else combined.
It prevents errors.
This would be by virtue of number 2, which is false.
In the words of the master:
http://www.joelonsoftware.com/articles/Wrong.html
An interesting reading, as usual.
Extracts:
"Somebody, somewhere, read Simonyi’s paper, where he used the word “type,” and thought he meant type, like class, like in a type system, like the type checking that the compiler does. He did not. He explained very carefully exactly what he meant by the word “type,” but it didn’t help. The damage was done."
"But there’s still a tremendous amount of value to Apps Hungarian, in that it increases collocation in code, which makes the code easier to read, write, debug, and maintain, and, most importantly, it makes wrong code look wrong."
Make sure you have some time before reading Joel On Software. :)
Several reasons:
Any modern IDE will give you the variable type by simply hovering your mouse over the variable.
Most type names are way long (think HttpClientRequestProvider) to be reasonably used as prefix.
The type information does not carry the right information, it is just paraphrasing the variable declaration, instead of outlining the purpose of the variable (think myInteger vs. pageSize).
I don't think everyone is rabidly against it. In languages without static types, it's pretty useful. I definitely prefer it when it's used to give information that is not already in the type. Like in C, char * szName says that the variable will refer to a null terminated string -- that's not implicit in char* -- of course, a typedef would also help.
Joel had a great article on using hungarian to tell if a variable was HTML encoded or not:
http://www.joelonsoftware.com/articles/Wrong.html
Anyway, I tend to dislike Hungarian when it's used to impart information I already know.
Of course when 99% of programmers agree on something, there is something wrong. The reason they agree here is because most of them have never used Hungarian notation correctly.
For a detailed argument, I refer you to a blog post I have made on the subject.
http://codingthriller.blogspot.com/2007/11/rediscovering-hungarian-notation.html
I started coding pretty much the about the time Hungarian notation was invented and the first time I was forced to use it on a project I hated it.
After a while I realised that when it was done properly it did actually help and these days I love it.
But like all things good, it has to be learnt and understood and to do it properly takes time.
The Hungarian notation was abused, particularly by Microsoft, leading to prefixes longer than the variable name, and showing it is quite rigid, particularly when you change the types (the infamous lparam/wparam, of different type/size in Win16, identical in Win32).
Thus, both due to this abuse, and its use by M$, it was put down as useless.
At my work, we code in Java, but the founder cames from MFC world, so use similar code style (aligned braces, I like this!, capitals to method names, I am used to that, prefix like m_ to class members (fields), s_ to static members, etc.).
And they said all variables should have a prefix showing its type (eg. a BufferedReader is named brData). Which shown as being a bad idea, as the types can change but the names doesn't follow, or coders are not consistent in the use of these prefixes (I even see aBuffer, theProxy, etc.!).
Personally, I chose for a few prefixes that I find useful, the most important being b to prefix boolean variables, as they are the only ones where I allow syntax like if (bVar) (no use of autocast of some values to true or false).
When I coded in C, I used a prefix for variables allocated with malloc, as a reminder it should be freed later. Etc.
So, basically, I don't reject this notation as a whole, but took what seems fitting for my needs.
And of course, when contributing to some project (work, open source), I just use the conventions in place!

How to name variables

What rules do you use to name your variables?
Where are single letter vars allowed?
How much info do you put in the name?
How about for example code?
What are your preferred meaningless variable names? (after foo & bar)
Why are they spelled "foo" and "bar" rather than FUBAR
function startEditing(){
if (user.canEdit(currentDocument)){
editorControl.setEditMode(true);
setButtonDown(btnStartEditing);
}
}
Should read like a narrative work.
One rule I always follow is this: if a variable encodes a value that is in some particular units, then those units have to be part of the variable name. Example:
int postalCodeDistanceMiles;
decimal reactorCoreTemperatureKelvin;
decimal altitudeMsl;
int userExperienceWongBakerPainScale
I will NOT be responsible for crashing any Mars landers (or the equivalent failure in my boring CRUD business applications).
Well it all depends on the language you are developing in. As I am currently using C# I tend you use the following.
camelCase for variables.
camelCase for parameters.
PascalCase for properties.
m_PascalCase for member variables.
Where are single letter vars allows?
I tend to do this in for loops but feel a bit guilty whenever I do so. But with foreach and lambda expressions for loops are not really that common now.
How much info do you put in the name?
If the code is a bit difficult to understand write a comment. Don't turn a variable name into a comment, i.e .
int theTotalAccountValueIsStoredHere
is not required.
what are your preferred meaningless variable names? (after foo & bar)
i or x. foo and bar are a bit too university text book example for me.
why are they spelled "foo" and "bar" rather than FUBAR?
Tradition
These are all C# conventions.
Variable-name casing
Case indicates scope. Pascal-cased variables are fields of the owning class. Camel-cased variables are local to the current method.
I have only one prefix-character convention. Backing fields for class properties are Pascal-cased and prefixed with an underscore:
private int _Foo;
public int Foo { get { return _Foo; } set { _Foo = value; } }
There's some C# variable-naming convention I've seen out there - I'm pretty sure it was a Microsoft document - that inveighs against using an underscore prefix. That seems crazy to me. If I look in my code and see something like
_Foo = GetResult();
the very first thing that I ask myself is, "Did I have a good reason not to use a property accessor to update that field?" The answer is often "Yes, and you'd better know what that is before you start monkeying around with this code."
Single-letter (and short) variable names
While I tend to agree with the dictum that variable names should be meaningful, in practice there are lots of circumstances under which making their names meaningful adds nothing to the code's readability or maintainability.
Loop iterators and array indices are the obvious places to use short and arbitrary variable names. Less obvious, but no less appropriate in my book, are nonce usages, e.g.:
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
XmlWriter xw = XmlWriter.Create(outputStream, xws);
That's from C# 2.0 code; if I wrote it today, of course, I wouldn't need the nonce variable:
XmlWriter xw = XmlWriter.Create(
outputStream,
new XmlWriterSettings() { Indent=true; });
But there are still plenty of places in C# code where I have to create an object that you're just going to pass elsewhere and then throw away.
A lot of developers would use a name like xwsTemp in those circumstances. I find that the Temp suffix is redundant. The fact that I named the variable xws in its declaration (and I'm only using it within visual range of that declaration; that's important) tells me that it's a temporary variable.
Another place I'll use short variable names is in a method that's making heavy use of a single object. Here's a piece of production code:
internal void WriteXml(XmlWriter xw)
{
if (!Active)
{
return;
}
xw.WriteStartElement(Row.Table.TableName);
xw.WriteAttributeString("ID", Row["ID"].ToString());
xw.WriteAttributeString("RowState", Row.RowState.ToString());
for (int i = 0; i < ColumnManagers.Length; i++)
{
ColumnManagers[i].Value = Row.ItemArray[i];
xw.WriteElementString(ColumnManagers[i].ColumnName, ColumnManagers[i].ToXmlString());
}
...
There's no way in the world that code would be easier to read (or safer to modify) if I gave the XmlWriter a longer name.
Oh, how do I know that xw isn't a temporary variable? Because I can't see its declaration. I only use temporary variables within 4 or 5 lines of their declaration. If I'm going to need one for more code than that, I either give it a meaningful name or refactor the code using it into a method that - hey, what a coincidence - takes the short variable as an argument.
How much info do you put in the name?
Enough.
That turns out to be something of a black art. There's plenty of information I don't have to put into the name. I know when a variable's the backing field of a property accessor, or temporary, or an argument to the current method, because my naming conventions tell me that. So my names don't.
Here's why it's not that important.
In practice, I don't need to spend much energy figuring out variable names. I put all of that cognitive effort into naming types, properties and methods. This is a much bigger deal than naming variables, because these names are very often public in scope (or at least visible throughout the namespace). Names within a namespace need to convey meaning the same way.
There's only one variable in this block of code:
RowManager r = (RowManager)sender;
// if the settings allow adding a new row, add one if the context row
// is the last sibling, and it is now active.
if (Settings.AllowAdds && r.IsLastSibling && r.Active)
{
r.ParentRowManager.AddNewChildRow(r.RecordTypeRow, false);
}
The property names almost make the comment redundant. (Almost. There's actually a reason why the property is called AllowAdds and not AllowAddingNewRows that a lot of thought went into, but it doesn't apply to this particular piece of code, which is why there's a comment.) The variable name? Who cares?
Pretty much every modern language that had wide use has its own coding standards. These are a great starting point. If all else fails, just use whatever is recommended. There are exceptions of course, but these are general guidelines. If your team prefers certain variations, as long as you agree with them, then that's fine as well.
But at the end of the day it's not necessarily what standards you use, but the fact that you have them in the first place and that they are adhered to.
I only use single character variables for loop control or very short functions.
for(int i = 0; i< endPoint; i++) {...}
int max( int a, int b) {
if (a > b)
return a;
return b;
}
The amount of information depends on the scope of the variable, the more places it could be used, the more information I want to have the name to keep track of its purpose.
When I write example code, I try to use variable names as I would in real code (although functions might get useless names like foo or bar).
See Etymology of "Foo"
What rules do you use to name your variables?
Typically, as I am a C# developer, I follow the variable naming conventions as specified by the IDesign C# Coding Standard for two reasons
1) I like it, and find it easy to read.
2) It is the default that comes with the Code Style Enforcer AddIn for Visual Studio 2005 / 2008 which I use extensively these days.
Where are single letter vars allows?
There are a few places where I will allow single letter variables. Usually these are simple loop indexers, OR mathematical concepts like X,Y,Z coordinates. Other than that, never! (Everywhere else I have used them, I have typically been bitten by them when rereading the code).
How much info do you put in the name?
Enough to know PRECISELY what the variable is being used for. As Robert Martin says:
The name of a variable, function, or
class, should answer all the big
questions. It should tell you why it
exists, what it does, and how it is
used. If a name requires a comment,
then the name does not reveal its
intent.
From Clean Code - A Handbook of Agile Software Craftsmanship
I never use meaningless variable names like foo or bar, unless, of course, the code is truly throw-away.
For loop variables, I double up the letter so that it's easier to search for the variable within the file. For example,
for (int ii=0; ii < array.length; ii++)
{
int element = array[ii];
printf("%d", element);
}
What rules do you use to name your variables? I've switched between underscore between words (load_vars), camel casing (loadVars) and no spaces (loadvars). Classes are always CamelCase, capitalized.
Where are single letter vars allows? Loops, mostly. Temporary vars in throwaway code.
How much info do you put in the name? Enough to remind me what it is while I'm coding. (Yes this can lead to problems later!)
what are your preferred meaningless variable names? (after foo & bar) temp, res, r. I actually don't use foo and bar a good amount.
What rules do you use to name your variables?
I need to be able to understand it in a year's time. Should also conform with preexisting style.
Where are single letter vars allows?
ultra-obvious things. E.g. char c; c = getc(); Loop indicies(i,j,k).
How much info do you put in the name?
Plenty and lots.
how about for example code?
Same as above.
what are your preferred meaningless variable names? (after foo & bar)
I don't like having meaningless variable names. If a variable doesn't mean anything, why is it in my code?
why are they spelled "foo" and "bar" rather than FUBAR
Tradition.
The rules I adhere to are;
Does the name fully and accurately describe what the variable represents?
Does the name refer to the real-world problem rather than the programming language solution?
Is the name long enough that you don't have to puzzle it out?
Are computed value qualifiers, if any, at the end of the name?
Are they specifically instantiated only at the point once required?
What rules do you use to name your variables?
camelCase for all important variables, CamelCase for all classes
Where are single letter vars allows?
In loop constructs and in mathematical funktions where the single letter var name is consistent with the mathematical definition.
How much info do you put in the name?
You should be able to read the code like a book. Function names should tell you what the function does (scalarProd(), addCustomer(), etc)
How about for example code?
what are your preferred meaningless variable names? (after foo & bar)
temp, tmp, input, I never really use foo and bar.
I would say try to name them as clearly as possible. Never use single letter variables and only use 'foo' and 'bar' if you're just testing something out (e.g., in interactive mode) and won't use it in production.
I like to prefix my variables with what they're going to be: str = String, int = Integer, bool = Boolean, etc.
Using a single letter is quick and easy in Loops: For i = 0 to 4...Loop
Variables are made to be a short but descriptive substitute for what you're using. If the variable is too short, you might not understand what it's for. If it's too long, you'll be typing forever for a variable that represents 5.
Foo & Bar are used for example code to show how the code works. You can use just about any different nonsensical characters to use instead. I usually just use i, x, & y.
My personal opinion of foo bar vs. fu bar is that it's too obvious and no one likes 2-character variables, 3 is much better!
In DSLs and other fluent interfaces often variable- and method-name taken together form a lexical entity. For example, I personally like the (admittedly heretic) naming pattern where the verb is put into the variable name rather than the method name. #see 6th Rule of Variable Naming
Also, I like the spartan use of $ as variable name for the main variable of a piece of code. For example, a class that pretty prints a tree structure can use $ for the StringBuffer inst var. #see This is Verbose!
Otherwise I refer to the Programmer's Phrasebook by Einar Hoest. #see http://www.nr.no/~einarwh/phrasebook/
I always use single letter variables in for loops, it's just nicer-looking and easier to read.
A lot of it depends on the language you're programming in too, I don't name variables the same in C++ as I do in Java (Java lends itself better to the excessively long variable names imo, but this could just a personal preference. Or it may have something to do with how Java built-ins are named...).
locals: fooBar;
members/types/functions FooBar
interfaces: IFooBar
As for me, single letters are only valid if the name is classic; i/j/k for only for local loop indexes, x,y,z for vector parts.
vars have names that convey meaning but are short enough to not wrap lines
foo,bar,baz. Pickle is also a favorite.
I learned not to ever use single-letter variable names back in my VB3 days. The problem is that if you want to search everywhere that a variable is used, it's kinda hard to search on a single letter!
The newer versions of Visual Studio have intelligent variable searching functions that avoid this problem, but old habits and all that. Anyway, I prefer to err on the side of ridiculous.
for (int firstStageRocketEngineIndex = 0; firstStageRocketEngineIndex < firstStageRocketEngines.Length; firstStageRocketEngineIndex++)
{
firstStageRocketEngines[firstStageRocketEngineIndex].Ignite();
Thread.Sleep(100); // Don't start them all at once. That would be bad.
}
It's pretty much unimportant how you name variables. You really don't need any rules, other than those specified by the language, or at minimum, those enforced by your compiler.
It's considered polite to pick names you think your teammates can figure out, but style rules don't really help with that as much as people think.
Since I work as a contractor, moving among different companies and projects, I prefer to avoid custom naming conventions. They make it more difficult for a new developer, or a maintenance developer, to become acquainted with (and follow) the standard being used.
So, while one can find points in them to disagree with, I look to the official Microsoft Net guidelines for a consistent set of naming conventions.
With some exceptions (Hungarian notation), I think consistent usage may be more useful than any arbitrary set of rules. That is, do it the same way every time.
.
I work in MathCAD and I'm happy because MathCAD gives me increadable possibilities in naming and I use them a lot. And I can`t understand how to programm without this.
To differ one var from another I have to include a lot of information in the name,for example:
1.On the first place - that is it -N for quantity,F for force and so on
2.On the second - additional indices - for direction of force for example
3.On the third - indexation inside vector or matrix var,for convinience I put var name in {} or [] brackets to show its dimensions.
So,as conclusion my var name is like
N.dirs / Fx i.row / {F}.w.(i,j.k) / {F}.w.(k,i.j).
Sometimes I have to add name of coordinate system for vector values
{F}.{GCS}.w.(i,j.k) / {F}.{LCS}.w.(i,j.k)
And as final step I add name of the external module in BOLD at the end of external function or var like Row.MTX.f([M]) because MathCAD doesn't have help string for function.
Use variables that describes clearly what it contains. If the class is going to get big, or if it is in the public scope the variable name needs to be described more accurately. Of course good naming makes you and other people understand the code better.
for example: use "employeeNumber" insetead of just "number".
use Btn or Button in the end of the name of variables reffering to buttons, str for strings and so on.
Start variables with lower case, start classes with uppercase.
example of class "MyBigClass", example of variable "myStringVariable"
Use upper case to indicate a new word for better readability. Don't use "_", because it looks uglier and takes longer time to write.
for example: use "employeeName".
Only use single character variables in loops.
Updated
First off, naming depends on existing conventions, whether from language, framework, library, or project. (When in Rome...) Example: Use the jQuery style for jQuery plugins, use the Apple style for iOS apps. The former example requires more vigilance (since JavaScript can get messy and isn't automatically checked), while the latter example is simpler since the standard has been well-enforced and followed. YMMV depending on the leaders, the community, and especially the tools.
I will set aside all my naming habits to follow any existing conventions.
In general, I follow these principles, all of which center around programming being another form of interpersonal communication through written language.
Readability - important parts should have solid names; but these names should not be a replacement for proper documentation of intent. The test for code readability is if you can come back to it months later and still be understanding enough to not toss the entire thing upon first impression. This means avoiding abbreviation; see the case against Hungarian notation.
Writeability - common areas and boilerplate should be kept simple (esp. if there's no IDE), so code is easier and more fun to write. This is a bit inspired by Rob Pyke's style.
Maintainability - if I add the type to my name like arrItems, then it would suck if I changed that property to be an instance of a CustomSet class that extends Array. Type notes should be kept in documentation, and only if appropriate (for APIs and such).
Standard, common naming - For dumb environments (text editors): Classes should be in ProperCase, variables should be short and if needed be in snake_case and functions should be in camelCase.
For JavaScript, it's a classic case of the restraints of the language and the tools affecting naming. It helps to distinguish variables from functions through different naming, since there's no IDE to hold your hand while this and prototype and other boilerplate obscure your vision and confuse your differentiation skills. It's also not uncommon to see all the unimportant or globally-derived vars in a scope be abbreviated. The language has no import [path] as [alias];, so local vars become aliases. And then there's the slew of different whitespacing conventions. The only solution here (and anywhere, really) is proper documentation of intent (and identity).
Also, the language itself is based around function level scope and closures, so that amount of flexibility can make blocks with variables in 2+ scope levels feel very messy, so I've seen naming where _ is prepended for each level in the scope chain to the vars in that scope.
I do a lot of php in nowadays, It was not always like that though and I have learned a couple of tricks when it comes to variable naming.
//this is my string variable
$strVar = "";
//this would represent an array
$arrCards = array();
//this is for an integer
$intTotal = NULL:
//object
$objDB = new database_class();
//boolean
$blValid = true;