Why do programming languages not allow spaces in identifiers? - language-agnostic

This may seem like a dumb question, but still I don't know the answer.
Why do programming languages not allow spaces in the names ( for instance method names )?
I understand it is to facilitate ( allow ) the parsing, and at some point it would be impossible to parse anything if spaces were allowed.
Nowadays we are so use to it that the norm is not to see spaces.
For instance:
object.saveData( data );
object.save_data( data )
object.SaveData( data );
[object saveData:data];
etc.
Could be written as:
object.save data( data ) // looks ugly, but that's the "nature" way.
If it is only for parsing, I guess the identifier could be between . and ( of course, procedural languages wouldn't be able to use it because there is no '.' but OO do..
I wonder if parsing is the only reason, and if it is, how important it is ( I assume that it will be and it will be impossible to do it otherwise, unless all the programming language designers just... forget the option )
EDIT
I'm ok with identifiers in general ( as the fortran example ) is bad idea. Narrowing to OO languages and specifically to methods, I don't see ( I don't mean there is not ) a reason why it should be that way. After all the . and the first ( may be used.
And forget the saveData method , consider this one:
key.ToString().StartsWith("TextBox")
as:
key.to string().starts with("textbox");

Be cause i twoul d makepa rsing suc hcode reallydif ficult.

I used an implementation of ALGOL (c. 1978) which—extremely annoyingly—required quoting of what is now known as reserved words, and allowed spaces in identifiers:
"proc" filter = ("proc" ("int") "bool" p, "list" l) "list":
"if" l "is" "nil" "then" "nil"
"elif" p(hd(l)) "then" cons(hd(l), filter(p,tl(l)))
"else" filter(p, tl(l))
"fi";
Also, FORTRAN (the capitalized form means F77 or earlier), was more or less insensitive to spaces. So this could be written:
799 S = FLO AT F (I A+I B+I C) / 2 . 0
A R E A = SQ R T ( S *(S - F L O ATF(IA)) * (S - FLOATF(IB)) *
+ (S - F LOA TF (I C)))
which was syntactically identical to
799 S = FLOATF (IA + IB + IC) / 2.0
AREA = SQRT( S * (S - FLOATF(IA)) * (S - FLOATF(IB)) *
+ (S - FLOATF(IC)))
With that kind of history of abuse, why make parsing difficult for humans? Let alone complicate computer parsing.

Yes, it's the parsing - both human and computer. It's easier to read and easier to parse if you can safely assume that whitespace doesn't matter. Otherwise, you can have potentially ambiguous statements, statements where it's not clear how things go together, statements that are hard to read, etc.

Such a change would make for an ambiguous language in the best of cases. For example, in a C99-like language:
if not foo(int x) {
...
}
is that equivalent to:
A function definition of foo that returns a value of type ifnot:
ifnot foo(int x) {
...
}
A call to a function called notfoo with a variable named intx:
if notfoo(intx) {
...
}
A negated call to a function called foo (with C99's not which means !):
if not foo(intx) {
...
}
This is just a small sample of the ambiguities you might run into.
Update: I just noticed that obviously, in a C99-like language, the condition of an if statement would be enclosed in parentheses. Extra punctuation can help with ambiguities if you choose to ignore whitespace, but your language will end up having lots of extra punctuation wherever you would normally have used whitespace.

Before the interpreter or compiler can build a parse tree, it must perform lexical analysis, turning the stream of characters into a stream of tokens. Consider how you would want the following parsed:
a = 1.2423 / (4343.23 * 2332.2);
And how your rule above would work on it. Hard to know how to lexify it without understanding the meaning of the tokens. It would be really hard to build a parser that did lexification at the same time.

There are a few languages which allow spaces in identifiers. The fact that nearly all languages constrain the set of characters in identifiers is because parsing is more easy and most programmers are accustomed to the compact no-whitespace style.
I don’t think there’s real reason.

Check out Stroustrup's classic Generalizing Overloading for C++2000.

We were allowed to put spaces in filenames back in the 1960's, and computers still don't handle them very well (everything used to break, then most things, now it's just a few things - but they still break).
We simply can't wait another 50 years before our code will work again.
:-)
(And what everyone else said, of course. In English, we use spaces and punctuation to separate the words. The same is true for computer languages, except that computer parsers define "words" in a slightly different sense)

Using space as part of an identifier makes parsing really murky (is that a syntactic space or an identifier?), but the same sort "natural reading" behavior is achieved with keyword arguments. object.save(data: something, atomically: true)

The TikZ language for creating graphics in LaTeX allows whitespace in parameter names (also known as 'keys'). For instance, you see things like
\shade[
top color=yellow!70,
bottom color=red!70,
shading angle={45},
]
In this restricted setting of a comma-separated list of key-value pairs, there's no parsing difficulty. In fact, I think it's much easier to read than the alternatives like topColor, top_color or topcolor.

Related

Operators and Functions [duplicate]

Is there any substantial difference between operators and methods?
The only difference I see is the way the are called, do they have other differences?
For example in Python concatenation, slicing, indexing are defined as operators, while (referring to strings) upper(), replace(), strip() and so on are methods.
If I understand question currectly...
In nutshell, everything is a method of object. You can find "expression operators" methods in python magic class methods, in the operators.
So, why python has "sexy" things like [x:y], [x], +, -? Because it is common things to most developers, even to unfamiliar with development people, so math functions like +, - will catch human eye and he will know what happens. Similar with indexing - it is common syntax in many languages.
But there is no special ways to express upper, replace, strip methods, so there is no "expression operators" for it.
So, what is different between "expression operators" and methods, I'd say just the way it looks.
Your question is rather broad. For your examples, concatenation, slicing, and indexing are defined on strings and lists using special syntax (e.g., []). But other types may do things differently.
In fact, the behavior of most (I think all) of the operators is constrolled by magic methods, so really when you write something like x + y a method is called under the hood.
From a practical perspective, one of the main differences is that the set of available syntactic operators is fixed and new ones cannot be added by your Python code. You can't write your own code to define a new operator called $ and then have x $ y work. On the other hand, you can define as many methods as you want. This means that you should choose carefully what behavior (if any) you assign to operators; since there are only a limited number of operators, you want to be sure that you don't "waste" them on uncommon operations.
Is there any substantial difference between operators and
methods?
Practically speaking, there is no difference because each operator is mapped to a specific Python special method. Moreover, whenever Python encounters the use of an operator, it calls its associated special method implicitly. For example:
1 + 2
implicitly calls int.__add__, which makes the above expression equivalent1 to:
(1).__add__(2)
Below is a demonstration:
>>> class Foo:
... def __add__(self, other):
... print("Foo.__add__ was called")
... return other + 10
...
>>> f = Foo()
>>> f + 1
Foo.__add__ was called
11
>>> f.__add__(1)
Foo.__add__ was called
11
>>>
Of course, actually using (1).__add__(2) in place of 1 + 2 would be inefficient (and ugly!) because it involves an unnecessary name lookup with the . operator.
That said, I do not see a problem with generally regarding the operator symbols (+, -, *, etc.) as simply shorthands for their associated method names (__add__, __sub__, __mul__, etc.). After all, they each end up doing the same thing by calling the same method.
1Well, roughly equivalent. As documented here, there is a set of special methods prefixed with the letter r that handle reflected operands. For example, the following expression:
A + B
may actually be equivalent to:
B.__radd__(A)
if A does not implement __add__ but B implements __radd__.

Uses of & and && operator

Same question goes for | and ||.
What are uses for the & and && operator? The only use i can think of are
Bitwise Ands for int base types (but not float/decimals) using &
logical short circuit for bools/functions that return bool. Using the && operator usually.
I cant think of any other cases i have used it.
Does anyone know other uses?
-edit- To clarify, i am asking about any language. I seen DateTime use '-' to return a timespan, strings use '+' to create new strings, etc. I dont remember any custom datatype using && and &. So i am asking what might they (reasonably) be use for? I dont know of an example.
In most C-based languages the meanings of these operators are:
&& - boolean AND. Used in boolean expressions such as if statements.
|| - boolean OR. Used in boolean expressions such as if statements.
& - bitwise AND. Used to AND the bits of both operands.
| - bitwise OR. Used to OR the bits of both operands.
However, these are not guaranteed to be such. Since every language defines its own operators, these string can be defined as anything in a different language.
From your edit, you seem to be using C#. The above description is right for C#, with | and & also being conditional operators (depending on context).
As for what you are saying about DateTime and the + operator - this is not related to the other operators you mentioned and their meaning.
If you're asking about all languages then I don't think it's reasonable to talk about "the & operator". The token & could have all sorts of meanings in different languages, operator and otherwise.
For example in C alone there are two distinct & operators (unary address-of and binary bitwise-and). Unary & in C and related languages is the only example I can immediately think of, of a use I've encountered that meets your criteria.
However, C++ adds operator overloading so that they can mean anything you like for user-defined classes, and in addition the & character has meaning in type declarations. In C++0x the && token has meaning in type declarations too.
A language along the lines of APL or J could "reasonably" use an & operator to mean pretty much anything, since there is no expectation that code in those languages bears any resemblance at all to C-like languages. Not sure if either of those two does in fact use either & or &&.
What meanings it's "reasonable" for a binary & operator overload to have in C++ is a matter of taste - normally it would be something that's analogous to bitwise & in some way, because the values represented by your class can be considered as a sequence of bits in some way. Doesn't have to be, though, as long as it's something that makes sense in the domain. Normally it's fairly "unreasonable" to use an & overload just because & happens to be unused. But if your class represents something fairly abstruse in mathematics and you need a third binary operator after + and *, I suppose you'd start looking around. If what you want is something with even lower precedence than +, binary & is a candidate. I can't for the moment think of any structures in abstract algebra that want such a thing, but that doesn't mean there aren't any.
Overloading operator&& in C++ is moderately antisocial, since the un-overloaded version of the operator short-circuits and overloaded versions don't. C++ programmers are used to writing expressions like if (p && *p != 0), so by overloading operator&& you're in effect messing with a control structure.
Overloading unary operator& in C++ is extremely antisocial. It stops people taking pointers to your objects. IIRC there are some awkward cases where common implementations of standard templates require of their template parameters that unary operator& results in a pointer (or at least a very pointer-like thing). This is not documented in the requirements for the argument, but is either almost or completely unavoidable when the library-writer comes to implement the template. So the overload would place restrictions on the use of the class that can't be deduced from the standard, and there'd better be a very good reason for that.
[Edit: what I didn't know when I wrote this, but do know now, is that template-writers could work around the need to use unary operator& with template parameters where the standard doesn't specify what & does for that type (i.e. all of them). You can do what boost::addressof does, which is:
reinterpret_cast<Foo*>(&reinterpret_cast<char&>(foo))
The standard doesn't require much of reinterpet_cast, but since we're talking about standard templates they know exactly what it does in the implementation, and anyway it's legal to reinterpret an object as chars. I think this is guaranteed to work - but if not the implementation can ensure that it does work if necessary to write fully conforming standard templates.
But, if your implementation doesn't go to these lengths to avoid calling an overloaded operator&, the original problem remains.]
As your previoes question about these operators has been about C#, I assume that this one is too.
Generally you want to use the short-circuit version of the conditional operators to avoid unneccesary operations. If the value of the first operand is enough to determine the result, the second operand needn't be evaluated.
When a condition relies on the previos condition being true, only the short-circuit operators work, for example doing a null check and property comparison:
if (myObj != null && myObj.State == "active")
Using the & operator in that case would not keep the second operand from being evaluated, and it would cause a null reference exception.
The non-shortcircuit operators are useful when you want both operands to always be evaluated, for example when they have a side effect:
if (DoSomeWork() & DoOtherWork())
Using the && operator would prevent the second method to be called if the first returned false.
The & and | are also binary operators, but as the || and && operators aren't, there is no ambiguity when you use them as binary operators.
Very general question and I'm assuming you're talking in Java, C#, or another similar syntax. In VB it's the equivalent of + on strings, but that's another story I assume.
As far as I know, your statement is correct if you're talking in terms of C#.
If it's Javascript then please look at this answer: Using &&'s short-circuiting as an if statement?
There is a short discussion on C# uses there too.
Java has a few more operators, such as |= : What does "|=" mean in Java?
C uses & as a unary operator on any data types to get the address of the data
for example:
int i = 5;
cout<<&i;//print the address of i
Some languages allow you to override such operators to make them do anything you want!

Is there a programming language with no controls structures or operators?

Like Smalltalk or Lisp?
EDIT
Where control structures are like:
Java Python
if( condition ) { if cond:
doSomething doSomething
}
Or
Java Python
while( true ) { while True:
print("Hello"); print "Hello"
}
And operators
Java, Python
1 + 2 // + operator
2 * 5 // * op
In Smalltalk ( if I'm correct ) that would be:
condition ifTrue:[
doSomething
]
True whileTrue:[
"Hello" print
]
1 + 2 // + is a method of 1 and the parameter is 2 like 1.add(2)
2 * 5 // same thing
how come you've never heard of lisp before?
You mean without special syntax for achieving the same?
Lots of languages have control structures and operators that are "really" some form of message passing or functional call system that can be redefined. Most "pure" object languages and pure functional languages fit the bill. But they are all still going to have your "+" and some form of code block--including SmallTalk!--so your question is a little misleading.
Assembly
Befunge
Prolog*
*I cannot be held accountable for any frustration and/or headaches caused by trying to get your head around this technology, nor am I liable for any damages caused by you due to aforementioned conditions including, but not limited to, broken keyboard, punched-in screen and/or head-shaped dents in your desk.
Pure lambda calculus? Here's the grammar for the entire language:
e ::= x | e1 e2 | \x . e
All you have are variables, function application, and function creation. It's equivalent in power to a Turing machine. There are well-known codings (typically "Church encodings") for such constructs as
If-then-else
while-do
recursion
and such datatypes as
Booleans
integers
records
lists, trees, and other recursive types
Coding in lambda calculus can be a lot of fun—our students will do it in the undergraduate languages course next spring.
Forth may qualify, depending on exactly what you mean by "no control structures or operators". Forth may appear to have them, but really they are all just symbols, and the "control structures" and "operators" can be defined (or redefined) by the programmer.
What about Logo or more specifically, Turtle Graphics? I'm sure we all remember that, PEN UP, PEN DOWN, FORWARD 10, etc.
The SMITH programming language:
http://esolangs.org/wiki/SMITH
http://catseye.tc/projects/smith/
It has no jumps and is Turing complete. I've also made a Haskell interpreter for this bad boy a few years back.
I'll be first to mention brain**** then.
In Tcl, there's no control structures; there's just commands and they can all be redefined. Every last one. There's also no operators. Well, except for in expressions, but that's really just an imported foreign syntax that isn't part of the language itself. (We can also import full C or Fortran or just about anything else.)
How about FRACTRAN?
FRACTRAN is a Turing-complete esoteric programming language invented by the mathematician John Conway. A FRACTRAN program is an ordered list of positive fractions together with an initial positive integer input n. The program is run by updating the integer (n) as follows:
for the first fraction f in the list for which nf is an integer, replace n by nf
repeat this rule until no fraction in the list produces an integer when multiplied by n, then halt.
Of course there is an implicit control structure in rule 2.
D (used in DTrace)?
APT - (Automatic Programmed Tool) used extensively for programming NC machine tools.
The language also has no IO capabilities.
XSLT (or XSL, some say) has control structures like if and for, but you should generally avoid them and deal with everything by writing rules with the correct level of specificity. So the control structures are there, but are implied by the default thing the translation engine does: apply potentially-recursive rules.
For and if (and some others) do exist, but in many many situations you can and should work around them.
How about Whenever?
Programs consist of "to-do list" - a series of statements which are executed in random order. Each statement can contain a prerequisite, which if not fulfilled causes the statement to be deferred until some (random) later time.
I'm not entirely clear on the concept, but I think PostScript meets the criteria, although it calls all of its functions operators (the way LISP calls all of its operators functions).
Makefile syntax doesn't seem to have any operators or control structures. I'd say it's a programming language but it isn't Turing Complete (without extensions to the POSIX standard anyway)
So... you're looking for a super-simple language? How about Batch programming? If you have any version of Windows, then you have access to a Batch compiler. It's also more useful than you'd think, since you can carry out basic file functions (copy, rename, make directory, delete file, etc.)
http://www.csulb.edu/~murdock/dosindex.html
Example
Open notepad and make a .Bat file on your Windows box.
Open the .Bat file with notepad
In the first line, type "echo off"
In the second line, type "echo hello world"
In the third line, type "pause"
Save and run the file.
If you're looking for a way to learn some very basic programming, this is a good way to start. (Just be careful with the Delete and Format commands. Don't experiment with those.)

Using magic strings or constants in processing punctuation?

We do a lot of lexical processing with arbitrary strings which include arbitrary punctuation. I am divided as to whether to use magic characters/strings or symbolic constants.
The examples should be read as language-independent although most are Java.
There are clear examples where punctuation has a semantic role and should be identified as a constant:
File.separator not "/" or "\\"; // a no-brainer as it is OS-dependent
and I write XML_PREFIX_SEPARATOR = ":";
However let's say I need to replace all examples of "" with an empty string ``. I can write:
s = s.replaceAll("\"\"", "");
or
s = s.replaceAll(S_QUOT+S_QUOT, S_EMPTY);
(I have defined all common punctuation as S_FOO (string) and C_FOO (char))
In favour of magic strings/characters:
It's shorter
It's natural to read (sometimes)
The named constants may not be familiar (C_APOS vs '\'')
In favour of constants
It's harder to make typos (e.g. contrast "''" + '"' with S_APOS+S_APOS + C_QUOT)
It removes escaping problems Should a regex be "\\s+" or "\s+" or "\\\\s+"?
It's easy to search the code for punctuation
(There is a limit to this - I would not write regexes this way even though regex syntax is one of the most cognitively dysfunctional parts of all programming. I think we need a better syntax.)
If the definitions may change over time or between installations, I tend to put these things in a config file, and pick up the information at startup or on-demand (depending on the situation). Then provide a static class with read-only interface and clear names on the properties for exposing the information to the system.
Usage could look like this:
s = s.replaceAll(CharConfig.Quotation + CharConfig.Quotation, CharConfig.EmtpyString);
For general string processing, I wouldn't use special symbols. A space is always going to be a space, and it's just more natural to read (and write!):
s.replace("String", " ");
Than:
s.replace("String", S_SPACE);
I would take special care to use things like "\t" to represent tabs, for example, since they can't easily be distinguished from spaces in a string.
As for things like XML_PREFIX_SEPARATOR or FILE_SEPARATOR, you should probably never have to deal with constants like that, since you should use a library to do the work for you. For example, you shouldn't be hand-writing: dir + FILE_SEPARATOR + filename, but rather be calling: file_system_library.join(dir, filename) (or whatever equivalent you're using).
This way, you'll not only have an answer for things like the constants, you'll actually get much better handling of various edge cases which you probably aren't thinking about right now

How to name variables

What rules do you use to name your variables?
Where are single letter vars allowed?
How much info do you put in the name?
How about for example code?
What are your preferred meaningless variable names? (after foo & bar)
Why are they spelled "foo" and "bar" rather than FUBAR
function startEditing(){
if (user.canEdit(currentDocument)){
editorControl.setEditMode(true);
setButtonDown(btnStartEditing);
}
}
Should read like a narrative work.
One rule I always follow is this: if a variable encodes a value that is in some particular units, then those units have to be part of the variable name. Example:
int postalCodeDistanceMiles;
decimal reactorCoreTemperatureKelvin;
decimal altitudeMsl;
int userExperienceWongBakerPainScale
I will NOT be responsible for crashing any Mars landers (or the equivalent failure in my boring CRUD business applications).
Well it all depends on the language you are developing in. As I am currently using C# I tend you use the following.
camelCase for variables.
camelCase for parameters.
PascalCase for properties.
m_PascalCase for member variables.
Where are single letter vars allows?
I tend to do this in for loops but feel a bit guilty whenever I do so. But with foreach and lambda expressions for loops are not really that common now.
How much info do you put in the name?
If the code is a bit difficult to understand write a comment. Don't turn a variable name into a comment, i.e .
int theTotalAccountValueIsStoredHere
is not required.
what are your preferred meaningless variable names? (after foo & bar)
i or x. foo and bar are a bit too university text book example for me.
why are they spelled "foo" and "bar" rather than FUBAR?
Tradition
These are all C# conventions.
Variable-name casing
Case indicates scope. Pascal-cased variables are fields of the owning class. Camel-cased variables are local to the current method.
I have only one prefix-character convention. Backing fields for class properties are Pascal-cased and prefixed with an underscore:
private int _Foo;
public int Foo { get { return _Foo; } set { _Foo = value; } }
There's some C# variable-naming convention I've seen out there - I'm pretty sure it was a Microsoft document - that inveighs against using an underscore prefix. That seems crazy to me. If I look in my code and see something like
_Foo = GetResult();
the very first thing that I ask myself is, "Did I have a good reason not to use a property accessor to update that field?" The answer is often "Yes, and you'd better know what that is before you start monkeying around with this code."
Single-letter (and short) variable names
While I tend to agree with the dictum that variable names should be meaningful, in practice there are lots of circumstances under which making their names meaningful adds nothing to the code's readability or maintainability.
Loop iterators and array indices are the obvious places to use short and arbitrary variable names. Less obvious, but no less appropriate in my book, are nonce usages, e.g.:
XmlWriterSettings xws = new XmlWriterSettings();
xws.Indent = true;
XmlWriter xw = XmlWriter.Create(outputStream, xws);
That's from C# 2.0 code; if I wrote it today, of course, I wouldn't need the nonce variable:
XmlWriter xw = XmlWriter.Create(
outputStream,
new XmlWriterSettings() { Indent=true; });
But there are still plenty of places in C# code where I have to create an object that you're just going to pass elsewhere and then throw away.
A lot of developers would use a name like xwsTemp in those circumstances. I find that the Temp suffix is redundant. The fact that I named the variable xws in its declaration (and I'm only using it within visual range of that declaration; that's important) tells me that it's a temporary variable.
Another place I'll use short variable names is in a method that's making heavy use of a single object. Here's a piece of production code:
internal void WriteXml(XmlWriter xw)
{
if (!Active)
{
return;
}
xw.WriteStartElement(Row.Table.TableName);
xw.WriteAttributeString("ID", Row["ID"].ToString());
xw.WriteAttributeString("RowState", Row.RowState.ToString());
for (int i = 0; i < ColumnManagers.Length; i++)
{
ColumnManagers[i].Value = Row.ItemArray[i];
xw.WriteElementString(ColumnManagers[i].ColumnName, ColumnManagers[i].ToXmlString());
}
...
There's no way in the world that code would be easier to read (or safer to modify) if I gave the XmlWriter a longer name.
Oh, how do I know that xw isn't a temporary variable? Because I can't see its declaration. I only use temporary variables within 4 or 5 lines of their declaration. If I'm going to need one for more code than that, I either give it a meaningful name or refactor the code using it into a method that - hey, what a coincidence - takes the short variable as an argument.
How much info do you put in the name?
Enough.
That turns out to be something of a black art. There's plenty of information I don't have to put into the name. I know when a variable's the backing field of a property accessor, or temporary, or an argument to the current method, because my naming conventions tell me that. So my names don't.
Here's why it's not that important.
In practice, I don't need to spend much energy figuring out variable names. I put all of that cognitive effort into naming types, properties and methods. This is a much bigger deal than naming variables, because these names are very often public in scope (or at least visible throughout the namespace). Names within a namespace need to convey meaning the same way.
There's only one variable in this block of code:
RowManager r = (RowManager)sender;
// if the settings allow adding a new row, add one if the context row
// is the last sibling, and it is now active.
if (Settings.AllowAdds && r.IsLastSibling && r.Active)
{
r.ParentRowManager.AddNewChildRow(r.RecordTypeRow, false);
}
The property names almost make the comment redundant. (Almost. There's actually a reason why the property is called AllowAdds and not AllowAddingNewRows that a lot of thought went into, but it doesn't apply to this particular piece of code, which is why there's a comment.) The variable name? Who cares?
Pretty much every modern language that had wide use has its own coding standards. These are a great starting point. If all else fails, just use whatever is recommended. There are exceptions of course, but these are general guidelines. If your team prefers certain variations, as long as you agree with them, then that's fine as well.
But at the end of the day it's not necessarily what standards you use, but the fact that you have them in the first place and that they are adhered to.
I only use single character variables for loop control or very short functions.
for(int i = 0; i< endPoint; i++) {...}
int max( int a, int b) {
if (a > b)
return a;
return b;
}
The amount of information depends on the scope of the variable, the more places it could be used, the more information I want to have the name to keep track of its purpose.
When I write example code, I try to use variable names as I would in real code (although functions might get useless names like foo or bar).
See Etymology of "Foo"
What rules do you use to name your variables?
Typically, as I am a C# developer, I follow the variable naming conventions as specified by the IDesign C# Coding Standard for two reasons
1) I like it, and find it easy to read.
2) It is the default that comes with the Code Style Enforcer AddIn for Visual Studio 2005 / 2008 which I use extensively these days.
Where are single letter vars allows?
There are a few places where I will allow single letter variables. Usually these are simple loop indexers, OR mathematical concepts like X,Y,Z coordinates. Other than that, never! (Everywhere else I have used them, I have typically been bitten by them when rereading the code).
How much info do you put in the name?
Enough to know PRECISELY what the variable is being used for. As Robert Martin says:
The name of a variable, function, or
class, should answer all the big
questions. It should tell you why it
exists, what it does, and how it is
used. If a name requires a comment,
then the name does not reveal its
intent.
From Clean Code - A Handbook of Agile Software Craftsmanship
I never use meaningless variable names like foo or bar, unless, of course, the code is truly throw-away.
For loop variables, I double up the letter so that it's easier to search for the variable within the file. For example,
for (int ii=0; ii < array.length; ii++)
{
int element = array[ii];
printf("%d", element);
}
What rules do you use to name your variables? I've switched between underscore between words (load_vars), camel casing (loadVars) and no spaces (loadvars). Classes are always CamelCase, capitalized.
Where are single letter vars allows? Loops, mostly. Temporary vars in throwaway code.
How much info do you put in the name? Enough to remind me what it is while I'm coding. (Yes this can lead to problems later!)
what are your preferred meaningless variable names? (after foo & bar) temp, res, r. I actually don't use foo and bar a good amount.
What rules do you use to name your variables?
I need to be able to understand it in a year's time. Should also conform with preexisting style.
Where are single letter vars allows?
ultra-obvious things. E.g. char c; c = getc(); Loop indicies(i,j,k).
How much info do you put in the name?
Plenty and lots.
how about for example code?
Same as above.
what are your preferred meaningless variable names? (after foo & bar)
I don't like having meaningless variable names. If a variable doesn't mean anything, why is it in my code?
why are they spelled "foo" and "bar" rather than FUBAR
Tradition.
The rules I adhere to are;
Does the name fully and accurately describe what the variable represents?
Does the name refer to the real-world problem rather than the programming language solution?
Is the name long enough that you don't have to puzzle it out?
Are computed value qualifiers, if any, at the end of the name?
Are they specifically instantiated only at the point once required?
What rules do you use to name your variables?
camelCase for all important variables, CamelCase for all classes
Where are single letter vars allows?
In loop constructs and in mathematical funktions where the single letter var name is consistent with the mathematical definition.
How much info do you put in the name?
You should be able to read the code like a book. Function names should tell you what the function does (scalarProd(), addCustomer(), etc)
How about for example code?
what are your preferred meaningless variable names? (after foo & bar)
temp, tmp, input, I never really use foo and bar.
I would say try to name them as clearly as possible. Never use single letter variables and only use 'foo' and 'bar' if you're just testing something out (e.g., in interactive mode) and won't use it in production.
I like to prefix my variables with what they're going to be: str = String, int = Integer, bool = Boolean, etc.
Using a single letter is quick and easy in Loops: For i = 0 to 4...Loop
Variables are made to be a short but descriptive substitute for what you're using. If the variable is too short, you might not understand what it's for. If it's too long, you'll be typing forever for a variable that represents 5.
Foo & Bar are used for example code to show how the code works. You can use just about any different nonsensical characters to use instead. I usually just use i, x, & y.
My personal opinion of foo bar vs. fu bar is that it's too obvious and no one likes 2-character variables, 3 is much better!
In DSLs and other fluent interfaces often variable- and method-name taken together form a lexical entity. For example, I personally like the (admittedly heretic) naming pattern where the verb is put into the variable name rather than the method name. #see 6th Rule of Variable Naming
Also, I like the spartan use of $ as variable name for the main variable of a piece of code. For example, a class that pretty prints a tree structure can use $ for the StringBuffer inst var. #see This is Verbose!
Otherwise I refer to the Programmer's Phrasebook by Einar Hoest. #see http://www.nr.no/~einarwh/phrasebook/
I always use single letter variables in for loops, it's just nicer-looking and easier to read.
A lot of it depends on the language you're programming in too, I don't name variables the same in C++ as I do in Java (Java lends itself better to the excessively long variable names imo, but this could just a personal preference. Or it may have something to do with how Java built-ins are named...).
locals: fooBar;
members/types/functions FooBar
interfaces: IFooBar
As for me, single letters are only valid if the name is classic; i/j/k for only for local loop indexes, x,y,z for vector parts.
vars have names that convey meaning but are short enough to not wrap lines
foo,bar,baz. Pickle is also a favorite.
I learned not to ever use single-letter variable names back in my VB3 days. The problem is that if you want to search everywhere that a variable is used, it's kinda hard to search on a single letter!
The newer versions of Visual Studio have intelligent variable searching functions that avoid this problem, but old habits and all that. Anyway, I prefer to err on the side of ridiculous.
for (int firstStageRocketEngineIndex = 0; firstStageRocketEngineIndex < firstStageRocketEngines.Length; firstStageRocketEngineIndex++)
{
firstStageRocketEngines[firstStageRocketEngineIndex].Ignite();
Thread.Sleep(100); // Don't start them all at once. That would be bad.
}
It's pretty much unimportant how you name variables. You really don't need any rules, other than those specified by the language, or at minimum, those enforced by your compiler.
It's considered polite to pick names you think your teammates can figure out, but style rules don't really help with that as much as people think.
Since I work as a contractor, moving among different companies and projects, I prefer to avoid custom naming conventions. They make it more difficult for a new developer, or a maintenance developer, to become acquainted with (and follow) the standard being used.
So, while one can find points in them to disagree with, I look to the official Microsoft Net guidelines for a consistent set of naming conventions.
With some exceptions (Hungarian notation), I think consistent usage may be more useful than any arbitrary set of rules. That is, do it the same way every time.
.
I work in MathCAD and I'm happy because MathCAD gives me increadable possibilities in naming and I use them a lot. And I can`t understand how to programm without this.
To differ one var from another I have to include a lot of information in the name,for example:
1.On the first place - that is it -N for quantity,F for force and so on
2.On the second - additional indices - for direction of force for example
3.On the third - indexation inside vector or matrix var,for convinience I put var name in {} or [] brackets to show its dimensions.
So,as conclusion my var name is like
N.dirs / Fx i.row / {F}.w.(i,j.k) / {F}.w.(k,i.j).
Sometimes I have to add name of coordinate system for vector values
{F}.{GCS}.w.(i,j.k) / {F}.{LCS}.w.(i,j.k)
And as final step I add name of the external module in BOLD at the end of external function or var like Row.MTX.f([M]) because MathCAD doesn't have help string for function.
Use variables that describes clearly what it contains. If the class is going to get big, or if it is in the public scope the variable name needs to be described more accurately. Of course good naming makes you and other people understand the code better.
for example: use "employeeNumber" insetead of just "number".
use Btn or Button in the end of the name of variables reffering to buttons, str for strings and so on.
Start variables with lower case, start classes with uppercase.
example of class "MyBigClass", example of variable "myStringVariable"
Use upper case to indicate a new word for better readability. Don't use "_", because it looks uglier and takes longer time to write.
for example: use "employeeName".
Only use single character variables in loops.
Updated
First off, naming depends on existing conventions, whether from language, framework, library, or project. (When in Rome...) Example: Use the jQuery style for jQuery plugins, use the Apple style for iOS apps. The former example requires more vigilance (since JavaScript can get messy and isn't automatically checked), while the latter example is simpler since the standard has been well-enforced and followed. YMMV depending on the leaders, the community, and especially the tools.
I will set aside all my naming habits to follow any existing conventions.
In general, I follow these principles, all of which center around programming being another form of interpersonal communication through written language.
Readability - important parts should have solid names; but these names should not be a replacement for proper documentation of intent. The test for code readability is if you can come back to it months later and still be understanding enough to not toss the entire thing upon first impression. This means avoiding abbreviation; see the case against Hungarian notation.
Writeability - common areas and boilerplate should be kept simple (esp. if there's no IDE), so code is easier and more fun to write. This is a bit inspired by Rob Pyke's style.
Maintainability - if I add the type to my name like arrItems, then it would suck if I changed that property to be an instance of a CustomSet class that extends Array. Type notes should be kept in documentation, and only if appropriate (for APIs and such).
Standard, common naming - For dumb environments (text editors): Classes should be in ProperCase, variables should be short and if needed be in snake_case and functions should be in camelCase.
For JavaScript, it's a classic case of the restraints of the language and the tools affecting naming. It helps to distinguish variables from functions through different naming, since there's no IDE to hold your hand while this and prototype and other boilerplate obscure your vision and confuse your differentiation skills. It's also not uncommon to see all the unimportant or globally-derived vars in a scope be abbreviated. The language has no import [path] as [alias];, so local vars become aliases. And then there's the slew of different whitespacing conventions. The only solution here (and anywhere, really) is proper documentation of intent (and identity).
Also, the language itself is based around function level scope and closures, so that amount of flexibility can make blocks with variables in 2+ scope levels feel very messy, so I've seen naming where _ is prepended for each level in the scope chain to the vars in that scope.
I do a lot of php in nowadays, It was not always like that though and I have learned a couple of tricks when it comes to variable naming.
//this is my string variable
$strVar = "";
//this would represent an array
$arrCards = array();
//this is for an integer
$intTotal = NULL:
//object
$objDB = new database_class();
//boolean
$blValid = true;