I basically don't look for an answer on how to do something but I found how to do it, yet want more information. Hope this kind of question is OK here.
Since I just discovered this the code of a game I'm modding I don't have any idea what should I google for.
In Lua, I can have for example:
Account = {balance = 0}
function Account.withdraw (v)
self.balance = self.balance - v
end
I can have (in another lua file)
function Account.withdrawBetter (v)
if self.balance > v then
self.balance = self.balance - v
end
end
....
--somewhere in some function, with an Account instance:
a1.withdraw = a1.withdrawBetter
`
What's the name for this "technique" so I can find some more information about it (possible pitfalls, performance considerations vs. override/overwrite, etc)? note I'm only changing withdraw for the particular instance (a1), not for every Account instance.
Bonus question: Any other oo programming languages with such facility?
Thanks
OO in Lua
First of all, it should be pointed out that Lua does not implement Object Oriented Programming; it has no concept of objects, classes, inheritance, etc.
If you want OOP in Lua, you have to implement it yourself. Usually this is done by creating a table that acts as a "class", storing the "instance methods", which are really just functions that accept the instance as its first argument.
Inheritance is then achieved by having the "constructor" (also just a function) create a new table and set its metatable to one with an __index field pointing to the class table. When indexing the "instance" with a key it doesn't have, it will then search for that key in the class instead.
In other words, an "instance" table may have no functions at all, but indexing it with, for example, "withdraw" will just try indexing the class instead.
Now, if we take a single "instance" table and add a withdraw field to it, Lua will see that it has that field and not bother looking it up in the class. You could say that this value shadows the one in the class table.
What's the name for this "technique"
It doesn't really have one, but you should definitely look into metatables.
In languages that do support this sort of thing, like in Ruby (see below) this is often done with singleton classes, meaning that they only have a single instance.
Performance considerations
Indexing tables, including metatables takes some time. If Lua finds a method in the instance table, then that's a single table lookup; if it doesn't, it then needs to first get the metatable and index that instead, and if that doesn't have it either and has its own metatable, the chain goes on like that.
So, in other words, this is actually faster. It does use up some more space, but not really that much (technically it could be quite a lot, but you really shouldn't worry about that. Nonetheless, here's where you can read up on that, if you want to).
Any other oo programming languages with such facility?
Yes, lots of 'em. Ruby is a good example, where you can do something like
array1 = [1, 2, 3]
array2 = [4, 5, 6]
def array1.foo
puts 'bar'
end
array1.foo # prints 'bar'
array2.foo # raises `NoMethodError`
I came across this yesterday and was wondering what the relationship of pass-by-value, pass-by-reference and pass-reference-by-value is. How do they all behave? Which programming language uses which?
Additionally:
I have also occasionally heard other terms like pass-by-pointer, pass-by-copy, pass-by-object...how do they all fit in?
It's a made-up term that refers to something that is actually pass-by-value, but where people always get confused because the only values in the language are references (pointers to things). Many modern languages work this way, including Java (for non-primitives), JavaScript, Python, Ruby, Scheme, Smalltalk, etc.
Beginners always ask questions on StackOverflow about why it's not pass-by-reference, because they don't understand what pass-by-reference really is. So to satisfy them some people make up a new term that they also don't understand, but makes it seem like they learned something, and which sounds vaguely like both pass-by-value and pass-by-reference.
Use of terminology varies widely between different language communities. For example, the Java community almost always describes this semantics as pass-by-value, while the Python and Ruby communities rarely describe it as pass-by-value, even though they are semantically identical.
Let's start with the basics. You probably already know this or at least think you do. Surprisingly many people are actually misusing these terms quite a bit (which certainly doesn't help with the confusion).
What is pass-by-value and pass-by-reference?
I am not going to tell you the answer just yet and explain it using examples instead. You will see why later.
Imagine you are trying to learn a programming language and a friend tells you that he knows a great book on the topic. You have two options:
borrow the book from him or
buy your own copy.
Case 1: If you borrow his book and you make notes in the margins (and add some bookmarks, references, drawings, ...) your friend can later benefit from those notes and bookmarks as well. Sharing is caring. The world is great!
Case 2: But if, on the other hand, your friend is a bit of a neat freak or holds his books very dear he might not appreciate your notes and even less the occasional coffee stain or how you fold the corners for bookmarking. You should really get your own book and everybody can happily treat his book the way he likes. The world is great!
These two cases are - you guessed it - just like pass-by-reference and pass-by-value.
The difference is whether the caller of the function can see changes that the callee makes.
Let's look at a code example.
Say you have the following code snippet:
function largePrint(text):
text = makeUpperCase(text)
print(text)
myText = "Hello"
largePrint(myText)
print(myText)
and you run it in a pass-by-value language the output will be
HELLO
Hello
while in a pass-by-reference language the output will be
HELLO
HELLO
If you pass-by-reference the variable "myText" gets changed by the function. If you just pass the value of the variable the function cannot change it.
If you need another example, this one using websites got quite a few upvotes.
Okay, now that we have the basics down let's move on to...
What is pass-reference-by-value?
Let's use our book example again. Assume that you have decided to borrow the book from your friend although you know how much he loves his books. Let us also assume that you then - being the total klutz that you are - spilled a gallon of coffee over it. Oh, my...
Not to worry, your brain starts up and realizes that you can just buy your friend a new book. It was in such good condition he won't even notice!
What does that story have to do with pass-reference-by-value?
Well, imagine on the other hand that your friend loved his books so muchs he doesn't want to lend it to you. He does however agree to bring the book and show it to you. The problem is, you can still spill a gallon of coffee over it but now you can't buy him a new book anymore. He would notice!
While this may sound dreadful (and probably has you sweating as if you had drunk that gallon of coffee) but it is actually quite common and a perfectly good way to share books call functions. It is sometimes called pass-reference-by-value.
How does the code example fare?
In a pass-reference-by-value function the output of our snippet is
HELLO
Hello
just like in the pass-by-value case. But if we change the code a little bit:
function largePrint(text):
text.toUpperCase()
print(text)
myText = "Hello"
largePrint(myText)
print(myText)
The output becomes:
HELLO
HELLO
just like in the pass-by-reference case. The reason is the same as in the book example: We can change the variable (or book) by calling text.toUpperCase() (or by spilling coffee on it). BUT we cannot change the object that the variable references anymore (text = makeUppercase(text) has no effect); in other words we cannot use a different book.
So summarizing we get:
pass-by-value:
You get your own book, your own copy of the variable and whatever you do with it the original owner will never know.
pass-by-reference:
You use the same book as your friend or the same variable as the caller. If you change the variable it changes outside your function.
pass-reference-by-value
You use the same book as your friend but he doesn't let it out of his sight. You can change the book but not EXchange it. In variable terms that means that you cannot change what the variable references but you can change that which the variable references.
So now when I show you the following Python code:
def appendSeven(list):
list.append(7)
def replaceWithNewList(list):
list = ["a", "b", "c"]
firstList = [1, 2, 3, 4, 5, 6]
print(firstList)
appendSeven(firstList)
print(firstList)
secondList = ["x", "y", "z"]
print(secondList)
replaceWithNewList(secondList)
print(secondList)
and tell you that the output is this
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 7]
['x', 'y', 'z']
['x', 'y', 'z']
what do you say?
Indeed, you are correct! Python uses pass-reference-by-value! (And so is Java and Ruby and Scheme and...)
Conclusion:
Was this extremely difficult? No.
Then why are so many people confused by it? (I'm not going to link to all the other questions about "How do I pass a variable by reference in Python" or "Is Java pass-by-reference?" and so on...)
Well, I see several reasons:
Saying pass-value-by-reference is really long (and it behaves almost the same as pass-by-reference anyway so get of my back and stop splitting hairs!)
There are nasty special cases that complicate the issue
like primitive types (which are often pass-by-value in pass-value-by-reference languages)
immutable types (which cannot be changed so passing them by-reference doesn't get you anywhere)
Some people just like to use different names:
pass-by-copy for pass-by-value because if you paid attention, you need your own book, so a copy has to be made. (But that is an implementation detail, other tricks like copy-on-write might be used to only sometimes make a copy.)
pass-by-reference for pass-reference-by-value because, after all, you did get a reference and that means you can change the object that the caller had (even if you cannot EXchange it)
pass-by-reference for pass-reference-by-value because almost no language actually has pass-by-reference (notable Exceptions (that I know): C++, C# but you have to use special syntax)
pass-by-sharing for pass-reference-by-value (remember our book example? that name is actually way better and everybody should use it!)
pass-by-object for pass-reference-by-value because allegedly Barbara Liskov first named it and used that, so that's its birthname)
pass-by-pointer for pass-by-reference because C doesn't have pass-by-reference and that is how it allows you to accomplish the same thing.
If there is anything you think I should add, please let me know. If I made a mistake, PLEASE LET ME KNOW!
For those of you who still haven't got enough: http://en.wikipedia.org/wiki/Evaluation_strategy
This may seem like a dumb question, but still I don't know the answer.
Why do programming languages not allow spaces in the names ( for instance method names )?
I understand it is to facilitate ( allow ) the parsing, and at some point it would be impossible to parse anything if spaces were allowed.
Nowadays we are so use to it that the norm is not to see spaces.
For instance:
object.saveData( data );
object.save_data( data )
object.SaveData( data );
[object saveData:data];
etc.
Could be written as:
object.save data( data ) // looks ugly, but that's the "nature" way.
If it is only for parsing, I guess the identifier could be between . and ( of course, procedural languages wouldn't be able to use it because there is no '.' but OO do..
I wonder if parsing is the only reason, and if it is, how important it is ( I assume that it will be and it will be impossible to do it otherwise, unless all the programming language designers just... forget the option )
EDIT
I'm ok with identifiers in general ( as the fortran example ) is bad idea. Narrowing to OO languages and specifically to methods, I don't see ( I don't mean there is not ) a reason why it should be that way. After all the . and the first ( may be used.
And forget the saveData method , consider this one:
key.ToString().StartsWith("TextBox")
as:
key.to string().starts with("textbox");
Be cause i twoul d makepa rsing suc hcode reallydif ficult.
I used an implementation of ALGOL (c. 1978) which—extremely annoyingly—required quoting of what is now known as reserved words, and allowed spaces in identifiers:
"proc" filter = ("proc" ("int") "bool" p, "list" l) "list":
"if" l "is" "nil" "then" "nil"
"elif" p(hd(l)) "then" cons(hd(l), filter(p,tl(l)))
"else" filter(p, tl(l))
"fi";
Also, FORTRAN (the capitalized form means F77 or earlier), was more or less insensitive to spaces. So this could be written:
799 S = FLO AT F (I A+I B+I C) / 2 . 0
A R E A = SQ R T ( S *(S - F L O ATF(IA)) * (S - FLOATF(IB)) *
+ (S - F LOA TF (I C)))
which was syntactically identical to
799 S = FLOATF (IA + IB + IC) / 2.0
AREA = SQRT( S * (S - FLOATF(IA)) * (S - FLOATF(IB)) *
+ (S - FLOATF(IC)))
With that kind of history of abuse, why make parsing difficult for humans? Let alone complicate computer parsing.
Yes, it's the parsing - both human and computer. It's easier to read and easier to parse if you can safely assume that whitespace doesn't matter. Otherwise, you can have potentially ambiguous statements, statements where it's not clear how things go together, statements that are hard to read, etc.
Such a change would make for an ambiguous language in the best of cases. For example, in a C99-like language:
if not foo(int x) {
...
}
is that equivalent to:
A function definition of foo that returns a value of type ifnot:
ifnot foo(int x) {
...
}
A call to a function called notfoo with a variable named intx:
if notfoo(intx) {
...
}
A negated call to a function called foo (with C99's not which means !):
if not foo(intx) {
...
}
This is just a small sample of the ambiguities you might run into.
Update: I just noticed that obviously, in a C99-like language, the condition of an if statement would be enclosed in parentheses. Extra punctuation can help with ambiguities if you choose to ignore whitespace, but your language will end up having lots of extra punctuation wherever you would normally have used whitespace.
Before the interpreter or compiler can build a parse tree, it must perform lexical analysis, turning the stream of characters into a stream of tokens. Consider how you would want the following parsed:
a = 1.2423 / (4343.23 * 2332.2);
And how your rule above would work on it. Hard to know how to lexify it without understanding the meaning of the tokens. It would be really hard to build a parser that did lexification at the same time.
There are a few languages which allow spaces in identifiers. The fact that nearly all languages constrain the set of characters in identifiers is because parsing is more easy and most programmers are accustomed to the compact no-whitespace style.
I don’t think there’s real reason.
Check out Stroustrup's classic Generalizing Overloading for C++2000.
We were allowed to put spaces in filenames back in the 1960's, and computers still don't handle them very well (everything used to break, then most things, now it's just a few things - but they still break).
We simply can't wait another 50 years before our code will work again.
:-)
(And what everyone else said, of course. In English, we use spaces and punctuation to separate the words. The same is true for computer languages, except that computer parsers define "words" in a slightly different sense)
Using space as part of an identifier makes parsing really murky (is that a syntactic space or an identifier?), but the same sort "natural reading" behavior is achieved with keyword arguments. object.save(data: something, atomically: true)
The TikZ language for creating graphics in LaTeX allows whitespace in parameter names (also known as 'keys'). For instance, you see things like
\shade[
top color=yellow!70,
bottom color=red!70,
shading angle={45},
]
In this restricted setting of a comma-separated list of key-value pairs, there's no parsing difficulty. In fact, I think it's much easier to read than the alternatives like topColor, top_color or topcolor.
I am curious to know about this.
whenever I write a function which have to return multiple values, either I have to use pass by reference or create an array store values in it and pass them.
Why all the Object Orinented languages functions are not allowed to return multiple parameters as we pass them as input. Like is there anything inbuilt structure of the language which is restricting from doing this.
Dont you think it will be fun and easy if we are allowed to do so.
It's not true that all Object-Oriented languages follow this paradigm.
e.g. in Python (from here):
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
a will be 9 and b will be 27.
The difference between the traditional
OutTypeA SomeFunction(out OutTypeB, TypeC someOtherInputParam)
and your
{ OutTypeA, OutTypeB } SomeFunction(TypeC someOtherInputParam)
is just syntactic sugar. Also, the tradition of returning one single parameter type allows writing in the easy readable natural language of result = SomeFunction(...). It's just convenience and ease of use.
And yes, as others said, you have tuples in some languages.
This is likely because of the way processors have been designed and hence carried over to modern languages such as Java or C#. The processor can load multiple things (pointers) into parameter registers but only has one return value register that holds a pointer.
I do agree that not all OOP languages only support returning one value, but for the ones that "apparently" do, this I think is the reason why.
Also for returning a tuple, pair or struct for that matter in C/C++, essentially, the compiler is returning a pointer to that object.
First answer: They don't. many OOP languages allow you to return a tuple. This is true for instance in python, in C++ you have pair<> and in C++0x a fully fledged tuple<> is in TR1.
Second answer: Because that's the way it should be. A method should be short and do only one thing and thus can be argued, only need to return one thing.
In PHP, it is like that because the only way you can receive a value is by assigning the function to a variable (or putting it in place of a variable). Although I know array_map allows you to do return something & something;
To return multiple parameters, you return an single object that contains both of those parameters.
public MyResult GetResult(x)
{
return new MyResult { Squared = Math.Pow(x,2), Cubed = Math.Pow(x,3) };
}
For some languages you can create anonymous types on the fly. For others you have to specify a return object as a concrete class. One observation with OO is you do end up with a lot of little classes.
The syntactic niceties of python (see #Cowan's answer) are up to the language designer. The compiler / runtime could creating an anonymous class to hold the result for you, even in a strongly typed environment like the .net CLR.
Yes it can be easier to read in some circumstances, and yes it would be nice. However, if you read Eric Lippert's blog, you'll often read dialogue's and hear him go on about how there are many nice features that could be implemented, but there's a lot of effort that goes into every feature, and some things just don't make the cut because in the end they can't be justified.
It's not a restriction, it is just the architecture of the Object Oriented and Structured programming paradigms. I don't know if it would be more fun if functions returned more than one value, but it would be sure more messy and complicated. I think the designers of the above programming paradigms thought about it, and they probably had good reasons not to implement that "feature" -it is unnecessary, since you can already return multiple values by packing them in some kind of collection. Programming languages are designed to be compact, so usually unnecessary features are not implemented.
Closed. This question is opinion-based. It is not currently accepting answers.
Closed 11 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Sorry for the waffly title - if I could come up with a concise title, I wouldn't have to ask the question.
Suppose I have an immutable list type. It has an operation Foo(x) which returns a new immutable list with the specified argument as an extra element at the end. So to build up a list of strings with values "Hello", "immutable", "world" you could write:
var empty = new ImmutableList<string>();
var list1 = empty.Foo("Hello");
var list2 = list1.Foo("immutable");
var list3 = list2.Foo("word");
(This is C# code, and I'm most interested in a C# suggestion if you feel the language is important. It's not fundamentally a language question, but the idioms of the language may be important.)
The important thing is that the existing lists are not altered by Foo - so empty.Count would still return 0.
Another (more idiomatic) way of getting to the end result would be:
var list = new ImmutableList<string>().Foo("Hello")
.Foo("immutable")
.Foo("word");
My question is: what's the best name for Foo?
EDIT 3: As I reveal later on, the name of the type might not actually be ImmutableList<T>, which makes the position clear. Imagine instead that it's TestSuite and that it's immutable because the whole of the framework it's a part of is immutable...
(End of edit 3)
Options I've come up with so far:
Add: common in .NET, but implies mutation of the original list
Cons: I believe this is the normal name in functional languages, but meaningless to those without experience in such languages
Plus: my favourite so far, it doesn't imply mutation to me. Apparently this is also used in Haskell but with slightly different expectations (a Haskell programmer might expect it to add two lists together rather than adding a single value to the other list).
With: consistent with some other immutable conventions, but doesn't have quite the same "additionness" to it IMO.
And: not very descriptive.
Operator overload for + : I really don't like this much; I generally think operators should only be applied to lower level types. I'm willing to be persuaded though!
The criteria I'm using for choosing are:
Gives the correct impression of the result of the method call (i.e. that it's the original list with an extra element)
Makes it as clear as possible that it doesn't mutate the existing list
Sounds reasonable when chained together as in the second example above
Please ask for more details if I'm not making myself clear enough...
EDIT 1: Here's my reasoning for preferring Plus to Add. Consider these two lines of code:
list.Add(foo);
list.Plus(foo);
In my view (and this is a personal thing) the latter is clearly buggy - it's like writing "x + 5;" as a statement on its own. The first line looks like it's okay, until you remember that it's immutable. In fact, the way that the plus operator on its own doesn't mutate its operands is another reason why Plus is my favourite. Without the slight ickiness of operator overloading, it still gives the same connotations, which include (for me) not mutating the operands (or method target in this case).
EDIT 2: Reasons for not liking Add.
Various answers are effectively: "Go with Add. That's what DateTime does, and String has Replace methods etc which don't make the immutability obvious." I agree - there's precedence here. However, I've seen plenty of people call DateTime.Add or String.Replace and expect mutation. There are loads of newsgroup questions (and probably SO ones if I dig around) which are answered by "You're ignoring the return value of String.Replace; strings are immutable, a new string gets returned."
Now, I should reveal a subtlety to the question - the type might not actually be an immutable list, but a different immutable type. In particular, I'm working on a benchmarking framework where you add tests to a suite, and that creates a new suite. It might be obvious that:
var list = new ImmutableList<string>();
list.Add("foo");
isn't going to accomplish anything, but it becomes a lot murkier when you change it to:
var suite = new TestSuite<string, int>();
suite.Add(x => x.Length);
That looks like it should be okay. Whereas this, to me, makes the mistake clearer:
var suite = new TestSuite<string, int>();
suite.Plus(x => x.Length);
That's just begging to be:
var suite = new TestSuite<string, int>().Plus(x => x.Length);
Ideally, I would like my users not to have to be told that the test suite is immutable. I want them to fall into the pit of success. This may not be possible, but I'd like to try.
I apologise for over-simplifying the original question by talking only about an immutable list type. Not all collections are quite as self-descriptive as ImmutableList<T> :)
In situations like that, I usually go with Concat. That usually implies to me that a new object is being created.
var p = listA.Concat(listB);
var k = listA.Concat(item);
I'd go with Cons, for one simple reason: it means exactly what you want it to.
I'm a huge fan of saying exactly what I mean, especially in source code. A newbie will have to look up the definition of Cons only once, but then read and use that a thousand times. I find that, in the long term, it's nicer to work with systems that make the common case easier, even if the up-front cost is a little bit higher.
The fact that it would be "meaningless" to people with no FP experience is actually a big advantage. As you pointed out, all of the other words you found already have some meaning, and that meaning is either slightly different or ambiguous. A new concept should have a new word (or in this case, an old one). I'd rather somebody have to look up the definition of Cons, than to assume incorrectly he knows what Add does.
Other operations borrowed from functional languages often keep their original names, with no apparent catastrophes. I haven't seen any push to come up with synonyms for "map" and "reduce" that sound more familiar to non-FPers, nor do I see any benefit from doing so.
(Full disclosure: I'm a Lisp programmer, so I already know what Cons means.)
Actually I like And, especially in the idiomatic way. I'd especially like it if you had a static readonly property for the Empty list, and perhaps make the constructor private so you always have to build from the empty list.
var list = ImmutableList<string>.Empty.And("Hello")
.And("Immutable")
.And("Word");
Whenever I'm in a jam with nomenclature, I hit up the interwebs.
thesaurus.com returns this for "add":
Definition: adjoin, increase; make
further comment
Synonyms: affix,
annex, ante, append, augment, beef
up, boost, build up, charge up,
continue, cue in, figure in, flesh
out, heat up, hike, hike up, hitch on,
hook on, hook up with, include, jack
up, jazz up, join together, pad,
parlay, piggyback, plug into, pour it
on, reply, run up, say further, slap
on, snowball, soup up, speed up,
spike, step up, supplement, sweeten,
tack on, tag
I like the sound of Adjoin, or more simply Join. That is what you're doing, right? The method could also apply to joining other ImmutableList<>'s.
Personally, I like .With(). If I was using the object, after reading the documentation or the code comments, it would be clear what it does, and it reads ok in the source code.
object.With("My new item as well");
Or, you add "Along" with it.. :)
object.AlongWith("this new item");
I ended up going with Add for all of my Immutable Collections in BclExtras. The reason being is that it's an easy predictable name. I'm not worried so much about people confusing Add with a mutating add since the name of the type is prefixed with Immutable.
For awhile I considered Cons and other functional style names. Eventually I discounted them because they're not nearly as well known. Sure functional programmers will understand but they're not the majority of users.
Other Names: you mentioned:
Plus: I'm wishy/washing on this one. For me this doesn't distinguish it as being a non-mutating operation anymore than Add does
With: Will cause issues with VB (pun intended)
Operator overloading: Discoverability would be an issue
Options I considered:
Concat: String's are Immutable and use this. Unfortunately it's only really good for adding to the end
CopyAdd: Copy what? The source, the list?
AddToNewList: Maybe a good one for List. But what about a Collection, Stack, Queue, etc ...
Unfortunately there doesn't really seem to be a word that is
Definitely an immutable operation
Understandable to the majority of users
Representable in less than 4 words
It gets even more odd when you consider collections other than List. Take for instance Stack. Even first year programmers can tell you that Stacks have a Push/Pop pair of methods. If you create an ImmutableStack and give it a completely different name, lets call it Foo/Fop, you've just added more work for them to use your collection.
Edit: Response to Plus Edit
I see where you're going with Plus. I think a stronger case would actually be Minus for remove. If I saw the following I would certainly wonder what in the world the programmer was thinking
list.Minus(obj);
The biggest problem I have with Plus/Minus or a new pairing is it feels like overkill. The collection itself already has a distinguishing name, the Immutable prefix. Why go further by adding vocabulary whose intent is to add the same distinction as the Immutable prefix already did.
I can see the call site argument. It makes it clearer from the standpoint of a single expression. But in the context of the entire function it seems unnecessary.
Edit 2
Agree that people have definitely been confused by String.Concat and DateTime.Add. I've seen several very bright programmers hit this problem.
However I think ImmutableList is a different argument. There is nothing about String or DateTime that establishes it as Immutable to a programmer. You must simply know that it's immutable via some other source. So the confusion is not unexpected.
ImmutableList does not have that problem because the name defines it's behavior. You could argue that people don't know what Immutable is and I think that's also valid. I certainly didn't know it till about year 2 in college. But you have the same issue with whatever name you choose instead of Add.
Edit 3: What about types like TestSuite which are immutable but do not contain the word?
I think this drives home the idea that you shouldn't be inventing new method names. Namely because there is clearly a drive to make types immutable in order to facilitate parallel operations. If you focus on changing the name of methods for collections, the next step will be the mutating method names on every type you use that is immutable.
I think it would be a more valuable effort to instead focus on making types identifiable as Immutable. That way you can solve the problem without rethinking every mutating method pattern out there.
Now how can you identify TestSuite as Immutable? In todays environment I think there are a few ways
Prefix with Immutable: ImmutableTestSuite
Add an Attribute which describes the level of Immutablitiy. This is certainly less discoverable
Not much else.
My guess/hope is development tools will start helping this problem by making it easy to identify immutable types simply by sight (different color, stronger font, etc ...). But I think that's the answer though over changing all of the method names.
I think this may be one of those rare situations where it's acceptable to overload the + operator. In math terminology, we know that + doesn't append something to the end of something else. It always combines two values together and returns a new resulting value.
For example, it's intuitively obvious that when you say
x = 2 + 2;
the resulting value of x is 4, not 22.
Similarly,
var empty = new ImmutableList<string>();
var list1 = empty + "Hello";
var list2 = list1 + "immutable";
var list3 = list2 + "word";
should make clear what each variable is going to hold. It should be clear that list2 is not changed in the last line, but instead that list3 is assigned the result of appending "word" to list2.
Otherwise, I would just name the function Plus().
To be as clear as possible, you might want to go with the wordier CopyAndAdd, or something similar.
I would call it Extend() or maybe ExtendWith() if you feel like really verbose.
Extends means adding something to something else without changing it. I think this is very relevant terminology in C# since this is similar to the concept of extension methods - they "add" a new method to a class without "touching" the class itself.
Otherwise, if you really want to emphasize that you don't modify the original object at all, using some prefix like Get- looks like unavoidable to me.
Added(), Appended()
I like to use the past tense for operations on immutable objects. It conveys the idea that you aren't changing the original object, and it's easy to recognize when you see it.
Also, because mutating method names are often present-tense verbs, it applies to most of the immutable-method-name-needed cases you run into. For example an immutable stack has the methods "pushed" and "popped".
I like mmyers suggestion of CopyAndAdd. In keeping with a "mutation" theme, maybe you could go with Bud (asexual reproduction), Grow, Replicate, or Evolve? =)
EDIT: To continue with my genetic theme, how about Procreate, implying that a new object is made which is based on the previous one, but with something new added.
This is probably a stretch, but in Ruby there is a commonly used notation for the distinction: add doesn't mutate; add! mutates. If this is an pervasive problem in your project, you could do that too (not necessarily with non-alphabetic characters, but consistently using a notation to indicate mutating/non-mutating methods).
Join seems appropriate.
Maybe the confusion stems from the fact that you want two operations in one. Why not separate them? DSL style:
var list = new ImmutableList<string>("Hello");
var list2 = list.Copy().With("World!");
Copy would return an intermediate object, that's a mutable copy of the original list. With would return a new immutable list.
Update:
But, having an intermediate, mutable collection around is not a good approach. The intermediate object should be contained in the Copy operation:
var list1 = new ImmutableList<string>("Hello");
var list2 = list1.Copy(list => list.Add("World!"));
Now, the Copy operation takes a delegate, which receives a mutable list, so that it can control the copy outcome. It can do much more than appending an element, like removing elements or sorting the list. It can also be used in the ImmutableList constructor to assemble the initial list without intermediary immutable lists.
public ImmutableList<T> Copy(Action<IList<T>> mutate) {
if (mutate == null) return this;
var list = new List<T>(this);
mutate(list);
return new ImmutableList<T>(list);
}
Now there's no possibility of misinterpretation by the users, they will naturally fall into the pit of success.
Yet another update:
If you still don't like the mutable list mention, even now that it's contained, you can design a specification object, that will specify, or script, how the copy operation will transform its list. The usage will be the same:
var list1 = new ImmutableList<string>("Hello");
// rules is a specification object, that takes commands to run in the copied collection
var list2 = list1.Copy(rules => rules.Append("World!"));
Now you can be creative with the rules names and you can only expose the functionality that you want Copy to support, not the entire capabilities of an IList.
For the chaining usage, you can create a reasonable constructor (which will not use chaining, of course):
public ImmutableList(params T[] elements) ...
...
var list = new ImmutableList<string>("Hello", "immutable", "World");
Or use the same delegate in another constructor:
var list = new ImmutableList<string>(rules =>
rules
.Append("Hello")
.Append("immutable")
.Append("World")
);
This assumes that the rules.Append method returns this.
This is what it would look like with your latest example:
var suite = new TestSuite<string, int>(x => x.Length);
var otherSuite = suite.Copy(rules =>
rules
.Append(x => Int32.Parse(x))
.Append(x => x.GetHashCode())
);
A few random thoughts:
ImmutableAdd()
Append()
ImmutableList<T>(ImmutableList<T> originalList, T newItem) Constructor
DateTime in C# uses Add. So why not use the same name? As long the users of your class understand the class is immutable.
I think the key thing you're trying to get at that's hard to express is the nonpermutation, so maybe something with a generative word in it, something like CopyWith() or InstancePlus().
I don't think the English language will let you imply immutability in an unmistakable way while using a verb that means the same thing as "Add". "Plus" almost does it, but people can still make the mistake.
The only way you're going to prevent your users from mistaking the object for something mutable is by making it explicit, either through the name of the object itself or through the name of the method (as with the verbose options like "GetCopyWith" or "CopyAndAdd").
So just go with your favourite, "Plus."
First, an interesting starting point:
http://en.wikipedia.org/wiki/Naming_conventions_(programming) ...In particular, check the "See Also" links at the bottom.
I'm in favor of either Plus or And, effectively equally.
Plus and And are both math-based in etymology. As such, both connote mathematical operation; both yield an expression which reads naturally as expressions which may resolve into a value, which fits with the method having a return value. And bears additional logic connotation, but both words apply intuitively to lists. Add connotes action performed on an object, which conflicts with the method's immutable semantics.
Both are short, which is especially important given the primitiveness of the operation. Simple, frequently-performed operations deserve shorter names.
Expressing immutable semantics is something I prefer to do via context. That is, I'd rather simply imply that this entire block of code has a functional feel; assume everything is immutable. That might just be me, however. I prefer immutability to be the rule; if it's done, it's done a lot in the same place; mutability is the exception.
How about Chain() or Attach()?
I prefer Plus (and Minus). They are easily understandable and map directly to operations involving well known immutable types (the numbers). 2+2 doesn't change the value of 2, it returns a new, equally immutable, value.
Some other possibilities:
Splice()
Graft()
Accrete()
How about mate, mateWith, or coitus, for those who abide. In terms of reproducing mammals are generally considered immutable.
Going to throw Union out there too. Borrowed from SQL.
Apparently I'm the first Obj-C/Cocoa person to answer this question.
NNString *empty = [[NSString alloc] init];
NSString *list1 = [empty stringByAppendingString:#"Hello"];
NSString *list2 = [list1 stringByAppendingString:#"immutable"];
NSString *list3 = [list2 stringByAppendingString:#"word"];
Not going to win any code golf games with this.
I think "Add" or "Plus" sounds fine. The name of the list itself should be enough to convey the list's immutability.
Maybe there are some words which remember me more of making a copy and add stuff to that instead of mutating the instance (like "Concatenate"). But i think having some symmetry for those words for other actions would be good to have too. I don't know of a similar word for "Remove" that i think of the same kind like "Concatenate". "Plus" sounds little strange to me. I wouldn't expect it being used in a non-numerical context. But that could aswell come from my non-english background.
Maybe i would use this scheme
AddToCopy
RemoveFromCopy
InsertIntoCopy
These have their own problems though, when i think about it. One could think they remove something or add something to an argument given. Not sure about it at all. Those words do not play nice in chaining either, i think. Too wordy to type.
Maybe i would just use plain "Add" and friends too. I like how it is used in math
Add 1 to 2 and you get 3
Well, certainly, a 2 remains a 2 and you get a new number. This is about two numbers and not about a list and an element, but i think it has some analogy. In my opinion, add does not necessarily mean you mutate something. I certainly see your point that having a lonely statement containing just an add and not using the returned new object does not look buggy. But I've now also thought some time about that idea of using another name than "add" but i just can't come up with another name, without making me think "hmm, i would need to look at the documentation to know what it is about" because its name differs from what I would expect to be called "add". Just some weird thought about this from litb, not sure it makes sense at all :)
Looking at http://thesaurus.reference.com/browse/add and http://thesaurus.reference.com/browse/plus I found gain and affix but I'm not sure how much they imply non-mutation.
I think that Plus() and Minus() or, alternatively, Including(), Excluding() are reasonable at implying immutable behavior.
However, no naming choice will ever make it perfectly clear to everyone, so I personally believe that a good xml doc comment would go a very long way here. VS throws these right in your face when you write code in the IDE - they're hard to ignore.
Append - because, note that names of the System.String methods suggest that they mutate the instance, but they don't.
Or I quite like AfterAppending:
void test()
{
Bar bar = new Bar();
List list = bar.AfterAppending("foo");
}
list.CopyWith(element)
As does Smalltalk :)
And also list.copyWithout(element) that removes all occurrences of an element, which is most useful when used as list.copyWithout(null) to remove unset elements.
I would go for Add, because I can see the benefit of a better name, but the problem would be to find different names for every other immutable operation which might make the class quite unfamiliar if that makes sense.