Law of Demeter violation proves useful. Am I missing something? [closed] - language-agnostic

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have some code like this in my application. It writes out some XML:-
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// Code doing stuff with f
// b is not mentioned again.
}
As I understand it, the Law of Dementer would say this is bad. "Code Complete" says this is increasing coupling. This method should take "f" in the first place.
public void doStuff( Foo f, XMLElement x)
{
// Code doing stuff with f
}
However, now I have come to change this code, and I do actually need to access a different method on b.
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// Code doing stuff with f
// A different method is called on b.
}
This interface has made life easier as the change is entirely inside the method. I do not have to worry about the many places it is called from around the application.
This suggests to me that the original design was correct. Do you agree? What am I missing?
PS. I do not think the behaviour belongs in b itself, as domain objects do not know about the external representation as XML in this system.

First thing is that this isn't necessarily a Law of Demeter violation, unless you are actually calling methods on the Foo object f in doStuff. If you are not, then you are probably fine; you're only using the interface of the Business object b. So I'll assume that you are calling at least one method on 'f'.
One thing you might be "missing" is testability, specifically unit tests. If you have:
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// stuff using f.someMethod
// business stuff with b
// presumably something with x
}
... then if you want to test that doStuff does the right thing for different Foo, you have to first create (or mock) a new Business object with each Foo 'f' that you want, then plug that object into doStuff (even if the rest of the Business-specific stuff is identical). You're testing at one remove from your method, and while your source code may stay simple, your test code gets messier. So, if you really need both f and b in doStuff, then arguably they should both be parameters.
For more reading on it, this person is one of the most emphatic Law of Demeter campaigners I've come across; frequently provides rationales for it.

I think it's difficult to give you a clear-cut answer because
1) the problem statement is pretty abstract, and
2) there is no "absolute" good design - it depends on what's around your classes too, and what was a good design initially might evolve into something you want to refactor as your system grows and evolves, and your understanding of the domain becomes more refined.
I don't see the first example as a "massive" violation of the Demeter principle, but again everything is in the details, it depends on how much is going on in your commented section - you can always add more indirection if you need to. You could for instance have your method "DoStuff" on a WriteBusinessObjectToXmlService class, and if the amount of work you were doing involving f was growing, you could extract it into its method "DoStuffWithF(f, x)", or even create a separate class WriteFToXmlService, with DoStuff(f, x).

If we follow with this logic further we will come up with an idea that a global-everything-objects-repository object (or service locator) should be used which contains links to everything in the system. Than we will not need to change method signatures at all because this repository is all we need.
The problem is that the purpose of the method has changed, but the signature has not. If Foo is everything the method needs than it should accept Foo only. This way we can tell that it operates solely on Foo. This will communicate the purpose of the method more clearly. If it suddenly needs Business too we need to change the method signature because it should indicate other method purpose and requirements

Maybe it is now justified to pass in Business or the method needs a third parameter: the return type of the other method called on the Business object. It depends on the rest of the body of the doStuff method.

Related

Adding user defined functions to a simple calculator YACC [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've been searching all over the internet for a comprehensible example to how you can define and call a function in a simple calculator interpreter. Maybe I've found the answer but since I'm not familiar with YACC I couldn't see it.
So the question is, how do you set up a symbol table for user defined functions and how do you store/call these functions in a calculator interpreter?
I'm basically looking to achieve something like this:
def sum(a,b) { a + b }
sum(5,5)
result:
10
Any pointers or examples would be appreciated
That's definitely diving in to the concepts required to interpret (or compile) a programming language, which makes it difficult to provide an answer in a format suitable for StackOverflow. Here's a quick outline:
You need a symbol table which can hold both functions and variables. Or two symbol tables. In the first case, the mapped value will be some kind of variant type, such as a discriminated union; you might need that anyway if you have more than one type of variable. In the second case, you can use a specific type for the mapped value of function names. I'd go for the first option, because it allows functions to be first-class objects.
You need some kind of type which represents the "value" of a function definition. The obvious type is the Abstract Syntax Tree (AST) of an expression (or a program), and doing that will generally simplify your code so I'd highly recommend it. That means that the calculator/parser will not actually evaluate 5+5 (even if that is the literal input) or a+b, but rather will return an AST to whoever called the parser. Consequently, you will need:
A function which can evaluate an AST. That's usually straightforward to write, since it's just a depth-first tree walk. But now you need to worry about variable scope because when you do evaluate they body of your function sum, you probably want to only locally set the values of the parameters.
If you manage all that, you will have gone several steps beyond the usual "let's build a calculator with flex and bison" project, and I definitely encourage you to do so. You may want to take a look at the classic text Structure and Interpretation of Computer Programs (Abelson & Sussman, 1996; often referred to simply as SICP).

Multiple logics (N number of clients) handled with only one function. All call the same function, HOW TO? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a corporate application written in python/Django (no python experience required to answer this). Its SAAS basically.
Few of the clients seems to have different requirement for a few modules.
Lets say there is a URL
www.xyz.com/groups
which is used by all clients but a few of the clients want to have different output on call of the same URL.
I want to know how can i do that without writing new function for each client or writing conditions in a single function.
Its a silly question i know but there must be some solution to it, i know.
If your code is required to do "A" for case "a" and "B" for case "b" and "C" for case "c", then regardless of what solution you pick, somewhere in the code has to exists something that decides whever or not case 'a/b/c' occurs, and something must exist that will dispatch correct 'A/B/C' action for that case, and of course those A/B/C actions have to be written somewhere in the code, too.
Step outside of the code and think about it. If it is specified and must happen - it has to be coded somewhere. You cannot escape that. Now, if the cases/actions are trivial and typical, you might find some more-or-even-more configurable library that accidentally allows you to configure such cases and actions, and off you go, you have it "with no code" and no "clutter". But looking formally, the code is deep there in the library. So, the decider, dispatcher and actions are coded. Just not by you - by someone that guessed your needs.
But if your needs are nontrivial and highly specific, for example, if it require your various conditions to decide which a/b/c case is it - then most probably you will have to code the 'decider' part for yourself. That means lots of tree-of-IFs, nested-switches, rules-n-loops, or whatever you like or feel adequate. After this, you are left with dispatch/execute phase, and this can be realized in a multitude of ways - i.e. strategy pattern - it is exactly it: dispatch (by concrete class related to case) and execute (the concrete strategy has the concrete code for the case).
Let's try something-like-OO approach:
For example, if you have cases a/b/c for UserTypes U1,U2,U3, you could introduce three classes:
UserType1 inherits from abstract UserType or implements "DoAThing" interface
UserType2 inherits from abstract UserType or implements "DoAThing" interface
UserType3 inherits from abstract UserType or implements "DoAThing" interface
UserType1 implements virtual method 'doTheThing' that executes actionA
UserType2 implements virtual method 'doTheThing' that executes actionB
UserType3 implements virtual method 'doTheThing' that executes actionC
your Users stop keeping "UserType" of type "int" equal to '1/2/3' - now their type is an object: UserType1, UserType2 or UserType3
whenever you must do the thing for a given user, you now just:
result = user.getType().doTheThing( ..params..)
So, instead of iffing/switching, you use OO: tell, don't ask rule. If the action-to-do is dependent solely on UserType, then let the UserType perform it. The resulting code is as short as possible - but at the cost of number of classes to create and, well, ...
... the decider, dispatcher and actions are still in the code. Actions - obvious - in the various usertype clasess. Dispatch - obvious - virtual call by common abstract base method. And decider..? Well: someone at some point had to choose and construct the correct UserType object for the user. If user was stored in the database, if "usertype" is just an integer 1/2/3, then somewhere in your ORM layer those 1/2/3 numbers had to be decoded and translated into UserType1/2/3 classes/objects. That means, that you'd need there a tree-of-ifs or a switch or etc. Or, if you have an ORM smart enough - you just set up a bunch of rules and it did it for you, but that's just again delegating part of the job to more-or-even-more configurable library. Not mentioning that your UserType1/2/3 classes in fact became somewhat .. strategies.
Ok, let's attack the 'choose' part.
You can build a tree of ifs or switches somewhere to decide and assign, but imperative seems to smell. Or, with OO, you can try to polymorphize something so that "it will just do the right thing", but it will not solve anything since again you will have to choose the object type somewhere. So, let's try data-driven: let's use lookups.
we've got five implementations of an action
create a hash/dictionary/map
add usertype1->caseA to the map
add usertype2->caseC to the map
add usertype3->caseB to the map
add usertype4->caseA to the map
add usertype5->caseE to the map
....
now, whenever you have a user and need to decide, just look it up. Instead of a "case" you may hold a ready to use object of a strategy. Or a callable method. Or a typename. Or whatever you need. The point is that instead of writing
if( user.type == 1) { ... }
else if( user.type == 2) ...
or switching, you just look it up:
thing = map[ user.type ]
if ( thing is null ) ???
but, mind that without some care, you might sometimes NOT find a match in the map. And also, the map must be PREDEFINED for ALL CASES. So, simple if X < 100 may turn up into a hundred of entries 0..99 inside the map.
Of course, instead of a map, you may use some rule-engine and you could define a mapping like
X<100 -> caseA
X>=100 -> caseB
and then 'run' the rules against your usertype and obtain a 'result' that will tell you "caseA".
And so on.
Each of the parts - decide, dispatch, execute - you may implement in various ways, shorter or longer, more or less extensible, more or less configurable, as OO/imperative/datadriven/functional/etc - but you cannot escape them:
you have to define the discriminant of the cases
you have to define the implementation of the actions
you have to define the mapping case-to-action
How to do them, is a matter of your aesthetics, language features, frameworks, libraries and .. time you want to spend on creating and mantaining it.

OOP Design Question - Linking to precursors

I'm writing a program to do a search and export the output.
I have three primary objects:
Request
SearchResults
ExportOutput
Each of these objects links to its precursor.
Ie: ExportOutput -> SearchResults -> Request
Is this ok? Should they somehow be more loosely coupled?
Clarification:
Processes later on do use properties and methods on the precursor objects.
Ie:
SendEmail(output.SearchResults.Request.UserEmail, BODY, SUBJECT);
This has a smell even to me. The only way I can think to fix it is have hiding properties in each one, that way I'm only accessing one level
MailAddress UserEmail
{
get { return SearchResults.UserEmail; }
}
which would yeild
SendEmail(output.UserEmail, BODY, SUBJECT);
But again, that's just hiding the problem.
I could copy everything out of the precursor objects into their successors, but that would make ExportOutput really ugly. Is their a better way to factor these objects.
Note: SearchResults implements IDisposable because it links to unmanaged resources (temp files), so I really don't want to just duplicate that in ExportOutput.
If A uses B directly, you cannot:
Reuse A without also reusing B
Test A in isolation from B
Change B without risking breaking A
If instead you designed/programmed to interfaces, you could:
Reuse A without also reusing B - you just need to provide something that implements the same interface as B
Test A in isolation from B - you just need to substitute a Mock Object.
Change B without risking breaking A - because A depends on an interface - not on B
So, at a minimum, I recommend extracting interfaces. Also, this might be a good read for you: the Dependency Inversion Principle (PDF file).
Without knowing your specifics, I would think that results in whatever form would simply be returned from a Request's method (might be more than one such method from a configured Request, like find_first_instance vs. find_all_instances). Then, an Exporter's output method(s) would take results as input. So, I am not envisioning the need to link the objects at all.

Is it bad to perform two different tasks in the same loop? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm working on a highly-specialized search engine for my database. When the user submits a search request, the engine splits the search terms into an array and loops through. Inside the loop, each search term is examined against several possible scenarios to determine what it could mean. When a search term matches a scenario, a WHERE condition is added to the SQL query. Some terms can have multiple meanings, and in those cases the engine builds a list of suggestions to help the user to narrow the results.
Aside: In case anyone is interested to know, ambigous terms are refined by prefixing them with a keyword. For example, 1954 could be a year or a serial number. The engine will suggest both of these scenarios to the user and modify the search term to either year:1954 or serial:1954.
Building the SQL query and the refine suggestions in the same loop feels somehow wrong to me, but to separate them would add more overhead because I would have to loop through the same array twice and test all the same scenarios twice. What is the better course of action?
I'd probably factor out the two actions into their own functions. Then you'd have
foreach (term in terms) {
doThing1();
doThing2();
}
which is nice and clean.
No. It's not bad. I would think looping twice would be more confusing.
Arguably some of the tasks might be put into functions if the tasks are decoupled enough from each other, however.
I don't think it makes sense to add multiple loops for the sake of theoretical purity, especially given that if you're going to add a loop against multiple scenarios you're going from an O(n) -> O(n*#scenarios). Another way to break this out without falling into the "God Method" trap would be to have a method that runs a single loop and returns an array of matches, and another that runs the search for each element in the match array.
Using the same loop seems as a valid optimization to me, try to keep the code of the two tasks independent so this optimization can be changed if necessary.
Your scenario fits the builder pattern and if each operation is fairly complex then it would serve you well to break things up a bit. This is waaaaaay over engineering if all your logic fits in 50 lines of code, but if you have dependencies to manage and complex logic, then you should be using a proven design pattern to achieve separation of concerns. It might look like this:
var relatedTermsBuilder = new RelatedTermsBuilder();
var whereClauseBuilder = new WhereClauseBuilder();
var compositeBuilder = new CompositeBuilder()
.Add(relatedTermsBuilder)
.Add(whereClauseBuilder);
var parser = new SearchTermParser(compositeBuilder);
parser.Execute("the search phrase");
string[] related = relatedTermsBuilder.Result;
string whereClause = whereClauseBuilder.Result;
The supporting objects would look like:
public interface ISearchTermBuilder {
void Build(string term);
}
public class SearchTermParser {
private readonly ISearchTermBuilder builder;
public SearchTermParser(ISearchTermBuilder builder) {
this.builder = builder;
}
public void Execute(string phrase) {
foreach (var term in Parse(phrase)) {
builder.Build(term);
}
}
private static IEnumerable<string> Parse(string phrase) {
throw new NotImplementedException();
}
}
I'd call it a code smell, but not a very bad one. I would separate out the functionality inside the loop, putting one of the things first, and then after a blank line and/or comment the other one.
I would look to it as if it were an instance of the observer pattern: each time you loop you raise an event, and as many observers as you want can subscribe to it. Of course it would be overkill to do it as the pattern but the similarities tell me that it is just fine to execute two or three or how many actions you want.
I don't think it's wrong to make two actions in one loop. I'd even suggest to make two methods that are called from inside the loop, like:
for (...) {
refineSuggestions(..)
buildQuery();
}
On the other hand, O(n) = O(2n)
So don't worry too much - it isn't such a performance sin.
You could certainly run two loops.
If a lot of this is business logic, you could also create some kind of data structure in the first loop, and then use that to generate the SQL, something like
search_objects = []
loop through term in terms
search_object = {}
search_object.string = term
// suggestion & rules code
search_object.suggestion = suggestion
search_object.rule = { 'contains', 'term' }
search_objects.push(search_object)
loop through search_object in search_objects
//generate SQL based on search_object.rule
This at least saves you from having to do if/then/elses in both loops, and I think it is a bit cleaner to move SQL code creation outside of the first loop.
If the things you're doing in the loop are related, then fine. It probably makes sense to code "the stuff for each iteration" and then wrap it in a loop, since that;s probably how you think of it in your head.
Add a comment and if it gets too long, look at splitting it or using simple utility methods.
I think one could argue that this may not exactly be language-agnostic; it's also highly dependent on what you're trying to accomplish. If you're putting multiple tasks in a loop in such a way that they cannot be easily parallelized by the compiler for a parallel environment, then it is definitely a code smell.

What's the best name for a non-mutating "add" method on an immutable collection? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 11 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Sorry for the waffly title - if I could come up with a concise title, I wouldn't have to ask the question.
Suppose I have an immutable list type. It has an operation Foo(x) which returns a new immutable list with the specified argument as an extra element at the end. So to build up a list of strings with values "Hello", "immutable", "world" you could write:
var empty = new ImmutableList<string>();
var list1 = empty.Foo("Hello");
var list2 = list1.Foo("immutable");
var list3 = list2.Foo("word");
(This is C# code, and I'm most interested in a C# suggestion if you feel the language is important. It's not fundamentally a language question, but the idioms of the language may be important.)
The important thing is that the existing lists are not altered by Foo - so empty.Count would still return 0.
Another (more idiomatic) way of getting to the end result would be:
var list = new ImmutableList<string>().Foo("Hello")
.Foo("immutable")
.Foo("word");
My question is: what's the best name for Foo?
EDIT 3: As I reveal later on, the name of the type might not actually be ImmutableList<T>, which makes the position clear. Imagine instead that it's TestSuite and that it's immutable because the whole of the framework it's a part of is immutable...
(End of edit 3)
Options I've come up with so far:
Add: common in .NET, but implies mutation of the original list
Cons: I believe this is the normal name in functional languages, but meaningless to those without experience in such languages
Plus: my favourite so far, it doesn't imply mutation to me. Apparently this is also used in Haskell but with slightly different expectations (a Haskell programmer might expect it to add two lists together rather than adding a single value to the other list).
With: consistent with some other immutable conventions, but doesn't have quite the same "additionness" to it IMO.
And: not very descriptive.
Operator overload for + : I really don't like this much; I generally think operators should only be applied to lower level types. I'm willing to be persuaded though!
The criteria I'm using for choosing are:
Gives the correct impression of the result of the method call (i.e. that it's the original list with an extra element)
Makes it as clear as possible that it doesn't mutate the existing list
Sounds reasonable when chained together as in the second example above
Please ask for more details if I'm not making myself clear enough...
EDIT 1: Here's my reasoning for preferring Plus to Add. Consider these two lines of code:
list.Add(foo);
list.Plus(foo);
In my view (and this is a personal thing) the latter is clearly buggy - it's like writing "x + 5;" as a statement on its own. The first line looks like it's okay, until you remember that it's immutable. In fact, the way that the plus operator on its own doesn't mutate its operands is another reason why Plus is my favourite. Without the slight ickiness of operator overloading, it still gives the same connotations, which include (for me) not mutating the operands (or method target in this case).
EDIT 2: Reasons for not liking Add.
Various answers are effectively: "Go with Add. That's what DateTime does, and String has Replace methods etc which don't make the immutability obvious." I agree - there's precedence here. However, I've seen plenty of people call DateTime.Add or String.Replace and expect mutation. There are loads of newsgroup questions (and probably SO ones if I dig around) which are answered by "You're ignoring the return value of String.Replace; strings are immutable, a new string gets returned."
Now, I should reveal a subtlety to the question - the type might not actually be an immutable list, but a different immutable type. In particular, I'm working on a benchmarking framework where you add tests to a suite, and that creates a new suite. It might be obvious that:
var list = new ImmutableList<string>();
list.Add("foo");
isn't going to accomplish anything, but it becomes a lot murkier when you change it to:
var suite = new TestSuite<string, int>();
suite.Add(x => x.Length);
That looks like it should be okay. Whereas this, to me, makes the mistake clearer:
var suite = new TestSuite<string, int>();
suite.Plus(x => x.Length);
That's just begging to be:
var suite = new TestSuite<string, int>().Plus(x => x.Length);
Ideally, I would like my users not to have to be told that the test suite is immutable. I want them to fall into the pit of success. This may not be possible, but I'd like to try.
I apologise for over-simplifying the original question by talking only about an immutable list type. Not all collections are quite as self-descriptive as ImmutableList<T> :)
In situations like that, I usually go with Concat. That usually implies to me that a new object is being created.
var p = listA.Concat(listB);
var k = listA.Concat(item);
I'd go with Cons, for one simple reason: it means exactly what you want it to.
I'm a huge fan of saying exactly what I mean, especially in source code. A newbie will have to look up the definition of Cons only once, but then read and use that a thousand times. I find that, in the long term, it's nicer to work with systems that make the common case easier, even if the up-front cost is a little bit higher.
The fact that it would be "meaningless" to people with no FP experience is actually a big advantage. As you pointed out, all of the other words you found already have some meaning, and that meaning is either slightly different or ambiguous. A new concept should have a new word (or in this case, an old one). I'd rather somebody have to look up the definition of Cons, than to assume incorrectly he knows what Add does.
Other operations borrowed from functional languages often keep their original names, with no apparent catastrophes. I haven't seen any push to come up with synonyms for "map" and "reduce" that sound more familiar to non-FPers, nor do I see any benefit from doing so.
(Full disclosure: I'm a Lisp programmer, so I already know what Cons means.)
Actually I like And, especially in the idiomatic way. I'd especially like it if you had a static readonly property for the Empty list, and perhaps make the constructor private so you always have to build from the empty list.
var list = ImmutableList<string>.Empty.And("Hello")
.And("Immutable")
.And("Word");
Whenever I'm in a jam with nomenclature, I hit up the interwebs.
thesaurus.com returns this for "add":
Definition: adjoin, increase; make
further comment
Synonyms: affix,
annex, ante, append, augment, beef
up, boost, build up, charge up,
continue, cue in, figure in, flesh
out, heat up, hike, hike up, hitch on,
hook on, hook up with, include, jack
up, jazz up, join together, pad,
parlay, piggyback, plug into, pour it
on, reply, run up, say further, slap
on, snowball, soup up, speed up,
spike, step up, supplement, sweeten,
tack on, tag
I like the sound of Adjoin, or more simply Join. That is what you're doing, right? The method could also apply to joining other ImmutableList<>'s.
Personally, I like .With(). If I was using the object, after reading the documentation or the code comments, it would be clear what it does, and it reads ok in the source code.
object.With("My new item as well");
Or, you add "Along" with it.. :)
object.AlongWith("this new item");
I ended up going with Add for all of my Immutable Collections in BclExtras. The reason being is that it's an easy predictable name. I'm not worried so much about people confusing Add with a mutating add since the name of the type is prefixed with Immutable.
For awhile I considered Cons and other functional style names. Eventually I discounted them because they're not nearly as well known. Sure functional programmers will understand but they're not the majority of users.
Other Names: you mentioned:
Plus: I'm wishy/washing on this one. For me this doesn't distinguish it as being a non-mutating operation anymore than Add does
With: Will cause issues with VB (pun intended)
Operator overloading: Discoverability would be an issue
Options I considered:
Concat: String's are Immutable and use this. Unfortunately it's only really good for adding to the end
CopyAdd: Copy what? The source, the list?
AddToNewList: Maybe a good one for List. But what about a Collection, Stack, Queue, etc ...
Unfortunately there doesn't really seem to be a word that is
Definitely an immutable operation
Understandable to the majority of users
Representable in less than 4 words
It gets even more odd when you consider collections other than List. Take for instance Stack. Even first year programmers can tell you that Stacks have a Push/Pop pair of methods. If you create an ImmutableStack and give it a completely different name, lets call it Foo/Fop, you've just added more work for them to use your collection.
Edit: Response to Plus Edit
I see where you're going with Plus. I think a stronger case would actually be Minus for remove. If I saw the following I would certainly wonder what in the world the programmer was thinking
list.Minus(obj);
The biggest problem I have with Plus/Minus or a new pairing is it feels like overkill. The collection itself already has a distinguishing name, the Immutable prefix. Why go further by adding vocabulary whose intent is to add the same distinction as the Immutable prefix already did.
I can see the call site argument. It makes it clearer from the standpoint of a single expression. But in the context of the entire function it seems unnecessary.
Edit 2
Agree that people have definitely been confused by String.Concat and DateTime.Add. I've seen several very bright programmers hit this problem.
However I think ImmutableList is a different argument. There is nothing about String or DateTime that establishes it as Immutable to a programmer. You must simply know that it's immutable via some other source. So the confusion is not unexpected.
ImmutableList does not have that problem because the name defines it's behavior. You could argue that people don't know what Immutable is and I think that's also valid. I certainly didn't know it till about year 2 in college. But you have the same issue with whatever name you choose instead of Add.
Edit 3: What about types like TestSuite which are immutable but do not contain the word?
I think this drives home the idea that you shouldn't be inventing new method names. Namely because there is clearly a drive to make types immutable in order to facilitate parallel operations. If you focus on changing the name of methods for collections, the next step will be the mutating method names on every type you use that is immutable.
I think it would be a more valuable effort to instead focus on making types identifiable as Immutable. That way you can solve the problem without rethinking every mutating method pattern out there.
Now how can you identify TestSuite as Immutable? In todays environment I think there are a few ways
Prefix with Immutable: ImmutableTestSuite
Add an Attribute which describes the level of Immutablitiy. This is certainly less discoverable
Not much else.
My guess/hope is development tools will start helping this problem by making it easy to identify immutable types simply by sight (different color, stronger font, etc ...). But I think that's the answer though over changing all of the method names.
I think this may be one of those rare situations where it's acceptable to overload the + operator. In math terminology, we know that + doesn't append something to the end of something else. It always combines two values together and returns a new resulting value.
For example, it's intuitively obvious that when you say
x = 2 + 2;
the resulting value of x is 4, not 22.
Similarly,
var empty = new ImmutableList<string>();
var list1 = empty + "Hello";
var list2 = list1 + "immutable";
var list3 = list2 + "word";
should make clear what each variable is going to hold. It should be clear that list2 is not changed in the last line, but instead that list3 is assigned the result of appending "word" to list2.
Otherwise, I would just name the function Plus().
To be as clear as possible, you might want to go with the wordier CopyAndAdd, or something similar.
I would call it Extend() or maybe ExtendWith() if you feel like really verbose.
Extends means adding something to something else without changing it. I think this is very relevant terminology in C# since this is similar to the concept of extension methods - they "add" a new method to a class without "touching" the class itself.
Otherwise, if you really want to emphasize that you don't modify the original object at all, using some prefix like Get- looks like unavoidable to me.
Added(), Appended()
I like to use the past tense for operations on immutable objects. It conveys the idea that you aren't changing the original object, and it's easy to recognize when you see it.
Also, because mutating method names are often present-tense verbs, it applies to most of the immutable-method-name-needed cases you run into. For example an immutable stack has the methods "pushed" and "popped".
I like mmyers suggestion of CopyAndAdd. In keeping with a "mutation" theme, maybe you could go with Bud (asexual reproduction), Grow, Replicate, or Evolve? =)
EDIT: To continue with my genetic theme, how about Procreate, implying that a new object is made which is based on the previous one, but with something new added.
This is probably a stretch, but in Ruby there is a commonly used notation for the distinction: add doesn't mutate; add! mutates. If this is an pervasive problem in your project, you could do that too (not necessarily with non-alphabetic characters, but consistently using a notation to indicate mutating/non-mutating methods).
Join seems appropriate.
Maybe the confusion stems from the fact that you want two operations in one. Why not separate them? DSL style:
var list = new ImmutableList<string>("Hello");
var list2 = list.Copy().With("World!");
Copy would return an intermediate object, that's a mutable copy of the original list. With would return a new immutable list.
Update:
But, having an intermediate, mutable collection around is not a good approach. The intermediate object should be contained in the Copy operation:
var list1 = new ImmutableList<string>("Hello");
var list2 = list1.Copy(list => list.Add("World!"));
Now, the Copy operation takes a delegate, which receives a mutable list, so that it can control the copy outcome. It can do much more than appending an element, like removing elements or sorting the list. It can also be used in the ImmutableList constructor to assemble the initial list without intermediary immutable lists.
public ImmutableList<T> Copy(Action<IList<T>> mutate) {
if (mutate == null) return this;
var list = new List<T>(this);
mutate(list);
return new ImmutableList<T>(list);
}
Now there's no possibility of misinterpretation by the users, they will naturally fall into the pit of success.
Yet another update:
If you still don't like the mutable list mention, even now that it's contained, you can design a specification object, that will specify, or script, how the copy operation will transform its list. The usage will be the same:
var list1 = new ImmutableList<string>("Hello");
// rules is a specification object, that takes commands to run in the copied collection
var list2 = list1.Copy(rules => rules.Append("World!"));
Now you can be creative with the rules names and you can only expose the functionality that you want Copy to support, not the entire capabilities of an IList.
For the chaining usage, you can create a reasonable constructor (which will not use chaining, of course):
public ImmutableList(params T[] elements) ...
...
var list = new ImmutableList<string>("Hello", "immutable", "World");
Or use the same delegate in another constructor:
var list = new ImmutableList<string>(rules =>
rules
.Append("Hello")
.Append("immutable")
.Append("World")
);
This assumes that the rules.Append method returns this.
This is what it would look like with your latest example:
var suite = new TestSuite<string, int>(x => x.Length);
var otherSuite = suite.Copy(rules =>
rules
.Append(x => Int32.Parse(x))
.Append(x => x.GetHashCode())
);
A few random thoughts:
ImmutableAdd()
Append()
ImmutableList<T>(ImmutableList<T> originalList, T newItem) Constructor
DateTime in C# uses Add. So why not use the same name? As long the users of your class understand the class is immutable.
I think the key thing you're trying to get at that's hard to express is the nonpermutation, so maybe something with a generative word in it, something like CopyWith() or InstancePlus().
I don't think the English language will let you imply immutability in an unmistakable way while using a verb that means the same thing as "Add". "Plus" almost does it, but people can still make the mistake.
The only way you're going to prevent your users from mistaking the object for something mutable is by making it explicit, either through the name of the object itself or through the name of the method (as with the verbose options like "GetCopyWith" or "CopyAndAdd").
So just go with your favourite, "Plus."
First, an interesting starting point:
http://en.wikipedia.org/wiki/Naming_conventions_(programming) ...In particular, check the "See Also" links at the bottom.
I'm in favor of either Plus or And, effectively equally.
Plus and And are both math-based in etymology. As such, both connote mathematical operation; both yield an expression which reads naturally as expressions which may resolve into a value, which fits with the method having a return value. And bears additional logic connotation, but both words apply intuitively to lists. Add connotes action performed on an object, which conflicts with the method's immutable semantics.
Both are short, which is especially important given the primitiveness of the operation. Simple, frequently-performed operations deserve shorter names.
Expressing immutable semantics is something I prefer to do via context. That is, I'd rather simply imply that this entire block of code has a functional feel; assume everything is immutable. That might just be me, however. I prefer immutability to be the rule; if it's done, it's done a lot in the same place; mutability is the exception.
How about Chain() or Attach()?
I prefer Plus (and Minus). They are easily understandable and map directly to operations involving well known immutable types (the numbers). 2+2 doesn't change the value of 2, it returns a new, equally immutable, value.
Some other possibilities:
Splice()
Graft()
Accrete()
How about mate, mateWith, or coitus, for those who abide. In terms of reproducing mammals are generally considered immutable.
Going to throw Union out there too. Borrowed from SQL.
Apparently I'm the first Obj-C/Cocoa person to answer this question.
NNString *empty = [[NSString alloc] init];
NSString *list1 = [empty stringByAppendingString:#"Hello"];
NSString *list2 = [list1 stringByAppendingString:#"immutable"];
NSString *list3 = [list2 stringByAppendingString:#"word"];
Not going to win any code golf games with this.
I think "Add" or "Plus" sounds fine. The name of the list itself should be enough to convey the list's immutability.
Maybe there are some words which remember me more of making a copy and add stuff to that instead of mutating the instance (like "Concatenate"). But i think having some symmetry for those words for other actions would be good to have too. I don't know of a similar word for "Remove" that i think of the same kind like "Concatenate". "Plus" sounds little strange to me. I wouldn't expect it being used in a non-numerical context. But that could aswell come from my non-english background.
Maybe i would use this scheme
AddToCopy
RemoveFromCopy
InsertIntoCopy
These have their own problems though, when i think about it. One could think they remove something or add something to an argument given. Not sure about it at all. Those words do not play nice in chaining either, i think. Too wordy to type.
Maybe i would just use plain "Add" and friends too. I like how it is used in math
Add 1 to 2 and you get 3
Well, certainly, a 2 remains a 2 and you get a new number. This is about two numbers and not about a list and an element, but i think it has some analogy. In my opinion, add does not necessarily mean you mutate something. I certainly see your point that having a lonely statement containing just an add and not using the returned new object does not look buggy. But I've now also thought some time about that idea of using another name than "add" but i just can't come up with another name, without making me think "hmm, i would need to look at the documentation to know what it is about" because its name differs from what I would expect to be called "add". Just some weird thought about this from litb, not sure it makes sense at all :)
Looking at http://thesaurus.reference.com/browse/add and http://thesaurus.reference.com/browse/plus I found gain and affix but I'm not sure how much they imply non-mutation.
I think that Plus() and Minus() or, alternatively, Including(), Excluding() are reasonable at implying immutable behavior.
However, no naming choice will ever make it perfectly clear to everyone, so I personally believe that a good xml doc comment would go a very long way here. VS throws these right in your face when you write code in the IDE - they're hard to ignore.
Append - because, note that names of the System.String methods suggest that they mutate the instance, but they don't.
Or I quite like AfterAppending:
void test()
{
Bar bar = new Bar();
List list = bar.AfterAppending("foo");
}
list.CopyWith(element)
As does Smalltalk :)
And also list.copyWithout(element) that removes all occurrences of an element, which is most useful when used as list.copyWithout(null) to remove unset elements.
I would go for Add, because I can see the benefit of a better name, but the problem would be to find different names for every other immutable operation which might make the class quite unfamiliar if that makes sense.