Python equivalent of PHP's # - exception

Is there a Python equivalent of PHP's #?
#function_which_is_doomed_to_fail();
I've always used this block:
try:
foo()
except:
pass
But I know there has to be a better way.
Does anyone know how I can Pythonicify that code?
I think adding some context to that code would be appropriate:
for line in blkid:
line = line.strip()
partition = Partition()
try:
partition.identifier = re.search(r'^(/dev/[a-zA-Z0-9]+)', line).group(0)
except:
pass
try:
partition.label = re.search(r'LABEL="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
try:
partition.uuid = re.search(r'UUID="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
try:
partition.type = re.search(r'TYPE="((?:[^"\\]|\\.)*)"', line).group(1)
except:
pass
partitions.add(partition)

What you are looking for is anti-pythonic, because:
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In your case, I would use something like this:
match = re.search(r'^(/dev/[a-zA-Z0-9]+)', line)
if match:
partition.identifier = match.group(0)
And you have 3 lines instead of 4.

There is no better way. Silently ignoring error is bad practice in any language, so it's naturally not Pythonic.

Building upon Gabi Purcanu's answer and your desire to condense to one-liners, you could encapsulate his solution into a function and reduce your example:
def cond_match(regexp, line, grp):
match = re.search(regexp, line)
if match:
return match.group(grp)
else:
return None
for line in blkid:
line = line.strip()
partition = Partition()
partition.identifier = cond_match(r'^(/dev/[a-zA-Z0-9]+)', line, 0)
partition.label = cond_match(r'LABEL="((?:[^"\\]|\\.)*)"', line, 1)
partition.uuid = cond_match(r'UUID="((?:[^"\\]|\\.)*)"', line, 1)
partition.type = cond_match(r'TYPE="((?:[^"\\]|\\.)*)"', line, 1)
partitions.add(partition)

Please don't ask for Python to be like PHP. You should always explicitly trap the most specific error you can. Catching and ignoring all errors like that is not good best practice. This is because it can hide other problems and make bugs harder to find. But in the case of REs, you should really check for the None value that it returns. For example, your code:
label = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line).group(1)
Raises an AttributeError if there is not match, because the re.search returns None if there is no match. But what if there was a match but you had a typo in your code:
label = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line).roup(1)
This also raises an AttributeError, even if there was a match. But using the catchall exception and ignoring it would mask that error from you. You will never match a label in that case, and you would never know it until you found it some other way, such as by eventually noticing that your code never matches a label (but hopefully you have unit tests for that case...)
For REs, the usual pattern is this:
matchobj = re.search(r'LABEL="((?:[^"\\]|\.)*)"', line)
if matchobj:
label = matchobj.group(1)
No need to try and catch an exception here since there would not be one. Except... when there was an exception caused by a similar typo.

Use data-driven design instead of repeating yourself. Naming the relevant group also makes it easier to avoid group indexing bugs:
_components = dict(
identifier = re.compile(r'^(?P<value>/dev/[a-zA-Z0-9]+)'),
label = re.compile(r'LABEL="(?P<value>(?:[^"\\]|\\.)*)"'),
uuid = re.compile(r'UUID="(?P<value>(?:[^"\\]|\\.)*)"'),
type = re.compile(r'TYPE="(?P<value>(?:[^"\\]|\\.)*)"'),
)
for line in blkid:
line = line.strip()
partition = Partition()
for name, pattern in _components:
match = pattern.search(line)
value = match.group('value') if match else None
setattr(partition, name, value)
partitions.add(partition)

There is warnings control in Python - http://docs.python.org/library/warnings.html
After edit:
You probably want to check if it is not None before trying to get the groups.
Also use len() on the groups to see how many groups you have got. "Pass"ing the error is definitely not the way to go.

Related

Does JSON.stringify a string protect against (My)SQL injection?

I've run across some node.js code that gets a user-supplied string, calls JSON.stringify(str) and injects the value directly into an SQL statement.
e.g.
var x = JSON.stringify(UNSAFE_USER_STRING);
mysql_execute('UPDATE foo SET v = ' + x + ' WHERE id = 1');
Obviously this is an abuse of JSON.stringify, however this is not my code and the authors would like to see an attack vector before they patch it. Because UNSAFE_USER_STRING is a string, not an object and does escaping of the obvious " and \ it's not obvious if there is a serious problem
Is this code safe? And if not, could someone demonstrate what would be unsafe input?
Thanks!
If you are sure x is a string, then I'm 99% sure this makes it impossible to conduct an SQL injection attack. My confidence goes down to 90% when you are unsure of the type for x. That said, considering all of the following should not pose a vulnerability:
Null, NaN, Infinity, -Infinity all seem to come back as null which is safe.
Undefined comes back as the value undefined, not a string, so I'm not sure about that. I think it would just be considered invalid SQL rather than pose a vulnerability.
Date in node.js JSON.stringify(new Date()) returns '"2015-11-09T18:53:46.198Z"' which is exactly what you'd want.
Arrays and Objects should result in invalid SQL although a smart conversion could enable successful use of SQL arrays. That said, there might be some tricky way to fill the array with Objects that might cause a vulnerability, but I doubt it.
Hex seems to just convert it to an integer.
Buffers and Uint8Arrays seem to come back as objects. Again, there might be some way to populate the Object with something that would be a vulnerability, but I doubt it.
Even if characters like " are being escaped. Character(combinations) used for comments like -- or # could still cause the WHERE clause to be ignored.

Nokogiri returns "no method error"

I keep getting the same error in my program. I've written a method that takes some messy HTML and turns it into neater strings. This works fine on its own, however when I run the whole program I get the following error:
kamer.rb:9:in `normalise_instrumentation': undefined method `split' for #<Nokogiri::XML::NodeSet:0x007f92cb93bfb0> (NoMethodError)
I'd be really grateful for any info or advice on why this happens and how to stop it.
The code is here:
require 'nokogiri'
require 'open-uri'
def normalise_instrumentation(instrumentation)
messy_array = instrumentation.split('.')
normal_array = []
messy_array.each do |section|
if section =~ /\A\d+\z/
normal_array << section
end
end
return normal_array
end
doc = Nokogiri::HTML(open('http://www.cs.vu.nl/~rutger/vuko/nl/lijst_van_ooit/complete-solo.html'))
table = doc.css('table[summary=works] tr')
work_value = []
work_hash = {}
table.each do |row|
piece = [row.css('td[1]'), row.css('td[2]'), row.css('td[3]')].map { |r|
r.text.strip!
}
work_value = work_value.push(piece)
work_key = normalise_instrumentation(row.css('td[3]'))
work_hash[work_key] = work_value
end
puts work_hash
The problem is here:
row.css('td[3]')
Here's why:
row.css('td[3]').class
# => Nokogiri::XML::NodeSet < Object
You're creating your piece array which then becomes an array of NodeSets, which is probably not what you want, because text against a NodeSet often returns a weird String of concatenated text from multiple nodes. You're not seeing that happen here because you're searching inside a row (<tr>) but if you were to look one level up, in the <table>, you'd have a cocked gun pointed at your foot.
Passing a NodeSet to your normalise_instrumentation method is a problem because NodeSet doesn't have a split method, which is the error you're seeing.
But, it gets worse before it gets better. css, like search and xpath returns a NodeSet, which is akin to an Array. Passing an array-like critter to the method will still result in confusion, because you really want just the Node found, not a set of Nodes. So I'd probably use:
row.at('td[3]')
which will return only the node.
At this point you probably want the text of that node, something like
row.at('td[3]').text
would make more sense because then the method would receive a String, which does have a split method.
However, it appears there are additional problems, because some of the cells you want don't exist, so you'll get nil values also.
This isn't one of my better answers, because I'm still trying to grok what you're doing. Providing us with a minimal example of the HTML you need to parse, and the output you want to capture, will help us fine-tune your code to get what you want.
I had a similar error (undefined method) for a different reason, in my case it was due to an extra dot (put by mistake) like this:
status = data.css.("status font-large").text
where it was fixed by removing the extra dot after the css as shown below
status = data.css("status font-large").text
I hope this helps someone else

Elegant way to handle "impossible" code paths

Occasionally I'll have a situation where I've written some code and, based on its logic, a certain path is impossible. For example:
activeGames = [10, 20, 30]
limit = 4
def getBestActiveGameStat():
if not activeGames: return None
return max(activeGames)
def bah():
if limit == 0: return "Limit is 0"
if len(activeGames) >= limit:
somestat = getBestActiveGameStat()
if somestat is None:
print "The universe has exploded"
#etc...
What would go in the universe exploding line? If limit is 0, then the function returns. If len(activeGames) >= limit, then there must be at least one active game, so getBestActiveGameStat() can't return None. So, should I even check for it?
The same also happens with something like a while loop which always returns in the loop:
def hmph():
while condition:
if foo: return "yep"
doStuffToMakeFooTrue()
raise SingularityFlippedMyBitsError()
Since I "know" it's impossible, should anything even be there?
If len(activeGames) >= limit, then
there must be at least one active
game, so getBestActiveGameStat() can't
return None. So, should I even check
for it?
Sometimes we make mistakes. You could have a program error now -- or someone could create one later.
Those errors might result in exceptions or failed unit tests. But debugging is expensive; it's useful to have multiple ways to detect errors.
A quickly written assert statement can express an expected invariant to human readers. And when debugging, a failed assertion can pinpoint an error quickly.
Sutter and Alexandrescu address this issue in "C++ Coding Standards." Despite the title, their arguments and guidelines are are language agnostic.
Assert liberally to document internal assumptions and invariants
... Use assert or an equivalent liberally to document assumptions internal to a module ... that must always be true and otherwise represent programming errors.
For example, if the default case in a switch statement cannot occur, add the case with assert(false).
IMHO, the first example is really more a question of how catastrophic failures are presented to the user. In the event that someone does something really silly and sets activeGames to none, most languages will throw a NullPointer/InvalidReference type of exception. If you have a good system for catching these kinds of errors and handling them elegantly, then I would argue that you leave these guards out entirely.
If you have a decent set of unit tests, they will ensure with huge amounts of certainty that this kind of problem does not escape the developers machine.
As for the second one, what you're really guarding against is a race condition. What if the "doStuffToMakeFooTrue()" method never makes foo true? This code will eventually run itself into the ground. Rather than risk that, I'll usually put code like this on a timer. If your language has closures or function pointers (honestly not sure about Python...), you can hide the implementation of the timing logic in a nice helper method, and call it this way:
withTiming(hmph, 30) // run for 30 seconds, then fail
If you don't have closures or function pointers, you'll have to do it the long way everywhere:
stopwatch = new Stopwatch(30)
stopwatch.start()
while stopwatch.elapsedTimeInSeconds() < 30
hmph()
raise OperationTimedOutError()

What's the best name for a non-mutating "add" method on an immutable collection? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 11 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Sorry for the waffly title - if I could come up with a concise title, I wouldn't have to ask the question.
Suppose I have an immutable list type. It has an operation Foo(x) which returns a new immutable list with the specified argument as an extra element at the end. So to build up a list of strings with values "Hello", "immutable", "world" you could write:
var empty = new ImmutableList<string>();
var list1 = empty.Foo("Hello");
var list2 = list1.Foo("immutable");
var list3 = list2.Foo("word");
(This is C# code, and I'm most interested in a C# suggestion if you feel the language is important. It's not fundamentally a language question, but the idioms of the language may be important.)
The important thing is that the existing lists are not altered by Foo - so empty.Count would still return 0.
Another (more idiomatic) way of getting to the end result would be:
var list = new ImmutableList<string>().Foo("Hello")
.Foo("immutable")
.Foo("word");
My question is: what's the best name for Foo?
EDIT 3: As I reveal later on, the name of the type might not actually be ImmutableList<T>, which makes the position clear. Imagine instead that it's TestSuite and that it's immutable because the whole of the framework it's a part of is immutable...
(End of edit 3)
Options I've come up with so far:
Add: common in .NET, but implies mutation of the original list
Cons: I believe this is the normal name in functional languages, but meaningless to those without experience in such languages
Plus: my favourite so far, it doesn't imply mutation to me. Apparently this is also used in Haskell but with slightly different expectations (a Haskell programmer might expect it to add two lists together rather than adding a single value to the other list).
With: consistent with some other immutable conventions, but doesn't have quite the same "additionness" to it IMO.
And: not very descriptive.
Operator overload for + : I really don't like this much; I generally think operators should only be applied to lower level types. I'm willing to be persuaded though!
The criteria I'm using for choosing are:
Gives the correct impression of the result of the method call (i.e. that it's the original list with an extra element)
Makes it as clear as possible that it doesn't mutate the existing list
Sounds reasonable when chained together as in the second example above
Please ask for more details if I'm not making myself clear enough...
EDIT 1: Here's my reasoning for preferring Plus to Add. Consider these two lines of code:
list.Add(foo);
list.Plus(foo);
In my view (and this is a personal thing) the latter is clearly buggy - it's like writing "x + 5;" as a statement on its own. The first line looks like it's okay, until you remember that it's immutable. In fact, the way that the plus operator on its own doesn't mutate its operands is another reason why Plus is my favourite. Without the slight ickiness of operator overloading, it still gives the same connotations, which include (for me) not mutating the operands (or method target in this case).
EDIT 2: Reasons for not liking Add.
Various answers are effectively: "Go with Add. That's what DateTime does, and String has Replace methods etc which don't make the immutability obvious." I agree - there's precedence here. However, I've seen plenty of people call DateTime.Add or String.Replace and expect mutation. There are loads of newsgroup questions (and probably SO ones if I dig around) which are answered by "You're ignoring the return value of String.Replace; strings are immutable, a new string gets returned."
Now, I should reveal a subtlety to the question - the type might not actually be an immutable list, but a different immutable type. In particular, I'm working on a benchmarking framework where you add tests to a suite, and that creates a new suite. It might be obvious that:
var list = new ImmutableList<string>();
list.Add("foo");
isn't going to accomplish anything, but it becomes a lot murkier when you change it to:
var suite = new TestSuite<string, int>();
suite.Add(x => x.Length);
That looks like it should be okay. Whereas this, to me, makes the mistake clearer:
var suite = new TestSuite<string, int>();
suite.Plus(x => x.Length);
That's just begging to be:
var suite = new TestSuite<string, int>().Plus(x => x.Length);
Ideally, I would like my users not to have to be told that the test suite is immutable. I want them to fall into the pit of success. This may not be possible, but I'd like to try.
I apologise for over-simplifying the original question by talking only about an immutable list type. Not all collections are quite as self-descriptive as ImmutableList<T> :)
In situations like that, I usually go with Concat. That usually implies to me that a new object is being created.
var p = listA.Concat(listB);
var k = listA.Concat(item);
I'd go with Cons, for one simple reason: it means exactly what you want it to.
I'm a huge fan of saying exactly what I mean, especially in source code. A newbie will have to look up the definition of Cons only once, but then read and use that a thousand times. I find that, in the long term, it's nicer to work with systems that make the common case easier, even if the up-front cost is a little bit higher.
The fact that it would be "meaningless" to people with no FP experience is actually a big advantage. As you pointed out, all of the other words you found already have some meaning, and that meaning is either slightly different or ambiguous. A new concept should have a new word (or in this case, an old one). I'd rather somebody have to look up the definition of Cons, than to assume incorrectly he knows what Add does.
Other operations borrowed from functional languages often keep their original names, with no apparent catastrophes. I haven't seen any push to come up with synonyms for "map" and "reduce" that sound more familiar to non-FPers, nor do I see any benefit from doing so.
(Full disclosure: I'm a Lisp programmer, so I already know what Cons means.)
Actually I like And, especially in the idiomatic way. I'd especially like it if you had a static readonly property for the Empty list, and perhaps make the constructor private so you always have to build from the empty list.
var list = ImmutableList<string>.Empty.And("Hello")
.And("Immutable")
.And("Word");
Whenever I'm in a jam with nomenclature, I hit up the interwebs.
thesaurus.com returns this for "add":
Definition: adjoin, increase; make
further comment
Synonyms: affix,
annex, ante, append, augment, beef
up, boost, build up, charge up,
continue, cue in, figure in, flesh
out, heat up, hike, hike up, hitch on,
hook on, hook up with, include, jack
up, jazz up, join together, pad,
parlay, piggyback, plug into, pour it
on, reply, run up, say further, slap
on, snowball, soup up, speed up,
spike, step up, supplement, sweeten,
tack on, tag
I like the sound of Adjoin, or more simply Join. That is what you're doing, right? The method could also apply to joining other ImmutableList<>'s.
Personally, I like .With(). If I was using the object, after reading the documentation or the code comments, it would be clear what it does, and it reads ok in the source code.
object.With("My new item as well");
Or, you add "Along" with it.. :)
object.AlongWith("this new item");
I ended up going with Add for all of my Immutable Collections in BclExtras. The reason being is that it's an easy predictable name. I'm not worried so much about people confusing Add with a mutating add since the name of the type is prefixed with Immutable.
For awhile I considered Cons and other functional style names. Eventually I discounted them because they're not nearly as well known. Sure functional programmers will understand but they're not the majority of users.
Other Names: you mentioned:
Plus: I'm wishy/washing on this one. For me this doesn't distinguish it as being a non-mutating operation anymore than Add does
With: Will cause issues with VB (pun intended)
Operator overloading: Discoverability would be an issue
Options I considered:
Concat: String's are Immutable and use this. Unfortunately it's only really good for adding to the end
CopyAdd: Copy what? The source, the list?
AddToNewList: Maybe a good one for List. But what about a Collection, Stack, Queue, etc ...
Unfortunately there doesn't really seem to be a word that is
Definitely an immutable operation
Understandable to the majority of users
Representable in less than 4 words
It gets even more odd when you consider collections other than List. Take for instance Stack. Even first year programmers can tell you that Stacks have a Push/Pop pair of methods. If you create an ImmutableStack and give it a completely different name, lets call it Foo/Fop, you've just added more work for them to use your collection.
Edit: Response to Plus Edit
I see where you're going with Plus. I think a stronger case would actually be Minus for remove. If I saw the following I would certainly wonder what in the world the programmer was thinking
list.Minus(obj);
The biggest problem I have with Plus/Minus or a new pairing is it feels like overkill. The collection itself already has a distinguishing name, the Immutable prefix. Why go further by adding vocabulary whose intent is to add the same distinction as the Immutable prefix already did.
I can see the call site argument. It makes it clearer from the standpoint of a single expression. But in the context of the entire function it seems unnecessary.
Edit 2
Agree that people have definitely been confused by String.Concat and DateTime.Add. I've seen several very bright programmers hit this problem.
However I think ImmutableList is a different argument. There is nothing about String or DateTime that establishes it as Immutable to a programmer. You must simply know that it's immutable via some other source. So the confusion is not unexpected.
ImmutableList does not have that problem because the name defines it's behavior. You could argue that people don't know what Immutable is and I think that's also valid. I certainly didn't know it till about year 2 in college. But you have the same issue with whatever name you choose instead of Add.
Edit 3: What about types like TestSuite which are immutable but do not contain the word?
I think this drives home the idea that you shouldn't be inventing new method names. Namely because there is clearly a drive to make types immutable in order to facilitate parallel operations. If you focus on changing the name of methods for collections, the next step will be the mutating method names on every type you use that is immutable.
I think it would be a more valuable effort to instead focus on making types identifiable as Immutable. That way you can solve the problem without rethinking every mutating method pattern out there.
Now how can you identify TestSuite as Immutable? In todays environment I think there are a few ways
Prefix with Immutable: ImmutableTestSuite
Add an Attribute which describes the level of Immutablitiy. This is certainly less discoverable
Not much else.
My guess/hope is development tools will start helping this problem by making it easy to identify immutable types simply by sight (different color, stronger font, etc ...). But I think that's the answer though over changing all of the method names.
I think this may be one of those rare situations where it's acceptable to overload the + operator. In math terminology, we know that + doesn't append something to the end of something else. It always combines two values together and returns a new resulting value.
For example, it's intuitively obvious that when you say
x = 2 + 2;
the resulting value of x is 4, not 22.
Similarly,
var empty = new ImmutableList<string>();
var list1 = empty + "Hello";
var list2 = list1 + "immutable";
var list3 = list2 + "word";
should make clear what each variable is going to hold. It should be clear that list2 is not changed in the last line, but instead that list3 is assigned the result of appending "word" to list2.
Otherwise, I would just name the function Plus().
To be as clear as possible, you might want to go with the wordier CopyAndAdd, or something similar.
I would call it Extend() or maybe ExtendWith() if you feel like really verbose.
Extends means adding something to something else without changing it. I think this is very relevant terminology in C# since this is similar to the concept of extension methods - they "add" a new method to a class without "touching" the class itself.
Otherwise, if you really want to emphasize that you don't modify the original object at all, using some prefix like Get- looks like unavoidable to me.
Added(), Appended()
I like to use the past tense for operations on immutable objects. It conveys the idea that you aren't changing the original object, and it's easy to recognize when you see it.
Also, because mutating method names are often present-tense verbs, it applies to most of the immutable-method-name-needed cases you run into. For example an immutable stack has the methods "pushed" and "popped".
I like mmyers suggestion of CopyAndAdd. In keeping with a "mutation" theme, maybe you could go with Bud (asexual reproduction), Grow, Replicate, or Evolve? =)
EDIT: To continue with my genetic theme, how about Procreate, implying that a new object is made which is based on the previous one, but with something new added.
This is probably a stretch, but in Ruby there is a commonly used notation for the distinction: add doesn't mutate; add! mutates. If this is an pervasive problem in your project, you could do that too (not necessarily with non-alphabetic characters, but consistently using a notation to indicate mutating/non-mutating methods).
Join seems appropriate.
Maybe the confusion stems from the fact that you want two operations in one. Why not separate them? DSL style:
var list = new ImmutableList<string>("Hello");
var list2 = list.Copy().With("World!");
Copy would return an intermediate object, that's a mutable copy of the original list. With would return a new immutable list.
Update:
But, having an intermediate, mutable collection around is not a good approach. The intermediate object should be contained in the Copy operation:
var list1 = new ImmutableList<string>("Hello");
var list2 = list1.Copy(list => list.Add("World!"));
Now, the Copy operation takes a delegate, which receives a mutable list, so that it can control the copy outcome. It can do much more than appending an element, like removing elements or sorting the list. It can also be used in the ImmutableList constructor to assemble the initial list without intermediary immutable lists.
public ImmutableList<T> Copy(Action<IList<T>> mutate) {
if (mutate == null) return this;
var list = new List<T>(this);
mutate(list);
return new ImmutableList<T>(list);
}
Now there's no possibility of misinterpretation by the users, they will naturally fall into the pit of success.
Yet another update:
If you still don't like the mutable list mention, even now that it's contained, you can design a specification object, that will specify, or script, how the copy operation will transform its list. The usage will be the same:
var list1 = new ImmutableList<string>("Hello");
// rules is a specification object, that takes commands to run in the copied collection
var list2 = list1.Copy(rules => rules.Append("World!"));
Now you can be creative with the rules names and you can only expose the functionality that you want Copy to support, not the entire capabilities of an IList.
For the chaining usage, you can create a reasonable constructor (which will not use chaining, of course):
public ImmutableList(params T[] elements) ...
...
var list = new ImmutableList<string>("Hello", "immutable", "World");
Or use the same delegate in another constructor:
var list = new ImmutableList<string>(rules =>
rules
.Append("Hello")
.Append("immutable")
.Append("World")
);
This assumes that the rules.Append method returns this.
This is what it would look like with your latest example:
var suite = new TestSuite<string, int>(x => x.Length);
var otherSuite = suite.Copy(rules =>
rules
.Append(x => Int32.Parse(x))
.Append(x => x.GetHashCode())
);
A few random thoughts:
ImmutableAdd()
Append()
ImmutableList<T>(ImmutableList<T> originalList, T newItem) Constructor
DateTime in C# uses Add. So why not use the same name? As long the users of your class understand the class is immutable.
I think the key thing you're trying to get at that's hard to express is the nonpermutation, so maybe something with a generative word in it, something like CopyWith() or InstancePlus().
I don't think the English language will let you imply immutability in an unmistakable way while using a verb that means the same thing as "Add". "Plus" almost does it, but people can still make the mistake.
The only way you're going to prevent your users from mistaking the object for something mutable is by making it explicit, either through the name of the object itself or through the name of the method (as with the verbose options like "GetCopyWith" or "CopyAndAdd").
So just go with your favourite, "Plus."
First, an interesting starting point:
http://en.wikipedia.org/wiki/Naming_conventions_(programming) ...In particular, check the "See Also" links at the bottom.
I'm in favor of either Plus or And, effectively equally.
Plus and And are both math-based in etymology. As such, both connote mathematical operation; both yield an expression which reads naturally as expressions which may resolve into a value, which fits with the method having a return value. And bears additional logic connotation, but both words apply intuitively to lists. Add connotes action performed on an object, which conflicts with the method's immutable semantics.
Both are short, which is especially important given the primitiveness of the operation. Simple, frequently-performed operations deserve shorter names.
Expressing immutable semantics is something I prefer to do via context. That is, I'd rather simply imply that this entire block of code has a functional feel; assume everything is immutable. That might just be me, however. I prefer immutability to be the rule; if it's done, it's done a lot in the same place; mutability is the exception.
How about Chain() or Attach()?
I prefer Plus (and Minus). They are easily understandable and map directly to operations involving well known immutable types (the numbers). 2+2 doesn't change the value of 2, it returns a new, equally immutable, value.
Some other possibilities:
Splice()
Graft()
Accrete()
How about mate, mateWith, or coitus, for those who abide. In terms of reproducing mammals are generally considered immutable.
Going to throw Union out there too. Borrowed from SQL.
Apparently I'm the first Obj-C/Cocoa person to answer this question.
NNString *empty = [[NSString alloc] init];
NSString *list1 = [empty stringByAppendingString:#"Hello"];
NSString *list2 = [list1 stringByAppendingString:#"immutable"];
NSString *list3 = [list2 stringByAppendingString:#"word"];
Not going to win any code golf games with this.
I think "Add" or "Plus" sounds fine. The name of the list itself should be enough to convey the list's immutability.
Maybe there are some words which remember me more of making a copy and add stuff to that instead of mutating the instance (like "Concatenate"). But i think having some symmetry for those words for other actions would be good to have too. I don't know of a similar word for "Remove" that i think of the same kind like "Concatenate". "Plus" sounds little strange to me. I wouldn't expect it being used in a non-numerical context. But that could aswell come from my non-english background.
Maybe i would use this scheme
AddToCopy
RemoveFromCopy
InsertIntoCopy
These have their own problems though, when i think about it. One could think they remove something or add something to an argument given. Not sure about it at all. Those words do not play nice in chaining either, i think. Too wordy to type.
Maybe i would just use plain "Add" and friends too. I like how it is used in math
Add 1 to 2 and you get 3
Well, certainly, a 2 remains a 2 and you get a new number. This is about two numbers and not about a list and an element, but i think it has some analogy. In my opinion, add does not necessarily mean you mutate something. I certainly see your point that having a lonely statement containing just an add and not using the returned new object does not look buggy. But I've now also thought some time about that idea of using another name than "add" but i just can't come up with another name, without making me think "hmm, i would need to look at the documentation to know what it is about" because its name differs from what I would expect to be called "add". Just some weird thought about this from litb, not sure it makes sense at all :)
Looking at http://thesaurus.reference.com/browse/add and http://thesaurus.reference.com/browse/plus I found gain and affix but I'm not sure how much they imply non-mutation.
I think that Plus() and Minus() or, alternatively, Including(), Excluding() are reasonable at implying immutable behavior.
However, no naming choice will ever make it perfectly clear to everyone, so I personally believe that a good xml doc comment would go a very long way here. VS throws these right in your face when you write code in the IDE - they're hard to ignore.
Append - because, note that names of the System.String methods suggest that they mutate the instance, but they don't.
Or I quite like AfterAppending:
void test()
{
Bar bar = new Bar();
List list = bar.AfterAppending("foo");
}
list.CopyWith(element)
As does Smalltalk :)
And also list.copyWithout(element) that removes all occurrences of an element, which is most useful when used as list.copyWithout(null) to remove unset elements.
I would go for Add, because I can see the benefit of a better name, but the problem would be to find different names for every other immutable operation which might make the class quite unfamiliar if that makes sense.

How do you write a (simple) variable "toggle"?

Given the following idioms:
1)
variable = value1
if condition
variable = value2
2)
variable = value2
if not condition
variable = value1
3)
if condition
variable = value2
else
variable = value1
4)
if not condition
variable = value1
else
variable = value2
Which do you prefer, and why?
We assume the most common execution path to be that of condition being false.
I tend to learn towards using 1), although I'm not exactly sure why I like it more.
Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax, and they are not suitable for extending the variable assignment to include more than one statement in the future.
variable = condition ? value2 : value1
...
variable = value2 if condition else value1
In theory, I prefer #3 as it avoids having to assign a value to the variable twice. In the real world though I use any of the four above that would be more readable or would express more clearly my intention.
I prefer method 3 because it is more concise and a logical unit. It sets the value only once, it can be moved around as a block, and it's not that error-prone (which happens, esp. in method 1 if setting-to-value1 and checking-and-optionally-setting-to-value2 are separated by other statements)
3) is the clearest expression of what you want to happen. I think all the others require some extra thinking to determine which value is going to end up in the variable.
In practice, I would use the ternary operator (?:) if I was using a language that supported it. I prefer to write in functional or declarative style over imperative whenever I can.
I tend to use #1 alot myself. if condition reads easier than if !condition, especially if you acidentally miss the '!', atleast to my mind atleast.
Most coding I do is in C#, but I still tend to steer clear of the terniary operator, unless I'm working with (mostly) local variables. Lines tend to get long VERY quickly in a ternary operator if you're calling three layers deep into some structure, which quickly decreases the readability again.
Note: The following examples may be simpler—and thus possibly more readable—but not all languages provide such syntax
This is no argument for not using them in languages that do provide such a syntax. Incidentally, that includes all current mainstream languages after my last count.
and they are not suitable for extending the variable assignment to include more than one statement in the future.
This is true. However, it's often certain that such an extension will absolutely never take place because the condition will always yield one of two possible cases.
In such situations I will always prefer the expression variant over the statement variant because it reduces syntactic clutter and improves expressiveness. In other situations I tend to go with the switch statement mentioned before – if the language allows this usage. If not, fall-back to generic if.
switch statement also works. If it's simple and more than 2 or 3 options, that's what I use.
In a situation where the condition might not happen. I would go with 1 or 2. Otherwise its just based on what i want the code to do. (ie. i agree with cruizer)
I tend to use if not...return.
But that's if you are looking to return a variable. Getting disqualifiers out of the way first tends to make it more readable. It really depends on the context of the statement and also the language. A case statement might work better and be readable most of the time, but performance suffers under VB so a series of if/else statements makes more sense in that specific case.
Method 1 or method 3 for me. Method 1 can avoid an extra scope entrance/exit, but method 3 avoids an extra assignment. I'd tend to avoid Method 2 as I try to keep condition logic as simple as possible (in this case, the ! is extraneous as it could be rewritten as method 1 without it) and the same reason applies for method 4.
It depends on what the condition is I'm testing.
If it's an error flag condition then I'll use 1) setting the Error flag to catch the error and then if the condition is successfull clear the error flag. That way there's no chance of missing an error condition.
For everything else I'd use 3)
The NOT logic just adds to confusion when reading the code - well in my head, can't speak for eveyone else :-)
If the variable has a natural default value I would go with #1. If either value is equally (in)appropriate for a default then I would go with #2.
It depends. I like the ternary operators, but sometimes it's clearer if you use an 'if' statement. Which of the four alternatives you choose depends on the context, but I tend to go for whichever makes the code's function clearer, and that varies from situation to situation.