Languages where ¬(a = b) and (a ≠ b) can be different - language-agnostic

In C++ one can overload the == and != operators for user types, but the language doesn't care how you do it. You can overload both to return true no matter what, so !(a==b) and (a!=b) don't necessarily have to evaluate to the same thing. How many other languages have a situation where ¬(a = b) and (a ≠ b) can be different? Is it a common thing?
It's not just an issue of overloads, but of strange corner cases even for primitive types. NaN in C and C++ doesn't compare equal to anything, including NaN. It is true that NaN != NaN in C, but maybe there are similar cases in other languages that cause ¬(a = b) and (a ≠ b) to be different?

Guy L. Steele famously said
...the ability to define your own operator functions means that a simple statement such as x=a+b; in an inner loop might involve the sending of e-mail to Afghanistan.
And as corsiKa says, just because you can do it, doesn't make it a good idea.

I know for a fact that Python and Ruby can and Java and PHP can not. (In Java == determines if two objects are the same thing in memory, not just semantically equivalent values. In PHP...never mind.) I'd also imagine that Lisp and JS can and C can not, but that's a bit more speculative.
It's nothing unusual to be able to overload operators. It is very rare for !(a==b) and (a!=b) to have different results though.

Related

In CUDA, is there a way to ensure the consistence of FP maths in the same program?

Is there a way to ensure that:
if a==b then devfun(a)==devfun(b);
where devfun() is a device function involves some floating point maths ops (e.g. polynomials) and returns floating point results, a and b are floating point variables.
I don't care about cross-implentation consistence (e.g. different compiler/different OS/different driver versions or different compiler options), I only care about, within the same building/program, at runtime, can it ensure that during each function call, the result returned by devfun() are consistent in a way such that as long as a==b, devfun(a)==devfun(b)?
I am talking about SM2.0+ hardware and CUDA 5.0+, just in case being relevant.
Let's assume that your numbers a and b represent properly normalized IEEE-754 representation floating point numbers and that niether a nor b is a NaN value. Let's also assume a and b are both 32-bit, or else a and b are both 64-bit (IEEE-754 floating point representations).
In that case, I believe the (ISO C/C++, or CUDA C/C++) floating point test for equality (==) will return TRUE when the two numbers a and b are bitwise identical (and FALSE otherwise).
Under the TRUE case, with one exception, I believe it is safe to assume that devfun(a) == devfun(b) without any additional conditions except the obvious ones: there is no difference in the behavior of devfun on either side of the == operation, that is, it's the same code, compiled in the same way, executed under the same conditions (e.g. other variables that may be taking part in devfun, same GPU type, etc.), just as you've indicated in your question: "same building/program".
The one exception is if the result of devfun(a) is NaN, since (IEEE-754) NaN != NaN.
It would be interesting (to me) if you think you have a piece of code that disproves this assertion.
Perhaps floating point ninjas will come along and correct me.
Perhaps also I would be remiss if I did not say something about the hazards of floating point comparisons. If you're not familiar with this (most folks would never recommend performing a test a==b on two floating point numbers) you can find many questions about it on SO.
For the same reasons that floating point equality comparison (==) in general is unwise, I think relying on the above assertion, even if it's true, is unwise. Let me give you one example.
Suppose you compile code for architecture sm_20. Now you run the code on an sm_21 device. This one simple variation could result in a JIT-compile at runtime. Now you are no longer running the same code, and all bets are off.
So, again, even if the above is true, I think it's unwise for you to rely on such a statement:
if a==b, then devfun(a) == devfun(b)

Clojure - test for equality of function expression?

Suppose I have the following clojure functions:
(defn a [x] (* x x))
(def b (fn [x] (* x x)))
(def c (eval (read-string "(defn d [x] (* x x))")))
Is there a way to test for the equality of the function expression - some equivalent of
(eqls a b)
returns true?
It depends on precisely what you mean by "equality of the function expression".
These functions are going to end up as bytecode, so I could for example dump the bytecode corresponding to each function to a byte[] and then compare the two bytecode arrays.
However, there are many different ways of writing semantically equivalent methods, that wouldn't have the same representation in bytecode.
In general, it's impossible to tell what a piece of code does without running it. So it's impossible to tell whether two bits of code are equivalent without running both of them, on all possible inputs.
This is at least as bad, computationally speaking, as the halting problem, and possibly worse.
The halting problem is undecidable as it is, so the general-case answer here is definitely no (and not just for Clojure but for every programming language).
I agree with the above answers in regards to Clojure not having a built in ability to determine the equivalence of two functions and that it has been proven that you can not test programs functionally (also known as black box testing) to determine equality due to the halting problem (unless the input set is finite and defined).
I would like to point out that it is possible to algebraically determine the equivalence of two functions, even if they have different forms (different byte code).
The method for proving the equivalence algebraically was developed in the 1930's by Alonzo Church and is know as beta reduction in Lambda Calculus. This method is certainly applicable to the simple forms in your question (which would also yield the same byte code) and also for more complex forms that would yield different byte codes.
I cannot add to the excellent answers by others, but would like to offer another viewpoint that helped me. If you are e.g. testing that the correct function is returned from your own function, instead of comparing the function object you might get away with just returning the function as a 'symbol.
I know this probably is not what the author asked for but for simple cases it might do.

Pattern matching with associative and commutative operators

Pattern matching (as found in e.g. Prolog, the ML family languages and various expert system shells) normally operates by matching a query against data element by element in strict order.
In domains like automated theorem proving, however, there is a requirement to take into account that some operators are associative and commutative. Suppose we have data
A or B or C
and query
C or $X
Going by surface syntax this doesn't match, but logically it should match with $X bound to A or B because or is associative and commutative.
Is there any existing system, in any language, that does this sort of thing?
Associative-Commutative pattern matching has been around since 1981 and earlier, and is still a hot topic today.
There are lots of systems that implement this idea and make it useful; it means you can avoid write complicated pattern matches when associtivity or commutativity could be used to make the pattern match. Yes, it can be expensive; better the pattern matcher do this automatically, than you do it badly by hand.
You can see an example in a rewrite system for algebra and simple calculus implemented using our program transformation system. In this example, the symbolic language to be processed is defined by grammar rules, and those rules that have A-C properties are marked. Rewrites on trees produced by parsing the symbolic language are automatically extended to match.
The maude term rewriter implements associative and commutative pattern matching.
http://maude.cs.uiuc.edu/
I've never encountered such a thing, and I just had a more detailed look.
There is a sound computational reason for not implementing this by default - one has to essentially generate all combinations of the input before pattern matching, or you have to generate the full cross-product worth of match clauses.
I suspect that the usual way to implement this would be to simply write both patterns (in the binary case), i.e., have patterns for both C or $X and $X or C.
Depending on the underlying organisation of data (it's usually tuples), this pattern matching would involve rearranging the order of tuple elements, which would be weird (particularly in a strongly typed environment!). If it's lists instead, then you're on even shakier ground.
Incidentally, I suspect that the operation you fundamentally want is disjoint union patterns on sets, e.g.:
foo (Or ({C} disjointUnion {X})) = ...
The only programming environment I've seen that deals with sets in any detail would be Isabelle/HOL, and I'm still not sure that you can construct pattern matches over them.
EDIT: It looks like Isabelle's function functionality (rather than fun) will let you define complex non-constructor patterns, except then you have to prove that they are used consistently, and you can't use the code generator anymore.
EDIT 2: The way I implemented similar functionality over n commutative, associative and transitive operators was this:
My terms were of the form A | B | C | D, while queries were of the form B | C | $X, where $X was permitted to match zero or more things. I pre-sorted these using lexographic ordering, so that variables always occurred in the last position.
First, you construct all pairwise matches, ignoring variables for now, and recording those that match according to your rules.
{ (B,B), (C,C) }
If you treat this as a bipartite graph, then you are essentially doing a perfect marriage problem. There exist fast algorithms for finding these.
Assuming you find one, then you gather up everything that does not appear on the left-hand side of your relation (in this example, A and D), and you stuff them into the variable $X, and your match is complete. Obviously you can fail at any stage here, but this will mostly happen if there is no variable free on the RHS, or if there exists a constructor on the LHS that is not matched by anything (preventing you from finding a perfect match).
Sorry if this is a bit muddled. It's been a while since I wrote this code, but I hope this helps you, even a little bit!
For the record, this might not be a good approach in all cases. I had very complex notions of 'match' on subterms (i.e., not simple equality), and so building sets or anything would not have worked. Maybe that'll work in your case though and you can compute disjoint unions directly.

Why do different operators have different associativity?

I've got to the section on operators in The Ruby Programming Language, and it's made me think about operator associativity. This isn't a Ruby question by the way - it applies to all languages.
I know that operators have to associate one way or the other, and I can see why in some cases one way would be preferable to the other, but I'm struggling to see the bigger picture. Are there some criteria that language designers use to decide what should be left-to-right and what should be right-to-left? Are there some cases where it "just makes sense" for it to be one way over the others, and other cases where it's just an arbitrary decision? Or is there some grand design behind all of this?
Typically it's so the syntax is "natural":
Consider x - y + z. You want that to be left-to-right, so that you get (x - y) + z rather than x - (y + z).
Consider a = b = c. You want that to be right-to-left, so that you get a = (b = c), rather than (a = b) = c.
I can't think of an example of where the choice appears to have been made "arbitrarily".
Disclaimer: I don't know Ruby, so my examples above are based on C syntax. But I'm sure the same principles apply in Ruby.
Imagine to write everything with brackets for a century or two.
You will have the experience about which operator will most likely bind its values together first, and which operator last.
If you can define the associativity of those operators, then you want to define it in a way to minimize the brackets while writing the formulas in easy-to-read terms. I.e. (*) before (+), and (-) should be left-associative.
By the way, Left/Right-Associative means the same as Left/Right-Recursive. The word associative is the mathematical perspective, recursive the algorihmic. (see "end-recursive", and look at where you write the most brackets.)
Most of operator associativities in comp sci is nicked directly from maths. Specifically symbolic logic and algebra.

Can you implement any pure LISP function using the ten primitives? (ie no type predicates)

This site makes the following claim:
http://hyperpolyglot.wikidot.com/lisp#ten-primitives
McCarthy introduced the ten primitives of lisp in 1960. All other pure lisp
functions (i.e. all functions which don't do I/O or interact with the environment)
can be implemented with these primitives. Thus, when implementing or porting lisp,
these are the only functions which need to be implemented in a lower language. The
way the non-primitives of lisp can be constructed from primitives is analogous to
the way theorems can be proven from axioms in mathematics.
The primitives are: atom, quote, eq, car, cdr, cons, cond, lambda, label, apply.
My question is - can you really do this without type predicates such as numberp? Surely there is a point when writing a higher level function that you need to do a numeric operation - which the primitives above don't allow for.
Some numbers can be represented with just those primitives, it's just rather inconvenient and difficult the conceptualize the first time you see it.
Similar to how the natural numbers are represented with sets increasing in size, they can be simulated in Lisp as nested cons cells.
Zero would be the empty list, or (). One would be the singleton cons cell, or (() . ()). Two would be one plus one, or the successor of one, where we define the successor of x to be (cons () x) , which is of course (() . (() . ())). If you accept the Infinity Axiom (and a few more, but mostly the Infinity Axiom for our purposes so far), and ignore the memory limitations of real computers, this can accurately represent all the natural numbers.
It's easy enough to extend this to represent all the integers and then the rationals [1], but representing the reals in this notation would be (I think) impossible. Fortunately, this doesn't dampen our fun, as we can't represent the all the reals on our computers anyway; we make do with floats and doubles. So our representation is just as powerful.
In a way, 1 is just syntactic sugar for (() . ()).
Hurray for set theory! Hurray for Lisp!
EDIT Ah, for further clarification, let me address your question of type predicates, though at this point it could be clear. Since your numbers have a distinct form, you can test these linked lists with a function of your own creation that tests for this particular structure. My Scheme isn't good enough anymore to write it in Scheme, but I can attempt to in Clojure.
Regardless, you may be saying that it could give you false positives: perhaps you're simply trying to represent sets and you end up having the same structure as a number in this system. To that I reply: well, in that case, you do in fact have a number.
So you can see, we've got a pretty decent representation of numbers here, aside from how much memory they take up (not our concern) and how ugly they look when printed at the REPL (also, not our concern) and how inefficient it will be to operate on them (e.g. we have to define our addition etc. in terms of list operations: slow and a bit complicated.) But none of these are out concern: the speed really should and could depend on the implementation details, not what you're doing this the language.
So here, in Clojure (but using only things we basically have access to in our simple Lisp, is numberp. (I hope; feel free to correct me, I'm groggy as hell etc. excuses etc.)
(defn numberp
[x]
(cond
(nil? x) true
(and (coll? x) (nil? (first x))) (numberp (second x))
:else false))
[1] For integers, represent them as cons cells of the naturals. Let the first element in the cons cell be the "negative" portion of the integer, and the second element be the "positive" portion of the integer. In this way, -2 can be represented as (2, 0) or (4, 2) or (5, 3) etc. For the rationals, let them be represented as cons cells of the integers: e.g. (-2, 3) etc. This does give us the possibility of having the same data structure representing the same number: however, this can be remedied by writing functions that test two numbers to see if they're equivalent: we'd define these functions in terms of the already-existing equivalence relations set theory offers us. Fun stuff :)