ClojureScript floats hashed as ints - clojurescript

At first I thought this is a bug, but looking at the source code it's clearly intentional. Does anybody know why this is being done? It's inconsistent with Clojure and a subtle source for bugs.
(hash 1) ; => 1
(hash 1.5) ; => 1
https://github.com/clojure/clojurescript/blob/master/src/main/cljs/cljs/core.cljs#L985
(defn hash
"Returns the hash code of its argument. Note this is the hash code
consistent with =."
[o]
(cond
(implements? IHash o)
(bit-xor (-hash ^not-native o) 0)
(number? o)
(if (js/isFinite o)
(js-mod (Math/floor o) 2147483647)
(case o
Infinity
2146435072
-Infinity
-1048576
2146959360))
...))

JavaScript has only one number type: 64-bit float between -(2^53)-1 and (2^53)-1. However, bitwise operations work on 32-bit signed integers. So, a lossy conversion is needed, when a float is converted to a hash that works with bitwise operators. The magic number 2147483647 for the modulo operation in core.cljs/hash is the maximum integer representable through a 32bit signed number. Note that there is also special handling for values Infinity and -Infinity.

Related

Is it possible to check/get function type or its signature at runtime in SBCL/Common Lisp?

(deftype binary-number-func ()
`(function (number number) number))
(declaim (ftype binary-number-func my-add))
(defun my-add (a b)
(+ (the number a) (the number b)))
;; FAIL:
(assert (typep #'my-add 'binary-number-func))
;; Function types are not a legal argument to TYPEP:
;; (FUNCTION (NUMBER NUMBER) (VALUES NUMBER &REST T))
;; [Condition of type SIMPLE-ERROR]
;; FAIL:
(typep #'my-add '(function (number number) number))
;; Function types are not a legal argument to TYPEP:
;; (FUNCTION (NUMBER NUMBER) (VALUES NUMBER &REST T))
;; [Condition of type SIMPLE-ERROR]
Is there any way to check the compound type of a function value?
(In Common Lisp, I'm using SBCL sbcl-1.5.0-x86-64-linux)
Thanks in advance.
Because Common Lisp allows you to write functions even if the compiler can’t determine a smallest function type for them, the compiler is not always able to check if a function has a certain type. Therefore the only sane behaviour is to not check the type.
Secondarily, such a minimal type may not exist. Consider this function:
(defun foo (key)
(getf '(:red 1 :blue 2 :green 3 :yellow 4) key))
Does it have type (function (T) T) or (function (T) (or null (integer 1 4))) or (function ((member :red :blue :green :yellow)) (integer 1 4)). Note in both the second and third types are correct but one is not a subtype of the other.
Also note that to check the third type above, one would need to precisely know the behaviour of getf which is unlikely to be true in this case and won’t be true at all in the general case.
It is ok for the compiler to check function types because the compiler is allowed to complain or give up. It would be completely unportable for different implementations to have runtime type checking functions which would have quite different behaviours.

Representing Functions as Types

A function can be a highly nested structure:
function a(x) {
return b(c(x), d(e(f(x), g())))
}
First, wondering if a function has an instance. That is, the evaluation of the function being the instance of the function. In that sense, the type is the function, and the instance is the evaluation of it. If it can be, then how to model a function as a type (in some type-theory oriented language like Haskell or Coq).
It's almost like:
type a {
field: x
constructor b {
constructor c {
parameter: x
},
...
}
}
But I'm not sure if I'm not on the right track. I know you can say a function has a [return] type. But I'm wondering if a function can be considered a type, and if so, how to model it as a type in a type-theory-oriented language, where it models the actual implementation of the function.
I think the problem is that types based directly on the implementation (let's call them "i-types") don't seem very useful, and we already have good ways of modelling them (called "programs" -- ha ha).
In your specific example, the full i-type of your function, namely:
type a {
field: x
constructor b {
constructor c {
parameter: x
},
constructor d {
constructor e {
constructor f {
parameter: x
}
constructor g {
}
}
}
}
is just a verbose, alternative syntax for the implementation itself. That is, we could write this i-type (in a Haskell-like syntax) as:
itype a :: a x = b (c x) (d (e (f x) g))
On the other hand, we could convert your function implementation to Haskell term-level syntax directly to write it as:
a x = b (c x) (d (e (f x) g))
and the i-type and the implementation are exactly the same thing.
How would you use these i-types? The compiler might use them by deriving argument and return types to type-check the program. (Fortunately, there are well known algorithms, such as Algorithm W, for simultaneously deriving and type-checking argument and return types from i-types of this sort.) Programmers probably wouldn't use i-types directly -- they're too complicated to use for refactoring or reasoning about program behavior. They'd probably want to look at the types derived by the compiler for the arguments and return type.
In particular, "modelling" these i-types at the type level in Haskell doesn't seem productive. Haskell can already model them at the term level. Just write your i-types as a Haskell program:
a x = b (c x) (d (e (f x) g))
b s t = sqrt $ fromIntegral $ length (s ++ t)
c = show
d = reverse
e c ds = show (sum ds + fromIntegral (ord c))
f n = if even n then 'E' else 'O'
g = [1.5..5.5]
and don't run it. Congratulations, you've successfully modelled these i-types! You can even use GHCi to query derived argument and return types:
> :t a
a :: Floating a => Integer -> a -- "a" takes an Integer and returns a float
>
Now, you are perhaps imagining that there are situations where the implementation and i-type would diverge, maybe when you start introducing literal values. For example, maybe you feel like the function f above:
f n = if even n then 'E' else 'O'
should be assigned a type something like the following, that doesn't depend on the specific literal values:
type f {
field: n
if_then_else {
constructor even { -- predicate
parameter: n
}
literal Char -- then-branch
literal Char -- else-branch
}
Again, though, you'd be better off defining an arbitrary term-level Char, like:
someChar :: Char
someChar = undefined
and modeling this i-type at the term-level:
f n = if even n then someChar else someChar
Again, as long as you don't run the program, you've successfully modelled the i-type of f, can query its argument and return types, type-check it as part of a bigger program, etc.
I'm not clear exactly what you are aiming at, so I'll try to point at some related terms that you might want to read about.
A function has not only a return type, but a type that describes its arguments as well. So the (Haskell) type of f reads "f takes an Int and a Float, and returns a List of Floats."
f :: Int -> Float -> [Float]
f i x = replicate i x
Types can also describe much more of the specification of a function. Here, we might want the type to spell out that the length of the list will be the same as the first argument, or that every element of the list will be the same as the second argument. Length-indexed lists (often called Vectors) are a common first example of Dependent Types.
You might also be interested in functions that take types as arguments, and return types. These are sometimes called "type-level functions". In Coq or Idris, they can be defined the same way as more familiar functions. In Haskell, we usually implement them using Type Families, or using Type Classes with Functional Dependencies.
Returning to the first part of your question, Beta Reduction is the process of filling in concrete values for each of the function's arguments. I've heard people describe expressions as "after reduction" or "fully reduced" to emphasize some stage in this process. This is similar to a function Call Site, but emphasizes the expression & arguments, rather than the surrounding context.

understand syntax in the sml language

Hello I started to write in sml and I have some difficulty in understanding a particular function.
I have this function:
fun isInRow (r:int) ((x,y)) = x=r;
I would be happy to get explain to some points:
What the function accepts and what it returns.
What is the relationship between (r: int) ((x, y)).
Thanks very much !!!
The function isInRow has two arguments. The first is named r. The second is a pair (x, y). The type ascription (r: int) says that r must be an int.
This function is curried, which is a little unusual for SML. What this means roughly speaking is that it accepts arguments given separately rather than supplied as a pair.
So, the function accepts an int and a pair whose first element is an int. These are accepted as separate arguments. It returns a boolean value (the result of the comparison x = r).
A call to the function would look like this:
isInRow 3 (3, 4)
There is more to say about currying (which is kind of cool), but I hope this is enough to get you going.
In addition to what Jeffrey has said,
You don't need the extra set of parentheses:
fun isInRow (r:int) (x,y) = x=r;
You don't need to specify the type :int. If you instead write:
fun isInRow r (x,y) = x=r;
then the function's changes type from int → (int • 'a) → bool into ''a → (''a • 'b) → bool, meaning that r and x can have any type that can be compared for equality (not just int), and y can still be anything since it is still disregarded.
Polymorphic functions are one of the strengths of typed, functional languages like SML.
You could even refrain from giving y a name:
fun isInRow r (x,_) = x=r;

PolyML Functions and Types

[...] a pair of functions tofun : int -> ('a -> 'a) and fromfun : ('a -> 'a) ->
int such that (fromfun o tofun) n evaluates to n for every n : int.
Anyone able to explain to me what this is actually asking for? I'm looking for more of an explanation of that than an actual solution to this.
What this is asking for is:
1) A higher-order function tofun which when given an integer returns a polymorphic function, one which has type 'a->'a, meaning that it can be applied to values of any type, returning a value of the same type. An example of such a function is:
- fun id x = x;
val id = fn : 'a -> 'a
for example, id "cat" = "cat" and id () = (). The later value is of type unit, which is a type with only 1 value. Note that there is only 1 total function from unit to unit (namely, id or something equivalent). This underscores the difficulty with coming up with defining tofun: it returns a function of type 'a -> 'a, and other than the identity function it is hard to think of other functions. On the other hand -- such functions can fail to terminate or can raise an error and still have type 'a -> 'a.
2) fromfun is supposed to take a function of type 'a ->'a and return an integer. So e.g. fromfun id might evaluate to 0 (or if you want to get tricky it might never terminate or it might raise an error)
3) These are supposed to be inverses of each other so that, e.g. fromfun (tofun 5) needs to evaluate to 5.
Intuitively, this should be impossible in a sufficiently pure functional language. If it is possible in SML, my guess is that it would be by using some of the impure features of SML (which allow for side effects) to violate referential transparency. Or, the trick might involve raising and handling errors (which is also an impure feature of SML). If you find an answer which works in SML it would be interesting to see if it could be translated to the annoyingly pure functional language Haskell. My guess is that it wouldn't translate.
You can devise the following property:
fun prop_inverse f g n = (f o g) n = n
And with definitions for tofun and fromfun,
fun tofun n = ...
fun fromfun f = ...
You can test that they uphold the property:
val prop_test_1 =
List.all
(fn i => prop_inverse fromfun tofun i handle _ => false)
[0, ~1, 1, valOf Int.maxInt, valOf Int.minInt]
And as John suggests, those functions must be impure. I'd also go with exceptions.

Can languages with char counts be described by context free grammars?

I am looking at a the German HBCI/FinTS protocol. One peculiarity of this protocol is that it can contain binary blobs, which are prefixed by #NUM_OF_BINARY_CHARS#. Otherwise the protocol is quite simple, a grammar could be described as follows (a bit simplified, terminals are quoted by "):
message = segment+
segment = elements "'"
elements = element "+" elements | element
element = items
items = item ":" items | item
item = [a-zA-Z0-9,._-]* | escaped item
escaped = ?[-#?_-a-zA-Z0-9,.]
The # is missing here!
A sample message could look something like this
FirstSegment+Elem1+Item1:Item2+#4#:'+#+The_last_four_chars_are_binary+Elem4'SecondSegment+Elem5'
Can this language (with the escaping of binary strings) be described by a context free grammar?
No, this language is not context-free. The format you're describing is essentially equivalent to this language
{ #n#w | n is a natural number and |w| = n }
You can show that this isn't context-free by using the context-free pumping lemma. Let the pumping length be p and consider the string #1p#x1111...1 (p times). This is a string encoding of a binary piece of data that show have length 111...1 (p times). Now split the string into u, v, x, y, z where |vy| > 1 and |vxy| ≤ p. If v or y is the # sign, then uv0xy0z isn't in the language because it doesn't have enough # signs. If v and y are purely contained in 1p, then pumping up the string will end up producing a string not in the language because the binary data string won't have the right size. Similarly, if v and y are purely contained in x111...1 (p times), pumping up or down will make the payload the wrong size. Finally, if v is in the length field and x is in the payload, pumping up v and x simultaneously will make the payload have the wrong length because v is written in decimal (so each extra character increases the payload size by a factor of ten) while x's length isn't.
Hope this helps!