Basically, I wonder if a language exists where this code will be invalid because even though counter and distance are both int under the hood, they represent incompatible types in the real world:
#include <stdio.h>
typedef int counter;
typedef int distance;
int main() {
counter pies = 1;
distance lengthOfBiscuit = 4;
printf("total pies: %d\n", pies + lengthOfBiscuit);
return 0;
}
That compiles with no warnings with "gcc -pedantic -Wall" and all other languages where I've tried it. It seems like it would be a good idea to disallow accidentally adding a counter and a distance, so where is the language support?
(Incidentally, the real-life example that prompted this quesion was web dev work in PHP and Python -- I was trying to make "HTML-escaped string", "SQL-escaped string" and "raw dangerous user input" incompatible, but the best I can seem to get is apps hungarian notation as suggested here --> http://www.joelonsoftware.com/articles/Wrong.html <-- and that still relies on human checking ("wrong code looks wrong") rather than compiler support ("wrong code is wrong"))
Haskell can do this, with GeneralizedNewtypeDeriving you can treat wrapped values as the underlying thing, whilst only exposing what you need:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype Counter = Counter Int deriving Num
newtype Distance = Distance Int deriving Num
main :: IO ()
main = print $ Counter 1 + Distance 2
Now you get the error:
Add.hs:6:28:
Couldn't match expected type ‘Counter’ with actual type ‘Distance’
In the second argument of ‘(+)’, namely ‘Distance 2’
In the second argument of ‘($)’, namely ‘Counter 1 + Distance 2’
You can still "force" the underlying data type with "coerce", or by unwrapping the Ints explicitly.
I should add that any language with "real" types should be able to do this.
In Ada you can have types that use the same representation, but are still distinct types. What a "strong typedef" would be (if it existed) in C or C++.
In your case, you could do
type counter is new Integer;
type distance is new Integer;
to create two new types that behave like integers, but cannot be mixed.
Derived types and sub types in Ada
You could ccreate an object wrapping the undelying type in a member variable and define operations (even in the form of functions) that make sense on that type (e.g. LEngth would define "plus" allowing addition to another length, but for angle).
A drawback of this approach is you have to create a wrapper for each underlying type you care about and define the appropriate operations for each sensible combination, which might be tedious and possibly error-prone.
In C++, you could check out BOOST support for dimensions. The example given is designed primarily for physical dimensions, but I think you could adapt it to many others as well.
Related
The concept of lambdas (anonymous functions) is very clear to me. And I'm aware of polymorphism in terms of classes, with runtime/dynamic dispatch used to call the appropriate method based on the instance's most derived type. But how exactly can a lambda be polymorphic? I'm yet another Java programmer trying to learn more about functional programming.
You will observe that I don't talk about lambdas much in the following answer. Remember that in functional languages, any function is simply a lambda bound to a name, so what I say about functions translates to lambdas.
Polymorphism
Note that polymorphism doesn't really require the kind of "dispatch" that OO languages implement through derived classes overriding virtual methods. That's just one particular kind of polymorphism, subtyping.
Polymorphism itself simply means a function allows not just for one particular type of argument, but is able to act accordingly for any of the allowed types. The simplest example: you don't care for the type at all, but simply hand on whatever is passed in. Or, to make it not quite so trivial, wrap it in a single-element container. You could implement such a function in, say, C++:
template<typename T> std::vector<T> wrap1elem( T val ) {
return std::vector(val);
}
but you couldn't implement it as a lambda, because C++ (time of writing: C++11) doesn't support polymorphic lambdas.
Untyped values
...At least not in this way, that is. C++ templates implement polymorphism in rather an unusual way: the compiler actually generates a monomorphic function for every type that anybody passes to the function, in all the code it encounters. This is necessary because of C++' value semantics: when a value is passed in, the compiler needs to know the exact type (its size in memory, possible child-nodes etc.) in order to make a copy of it.
In most newer languages, almost everything is just a reference to some value, and when you call a function it doesn't get a copy of the argument objects but just a reference to the already-existing ones. Older languages require you to explicitly mark arguments as reference / pointer types.
A big advantage of reference semantics is that polymorphism becomes much easier: pointers always have the same size, so the same machine code can deal with references to any type at all. That makes, very uglily1, a polymorphic container-wrapper possible even in C:
typedef struct{
void** contents;
int size;
} vector;
vector wrap1elem_by_voidptr(void* ptr) {
vector v;
v.contents = malloc(sizeof(&ptr));
v.contents[0] = ptr;
v.size = 1;
return v;
}
#define wrap1elem(val) wrap1elem_by_voidptr(&(val))
Here, void* is just a pointer to any unknown type. The obvious problem thus arising: vector doesn't know what type(s) of elements it "contains"! So you can't really do anything useful with those objects. Except if you do know what type it is!
int sum_contents_int(vector v) {
int acc = 0, i;
for(i=0; i<v.size; ++i) {
acc += * (int*) (v.contents[i]);
}
return acc;
}
obviously, this is extremely laborious. What if the type is double? What if we want the product, not the sum? Of course, we could write each case by hand. Not a nice solution.
What would we better is if we had a generic function that takes the instruction what to do as an extra argument! C has function pointers:
int accum_contents_int(vector v, void* (*combine)(int*, int)) {
int acc = 0, i;
for(i=0; i<v.size; ++i) {
combine(&acc, * (int*) (v.contents[i]));
}
return acc;
}
That could then be used like
void multon(int* acc, int x) {
acc *= x;
}
int main() {
int a = 3, b = 5;
vector v = wrap2elems(a, b);
printf("%i\n", accum_contents_int(v, multon));
}
Apart from still being cumbersome, all the above C code has one huge problem: it's completely unchecked if the container elements actually have the right type! The casts from *void will happily fire on any type, but in doubt the result will be complete garbage2.
Classes & Inheritance
That problem is one of the main issues which OO languages solve by trying to bundle all operations you might perform right together with the data, in the object, as methods. While compiling your class, the types are monomorphic so the compiler can check the operations make sense. When you try to use the values, it's enough if the compiler knows how to find the method. In particular, if you make a derived class, the compiler knows "aha, it's ok to call that method from the base class even on a derived object".
Unfortunately, that would mean all you achieve by polymorphism is equivalent to compositing data and simply calling the (monomorphic) methods on a single field. To actually get different behaviour (but controlledly!) for different types, OO languages need virtual methods. What this amounts to is basically that the class has extra fields with pointers to the method implementations, much like the pointer to the combine function I used in the C example – with the difference that you can only implement an overriding method by adding a derived class, for which the compiler again knows the type of all the data fields etc. and you're safe and all.
Sophisticated type systems, checked parametric polymorphism
While inheritance-based polymorphism obviously works, I can't help saying it's just crazy stupid3 sure a bit limiting. If you want to use just one particular operation that happens to be not implemented as a class method, you need to make an entire derived class. Even if you just want to vary an operation in some way, you need to derive and override a slightly different version of the method.
Let's revisit our C code. On the face of it, we notice it should be perfectly possible to make it type-safe, without any method-bundling nonsense. We just need to make sure no type information is lost – not during compile-time, at least. Imagine (Read ∀T as "for all types T")
∀T: {
typedef struct{
T* contents;
int size;
} vector<T>;
}
∀T: {
vector<T> wrap1elem(T* elem) {
vector v;
v.contents = malloc(sizeof(T*));
v.contents[0] = &elem;
v.size = 1;
return v;
}
}
∀T: {
void accum_contents(vector<T> v, void* (*combine)(T*, const T*), T* acc) {
int i;
for(i=0; i<v.size; ++i) {
combine(&acc, (*T) (v[i]));
}
}
}
Observe how, even though the signatures look a lot like the C++ template thing on top of this post (which, as I said, really is just auto-generated monomorphic code), the implementation actually is pretty much just plain C. There are no T values in there, just pointers to them. No need to compile multiple versions of the code: at runtime, the type information isn't needed, we just handle generic pointers. At compile time, we do know the types and can use the function head to make sure they match. I.e., if you wrote
void evil_sumon (int* acc, double* x) { acc += *x; }
and tried to do
vector<float> v; char acc;
accum_contents(v, evil_sumon, acc);
the compiler would complain because the types don't match: in the declaration of accum_contents it says the type may vary, but all occurences of T do need to resolve to the same type.
And that is exactly how parametric polymorphism works in languages of the ML family as well as Haskell: the functions really don't know anything about the polymorphic data they're dealing with. But they are given the specialised operators which have this knowledge, as arguments.
In a language like Java (prior to lambdas), parametric polymorphism doesn't gain you much: since the compiler makes it deliberately hard to define "just a simple helper function" in favour of having only class methods, you can simply go the derive-from-class way right away. But in functional languages, defining small helper functions is the easiest thing imaginable: lambdas!
And so you can do incredible terse code in Haskell:
Prelude> foldr (+) 0 [1,4,6]
11
Prelude> foldr (\x y -> x+y+1) 0 [1,4,6]
14
Prelude> let f start = foldr (\_ (xl,xr) -> (xr, xl)) start
Prelude> :t f
f :: (t, t) -> [a] -> (t, t)
Prelude> f ("left", "right") [1]
("right","left")
Prelude> f ("left", "right") [1, 2]
("left","right")
Note how in the lambda I defined as a helper for f, I didn't have any clue about the type of xl and xr, I merely wanted to swap a tuple of these elements which requires the types to be the same. So that would be a polymorphic lambda, with the type
\_ (xl, xr) -> (xr, xl) :: ∀ a t. a -> (t,t) -> (t,t)
1Apart from the weird explicit malloc stuff, type safety etc.: code like that is extremely hard to work with in languages without garbage collector, because somebody always needs to clean up memory once it's not needed anymore, but if you didn't watch out properly whether somebody still holds a reference to the data and might in fact need it still. That's nothing you have to worry about in Java, Lisp, Haskell...
2There is a completely different approach to this: the one dynamic languages choose. In those languages, every operation needs to make sure it works with any type (or, if that's not possible, raise a well-defined error). Then you can arbitrarily compose polymorphic operations, which is on one hand "nicely trouble-free" (not as trouble-free as with a really clever type system like Haskell's, though) but OTOH incurs quite a heavy overhead, since even primitive operations need type-decisions and safeguards around them.
3I'm of course being unfair here. The OO paradigm has more to it than just type-safe polymorphism, it enables many things e.g. old ML with it's Hindler-Milner type system couldn't do (ad-hoc polymorphism: Haskell has type classes for that, SML has modules), and even some things that are pretty hard in Haskell (mainly, storing values of different types in a variable-size container). But the more you get accustomed to functional programming, the less need you will feel for such stuff.
In C++ polymorphic (or generic) lambda starting from C++14 is a lambda that can take any type as an argument. Basically it's a lambda that has auto parameter type:
auto lambda = [](auto){};
Is there a context that you've heard the term "polymorphic lambda"? We might be able to be more specific.
The simplest way that a lambda can be polymorphic is to accept arguments whose type is (partly-)irrelevant to the final result.
e.g. the lambda
\(head:tail) -> tail
has the type [a] -> [a] -- e.g. it's fully-polymorphic in the inner type of the list.
Other simple examples are the likes of
\_ -> 5 :: Num n => a -> n
\x f -> f x :: a -> (a -> b) -> b
\n -> n + 1 :: Num n => n -> n
etc.
(Notice the Num n examples which involve typeclass dispatch)
I have learned Haskell for about one year, and came up with a question that could the talented compiler writers add a new feature called "subset" by me to enhance the Haskell's type system to catch many errors including IOExceptions in compiling stage. I'm novice of Theory of types, and forgive my wishful thinking.
My initial purpose is not how to solve the problem but to know whether there exists a related solution but because of some reasons the solution is not introduced to Haskell.
Haskell is nearly perfect in my mind except for some little things, and I will express my wish to Haskell of the future in the following lines.
The following is the major one:
If we can define a type, which is just a "subset" of Int assuming Haskell allows us to do that, like bellow:
data IntNotZero = Int {except `0`} -- certainly it is not legal in Haskell, but I just assume that Haskell allows us to define a type as a "subset" of an already existing type. I'm novice of Theory of types, and forgive me.
And If a function needs a parameter of Int, a variable of IntNotZero, which is just a "subset" of Int, can also be a parameter of the function. But, If a function needs a IntNotZero, then a Int is illegal.
For example:
div' :: Int -> IntNotZero -> Int
div' = div
aFunction :: Int -> Int -> Int --If we casually write it, then the compiler will complain for type conflict.
aFunction = div'
aFunction2 :: Int -> Int -> Int --we have to distinguish between `Int` and `IntNotZero`.
aFunction2 m n = type n of --An assumed grammar like `case ... of` to separate "subset" from its complement. `case ...of` only works on different patterns.
IntNotZero -> m `div` n
otherwise -> m + n
For a more useful example:
data HandleNotClosed = Handle {not closed} --this type infers a Handle not closed
hGetContents' :: HandleNotClosed -> IO String --this function needs a HandleNotClosed and a Handle will make a type conflict.
hGetContents' = hGetContents
wrongMain = do
...
h <- openFile "~/xxx/aa" ReadMode
... -- we do many tasks with h and we may casually closed h
contents <- hGetContents' h --this will raise a type conflict, because h has type of Handle not HandleNotClosed.
...
rightMain = do
...
h <- openFile "~/xxx/aa" ReadMode
... -- we do many tasks with h and we may casually closed h
type h of -- the new grammar.
HandleNotClosed -> do
contents <- hGetContents' h
...
otherwise -> ...
If we combine ordinary IO with Exception to a new "supset", then we may get free of IOErrors.
What you want sounds similar to "refinement types" à la Liquid Haskell. This is an external tool that allows you to "refine" your Haskell types by specifying additional predicates that hold over your types. To check that these hold, you use an SMT solver to verify all the constraints have been satisfied.
The following code snippets are taken from their introductory blog post.
For example, you could write the type that zero is 0:
{-# zero :: { v : Int | v = 0 } #-}
zero :: Int
zero = 0
You'll notice that the syntax for types looks just like set notation for math--you're defining a new type as a subset of the old on. In this case, you're defining the type of Ints that are equal to 0.
You can use this system to write a safe divide function:
{-# divide :: Int -> { v : Int | v != 0 } -> Int #-}
divide :: Int -> Int -> Int
divide n 0 = error "Cannot divide by 0."
divide n d = n `div` d
When you actually try to compile this program, Liquid Haskell will see that having 0 as the denominator violates the predicate and so the call to error cannot happen. Moreover, when you try to use divide, it will check that the argument you pass in cannot be 0.
Of course, to make this useful, you have to be able to add information about the postconditions of your functions, not just the preconditions. You can just do this by refining the result type of the function; for example, you can imagine the following type for abs:
{-# abs :: Int -> { v : Int | 0 <= v } #-}
Now the type system knows that the result of calling abs will never be negative, and it can take advantage of this fact when it needs to verify your program.
Like other people mentioned, using this sort of type system means you will have to have proofs in your code. The advantage of Liquid Haskell is that it uses an SMT solver to generate the proof for you automatically--you just have to write the assertions.
Liquid Haskell is still a research project, and it's limited by what can reasonably be done with an SMT solver. I haven't used it myself, but it looks really awesome and seems to be exactly what you want. One thing I'm not sure about is how it interacts with custom types and IO--something you might want to look into yourself.
The problem is that the compiler can't determine if something is of type IntNotZero. For example:
f :: Int -> IntNotZero
f x = someExtremlyComplexComputation
the compiler would have to prove that someExtremlyComplexComputation doesn't produce a zero result, which is in general impossible.
One way how to approach this is in plain Haskell to create a module that hides the representation of IntNotZero and publishes only a smart constructor such as
module MyMod (IntNotZero(), intNotZero) where
newtype IntNotZero = IntNotZero Int
intNotZero :: Int -> IntNotZero
intNotZero 0 = error "Zero argument"
intNotZero x = IntNotZero x
-- etc
The obvious drawback is that the constraint is checked only at runtime.
There are more complex systems than Haskell that use Dependent types. These are types that depend on values, and they allow you to express just what you want. Unfortunately these systems are rather complex and not much widespread. If you're interested in the subject, I suggest you to read Certified Programming with Dependent Types by Adam Chlipala.
We have types and we have values. A Type is a set of (infinite) values. String is a type and all the possible string values are part of the String set. Now, the most important distinction about types and values is this - Types are about compile time and Values are available at runtime.
If we look at you first example which talks about a new type which is subtype (or subset) of Int type such that "the value of the Int can't be zero", which means you want to define a type which put some restrictions on a value BUT types are compile time and values are runtime things - a compile time thing can't restrict a runtime thing because the runtime thing is not there yet for compile time thing to consume.
Similarly the handle value is a runtime thing and only at runtime you can know if it is closed or not and for that you have functions to check whether the handle is closed or not.
IO is all about runtime and you can't use a type system to get free of IOErrors.
For modeling runtime failures you can use data types like Maybe or Either to indicate that the function may not be able to do what is was supposed to do and as these data types implements functors, moands and other computations patterns you can easily compose them.
A type system is more of a structuring/design tool which make things more explicit and clear and makes you think more about your design but it can't do what functions are supposed to do.
The film is : Typed Lambda calculus. Lambda in lead role, Typed in supporting role :)
To expand on #augustss's comment, it's quite possible using dependently typed languages. Haskell is not exactly dependently typed, but it's close enough that dependent types can be "faked".
People don't today commonly use dependent types for several reasons
they're still very much research topics
they complicate and weaken type inference
they're somewhat more difficult to use in some circumstances
they can cause type-checking to take much longer or even fail to terminate, and
it's just more difficult to create production dependently-typed compilers.
That said, proponents of dependent typing find the error reduction you're looking for quite tenable. They also anticipate better safety and faster compiled binaries. Finally, dependently typed systems can be used as "proof systems" for mathematics. For a very current example consider the "Homotopy Type Theory" Agda code which formally proves many of the assertions of a new field of dependent typing math.
For a taste of dependent typing you can read/explore either Pierce's Software Foundations or Chlipala's Certified Programming with Dependent Types.
With dependent types you might introduce a type like this
div :: Int -> (x :: Int) -> (Inequal x 0) -> Int
where the second argument introduces a dependency of the type on the actual value of the argument and the third argument demands a proof of the proposition that x /= 0. With such a proof in hand (so long as nobody cheats and uses undefined as that proof) it's possible to feel confident dividing by the second argument could never be undefined.
The challenge comes from creating (automatically or manually) a value to pass in as the third argument. For such a simple example it may not be too difficult, but it becomes possible to encode demands for proofs that are very difficult to generate, or even impossible.
As an example of another advantage, consider
fold1 :: (f :: a -> a -> a) -> Associative f -> [a] -> a
which, ignoring the second argument, is just a regular fold. The second argument could be a proof that f associates and thus allows us to use a tree-like merging algorithm with log complexity instead of linear. But, in order to "prove" Associative we need to embed a theory of application and association into our types and have the competency to create proofs within it.
Simpler invariants exist such as the all-prevalent Vec type of "fixed-length vectors". These are lists where the length of the list (a value) is included in the type allowing us to have nice things like
(++) :: Vec n a -> Vec m a -> Vec (n + m) a
which, if we have some good theories of addition (or, more generally, magmas, monoids, and groups) in our type system then it isn't too difficult to create our result type which holds information about the way lengths of Vecs interact under concatenation.
How is this possible, what is going on there?
Is there a name for this?
What other languages have this same behavior?
Any without the strong typing system?
This behaviour is really simple and intuitive if you look at the types. To avoid the complications of infix operators like +, I'm going to use the function plus instead. I'm also going to specialise plus to work only on Int, to reduce the typeclass line noise.
Say we have a function plus, of type Int -> Int -> Int. One way to read that is "a function of two Ints that returns an Int". But that notation is a little clumsy for that reading, isn't it? The return type isn't singled out specially anywhere. Why would we write function type signatures this way? Because the -> is right associative, an equivalent type would be Int -> (Int -> Int). This looks much more like it's saying "a function from an Int to (a function from an Int to an Int)". But those two types are in fact exactly the same, and the latter interpretation is the key to understanding how this behaviour works.
Haskell views all functions as being from a single argument to a single result. There may be computations you have in mind where the result of the computation depends on two or more inputs (such as plus). Haskell says that the function plus is a function that takes a single input, and produces an output which is another function. This second function takes a single input and produces an output which is a number. Because the second function was computed by first (and will be different for different inputs to the first function), the "final" output can depend on both the inputs, so we can implement computations with multiple inputs with these functions that take only single inputs.
I promised this would be really easy to understand if you looked at the types. Here's some example expressions with their types explicitly annotated:
plus :: Int -> Int -> Int
plus 2 :: Int -> Int
plus 2 3 :: Int
If something is a function and you apply it to an argument, to get the type of the result of that application all you need to do is remove everything up to the first arrow from the function's type. If that leaves a type that has more arrows, then you still have a function! As you add arguments the right of an expression, you remove parameter types from the left of its type. The type makes it immediately clear what the type of all the intermediate results are, and why plus 2 is a function which can be further applied (its type has an arrow) and plus 2 3 is not (its type doesn't have an arrow).
"Currying" is the process of turning a function of two arguments into a function of one argument that returns a function of another argument that returns whatever the original function returned. It's also used to refer to the property of languages like Haskell that automatically have all functions work this way; people will say that Haskell "is a curried language" or "has currying", or "has curried functions".
Note that this works particularly elegantly because Haskell's syntax for function application is simple token adjacency. You are free to read plus 2 3 as the application of plus to 2 arguments, or the application of plus to 2 and then the application of the result to 3; you can mentally model it whichever way most fits what you're doing at the time.
In languages with C-like function application by parenthesised argument list, this breaks down a bit. plus(2, 3) is very different from plus(2)(3), and in languages with this syntax the two versions of plus involved would probably have different types. So languages with that kind of syntax tend not to have all functions be curried all the time, or even to have automatic currying of any function you like. But such languages have historically also tended not to have functions as first class values, which makes the lack of currying a moot point.
In Haskell, all functions take exactly 1 input, and produce exactly 1 output. Sometimes, the input to or output of a function can be another function. The input to or output of a function can also be a tuple. You can simulate a function with multiple inputs in one of two ways:
Use a tuple as input
(in1, in2) -> out
Use a function as output*
in1 -> (in2 -> out)
Likewise, you can simulate a function with multiple outputs in one of two ways:
Use a tuple as output*
in -> (out1, out2)
Use a function as a "second input" (a la function-as-output)
in -> ((out1 -> (out2 -> a)) -> a)
*this way is typically favored by Haskellers
The (+) function simulates taking 2 inputs in the typical Haskell way of producing a function as output. (Specializing to Int for ease of communication:)
(+) :: Int -> (Int -> Int)
For the sake of convenience, -> is right-associative, so the type signature for (+) can also be written
(+) :: Int -> Int -> Int
(+) is a function that takes in a number, and produces another function from number to number.
(+) 5 is the result of applying (+) to the argument 5, therefore, it is a function from number to number.
(5 +) is another way to write (+) 5
2 + 3 is another way of writing (+) 2 3. Function application is left-associative, so this is another way of writing (((+) 2) 3). In other words: Apply the function (+) to the input 2. The result will be a function. Take that function, and apply it to the input 3. The result of that is a number.
Therefore, (+) is a function, (5 +) is a function, and (+) 2 3 is a number.
In Haskell, you can take a function of two arguments, apply it to one argument, and get a function of one argument. In fact, strictly speaking, + isn't a function of two arguments, it's a function of one argument that returns a function of one argument.
In layman's terms, the + is the actual function and it is waiting to receive a certain number of parameters (in this case 2 or more) until it returns. If you don't give it two or more parameters, then it will remain a function waiting for another parameter.
It's called Currying
Lots of functional languages (Scala,Scheme, etc.)
Most functional languages are strongly typed, but this is good in the end because it reduces errors, which works well in enterprise or critical systems.
As a side note, the language Haskell is named after Haskell Curry, who re-discovered the phenomenon of Functional Currying while working on combinatory logic.
Languages like Haskell or OCaml have a syntax that lends itself particularily to currying, but you can do it in other languages with dynamic typing, like currying in Scheme.
I have a question that I cannot answer myself, but it seems like a fundamentally good question to clear up:
Why do some languages restrict the data returned from a function to a single item?
Is this serving some benefit? Or is it a practice brought over from Maths?
An example being (in Scala):
def login(username: String, password: String): User
If I wanted to return multiple items I cannot say it in the same manner as I just did for the input arguments (now entering imaginary Scala land)
def login(username: String, password: String): (User, Context, String)
Or even with named data returned:
def login(username: String, password: String): (user: User, context: Context, serverMessage: String)
There is no relationship: as observed, an arbitrary number of values can be returned, even if they must be "packaged" into a single value.
Imagine a language that can only accept a single tuple and can only return a single tuple from a function (the tuples can be any size). These functions then resemble math function transforming a vector from one space to another.
However, some reasons why it might be so:
Most functions only return one value, which may be a collection of values (object, sequence, etc.) Decompositions of the single value is supported in a number of languages, even though "only one value is returned".
The calling conventions and signatures are simpler: there is no special case/overhead to signal that n-values are being returned: there is no need to use part of the stack to return multiple values, a single register will do.
The need to fit in with the target architecture: earlier, especially lower-level languages, were heavily influenced by the computer architecture. In the case of Scala, for instance, it must work on the JVM.
It's just how the language was designed. Many (most?) languages borrow heavily -- syntax and/or methodologies -- from existing languages. Sometimes this is good, sometimes it is not so good. C# appeased Java appeased C++ appeased C, for instance: it's all about the market share.
It Just Works.
Even while "returning only one value", programming languages already have different ways of dealing with it. As noted in the post, some languages allow decomposition (the tuple returned as "decomposed" into it's two values during an assignment):
def multiMath (i):
return (i + i, i * i)
double, squared = multiMath(4)
# doubled is 8
# squared is 16
Additionally, other languages like C# which lacks decomposition, allow pass-by-reference (or emulate it in with mutation of an object):
void multiMath (int a, out int doubled, out int squared) {
doubled = a + a;
squared = a * a;
}
int d, s;
multiMath(4, out d, out s);
// d is now 8
// s is now 16
And, of course... ;-)
class ANewClassForThisFunctionsReturn {
...
}
There are likely more methods I am not aware of.
Happy coding.
Because tipicaly returned data is assigned to a variable. And thy are only few languages that can assign two variables in a single sentence.
A = sum(1,2)
B,C = dateTime()
Technically they are no problem to return more than one parameter because parameters are stacked, the issue is on assignment. Here a sample of this needed:
/* div example */
#include <stdio.h>
#include <stdlib.h>
int main ()
{
div_t divresult;
divresult = div (38,5);
printf ("38 div 5 => %d, remainder %d.\n", divresult.quot, divresult.rem);
return 0;
}
V.S
long quot, rem;
quot, rem = div(38,5)
standard C lib:
int fputc(int c , FILE *stream);
And such behaviors occured many times, e.g:
int putc(int c, FILE *stream);
int putchar(int c);
why not use CHAR as it really is?
If use INT is necessary, when should I use INT instead of CHAR?
Most likely (in my opinion, since much of the rationale behind early C is lost in the depths of time), it it was simply to mirror the types used in the fgetc type functions which must be able to return any real character plus the EOF special character. The fgetc function gets the next character converted to an int, and uses a special marker value EOF to indicate the end of the stream.
To do that, they needed the wider int type since a char isn't quite large enough to hold all possible characters plus one more thing.
And, since the developers of C seemed to prefer a rather minimalist approach to code, it makes sense that they would use the same type, to allow for code such as:
filecopy(ifp, ofp)
FILE *ifp;
FILE *ofp;
{
int c;
while ((c = fgetc (ifp)) != EOF)
fputc (c, ofp);
}
No char parameters in K&R C
One reason is that in early versions1 of C there were no char parameters.
Yes, you could declare a parameter as char or float but it was considered int or double. Therefore, it would have, then, been somewhat misleading to document an interface as taking a char argument.
I believe this is still true today for functions declared without prototypes, in order for it to be possible to interoperate with older code.
1. Early, but still widespread. C was a quick success and became the first (and still, mostly, the only) widely successful systems programming language.