I read through the Wikipedia article Existential types. I gathered that they're called existential types because of the existential operator (∃). I'm not sure what the point of it is, though. What's the difference between
T = ∃X { X a; int f(X); }
and
T = ∀x { X a; int f(X); }
?
When someone defines a universal type ∀X they're saying: You can plug in whatever type you want, I don't need to know anything about the type to do my job, I'll only refer to it opaquely as X.
When someone defines an existential type ∃X they're saying: I'll use whatever type I want here; you won't know anything about the type, so you can only refer to it opaquely as X.
Universal types let you write things like:
void copy<T>(List<T> source, List<T> dest) {
...
}
The copy function has no idea what T will actually be, but it doesn't need to know.
Existential types would let you write things like:
interface VirtualMachine<B> {
B compile(String source);
void run(B bytecode);
}
// Now, if you had a list of VMs you wanted to run on the same input:
void runAllCompilers(List<∃B:VirtualMachine<B>> vms, String source) {
for (∃B:VirtualMachine<B> vm : vms) {
B bytecode = vm.compile(source);
vm.run(bytecode);
}
}
Each virtual machine implementation in the list can have a different bytecode type. The runAllCompilers function has no idea what the bytecode type is, but it doesn't need to; all it does is relay the bytecode from VirtualMachine.compile to VirtualMachine.run.
Java type wildcards (ex: List<?>) are a very limited form of existential types.
Update: Forgot to mention that you can sort of simulate existential types with universal types. First, wrap your universal type to hide the type parameter. Second, invert control (this effectively swaps the "you" and "I" part in the definitions above, which is the primary difference between existentials and universals).
// A wrapper that hides the type parameter 'B'
interface VMWrapper {
void unwrap(VMHandler handler);
}
// A callback (control inversion)
interface VMHandler {
<B> void handle(VirtualMachine<B> vm);
}
Now, we can have the VMWrapper call our own VMHandler which has a universally-typed handle function. The net effect is the same, our code has to treat B as opaque.
void runWithAll(List<VMWrapper> vms, final String input)
{
for (VMWrapper vm : vms) {
vm.unwrap(new VMHandler() {
public <B> void handle(VirtualMachine<B> vm) {
B bytecode = vm.compile(input);
vm.run(bytecode);
}
});
}
}
An example VM implementation:
class MyVM implements VirtualMachine<byte[]>, VMWrapper {
public byte[] compile(String input) {
return null; // TODO: somehow compile the input
}
public void run(byte[] bytecode) {
// TODO: Somehow evaluate 'bytecode'
}
public void unwrap(VMHandler handler) {
handler.handle(this);
}
}
A value of an existential type like ∃x. F(x) is a pair containing some type x and a value of the type F(x). Whereas a value of a polymorphic type like ∀x. F(x) is a function that takes some type x and produces a value of type F(x). In both cases, the type closes over some type constructor F.
Note that this view mixes types and values. The existential proof is one type and one value. The universal proof is an entire family of values indexed by type (or a mapping from types to values).
So the difference between the two types you specified is as follows:
T = ∃X { X a; int f(X); }
This means: A value of type T contains a type called X, a value a:X, and a function f:X->int. A producer of values of type T gets to choose any type for X and a consumer can't know anything about X. Except that there's one example of it called a and that this value can be turned into an int by giving it to f. In other words, a value of type T knows how to produce an int somehow. Well, we could eliminate the intermediate type X and just say:
T = int
The universally quantified one is a little different.
T = ∀X { X a; int f(X); }
This means: A value of type T can be given any type X, and it will produce a value a:X, and a function f:X->int no matter what X is. In other words: a consumer of values of type T can choose any type for X. And a producer of values of type T can't know anything at all about X, but it has to be able to produce a value a for any choice of X, and be able to turn such a value into an int.
Obviously implementing this type is impossible, because there is no program that can produce a value of every imaginable type. Unless you allow absurdities like null or bottoms.
Since an existential is a pair, an existential argument can be converted to a universal one via currying.
(∃b. F(b)) -> Int
is the same as:
∀b. (F(b) -> Int)
The former is a rank-2 existential. This leads to the following useful property:
Every existentially quantified type of rank n+1 is a universally quantified type of rank n.
There is a standard algorithm for turning existentials into universals, called Skolemization.
I think it makes sense to explain existential types together with universal types, since the two concepts are complementary, i.e. one is the "opposite" of the other.
I cannot answer every detail about existential types (such as giving an exact definition, list all possible uses, their relation to abstract data types, etc.) because I'm simply not knowledgeable enough for that. I'll demonstrate only (using Java) what this HaskellWiki article states to be the principal effect of existential types:
Existential types can be used for several different purposes. But what they do is to 'hide' a type variable on the right-hand side. Normally, any type variable appearing on the right must also appear on the left […]
Example set-up:
The following pseudo-code is not quite valid Java, even though it would be easy enough to fix that. In fact, that's exactly what I'm going to do in this answer!
class Tree<α>
{
α value;
Tree<α> left;
Tree<α> right;
}
int height(Tree<α> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
Let me briefly spell this out for you. We are defining…
a recursive type Tree<α> which represents a node in a binary tree. Each node stores a value of some type α and has references to optional left and right subtrees of the same type.
a function height which returns the furthest distance from any leaf node to the root node t.
Now, let's turn the above pseudo-code for height into proper Java syntax! (I'll keep on omitting some boilerplate for brevity's sake, such as object-orientation and accessibility modifiers.) I'm going to show two possible solutions.
1. Universal type solution:
The most obvious fix is to simply make height generic by introducing the type parameter α into its signature:
<α> int height(Tree<α> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
This would allow you to declare variables and create expressions of type α inside that function, if you wanted to. But...
2. Existential type solution:
If you look at our method's body, you will notice that we're not actually accessing, or working with, anything of type α! There are no expressions having that type, nor any variables declared with that type... so, why do we have to make height generic at all? Why can't we simply forget about α? As it turns out, we can:
int height(Tree<?> t)
{
return (t != null) ? 1 + max( height(t.left), height(t.right) )
: 0;
}
As I wrote at the very beginning of this answer, existential and universal types are complementary / dual in nature. Thus, if the universal type solution was to make height more generic, then we should expect that existential types have the opposite effect: making it less generic, namely by hiding/removing the type parameter α.
As a consequence, you can no longer refer to the type of t.value in this method nor manipulate any expressions of that type, because no identifier has been bound to it. (The ? wildcard is a special token, not an identifier that "captures" a type.) t.value has effectively become opaque; perhaps the only thing you can still do with it is type-cast it to Object.
Summary:
===========================================================
| universally existentially
| quantified type quantified type
---------------------+-------------------------------------
calling method |
needs to know | yes no
the type argument |
---------------------+-------------------------------------
called method |
can use / refer to | yes no
the type argument |
=====================+=====================================
These are all good examples, but I choose to answer it a little bit differently. Recall from math, that ∀x. P(x) means "for all x's, I can prove that P(x)". In other words, it is a kind of function, you give me an x and I have a method to prove it for you.
In type theory, we are not talking about proofs, but of types. So in this space we mean "for any type X you give me, I will give you a specific type P". Now, since we don't give P much information about X besides the fact that it is a type, P can't do much with it, but there are some examples. P can create the type of "all pairs of the same type": P<X> = Pair<X, X> = (X, X). Or we can create the option type: P<X> = Option<X> = X | Nil, where Nil is the type of the null pointers. We can make a list out of it: List<X> = (X, List<X>) | Nil. Notice that the last one is recursive, values of List<X> are either pairs where the first element is an X and the second element is a List<X> or else it is a null pointer.
Now, in math ∃x. P(x) means "I can prove that there is a particular x such that P(x) is true". There may be many such x's, but to prove it, one is enough. Another way to think of it is that there must exist a non-empty set of evidence-and-proof pairs {(x, P(x))}.
Translated to type theory: A type in the family ∃X.P<X> is a type X and a corresponding type P<X>. Notice that while before we gave X to P, (so that we knew everything about X but P very little) that the opposite is true now. P<X> doesn't promise to give any information about X, just that there there is one, and that it is indeed a type.
How is this useful? Well, P could be a type that has a way of exposing its internal type X. An example would be an object which hides the internal representation of its state X. Though we have no way of directly manipulating it, we can observe its effect by poking at P. There could be many implementations of this type, but you could use all of these types no matter which particular one was chosen.
To directly answer your question:
With the universal type, uses of T must include the type parameter X. For example T<String> or T<Integer>. For the existential type uses of T do not include that type parameter because it is unknown or irrelevant - just use T (or in Java T<?>).
Further information:
Universal/abstract types and existential types are a duality of perspective between the consumer/client of an object/function and the producer/implementation of it. When one side sees a universal type the other sees an existential type.
In Java you can define a generic class:
public class MyClass<T> {
// T is existential in here
T whatever;
public MyClass(T w) { this.whatever = w; }
public static MyClass<?> secretMessage() { return new MyClass("bazzlebleeb"); }
}
// T is universal from out here
MyClass<String> mc1 = new MyClass("foo");
MyClass<Integer> mc2 = new MyClass(123);
MyClass<?> mc3 = MyClass.secretMessage();
From the perspective of a client of MyClass, T is universal because you can substitute any type for T when you use that class and you must know the actual type of T whenever you use an instance of MyClass
From the perspective of instance methods in MyClass itself, T is existential because it doesn't know the real type of T
In Java, ? represents the existential type - thus when you are inside the class, T is basically ?. If you want to handle an instance of MyClass with T existential, you can declare MyClass<?> as in the secretMessage() example above.
Existential types are sometimes used to hide the implementation details of something, as discussed elsewhere. A Java version of this might look like:
public class ToDraw<T> {
T obj;
Function<Pair<T,Graphics>, Void> draw;
ToDraw(T obj, Function<Pair<T,Graphics>, Void>
static void draw(ToDraw<?> d, Graphics g) { d.draw.apply(new Pair(d.obj, g)); }
}
// Now you can put these in a list and draw them like so:
List<ToDraw<?>> drawList = ... ;
for(td in drawList) ToDraw.draw(td);
It's a bit tricky to capture this properly because I'm pretending to be in some sort of functional programming language, which Java isn't. But the point here is that you are capturing some sort of state plus a list of functions that operate on that state and you don't know the real type of the state part, but the functions do since they were matched up with that type already.
Now, in Java all non-final non-primitive types are partly existential. This may sound strange, but because a variable declared as Object could potentially be a subclass of Object instead, you cannot declare the specific type, only "this type or a subclass". And so, objects are represented as a bit of state plus a list of functions that operate on that state - exactly which function to call is determined at runtime by lookup. This is very much like the use of existential types above where you have an existential state part and a function that operates on that state.
In statically typed programming languages without subtyping and casts, existential types allow one to manage lists of differently typed objects. A list of T<Int> cannot contain a T<Long>. However, a list of T<?> can contain any variation of T, allowing one to put many different types of data into the list and convert them all to an int (or do whatever operations are provided inside the data structure) on demand.
One can pretty much always convert a record with an existential type into a record without using closures. A closure is existentially typed, too, in that the free variables it is closed over are hidden from the caller. Thus a language that supports closures but not existential types can allow you to make closures that share the same hidden state that you would have put into the existential part of an object.
An existential type is an opaque type.
Think of a file handle in Unix. You know its type is int, so you can easily forge it. You can, for instance, try to read from handle 43. If it so happens that the program has a file open with this particular handle, you'll read from it. Your code doesn't have to be malicious, just sloppy (e.g., the handle could be an uninitialized variable).
An existential type is hidden from your program. If fopen returned an existential type, all you could do with it is to use it with some library functions that accept this existential type. For instance, the following pseudo-code would compile:
let exfile = fopen("foo.txt"); // No type for exfile!
read(exfile, buf, size);
The interface "read" is declared as:
There exists a type T such that:
size_t read(T exfile, char* buf, size_t size);
The variable exfile is not an int, not a char*, not a struct File—nothing you can express in the type system. You can't declare a variable whose type is unknown and you cannot cast, say, a pointer into that unknown type. The language won't let you.
Seems I’m coming a bit late, but anyway, this document adds another view of what existential types are, although not specifically language-agnostic, it should be then fairly easier to understand existential types: http://www.cs.uu.nl/groups/ST/Projects/ehc/ehc-book.pdf (chapter 8)
The difference between a universally and existentially quantified type can be characterized by the following observation:
The use of a value with a ∀ quantified type determines the type to choose for the instantiation of the quantified type variable. For example, the caller of the identity function “id :: ∀a.a → a” determines the type to choose for the type variable a for this particular application of id. For the function application “id 3” this type equals Int.
The creation of a value with a ∃ quantified type determines, and hides, the type of the quantified type variable. For example, a creator of a “∃a.(a, a → Int)” may have constructed a value of that type from “(3, λx → x)”; another creator has constructed a value with the same type from “(’x’, λx → ord x)”. From a users point of view both values have the same type and are thus interchangeable. The value has a specific type chosen for type variable a, but we do not know which type, so this information can no longer be exploited. This value specific type information has been ‘forgotten’; we only know it exists.
A universal type exists for all values of the type parameter(s). An existential type exists only for values of the type parameter(s) that satisfy the constraints of the existential type.
For example in Scala one way to express an existential type is an abstract type which is constrained to some upper or lower bounds.
trait Existential {
type Parameter <: Interface
}
Equivalently a constrained universal type is an existential type as in the following example.
trait Existential[Parameter <: Interface]
Any use site can employ the Interface because any instantiable subtypes of Existential must define the type Parameter which must implement the Interface.
A degenerate case of an existential type in Scala is an abstract type which is never referred to and thus need not be defined by any subtype. This effectively has a shorthand notation of List[_] in Scala and List<?> in Java.
My answer was inspired by Martin Odersky's proposal to unify abstract and existential types. The accompanying slide aids understanding.
Research into abstract datatypes and information hiding brought existential types into programming languages. Making a datatype abstract hides info about that type, so a client of that type cannot abuse it. Say you've got a reference to an object... some languages allow you to cast that reference to a reference to bytes and do anything you want to that piece of memory. For purposes of guaranteeing behavior of a program, it's useful for a language to enforce that you only act on the reference to the object via the methods the designer of the object provides. You know the type exists, but nothing more.
See:
Abstract Types Have Existential Type, MITCHEL & PLOTKIN
http://theory.stanford.edu/~jcm/papers/mitch-plotkin-88.pdf
I created this diagram. I don't know if it's rigorous. But if it helps, I'm glad.
As I understand it's a math way to describe interfaces/abstract class.
As for T = ∃X { X a; int f(X); }
For C# it would translate to a generic abstract type:
abstract class MyType<T>{
private T a;
public abstract int f(T x);
}
"Existential" just means that there is some type that obey to the rules defined here.
Related
I am writing a language where functions are not typed. Which means I need to infer the return type of a function call in order to do type checking. However when somebody writes a recursive function the type checker goes into an infinite recursion trying to infer the type of the function call inside the function body.
The type checker does something like this:
Infer the types of the function call actual arguments.
Create a mapping of the actual argument types to the formal arguments.
Use the mapping to annotate types on the arguments used inside the function body.
Infer and return the return type of the function body.
Step 4 tries to then infer the type of the function call inside the function body, which calls the same type checker function again, causing an infinite recursion.
An example of a recursive function that gives me this problem:
function factorial(n) = n<1 ? 1 : n*factorial(n-1); // Function definition.
...
assert 24 == factorial(4); // Function call expression usage example.
How can I solve this problem without going in to an infinite recursion loop? Is there a way to infer the type of the recursive function call without having to go into the body again? Or some clean way to infer the type from context?
I know the easy solution might be to add types annotations to functions, this way the problem is trivial, but before doing that I want to know if there is a way to solve this without resorting to that.
I'd also like for the solution to work for mutual recursion.
Type inference can vary a lot depending on the language's type system and on what properties you want to have in terms of when annotations are needed. But whatever your language looks like, I think there's one seminal case you really should read about, which is ML. ML's type inference holds a nice sweet spot where it all fits together in a relatively simple paradigm. No type annotations are needed, and any expression has a single most general type (this property is called principality of typing).
ML's type system is the Hindley-Milner type system, which has parametric polymorphism. The type of an expression is either a specific type, or “any”. More precisely, the type constructor of an expression is either a specific type constructor or “any”, and type constructors can have arguments which themselves either have a specific type constructor or “any”. For example, the empty list has the type “list of any”. Two expressions that can have “any” type in isolation may be constrained to have the same type, whatever it is, so “any” is expressed with variables. For example, function list_of_two(x, y) = [x, y] (in a notation like your language) constrains x and y to have the same type, because they're inserted in the same list, but that type can be any type, so the type of this function is “take any two parameters of the same type α, and return a value of type list of α”.
The basic type inference algorithm for Hindley-Milner is algorithm W. At its core, it works by giving each subexpression a type that's a variable: α₁, α₂, α₃, … Programming language constructions then impose constraints on those variables. For example, if a list contains two elements of types α₁ and α₂ and the list itself has the type α₃, this constraints α₁ = α₂ and α₃ = list of α₁. Putting all these constraints together is a unification problem.
The constraints are based on a purely syntactic reading of the program. If there's a recursive call, you don't need to know the type of the function: it just means that there's a constraint that the variable for the return type of the function is the same as the type at its point of use. That's just one more equation to add to the set of constraints.
I left out an important aspect of ML which is that an expression's type can be generalized: an expression can be used with different types at different places. This is what allows polymorphism. For example,
let empty_list = [] in
(empty_list # [3]), (empty_list # ["hello"])
is a valid program where empty_list is used once with the type “list of integers” and once with the type “list of strings”. The type of empty_list is “for any α, list of α”: that's parametric polymorphism. Generalization adds some complexity to the algorithm, but it also removes complexity elsewhere, because that's what allows principality. Without it, let empty_list = [] in … would be ambiguous: empty_list would have to have some type, but there's no way to know what type without analyzing …, and then when you do analyze the … above you'd need to make a choice between integer and string.
Depending on your language's type system, ML and algorithm W may be directly reusable or may just provide some vague inspiration. But the principle of using variables during the inference, and progressively constraining these variables, is very general.
This might seem like a silly question, but I want to make a struct with a collection of functions, but the functions bind to the struct. I can sorta see that this is a cycle, but humor me with this example:
type FuncType func() error
type FuncSet struct {
TokenVariable int
FuncTyper FuncType
}
and I want to be able to create a function bound to the FuncSet type so it can operate on TokenVariable, thusly:
func (f *FuncSet) FuncType() error {
f.TokenVariable = 100
return nil
}
However, this changes the signature of the type (I can't find any information about type bindings as part of function type specifications) such that assigning this function to the struct element tells me this function/variable is not found.
I can see an easy work-around for this, by prefixing the parameters with a pointer to the struct type, it's just a bit ugly.
I looked around a little further and discovered that what I'm kinda looking for is like a closure in that it can be passed a variable from the immediate outer scope but... well, I'll be glad to be corrected about this absence of type binding in function types, but for now passing the pointer to the type looks like the way to go.
I think I found the solution:
type nullTester func(*Bast, uint32) bool
type Bast struct {
...
isNull nullTester
...
}
func isNull(b *Bast, d uint32) bool {
return d == 0
}
and then I can bind it to the type like this:
func NewBast() (b *Bast) {
...
b.isNull = isNull
...
}
// IsNull - tests if a value in the tree is null
func (b *Bast) IsNull(d uint32) bool {
return b.isNull(b, d)
}
It seems a bit hackish and I'm not sure what's going to happen in a second library that I will write that sets a different type for the uint32 parameter, but go vet is happy so maybe this is the correct way to do it.
It does seem to me that func types should really have a field in the grammar to specify a binding type, but maybe I just found a hack that sorta lets me do polymorphism. In calling programs all they will see is the nice exported function that binds to the type as planned and I get my readability as well as being able to retarget the base library to store a different type of data.
I think this is the proper solution. I just can't find anything that confirms or denies whether in a type Name func specification there is any way of asserting the type. It really should not match up, since the binding is part of the signature, but the syntax for type with functions does not appear to have this type binding.
My actual code is here, and you can see by looking at it what I am aiming to do:
https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast.go
The differences between the type of data the tree stores is entirely superficial, because it is intended to be primarily used for sorting unsigned integers of various lengths, and one important thing it needs to have is to be able to work from a, for example, 64 bit integer but sort only by the first or last half (as I have a bigger project that treats these hash values as coordinates in an adjacency list). In theory it could be used instead of a hash table lookup as well, with a low variance in time to find elements because of the binary tree structure.
It's not a conventional, reference-vector based tree, and the store itself is an array with an unconventional power of two mapping, a 'dense' tree, and the purpose above all, for implementing this way, is that when the tree is walked, as well as rotated, much of the time it is sequential blocks of memory being accessed which should make for a lot less cache misses than a conventional binary tree (and for which reason generally this type of application just uses some kind of sort like a bucket sort).
You could use an anonymous field with an interface that defines the method set that you want to use (that might change).
Go playground here
You'd define your interface
type validator interface {
IsRightOf(a, b interface{}) bool
... // other methods
}
and your type:
type Bast struct {
validator // anonymous interface field
... // some fields
}
Then you can access the methods of validator from the Bast type
b := bast.New()
b.IsRightOf(c, d) // this is valid, you do not need to do b.validator.IsRightOf(...)
because validator is an interface you can change those methods how you like.
A lot of statically typed languages, like C++ and C#, have local variable type inference (with the keywords auto and var respectively, I think).
However, I haven't seen many C-derived languages (apart from those mentioned in the comments) implementing compile-time return type inference. I'll describe what I mean by "return type inference" before I ask the question. (I definitely don't mean overloading by return type.)
Consider this code in a hypothetical C#-like language:
private auto SomeMethod(int x)
{
return 3 * x;
}
It's more than obvious (to humans and to the compiler) that the return type is int (and the compilers can verify it).
The same goes for multiple paths:
private auto SomeOtherMethod(int x)
{
if(x == 0) return 1;
else return 3 * x;
}
It's still not ambiguous at all, because there is already an algorithm in said languages to resolve whether two expressions have compatible types:
private auto YetAnotherMethod(int x)
{
var r = (x == 0) ? 1 : 3 * x;
return r;
}
Since the algorithm exists and it is already implemented in some form, it's probably not a technical problem in this regard. But still, I haven't seen it anywhere in statically typed languages, which got me thinking about whether there's something bad about it.
My question:
Does return type inference, as a concept, have any disadvantage or subtle pitfall that I'm not seeing? (Apart from readability - I already understand that.)
Is there some corner case where it would introduce problems or ambiguity to a statically typed language? (By "introduce", I'm referring to issues that local variable type inference doesn't already have.)
yes, there are disadvantages. one you already mentioned: readability. second - the type has to be calculated so it takes time (in turing-complete type systems it may be infinite). but there is also something different - theory of type systems is much more complicated.
let's write a function that takes a list and return its head. what's its type? or function that takes a function, and a parameter applies that and return the result. in many languages you can't declare it. to support this kind of stuff, java introduced generics and it failed miserably. currently it's one of the most hated features of the language because of consistency problems
another thing: returned type may depend on not only the body of the function but also context of the invocation. let's look at haskell (that has best type system i've ever seen) http://learnyouahaskell.com/types-and-typeclasses
there is a function called read that takes a string, parse it and return... whatever you need, an int, an array.
so each time a type system is designed, the designer has to choose at which level she wants to stop. dynamic languages decided not to infer types at all, scala decided to do some local inference but not, for example, for overloaded or recursive functions and c++ decided not to infer the result
What makes a type different from class and vice versa?
(In the general language-agnostic sense)
The following answer is from Gof book (Design Patterns)
An object's class defines how the
object is implemented. The class
defines object's internal state and
the implementation of its
operations.
In contrast, an object's
type only refers to its interface - a
set of requests to which it can
respond.
An object can have many types,
and objects of different classes can
have the same type.
//example in c++
template<typename T>
const T & max(T const &a,T const &b)
{
return a>b?a:b; //> operator of the type is used for comparison
}
max function requires a type with operation > with its own type as one of it interface any class that satisfies the above requirement can be used to generate specific max<particular class/primitive type> function for that class.
Inspired by Wikipedia...
In type theory terms;
A type is an abstract interface.
Types generally represent nouns, such as a person, place or thing, or something nominalized,
A class represents an implementation of the type.
It is a concrete data structure and collection of subroutines
Different concrete classes can produce objects of the same abstract type (depending on type system).
*For example, one might implement the type Stack with two classes: SmallStack (fast for small stacks, but scales poorly) and ScalableStack (scales well but high overhead for small stacks).*
Similarly, a given class may have several different constructors.
The banana example.
A Banana type would represent the properties and functionality of bananas in general.
The ABCBanana and XYZBanana classes would represent ways of producing bananas.
(Different banana suppliers in real life, or different data structures and functions to represent and draw bananas in a video game).
The ABCBanana class could then produce particular bananas which are
instances of the ABCBanana class, they would be objects of type Banana.
It is not rare the programmer provide a single and only implementation for a type. In this case the class name is often identical with the type name. But there is still a type (which could be extracted in an interface if required), and an implementation (which would implement the separate interface) which builds instances (objects) of the class.
I always think of a 'type' as an umbrella term for 'classes' and 'primitives'.
int foo; // Type is int, class is nonexistent.
MyClass foo; // Type is MyClass, class is MyClass
Type is the umbrella term for all the available object templates or concepts. A class is one such object template. So is the structure type, the Integer type, the Interface type etc. These are all types
If you want, you can look at it this way: A type is the parent concept. All the other concepts: Class, Interface, Structure, Integer etc inherit from this concept.i.e They are types
Taken from the GoF citation from below:
An objects's class defines how the
object is implemented .The class
defines the object's internal state and
the implementation of its
operations.
In contrast, an objects's
type only refers to its interface - the
set of requests to which it can
respond.
I want to provide an example using Java:
public interface IType {
}
public class A implements IType {
public A{};
}
public class B implements IType {
public B{};
}
Both classes A and B implement the interface and thus are of the type IType. Additionally in Java, both classes produce their own type (respectively to their class name). Thus the class A is of type A and IType and the class B is of type B and IType satisfying:
An object can have many types,
and objects of different classes can
have the same type.
The difference between subtypes and subclass probably helps to understand that issue as well:
https://www.cs.princeton.edu/courses/archive/fall98/cs441/mainus/node12.html
In general language-agnostic sense - Class is an realization of the Type.
Often when this is the only realization of that type, you can use both terms to reference it in some context.
On the contrary, for example, in C# context - Class is just one of the many more implementations of a Type concept like primitives, structs, pointers etc.
Type contains description of the data (i.e. properties, operations, etc),
Class is a specific type - it is a template to create instances of objects.
Strictly speaking class is a special concept, it can be seen as a package containing subset of metadata describing some aspects of an object.
For example in C# you can find interfaces and classes. Both of them are types, but interface can only define some contract and can not be instantiated unlike classes.
Simply speaking class is a specialized type used to encapsulate properties and behavior of an object.
Wikipedia can give you a more complete answer:
Definition of class
Definition of data type
Type is conceptually a superset of class. In the broader sense, a class is one form of type.
Closely related to classes are interfaces, which can bee seen as a very special kind of class - a purely abstract one. These too are types.
So "type" encompasses classes, interfaces and in most languages primitives too. Also platforms like the dot-net CLR have structure types too.
To illustrate it the fastest way:
A Struct is a Type, but a Struct is not a Class.
As you can see, a Type is an "abstract" term for not only definitions of classes, but also structs and primitive data types like float, int, bool.
I think of a type as being the set of things you can do with a particular value. For instance, if you have an integer value, you can add it to other integers (or perform other arithmetic operations), or pass it to functions which accept an integer argument. If you have an object value, you can call methods on it that are defined by its class.
Because a class defines what you can do with objects of that class, a class defines a type. A class is more than that though, since it also provides a description of how the methods are implemented (something not implied by the type) and how the fields of the object are laid out.
Note also that an object value can only have one class, but it may have multiple types, since every superclass provides a subset of the functionality available in the object's class.
So although objects and types are closely related, they are really not the same thing.
To add another example of distinction: in C++ you have pointer and reference types which can refer to classes, but are not classes in and of themselves.
Bar b; // b is of type "class Bar"
Bar *b2 = &b; // b2 is of type "pointer to Class Bar"
Bar &b3 = b; // b3 is of type "reference to Class Bar"
Bar *b4[7]; // b4 is of type "7-element array of pointers to Class Bar"
Bar ***b5; //b5 is of type "pointer to a pointer to a pointer to Class Bar"
Note that only one class is involved, but a near infinite number of types can be used. In some languages, function are considered "first-class-objects" in which case, the type of a function is a class. In others, the type of a function is merely a pointer. Classes generally have the concepts of being able to hold data, as well as operations on that data.
My thoughts are pretty much in line with aku's answer.
I see classes as a template for building objects, while types are a way to classify those objects, and provide us with an interface to them.
Python also adds metaclasses, that are just a mechanism to build classes, in the same way as classes build objects (and well, classes and metaclasses are both objects).
This response to the same question in lamba the ultimate seems to me like a perfect explanation.
Types in C, like Int Float, char etc define data that can be acted on with specific methods that can operate on them. It's no more complicated than that. Like for int I can add, subtract multiply and maybe divide. Those are my methods (or operations) for int. A Class is simply a definition of a new type. I first define what the data looks like. Maybe its a single bit. Maybe it's two words like a complex with a real and imaginary part. Or maybe its this complex thingy with 309734325 bytes representing the atomic makeup of a weird particle on Jupiter. I don't care. Just like an integer, I get to make up the operations I can do with this new data type. In the case of the integer I had add, subtract, etc. With this new data type I can define whatever operations I think make sense. They might be add subtract etc. but they may add other things. These are whatever methods I decide to add to my class.
The bottom line is that with a type in C, you have a definition of what the data is, ie; a byte, word, float, char etc. But any of these also implies what operations are legal and will produce reliable results.
A class is no different except it is up to you to define the interface and acceptable operations. The class defines these things and when you instantiate it in an Object it defines the behavior of the object just like a type definition defines the behavior of an integer when you operate on it.
Classes just give you the flexibility to define new types and everything about how they operate.
Once this is defined, every time I instantiate an object of class "thingy", it has the data structure I defined and the operations (methods) that I said you can do with it. The class "thingy" is clearly nothing more or less than a new type that C++ lets me define.
Type generally refers to the classification of primitive values - integers, strings, arrays, booleans, null, etc. Usually, you can't create any new types.
Class refers to the named set of properties and methods which an object is associated with when it is created. You can usually define as many new classes as you want, although some languages you have to create a new object and then attach methods to it.
This definition is mostly true, but some languages have attempted to combine types and classes in various ways, with various beneficial results.
Types and classes are related but not identical. My take is that classes are used for implementation inheritance, whereas types are used for runtime substitution.
Here is a link explaining the substitution principle and why subclasses and subtypes are not always the same thing (in Java for example). The wikipedia page on covariance and contravariance has more information on this distinction.
In langugages like Haskell, the concept of Class doesn't exist. It only has Types. (And Type Class. Not to be confused with Class, Type Class is more of an abstracted version of Type).
Monad is a Type Class.
class Monad m where
(>>=) :: m a -> ( a -> m b) -> m b
(>>) :: m a -> m b -> m b
return :: a -> m a
fail :: String -> m a
From a (pure) functional programming perspective, Type is more fundemental than Class as one can trace its root to Type Theory (e.g. from a PTL perspective, lambda calculus with types and without types behave quite differently), while Class is really just a construct to enable OO.
In languages that only support Type and don't support Class, functions are often treated as first-class citizen.
Meanwhile, when a language makes a distinction between Type and Class, functions are more of a second-class citizens that can be attached to Objects, etc. And yup, often you can attach a function onto a Class itself (aka a static function).
Interesting question. I think aku's answer is spot on. Take the java ArrayList class as an example
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
An instance of the ArrayList class is said to be of type of every superclass it extends and every interface it implements. Therefore, an instance of the ArrayList class has a type ArrayList, RandomAccess, Cloneable, and so forth. In other words, values (or instances) belong to one or more types, classes define what these types are.
Different classes may describe the same type.
Type consists of these parts:
Operations = syntax
Description of operations = semantics
Class consists of these parts:
Operations = syntax
Implementation (= various implementations describe same semantics)
Some notes:
Interface (as in Java) is not type, because it does not describe semantics (describes only syntax)
Subclass is not subtype, because subclass may change semantics defined in superclass, subtype cannot change supertype semantics (see Liskov Substitution Principle, e.g. this LSP example).
Obviously, as there are languages with type system that are not OO programming languages, type must be a broader concept than class
Even in languages like Java, int is a (primitive) type, but not a class.
Hence: every class is a type, but not every type is a class.
If we think to this question in C# context, we reach bellow answer.
C# type system is divided into following categories:
Value types:
Simple types: like int, long, float, etc.
Enum types
Struct types
Nullable types
Reference types:
Class types
Interface types
Array types
Delegate types
As you can see there are many types in C# which Class is only one of them.
There is just one important note:
C#’s type system is unified such that a value of any type can be treated as an object. Every type in C# directly or indirectly derives from the object class type, and object is the ultimate base class of all types. Values of reference types are treated as objects simply by viewing the values as type object. Values of value types are treated as objects by performing boxing and unboxing operations.
so as I see, type is an umbrella over many items which class is one of them.
Referece: CSahrp Language Specification doc, page 4
This was a good question for me, which made me think hard. I would dare to say that Class is a compiletime thingy and Type is a runtime thingy. I say this because you write classes not types. The compiler then creates types from classes, and the runtime use types to create instances of objects.
types are programming constructs that helps the compiler to perform type checking and ensure that the variables have the right properties for an operation.
classes are user defined types that an objects or variables referencing them could have. These are also subjected to type checking.
What does "type-safe" mean?
Type safety means that the compiler will validate types while compiling, and throw an error if you try to assign the wrong type to a variable.
Some simple examples:
// Fails, Trying to put an integer in a string
String one = 1;
// Also fails.
int foo = "bar";
This also applies to method arguments, since you are passing explicit types to them:
int AddTwoNumbers(int a, int b)
{
return a + b;
}
If I tried to call that using:
int Sum = AddTwoNumbers(5, "5");
The compiler would throw an error, because I am passing a string ("5"), and it is expecting an integer.
In a loosely typed language, such as javascript, I can do the following:
function AddTwoNumbers(a, b)
{
return a + b;
}
if I call it like this:
Sum = AddTwoNumbers(5, "5");
Javascript automaticly converts the 5 to a string, and returns "55". This is due to javascript using the + sign for string concatenation. To make it type-aware, you would need to do something like:
function AddTwoNumbers(a, b)
{
return Number(a) + Number(b);
}
Or, possibly:
function AddOnlyTwoNumbers(a, b)
{
if (isNaN(a) || isNaN(b))
return false;
return Number(a) + Number(b);
}
if I call it like this:
Sum = AddTwoNumbers(5, " dogs");
Javascript automatically converts the 5 to a string, and appends them, to return "5 dogs".
Not all dynamic languages are as forgiving as javascript (In fact a dynamic language does not implicity imply a loose typed language (see Python)), some of them will actually give you a runtime error on invalid type casting.
While its convenient, it opens you up to a lot of errors that can be easily missed, and only identified by testing the running program. Personally, I prefer to have my compiler tell me if I made that mistake.
Now, back to C#...
C# supports a language feature called covariance, this basically means that you can substitute a base type for a child type and not cause an error, for example:
public class Foo : Bar
{
}
Here, I created a new class (Foo) that subclasses Bar. I can now create a method:
void DoSomething(Bar myBar)
And call it using either a Foo, or a Bar as an argument, both will work without causing an error. This works because C# knows that any child class of Bar will implement the interface of Bar.
However, you cannot do the inverse:
void DoSomething(Foo myFoo)
In this situation, I cannot pass Bar to this method, because the compiler does not know that Bar implements Foo's interface. This is because a child class can (and usually will) be much different than the parent class.
Of course, now I've gone way off the deep end and beyond the scope of the original question, but its all good stuff to know :)
Type-safety should not be confused with static / dynamic typing or strong / weak typing.
A type-safe language is one where the only operations that one can execute on data are the ones that are condoned by the data's type. That is, if your data is of type X and X doesn't support operation y, then the language will not allow you to to execute y(X).
This definition doesn't set rules on when this is checked. It can be at compile time (static typing) or at runtime (dynamic typing), typically through exceptions. It can be a bit of both: some statically typed languages allow you to cast data from one type to another, and the validity of casts must be checked at runtime (imagine that you're trying to cast an Object to a Consumer - the compiler has no way of knowing whether it's acceptable or not).
Type-safety does not necessarily mean strongly typed, either - some languages are notoriously weakly typed, but still arguably type safe. Take Javascript, for example: its type system is as weak as they come, but still strictly defined. It allows automatic casting of data (say, strings to ints), but within well defined rules. There is to my knowledge no case where a Javascript program will behave in an undefined fashion, and if you're clever enough (I'm not), you should be able to predict what will happen when reading Javascript code.
An example of a type-unsafe programming language is C: reading / writing an array value outside of the array's bounds has an undefined behaviour by specification. It's impossible to predict what will happen. C is a language that has a type system, but is not type safe.
Type safety is not just a compile time constraint, but a run time constraint. I feel even after all this time, we can add further clarity to this.
There are 2 main issues related to type safety. Memory** and data type (with its corresponding operations).
Memory**
A char typically requires 1 byte per character, or 8 bits (depends on language, Java and C# store unicode chars which require 16 bits).
An int requires 4 bytes, or 32 bits (usually).
Visually:
char: |-|-|-|-|-|-|-|-|
int : |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type safe language does not allow an int to be inserted into a char at run-time (this should throw some kind of class cast or out of memory exception). However, in a type unsafe language, you would overwrite existing data in 3 more adjacent bytes of memory.
int >> char:
|-|-|-|-|-|-|-|-| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?|
In the above case, the 3 bytes to the right are overwritten, so any pointers to that memory (say 3 consecutive chars) which expect to get a predictable char value will now have garbage. This causes undefined behavior in your program (or worse, possibly in other programs depending on how the OS allocates memory - very unlikely these days).
** While this first issue is not technically about data type, type safe languages address it inherently and it visually describes the issue to those unaware of how memory allocation "looks".
Data Type
The more subtle and direct type issue is where two data types use the same memory allocation. Take a int vs an unsigned int. Both are 32 bits. (Just as easily could be a char[4] and an int, but the more common issue is uint vs. int).
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type unsafe language allows the programmer to reference a properly allocated span of 32 bits, but when the value of a unsigned int is read into the space of an int (or vice versa), we again have undefined behavior. Imagine the problems this could cause in a banking program:
"Dude! I overdrafted $30 and now I have $65,506 left!!"
...'course, banking programs use much larger data types. ;) LOL!
As others have already pointed out, the next issue is computational operations on types. That has already been sufficiently covered.
Speed vs Safety
Most programmers today never need to worry about such things unless they are using something like C or C++. Both of these languages allow programmers to easily violate type safety at run time (direct memory referencing) despite the compilers' best efforts to minimize the risk. HOWEVER, this is not all bad.
One reason these languages are so computationally fast is they are not burdened by verifying type compatibility during run time operations like, for example, Java. They assume the developer is a good rational being who won't add a string and an int together and for that, the developer is rewarded with speed/efficiency.
Many answers here conflate type-safety with static-typing and dynamic-typing. A dynamically typed language (like smalltalk) can be type-safe as well.
A short answer: a language is considered type-safe if no operation leads to undefined behavior. Many consider the requirement of explicit type conversions necessary for a language to be strictly typed, as automatic conversions can sometimes leads to well defined but unexpected/unintuitive behaviors.
A programming language that is 'type-safe' means following things:
You can't read from uninitialized variables
You can't index arrays beyond their bounds
You can't perform unchecked type casts
An explanation from a liberal arts major, not a comp sci major:
When people say that a language or language feature is type safe, they mean that the language will help prevent you from, for example, passing something that isn't an integer to some logic that expects an integer.
For example, in C#, I define a function as:
void foo(int arg)
The compiler will then stop me from doing this:
// call foo
foo("hello world")
In other languages, the compiler would not stop me (or there is no compiler...), so the string would be passed to the logic and then probably something bad will happen.
Type safe languages try to catch more at "compile time".
On the down side, with type safe languages, when you have a string like "123" and you want to operate on it like an int, you have to write more code to convert the string to an int, or when you have an int like 123 and want to use it in a message like, "The answer is 123", you have to write more code to convert/cast it to a string.
To get a better understanding do watch the below video which demonstrates code in type safe language (C#) and NOT type safe language ( javascript).
http://www.youtube.com/watch?v=Rlw_njQhkxw
Now for the long text.
Type safety means preventing type errors. Type error occurs when data type of one type is assigned to other type UNKNOWINGLY and we get undesirable results.
For instance JavaScript is a NOT a type safe language. In the below code “num” is a numeric variable and “str” is string. Javascript allows me to do “num + str” , now GUESS will it do arithmetic or concatenation .
Now for the below code the results are “55” but the important point is the confusion created what kind of operation it will do.
This is happening because javascript is not a type safe language. Its allowing to set one type of data to the other type without restrictions.
<script>
var num = 5; // numeric
var str = "5"; // string
var z = num + str; // arthimetic or concat ????
alert(z); // displays “55”
</script>
C# is a type safe language. It does not allow one data type to be assigned to other data type. The below code does not allow “+” operator on different data types.
Concept:
To be very simple Type Safe like the meanings, it makes sure that type of the variable should be safe like
no wrong data type e.g. can't save or initialized a variable of string type with integer
Out of bound indexes are not accessible
Allow only the specific memory location
so it is all about the safety of the types of your storage in terms of variables.
Type-safe means that programmatically, the type of data for a variable, return value, or argument must fit within a certain criteria.
In practice, this means that 7 (an integer type) is different from "7" (a quoted character of string type).
PHP, Javascript and other dynamic scripting languages are usually weakly-typed, in that they will convert a (string) "7" to an (integer) 7 if you try to add "7" + 3, although sometimes you have to do this explicitly (and Javascript uses the "+" character for concatenation).
C/C++/Java will not understand that, or will concatenate the result into "73" instead. Type-safety prevents these types of bugs in code by making the type requirement explicit.
Type-safety is very useful. The solution to the above "7" + 3 would be to type cast (int) "7" + 3 (equals 10).
Try this explanation on...
TypeSafe means that variables are statically checked for appropriate assignment at compile time. For example, consder a string or an integer. These two different data types cannot be cross-assigned (ie, you can't assign an integer to a string nor can you assign a string to an integer).
For non-typesafe behavior, consider this:
object x = 89;
int y;
if you attempt to do this:
y = x;
the compiler throws an error that says it can't convert a System.Object to an Integer. You need to do that explicitly. One way would be:
y = Convert.ToInt32( x );
The assignment above is not typesafe. A typesafe assignement is where the types can directly be assigned to each other.
Non typesafe collections abound in ASP.NET (eg, the application, session, and viewstate collections). The good news about these collections is that (minimizing multiple server state management considerations) you can put pretty much any data type in any of the three collections. The bad news: because these collections aren't typesafe, you'll need to cast the values appropriately when you fetch them back out.
For example:
Session[ "x" ] = 34;
works fine. But to assign the integer value back, you'll need to:
int i = Convert.ToInt32( Session[ "x" ] );
Read about generics for ways that facility helps you easily implement typesafe collections.
C# is a typesafe language but watch for articles about C# 4.0; interesting dynamic possibilities loom (is it a good thing that C# is essentially getting Option Strict: Off... we'll see).
Type-Safe is code that accesses only the memory locations it is authorized to access, and only in well-defined, allowable ways.
Type-safe code cannot perform an operation on an object that is invalid for that object. The C# and VB.NET language compilers always produce type-safe code, which is verified to be type-safe during JIT compilation.
Type-safe means that the set of values that may be assigned to a program variable must fit well-defined and testable criteria. Type-safe variables lead to more robust programs because the algorithms that manipulate the variables can trust that the variable will only take one of a well-defined set of values. Keeping this trust ensures the integrity and quality of the data and the program.
For many variables, the set of values that may be assigned to a variable is defined at the time the program is written. For example, a variable called "colour" may be allowed to take on the values "red", "green", or "blue" and never any other values. For other variables those criteria may change at run-time. For example, a variable called "colour" may only be allowed to take on values in the "name" column of a "Colours" table in a relational database, where "red, "green", and "blue", are three values for "name" in the "Colours" table, but some other part of the computer program may be able to add to that list while the program is running, and the variable can take on the new values after they are added to the Colours table.
Many type-safe languages give the illusion of "type-safety" by insisting on strictly defining types for variables and only allowing a variable to be assigned values of the same "type". There are a couple of problems with this approach. For example, a program may have a variable "yearOfBirth" which is the year a person was born, and it is tempting to type-cast it as a short integer. However, it is not a short integer. This year, it is a number that is less than 2009 and greater than -10000. However, this set grows by 1 every year as the program runs. Making this a "short int" is not adequate. What is needed to make this variable type-safe is a run-time validation function that ensures that the number is always greater than -10000 and less than the next calendar year. There is no compiler that can enforce such criteria because these criteria are always unique characteristics of the problem domain.
Languages that use dynamic typing (or duck-typing, or manifest typing) such as Perl, Python, Ruby, SQLite, and Lua don't have the notion of typed variables. This forces the programmer to write a run-time validation routine for every variable to ensure that it is correct, or endure the consequences of unexplained run-time exceptions. In my experience, programmers in statically typed languages such as C, C++, Java, and C# are often lulled into thinking that statically defined types is all they need to do to get the benefits of type-safety. This is simply not true for many useful computer programs, and it is hard to predict if it is true for any particular computer program.
The long & the short.... Do you want type-safety? If so, then write run-time functions to ensure that when a variable is assigned a value, it conforms to well-defined criteria. The down-side is that it makes domain analysis really difficult for most computer programs because you have to explicitly define the criteria for each program variable.
Type Safety
In modern C++, type safety is very important. Type safety means that you use the types correctly and, therefore, avoid unsafe casts and unions. Every object in C++ is used according to its type and an object needs to be initialized before its use.
Safe Initialization: {}
The compiler protects from information loss during type conversion. For example,
int a{7}; The initialization is OK
int b{7.5} Compiler shows ERROR because of information loss.\
Unsafe Initialization: = or ()
The compiler doesn't protect from information loss during type conversion.
int a = 7 The initialization is OK
int a = 7.5 The initialization is OK, but information loss occurs. The actual value of a will become 7.0
int c(7) The initialization is OK
int c(7.5) The initialization is OK, but information loss occurs. The actual value of a will become 7.0