Why are Super-class and Sub-class reversed? - language-agnostic

In set theory, a set is a superset if it contains everything in the original set and possibly more. A subset however is does not contain everything of the initial set.
With that in mind, in most object-oriented programming languages (I'm thinking Objective-C but I know the same is true for Java and others), the parent class is called the super class, and any class which inherits from the super is called a subclass.
Isn't this backwards? A subclass inherits things like all instance variables and methods from its superclass, thus it "contains" everything from the parent, plus whatever is added in the subclass. Is this just a naming mistake or was this intentional, and if so why?

A superclass defines a class that has a larger set of possible values as members. A subclass restricts the items that can be part of its class, so it defines a smaller set of possible members.
The set of possible members of a superclass is a superset of the set of possible members of a subclass of that superclass.

Greg is correct. Two things to consider that might make it more clear:
the propertes and methods are not relevant to the sub/super relationship in terms of set theory:
the properties and methods defined by a subclass may extend beyond those provided by its superclass (and in fact, they often do), but instances of the subclass are still members of the set of instances of the superclass
in other words, the sub/super relationship is not defined by the propertes and methods, but by the instance-level semantics intended by the naming of the classes
Taxomony example:
the set of all People is larger than the set of all Programmers
the set People is, in fact, a superset of the set Programmers
the set Programmers is a subset of the set People
so in OOP terms, People would be a superclass and Programmer would be a subclass. Every Programmer is a person, but not every person is a Programmer. Hence superclass and subclass. The fact that the Programmer class may have super powers beyond the ken of mortal men does not change the class-relationship (is-a) semantics.

Greg's answer is correct. Here's an explanation by example:
You have a base class Base. You have two derived classes DerivedA and DerivedB. Every instance of DerivedA is also an instance of Base. Likewise, every DerivedB is also a Base. But, a DerivedA is not a DerivedB and vice versa. So, if you were to draw a Venn diagram of the universe of all possible objects, you'd get:
________________________
/ \
/ Base \
/ ______ ______ \
| / \ / \ |
| / \ / \ |
| | DerivedA | | DerivedB | |
| \ / \ / |
| \______/ \______/ |
\ /
\ /
\________________________/
In other words, every object in the set of DerivedA objects is also in the set of Base objects. Likewise for DerivedB. So Base is indeed the superset of both DerivedA and DerivedB. Hence, it is the "superclass".

Probably for the same reason that stacks grow down (bottom at the top), trees grow down (root at the top) and 2D graphics systems are almost always quadrant IV (0,0 in upper left).

The subclass has all the [members] of its superclass [and more]. Isn't this backwards?
This issue crops up all over programming languages, and it always makes my head hurt. (Subtyping especially.)
Here are the rules:
When you are considering obejcts, the subclass/child/subtype has more methods and members. It can be used in more contexts. This seems counterintuitive.
When you are considering contexts, or interfaces, or arguments, roles are reversed. For example, a method expecting an argument of the supertype/parent/superclass can accept more arguments than a method expecting an argument of the subtype.
Which one is on top depends entirely on whether you think objects are primary or whether you think contexts expecting objects are primary. I have studied this subject for almost 15 years and still my intuition betrays me.
If a class declaration is considered as a specification, then the superclass specification is satisifed by more objects, and the subclass specification is satisfied by fewer objects. I believe this is the reason for the nomenclature. (It is a little clearer if you talk about subtypes and supertypes—a subtype is inhabited by fewer values than its supertype, because every value of the subtype is also a value of the supertype, and the supertype is likely inhabited by additional values that are not members of the subtype.)
Did I mention that the whole topic makes my head hurt?

I sidestep the whole super/sub class issue and refer to them as "derived" and "parent" class.

Yes, but if you think of your diagram as a topographic map, the subclasses have higher altitudes than the superclass. Hence the confusion.
Another way of looking at this is that the superclass is akin to the leading digit in a number (to make this a programming language friendly analogy, we'll say a floating point number). As the number acquires more digits, each new digit "inherits" all the digits that precede it. Similarly, as the subclass gains more methods, it inherits the list of superclasses, in the order in which they were named, that precede it.
Hope this helps.

Related

Does the term "monad" apply to values of types like Maybe or List, or does it instead apply only to the types themselves?

I've noticed that the word "monad" seems to be used in a somewhat inconsistent way. I've come to believe that this is because many (if not most) of the monad tutorials out there are written by folks who have only just started to figure monads out themselves (eg: nuclear waste spacesuit burritos), and so the term ends up getting kind of overloaded/corrupted.
In particular, I'm wondering whether the term "monad" can be applied to individual values of types like Maybe, List or IO, or if the term "monad" should really only be applied to the types themselves.
This is a subtle distinction, so perhaps an analogy might make it more clear. In mathematics we have, rings, fields, groups, etc. These terms apply to an entire set of values along with the operations that can be performed on them, rather than to individual elements. For example, integers (along with the operations of addition, negation and multiplication) form a ring. You could say "Integer is a ring", but you would never say "5 is a ring".
So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"? I don't know category theory, but I'm under the impression that it really only makes sense to say "Maybe is a monad" and not "Just 5 is a monad".
"Monad" (and "Functor") are popularly misused as describing values.
No value is a monad, functor, monoid, applicative functor, etc.
Only types & type constructors (higher-kinded types) can be.
When you hear (and you will) that "lists are monoids" or "functions are monads", etc, or "this function takes a monad as an argument", don't believe it.
Ask the speaker "How can any value be a monoid (or monad or ...), considering that Haskells classes classify types (including higher-order ones) rather than values?"
Lists are not monoids (etc). List a is.
My guess is that this popular misuse stems from mainstream languages having value classes and not type classes, so that habitual, unconscious value-class thinking sneaks in.
Why does it matter whether we use language precisely?
Because we think in language and we build & convey understandings via language.
So in order to have clear thoughts, it helps to have clear language (or be able to at any time).
"The slovenliness of our language makes it easier for us to have foolish thoughts. The point is that the process is reversible." - George Orwell, Politics and the English Language
Edit: These remarks apply to Haskell, not to the more general setting of category theory.
List is a monad, List a is a type, and [] is a List a (an element of a type).
Technically, a monad is a functor with extra structure; and in Haskell we only use functors from the category of Haskell types to itself.
It is thus in particular a "function" which takes a type and returns another type (it has kind * -> *).
List, State s, Maybe, etc are monads. State is not a monad, since it has kind * -> * -> *.
(aside: to confuse matters, Monads are just functors, and if I give myself a partially ordered set A, then it forms a category, with Hom(a, b) = { 1 element } if a <= b and Hom(a, b) = empty otherwise. Now any increasing function f : A -> A forms a functor, and monads are those functions which satisfy x <= f(x) and f(f(x)) <= f(x), hence f(f(x)) = f(x) -- monads here are technically "elements of A -> A". See also closure operators.)
(aside 2: since you appear to know some mathematics, I encourage you to read about category theory. You'll see among others that algebraic structures can be seen as arising from monads. See this excellent blog entry from the excellent blog by Dan Piponi for a teaser.)
To be exact, monads are structures from category theory. They don't have a direct code counterpart. For simplicity let's talk about general functors instead of monads. In the case of Haskell roughly speaking a functor is a mapping from a class of types to a class of types that also maps functions in the first class to functions in the second. The Functor instance gives you access to the mapping function, but doesn't directly capture the concept of functors.
It is however fair to say that the type constructor as mentioned in the Functor instance is the actual functor:
instance Functor Tree
In this case Tree is the functor. However, because Tree is a type constructor it can't stand for both mapping functions that make a functor at the same time. The function that maps functions is called fmap. So if you want to be precise you have to say that the tuple (Tree, fmap) is the functor, where fmap is the particular fmap from Tree's Functor instance. For convenience, again, we say that Tree is the functor, because the corresponding fmap follows from its Functor instance.
Note that functors are always types of kind * -> *. So Maybe Int is not a functor – the functor is Maybe. Also people often talk about "the state monad", which is also imprecise. State is a whole family of infinitely many state monads, as you can see in the instance:
instance Monad (State s)
For every type s the type constructor State s (of kind * -> *) is a state monad, one of many.
So, can you say "Just 5 is a monad", or would that be as wrong as saying "5 is a ring"?
Your intuition is exactly right. Int is to Ring (or AbelianGroup or whatever) as Maybe is to Monad (or Functor or whatever). Values (5, Just 5, etc.) are unimportant.
In algebra, we say the set of integers form a ring; in Haskell we would say (informally) that Int is a member of the Ring typeclass, or (slightly more formally) that there exists a Ring instance for Int. You might find this proposal fun and/or useful. Anyway, same deal with monads.
I don't know category theory, but ...
Whatever, if you know a thing or two about abstract algebra, you're golden.
I would say "Just 5 is of a type that is an instance of a Monad" like i would say "5 is a number that has type (Integer) is a ring".
I use the term instance because is how in Haskell you declare an implementation of a typeclass, and Monad is one of them.

In which languages is function abstraction not primitive

In Haskell function type (->) is given, it's not an algebraic data type constructor and one cannot re-implement it to be identical to (->).
So I wonder, what languages will allow me to write my version of (->)? How does this property called?
UPD Reformulations of the question thanks to the discussion:
Which languages don't have -> as a primitive type?
Why -> is necessary primitive?
I can't think of any languages that have arrows as a user defined type. The reason is that arrows -- types for functions -- are baked in to the type system, all the way down to the simply typed lambda calculus. That the arrow type must fundamental to the language comes directly from the fact that the way you form functions in the lambda calculus is via lambda abstraction (which, at the type level, introduces arrows).
Although Marcin aptly notes that you can program in a point free style, this doesn't change the essence of what you're doing. Having a language without arrow types as primitives goes against the most fundamental building blocks of Haskell. (The language you reference in the question.)
Having the arrow as a primitive type also shares some important ties to constructive logic: you can read the function arrow type as implication from intuition logic, and programs having that type as "proofs." (Namely, if you have something of type A -> B, you have a proof that takes some premise of type A, and produces a proof for B.)
The fact that you're perturbed by the use of having arrows baked into the language might imply that you're not fundamentally grasping why they're so tied to the design of the language, perhaps it's time to read a few chapters from Ben Pierce's "Types and Programming Languages" link.
Edit: You can always look at languages which don't have a strong notion of functions and have their semantics defined with respect to some other way -- such as forth or PostScript -- but in these languages you don't define inductive data types in the same way as in functional languages like Haskell, ML, or Coq. To put it another way, in any language in which you define constructors for datatypes, arrows arise naturally from the constructors for these types. But in languages where you don't define inductive datatypes in the typical way, you don't get arrow types as naturally because the language just doesn't work that way.
Another edit: I will stick in one more comment, since I thought of it last night. Function types (and function abstraction) forms the basis of pretty much all programming languages -- at least at some level, even if it's "under the hood." However, there are languages designed to define the semantics of other languages. While this doesn't strictly match what you're talking about, PLT Redex is one such system, and is used for specifying and debugging the semantics of programming languages. It's not super useful from a practitioners perspective (unless your goal is to design new languages, in which case it is fairly useful), but maybe that fits what you want.
Do you mean meta-circular evaluators like in SICP? Being able to write your own DSL? If you create your own "function type", you'll have to take care of "applying" it, yourself.
Just as an example, you could create your own "function" in C for instance, with a look-up table holding function pointers, and use integers as functions. You'd have to provide your own "call" function for such "functions", of course:
void call( unsigned int function, int data) {
lookup_table[function](data);
}
You'd also probably want some means of creating more complex functions from primitive ones, for instance using arrays of ints to signify sequential execution of your "primitive functions" 1, 2, 3, ... and end up inventing whole new language for yourself.
I think early assemblers had no ability to create callable "macros" and had to use GOTO.
You could use trampolining to simulate function calls. You could have only global variables store, with shallow binding perhaps. In such language "functions" would be definable, though not primitive type.
So having functions in a language is not necessary, though it is convenient.
In Common Lisp defun is nothing but a macro associating a name and a callable object (though lambda is still a built-in). In AutoLisp originally there was no special function type at all, and functions were represented directly by quoted lists of s-expressions, with first element an arguments list. You can construct your function through use of cons and list functions, from symbols, directly, in AutoLisp:
(setq a (list (cons 'x NIL) '(+ 1 x)))
(a 5)
==> 6
Some languages (like Python) support more than one primitive function type, each with its calling protocol - namely, generators support multiple re-entry and returns (even if syntactically through the use of same def keyword). You can easily imagine a language which would let you define your own calling protocol, thus creating new function types.
Edit: as an example consider dealing with multiple arguments in a function call, the choice between automatic currying or automatical optional args etc. In Common LISP say, you could easily create yourself two different call macros to directly represent the two calling protocols. Consider functions returning multiple values not through a kludge of aggregates (tuples, in Haskell), but directly into designated recepient vars/slots. All are different types of functions.
Function definition is usually primitive because (a) functions are how programmes get things done; and (b) this sort of lambda-abstraction is necessary to be able to programme in a pointful style (i.e. with explicit arguments).
Probably the closest you will come to a language that meets your criteria is one based on a purely pointfree model which allows you to create your own lambda operator. You might like to explore pointfree languages in general, and ones based on SKI calculus in particular: http://en.wikipedia.org/wiki/SKI_combinator_calculus
In such a case, you still have primitive function types, and you always will, because it is a fundamental element of the type system. If you want to get away from that at all, probably the best you could do would be some kind of type system based on a category-theoretic generalisation of functions, such that functions would be a special case of another type. See http://en.wikipedia.org/wiki/Category_theory.
Which languages don't have -> as a primitive type?
Well, if you mean a type that can be named, then there are many languages that don't have them. All languages where functions are not first class citiziens don't have -> as a type you could mention somewhere.
But, as #Kristopher eloquently and excellently explained, functions are (or can, at least, perceived as) the very basic building blocks of all computation. Hence even in Java, say, there are functions, but they are carefully hidden from you.
And, as someone mentioned assembler - one could maintain that the machine language (of most contemporary computers) is an approximation of the model of the register machine. But how it is done? With millions and billions of logical circuits, each of them being a materialization of quite primitive pure functions like NOT or NAND, arranged in a certain physical order (which is, obviously, the way hardware engeniers implement function composition).
Hence, while you may not see functions in machine code, they're still the basis.
In Martin-Löf type theory, function types are defined via indexed product types (so-called Π-types).
Basically, the type of functions from A to B can be interpreted as a (possibly infinite) record, where all the fields are of the same type B, and the field names are exactly all the elements of A. When you need to apply a function f to an argument x, you look up the field in f corresponding to x.
The wikipedia article lists some programming languages that are based on Martin-Löf type theory. I am not familiar with them, but I assume that they are a possible answer to your question.
Philip Wadler's paper Call-by-value is dual to call-by-name presents a calculus in which variable abstraction and covariable abstraction are more primitive than function abstraction. Two definitions of function types in terms of those primitives are provided: one implements call-by-value, and the other call-by-name.
Inspired by Wadler's paper, I implemented a language (Ambidexer) which provides two function type constructors that are synonyms for types constructed from the primitives. One is for call-by-value and one for call-by-name. Neither Wadler's dual calculus nor Ambidexter provides user-defined type constructors. However, these examples show that function types are not necessarily primitive, and that a language in which you can define your own (->) is conceivable.
In Scala you can mixin one of the Function traits, e.g. a Set[A] can be used as A => Boolean because it implements the Function1[A,Boolean] trait. Another example is PartialFunction[A,B], which extends usual functions by providing a "range-check" method isDefinedAt.
However, in Scala methods and functions are different, and there is no way to change how methods work. Usually you don't notice the difference, as methods are automatically lifted to functions.
So you have a lot of control how you implement and extend functions in Scala, but I think you have a real "replacement" in mind. I'm not sure this makes even sense.
Or maybe you are looking for languages with some kind of generalization of functions? Then Haskell with Arrow syntax would qualify: http://www.haskell.org/arrows/syntax.html
I suppose the dumb answer to your question is assembly code. This provides you with primitives even "lower" level than functions. You can create functions as macros that make use of register and jump primitives.
Most sane programming languages will give you a way to create functions as a baked-in language feature, because functions (or "subroutines") are the essence of good programming: code reuse.

What is ADT? (Abstract Data Type)

I am currently studying about Abstract Data Types (ADT's) but I don't get the concept at all. Can someone please explain to me what this actually is? Also what is collection, bag, and List ADT? in simple terms?
Abstract Data Type(ADT) is a data type, where only behavior is defined but not implementation.
Opposite of ADT is Concrete Data Type (CDT), where it contains an implementation of ADT.
Examples:
Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs. Each of these ADTs has many implementations i.e. CDT. The container is a high-level ADT of above all ADTs.
Real life example:
book is Abstract (Telephone Book is an implementation)
The Abstact data type Wikipedia article has a lot to say.
In computer science, an abstract data type (ADT) is a mathematical model for a certain class of data structures that have similar behavior; or for certain data types of one or more programming languages that have similar semantics. An abstract data type is defined indirectly, only by the operations that may be performed on it and by mathematical constraints on the effects (and possibly cost) of those operations.
In slightly more concrete terms, you can take Java's List interface as an example. The interface doesn't explicitly define any behavior at all because there is no concrete List class. The interface only defines a set of methods that other classes (e.g. ArrayList and LinkedList) must implement in order to be considered a List.
A collection is another abstract data type. In the case of Java's Collection interface, it's even more abstract than List, since
The List interface places additional stipulations, beyond those specified in the Collection interface, on the contracts of the iterator, add, remove, equals, and hashCode methods.
A bag is also known as a multiset.
In mathematics, the notion of multiset (or bag) is a generalization of the notion of set in which members are allowed to appear more than once. For example, there is a unique set that contains the elements a and b and no others, but there are many multisets with this property, such as the multiset that contains two copies of a and one of b or the multiset that contains three copies of both a and b.
In Java, a Bag would be a collection that implements a very simple interface. You only need to be able to add items to a bag, check its size, and iterate over the items it contains. See Bag.java for an example implementation (from Sedgewick & Wayne's Algorithms 4th edition).
A truly abstract data type describes the properties of its instances without commitment to their representation or particular operations. For example the abstract (mathematical) type Integer is a discrete, unlimited, linearly ordered set of instances. A concrete type gives a specific representation for instances and implements a specific set of operations.
Notation of Abstract Data Type(ADT)
An abstract data type could be defined as a mathematical model with a
collection of operations defined on it. A simple example is the set of
integers together with the operations of union, intersection defined
on the set.
The ADT's are generalizations of primitive data type(integer, char
etc) and they encapsulate a data type in the sense that the definition
of the type and all operations on that type localized to one section
of the program. They are treated as a primitive data type outside the
section in which the ADT and its operations are defined.
An implementation of an ADT is the translation into statements of
a programming language of the declaration that defines a variable to
be of that ADT, plus a procedure in that language for each
operation of that ADT. The implementation of the ADT chooses a
data structure to represent the ADT.
A useful tool for specifying the logical properties of data type is
the abstract data type. Fundamentally, a data type is a collection of
values and a set of operations on those values. That collection and
those operations form a mathematical construct that may be implemented
using a particular hardware and software data structure. The term
"abstract data type" refers to the basic mathematical concept that defines the data type.
In defining an abstract data type as mathamatical concept, we are not
concerned with space or time efficinecy. Those are implementation
issue. Infact, the defination of ADT is not concerned with
implementaion detail at all. It may not even be possible to implement
a particular ADT on a particular piece of hardware or using a
particular software system. For example, we have already seen that an
ADT integer is not universally implementable.
To illustrate the concept of an ADT and my specification method,
consider the ADT RATIONAL which corresponds to the mathematical
concept of a rational number. A rational number is a number that can
be expressed as the quotient of two integers. The operations on
rational numbers that, we define are the creation of a rational number
from two integers, addition, multiplication and testing for equality.
The following is an initial specification of this ADT.
/* Value defination */
abstract typedef <integer, integer> RATIONAL;
condition RATIONAL [1]!=0;
/*Operator defination*/
abstract RATIONAL makerational (a,b)
int a,b;
preconditon b!=0;
postcondition makerational [0] =a;
makerational [1] =b;
abstract RATIONAL add [a,b]
RATIONAL a,b;
postcondition add[1] = = a[1] * b[1]
add[0] = a[0]*b[1]+b[0]*a[1]
abstract RATIONAL mult [a, b]
RATIONAL a,b;
postcondition mult[0] = = a[0]*b[a]
mult[1] = = a[1]*b[1]
abstract equal (a,b)
RATIONAL a,b;
postcondition equal = = |a[0] * b[1] = = b[0] * a[1];
An ADT consists of two parts:-
1) Value definition
2) Operation definition
1) Value Definition:-
The value definition defines the collection of values for the ADT and
consists of two parts:
1) Definition Clause
2) Condition Clause
For example, the value definition for the ADT RATIONAL states that
a RATIONAL value consists of two integers, the second of which does
not equal to 0.
The keyword abstract typedef introduces a value definitions and the
keyword condition is used to specify any conditions on the newly
defined data type. In this definition the condition specifies that the
denominator may not be 0. The definition clause is required, but the
condition may not be necessary for every ADT.
2) Operator Definition:-
Each operator is defined as an abstract junction with three parts.
1)Header
2)Optional Preconditions
3)Optional Postconditions
For example the operator definition of the ADT RATIONAL includes the
operations of creation (makerational), addition (add) and
multiplication (mult) as well as a test for equality (equal). Let us
consider the specification for multiplication first, since, it is the
simplest. It contains a header and post-conditions, but no
pre-conditions.
abstract RATIONAL mult [a,b]
RATIONAL a,b;
postcondition mult[0] = a[0]*b[0]
mult[1] = a[1]*b[1]
The header of this definition is the first two lines, which are just
like a C function header. The keyword abstract indicates that it is
not a C function but an ADT operator definition.
The post-condition specifies, what the operation does. In a
post-condition, the name of the function (in this case, mult) is used
to denote the result of an operation. Thus, mult [0] represents
numerator of result and mult 1 represents the denominator of the
result. That is it specifies, what conditions become true after the
operation is executed. In this example the post-condition specifies
that the neumerator of the result of a rational multiplication equals
integer product of numerators of the two inputs and the denominator
equals th einteger product of two denominators.
List
In computer science, a list or sequence is an abstract data type that
represents a countable number of ordered values, where the same value
may occur more than once. An instance of a list is a computer
representation of the mathematical concept of a finite sequence; the
(potentially) infinite analog of a list is a stream. Lists are a basic
example of containers, as they contain other values. If the same value
occurs multiple times, each occurrence is considered a distinct item
The name list is also used for several concrete data structures that
can be used to implement abstract lists, especially linked lists.
Image of a List
Bag
A bag is a collection of objects, where you can keep adding objects to
the bag, but you cannot remove them once added to the bag. So with a
bag data structure, you can collect all the objects, and then iterate
through them. You will bags normally when you program in Java.
Image of a Bag
Collection
A collection in the Java sense refers to any class that implements the
Collection interface. A collection in a generic sense is just a group
of objects.
Image of collections
Actually Abstract Data Types is:
Concepts or theoretical model that defines a data type logically
Specifies set of data and set of operations that can be performed on that data
Does not mention anything about how operations will be implemented
"Existing as an idea but not having a physical idea"
For example, lets see specifications of some Abstract Data Types,
List Abstract Data Type: initialize(), get(), insert(), remove(), etc.
Stack Abstract Data Type: push(), pop(), peek(), isEmpty(), isNull(), etc.
Queue Abstract Data Type: enqueue(), dequeue(), size(), peek(), etc.
One of the simplest explanation given on Brilliant's wiki:
Abstract data types, commonly abbreviated ADTs, are a way of
classifying data structures based on how they are used and the
behaviors they provide. They do not specify how the data structure
must be implemented or laid out in memory, but simply provide a
minimal expected interface and set of behaviors. For example, a stack
is an abstract data type that specifies a linear data structure with
LIFO (last in, first out) behavior. Stacks are commonly implemented
using arrays or linked lists, but a needlessly complicated
implementation using a binary search tree is still a valid
implementation. To be clear, it is incorrect to say that stacks are
arrays or vice versa. An array can be used as a stack. Likewise, a
stack can be implemented using an array.
Since abstract data types don't specify an implementation, this means
it's also incorrect to talk about the time complexity of a given
abstract data type. An associative array may or may not have O(1)
average search times. An associative array that is implemented by a
hash table does have O(1) average search times.
Example for ADT: List - can be implemented using Array and LinkedList, Queue, Deque, Stack, Associative array, Set.
https://brilliant.org/wiki/abstract-data-types/?subtopic=types-and-data-structures&chapter=abstract-data-types
ADT are a set of data values and associated operations that are precisely independent of any paticular implementaition. The strength of an ADT is implementaion is hidden from the user.only interface is declared .This means that the ADT is various ways
Abstract Data type is a mathematical module that includes data with various operations. Implementation details are hidden and that's why it is called abstract. Abstraction allowed you to organise the complexity of the task by focusing on logical properties of data and actions.
In programming languages, a type is some data and the associated operations. An ADT is a user defined data aggregate and the operations over these data and is characterized by encapsulation, the data and operations are represented, or at list declared, in a single syntactic unit, and information hiding, only the relevant operations are visible for the user of the ADT, the ADT interface, in the same way that a normal data type in the programming language. It's an abstraction because the internal representation of the data and implementation of the operations are of no concern to the ADT user.
Before defining abstract data types, let us considers the different
view of system-defined data types. We all know that by default all
primitive data types (int, float, etc.) support basic operations such
as addition and subtraction. The system provides the implementations
for the primitive data types. For user-defined data types, we also
need to define operations. The implementation for these operations can
be done when we want to actually use them. That means in general,
user-defined data types are defined along with their operations.
To simplify the process of solving problems, we combine the data
structures with their operations and we call this "Abstract Data
Type". (ADT's).
Commonly used ADT'S include: Linked List, Stacks, Queues, Binary Tree,
Dictionaries, Disjoint Sets (Union and find), Hash Tables and many
others.
ADT's consist of two types:
1. Declaration of data.
2. Declaration of operation.
Simply Abstract Data Type is nothing but a set of operation and set of data is used for storing some other data efficiently in the machine.
There is no need of any perticular type declaration.
It just require a implementation of ADT.
To solve problems we combine the data structure with their operations. An ADT consists of two parts:
Declaration of Data.
Declaration of Operation.
Commonly used ADT's are Linked Lists, Stacks, Queues, Priority Queues, Trees etc. While defining ADTs we don't need to worry about implementation detals. They come into picture only when we want to use them.
Abstract data type are like user defined data type on which we can perform functions without knowing what is there inside the datatype and how the operations are performed on them . As the information is not exposed its abstracted. eg. List,Array, Stack, Queue. On Stack we can perform functions like Push, Pop but we are not sure how its being implemented behind the curtains.
ADT is a set of objects and operations, no where in an ADT’s definitions is there any mention of how the set of operations is implemented. Programmers who use collections only need to know how to instantiate and access data in some pre-determined manner, without concerns for the details of the collections implementations. In other words, from a user’s perspective, a collection is an abstraction, and for this reason, in computer science, some collections are referred to as abstract data types (ADTs). The user is only concern with learning its interface, or the set of operations its performs...more
in a simple word: An abstract data type is a collection of data and operations that work on that data. The operations both describe the data to the rest of the program and allow the rest of the program to change the data. The word “data” in “abstract data type” is used loosely. An ADT might be a graphics window with all the operations that affect it, a file and file operations, an insurance-rates table and the operations on it, or something else.
from code complete 2 book
Abstract data type is the collection of values and any kind of operation on these values. For example, since String is not a primitive data type, we can include it in abstract data types.
ADT is a data type in which collection of data and operation works on that data . It focuses on more the concept than implementation..
It's up to you which language you use to make it visible on the earth
Example:
Stack is an ADT while the Array is not
Stack is ADT because we can implement it by many languages,
Python c c++ java and many more , while Array is built in data type
An abstract data type, sometimes abbreviated ADT, is a logical description of how we view the data and the operations that are allowed without regard to how they will be implemented. This means that we are concerned only with what the data is representing and not with how it will eventually be constructed.
https://runestone.academy/runestone/books/published/pythonds/Introduction/WhyStudyDataStructuresandAbstractDataTypes.html
Abstractions give you only information(service information) but not implementation.
For eg: When you go to withdraw money from an ATM machine, you just know one thing i.e put your ATM card to the machine, click the withdraw option, enter the amount and your money is out if there is money.
This is only what you know about ATM machines. But do you know how you are receiving money?? What business logic is going on behind? Which database is being called? Which server at which location is being invoked?? No, you only know is service information i.e you can withdraw money. This is an abstraction.
Similarly, ADT gives you an overview of data types: what they are / can be stored and what operations you can perform on those data types. But it doesn’t provide how to implement that. This is ADT. It only defines the logical form of your data types.
Another analogy is :
In a car or bike, you only know when you press the brake your vehicle will stop. But do you know how the bike stops when you press the brake??? No, means implementation detail is being hidden. You only know what brake does when you press but don’t know how it does.
An abstract data type(ADT) is an abstraction of a data structure that provides only the interface to which the data structure must adhere. The interface does not give any specific details about how something should be implemented or in what programming language.
The term data type is as the type of data which a particular variable can hold - it may be an integer, a character, a float, or any range of simple data storage representation. However, when we build an object oriented system, we use other data types, known as abstract data type, which represents more realistic entities.
E.g.: We might be interested in representing a 'bank account' data type, which describe how all bank account are handled in a program. Abstraction is about reducing complexity, ignoring unnecessary details.

Purity vs Referential transparency

The terms do appear to be defined differently, but I've always thought of one implying the other; I can't think of any case when an expression is referentially transparent but not pure, or vice-versa.
Wikipedia maintains separate articles for these concepts and says:
From Referential transparency:
If all functions involved in the
expression are pure functions, then
the expression is referentially
transparent. Also, some impure
functions can be included in the
expression if their values are
discarded and their side effects are
insignificant.
From Pure expressions:
Pure functions are required to
construct pure expressions. [...] Pure
expressions are often referred to as
being referentially transparent.
I find these statements confusing. If the side effects from a so-called "impure function" are insignificant enough to allow not performing them (i.e. replace a call to such a function with its value) without materially changing the program, it's the same as if it were pure in the first place, isn't it?
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any? If there is a difference, an example expression that clearly demonstrates it would be appreciated.
If I gather in one place any three theorists of my acquaintance, at least two of them disagree on the meaning of the term "referential transparency." And when I was a young student, a mentor of mine gave me a paper explaining that even if you consider only the professional literature, the phrase "referentially transparent" is used to mean at least three different things. (Unfortunately that paper is somewhere in a box of reprints that have yet to be scanned. I searched Google Scholar for it but I had no success.)
I cannot inform you, but I can advise you to give up: Because even the tiny cadre of pointy-headed language theorists can't agree on what it means, the term "referentially transparent" is not useful. So don't use it.
P.S. On any topic to do with the semantics of programming languages, Wikipedia is unreliable. I have given up trying to fix it; the Wikipedian process seems to regard change and popular voting over stability and accuracy.
All pure functions are necessarily referentially transparent. Since, by definition, they cannot access anything other than what they are passed, their result must be fully determined by their arguments.
However, it is possible to have referentially transparent functions which are not pure. I can write a function which is given an int i, then generates a random number r, subtracts r from itself and places it in s, then returns i - s. Clearly this function is impure, because it is generating random numbers. However, it is referentially transparent. In this case, the example is silly and contrived. However, in, e.g., Haskell, the id function is of type a - > a whereas my stupidId function would be of type a -> IO a indicating that it makes use of side effects. When a programmer can guarantee through means of an external proof that their function is actually referentially transparent, then they can use unsafePerformIO to strip the IO back away from the type.
I'm somewhat unsure of the answer I give here, but surely somebody will point us in some direction. :-)
"Purity" is generally considered to mean "lack of side-effects". An expression is said to be pure if its evaluation lacks side-effects. What's a side-effect then? In a purely functional language, side-effect is anything that doesn't go by the simple beta-rule (the rule that to evaluate function application is the same as to substitute actual parameter for all free occurrences of the formal parameter).
For example, in a functional language with linear (or uniqueness, this distinction shouldn't bother at this moment) types some (controlled) mutation is allowed.
So I guess we have sorted out what "purity" and "side-effects" might be.
Referential transparency (according to the Wikipedia article you cited) means that variable can be replaced by the expression it denotes (abbreviates, stands for) without changing the meaning of the program at hand (btw, this is also a hard question to tackle, and I won't attempt to do so here). So, "purity" and "referential transparency" are indeed different things: "purity" is a property of some expression roughly means "doesn't produce side-effects when executed" whereas "referential transparency" is a property relating variable and expression that it stands for and means "variable can be replaced with what it denotes".
Hopefully this helps.
These slides from one ACCU2015 talk have a great summary on the topic of referential transparency.
From one of the slides:
A language is referentially transparent if (a)
every subexpression can be replaced by any other
that’s equal to it in value and (b) all occurrences of
an expression within a given context yield the
same value.
You can have, for instance, a function that logs its computation to the program standard output (so, it won't be a pure function), but you can replace calls for this function by a similar function that doesn't log its computation. Therefore, this function have the referential transparency property. But... the above definition is about languages, not expressions, as the slides emphasize.
[...] it's the same as if it were pure in the first place, isn't it?
From the definitions we have, no, it is not.
Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any?
Try the slides I mentioned above.
The nice thing about standards is that there are so many of them to choose
from.
Andrew S. Tanenbaum.
...along with definitions of referential transparency:
from page 176 of Functional programming with Miranda by Ian Holyer:
8.1 Values and Behaviours
The most important property of the semantics of a pure functional language is that the declarative and operational views of the language coincide exactly, in the following way:
Every expression denotes a value, and there are valuescorresponding to all possible program behaviours. Thebehaviour produced by an expression in any context is completely determined by its value, and vice versa.
This principle, which is usually rather opaquely called referential transparency, can also be pictured in the following way:
and from Nondeterminism with Referential Transparency in Functional Programming Languages by F. Warren Burton:
[...] the property that an expression always has the same value in the same environment [...]
...for various other definitions, see Referential Transparency, Definiteness and Unfoldability by Harald Søndergaard and Peter Sestoft.
Instead, we'll begin with the concept of "purity". For the three of you who didn't know it already, the computer or device you're reading this on is a solid-state Turing machine, a model of computing intrinsically connected with effects. So every program, functional or otherwise, needs to use those effects To Get Things DoneTM.
What does this mean for purity? At the assembly-language level, which is the domain of the CPU, all programs are impure. If you're writing a program in assembly language, you're the one who is micro-managing the interplay between all those effects - and it's really tedious!
Most of the time, you're just instructing the CPU to move data around in the computer's memory, which only changes the contents of individual memory locations - nothing to see there! It's only when your instructions direct the CPU to e.g. write to video memory, that you observe a visible change (text appearing on the screen).
For our purposes here, we'll split effects into two coarse categories:
those involving I/O devices like screens, speakers, printers, VR-headsets, keyboards, mice, etc; commonly known as observable effects.
and the rest, which only ever change the contents of memory.
In this situation, purity just means the absence of those observable effects, the ones which cause a visible change to the environment of the running program, maybe even its host computer. It is definitely not the absence of all effects, otherwise we would have to replace our solid-state Turing machines!
Now for the question of 42 life, the Universe and everything what exactly is meant by the term "referential transparency" - instead of herding cats trying to bring theorists into agreement, let's just try to find the original meaning given to the term. Fortunately for us, the term frequently appears in the context of I/O in Haskell - we only need a relevant article...here's one: from the first page of Owen Stephen's Approaches to Functional I/O:
Referential transparency refers to the ability to replace a sub-expression with one of equal value, without changing the value of the outer expression. Originating from Quine the term was introduced to Computer Science by Strachey.
Following the references:
From page 9 of 39 in Christopher Strachey's Fundamental Concepts in Programming Languages:
One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
From page 163 of 314 in Willard Van Ormond Quine's Word and Object:
[...] Quotation, which thus interrupts the referential force of a term, may be said to fail of referential transparency2. [...] I call a mode of confinement Φ referentially transparent if, whenever an occurrence of a singular term t is purely referential in a term or sentence ψ(t), it is purely referential also in the containing term or sentence Φ(ψ(t)).
with the footnote:
2 The term is from Whitehead and Russell, 2d ed., vol. 1, p. 665.
Following that reference:
From page 709 of 719 in Principa Mathematica by Alfred North Whitehead and Bertrand Russell:
When an assertion occurs, it is made by means of a particular fact, which is an instance of the proposition asserted. But this particular fact is, so to speak, "transparent"; nothing is said about it, bit by means of it something is said about something else. It is the "transparent" quality which belongs to propositions as they occur in truth-functions.
Let's try to bring all that together:
Whitehead and Russell introduce the term "transparent";
Quine then defines the qualified term "referential transparency";
Strachey then adapts Quine's definition in defining the basics of programming languages.
So it's a choice between Quine's original or Strachey's adapted definition. You can try translating Quine's definition for yourself if you like - everyone who's ever contested the definition of "purely functional" might even enjoy the chance to debate something different like what "mode of containment" and "purely referential" really means...have fun! The rest of us will just accept that Strachey's definition is a little vague ("In essence [...]") and continue on:
One useful property of expressions is referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression,
the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of
its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.
(emphasis by me.)
Regarding that description ("that if we wish to find the value of [...]"), a similar, but more concise statement is given by Peter Landin in The Next 700 Programming Languages:
the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Thus:
One useful property of expressions is referential transparency. In essence this means the thing an expression denotes, i.e., its "value", depends only on the values of its sub-expressions, not on other properties of them.
Strachey provides some examples:
(page 12 of 39)
We tend to assume automatically that the symbol x in an expression such as 3x2 + 2x + 17 stands for the same thing (or has the same value) on each occasion it occurs. This is the most important consequence of referential transparency and it is only in virtue of this property that we can use the where-clauses or λ-expressions described in the last section.
(and on page 16)
When the function is used (or called or applied) we write f[ε] where ε can be an expression. If we are using a referentially transparent language all we require to know about the expression ε in order to evaluate f[ε] is its value.
So referential transparency, by Strachey's original definition, implies purity - in the absence of an order of evaluation, observable and other effects are practically useless...
I'll quote John Mitchell Concept in programming language. He defines pure functional language has to pass declarative language test which is free from side-effects or lack of side effects.
"Within the scope of specific deceleration of x1,...,xn , all occurrence of an expression e containing only variables x1,...,xn have the same value."
In linguistics a name or noun phrase is considered referentially transparent if it may be replaced with the another noun phrase with same referent without changing the meaning of the sentence it contains.
Which in 1st case holds but in 2nd case it gets too weird.
Case 1:
"I saw Walter get into his new car."
And if Walter own a Centro then we could replace that in the given sentence as:
"I saw Walter get into his Centro"
Contrary to first :
Case #2 : He was called William Rufus because of his read beard.
Rufus means somewhat red and reference was to William IV of England.
"He was called William IV because of his read beard." looks too awkward.
Traditional way to say is, a language is referentially transparent if we may replace one expression with another of equal value anywhere in the program without changing the meaning of the program.
So, referential transparency is a property of pure functional language.
And if your program is free from side effects then this property will hold.
So give it up is awesome advice but get it on might also look good in this context.
Pure functions are those that return the same value on every call, and do not have side effects.
Referential transparency means that you can replace a bound variable with its value and still receive the same output.
Both pure and referentially transparent:
def f1(x):
t1 = 3 * x
t2 = 6
return t1 + t2
Why is this pure?
Because it is a function of only the input x and has no side-effects.
Why is this referentially transparent?
You could replace t1 and t2 in f1 with their respective right hand sides in the return statement, as follows
def f2(x):
return 3 * x + 6
and f2 will still always return the same result as f1 in every case.
Pure, but not referentially transparent:
Let's modify f1 as follows:
def f3(x):
t1 = 3 * x
t2 = 6
x = 10
return t1 + t2
Let us try the same trick again by replacing t1 and t2 with their right hand sides, and see if it is an equivalent definition of f3.
def f4(x):
x = 10
return 3 * x + 6
We can easily observe that f3 and f4 are not equivalent on replacing variables with their right hand sides / values. f3(1) would return 9 and f4(1) would return 36.
Referentially transparent, but not pure:
Simply modifying f1 to receive a non-local value of x, as follows:
def f5:
global x
t1 = 3 * x
t2 = 6
return t1 + t2
Performing the same replacement exercise from before shows that f5 is still referentially transparent. However, it is not pure because it is not a function of only the arguments passed to it.
Observing carefully, the reason we lose referential transparency moving from f3 to f4 is that x is modified. In the general case, making a variable final (or those familiar with Scala, using vals instead of vars) and using immutable objects can help keep a function referentially transparent. This makes them more like variables in the algebraic or mathematical sense, thus lending themselves better to formal verification.

Should Tuples Subclass Each Other?

Given a set of tuple classes in an OOP language: Pair, Triple and Quad, should Triple subclass Pair, and Quad subclass Triple?
The issue, as I see it, is whether a Triple should be substitutable as a Pair, and likewise Quad for Triple or Pair. Whether Triple is also a Pair and Quad is also a Triple and a Pair.
In one context, such a relationship might be valuable for extensibility - today this thing returns a Pair of things, tomorrow I need it to return a Triple without breaking existing callers, who are only using the first two of the three.
On the other hand, should they each be distinct types? I can see benefit in stronger type checking - where you can't pass a Triple to a method that expects a Pair.
I am leaning towards using inheritance, but would really appreciate input from others?
PS: In case it matters, the classes will (of course) be generic.
PPS: On a way more subjective side, should the names be Tuple2, Tuple3 and Tuple4?
Edit: I am thinking of these more as loosely coupled groups; not specifically for things like x/y x/y/z coordinates, though they may be used for such. It would be things like needing a general solution for multiple return values from a method, but in a form with very simple semantics.
That said, I am interested in all the ways others have actually used tuples.
Different length of tuple is a different type. (Well, in many type systems anyways.) In a strongly typed language, I wouldn't think that they should be a collection.
This is a good thing as it ensures more safety. Places where you return tuples usually have somewhat coupled information along with it, the implicit knowledge of what each component is. It's worse if you pass in more values in a tuple than expected -- what's that supposed to mean? It doesn't fit inheritance.
Another potential issue is if you decide to use overloading. If tuples inherit from each other, then overload resolution will fail where it should not. But this is probably a better argument against overloading.
Of course, none of this matters if you have a specific use case and find that certain behaviours will help you.
Edit: If you want general information, try perusing a bit of Haskell or ML family (OCaml/F#) to see how they're used and then form your own decisions.
It seems to me that you should make a generic Tuple interface (or use something like the Collection mentioned above), and have your pair and 3-tuple classes implement that interface. That way, you can take advantage of polymorphism but also allow a pair to use a simpler implementation than an arbitrary-sized tuple. You'd probably want to make your Tuple interface include .x and .y accessors as shorthand for the first two elements, and larger tuples can implement their own shorthands as appropriate for items with higher indices.
Like most design related questions, the answer is - It depends.
If you are looking for conventional Tuple design, Tuple2, Tuple3 etc is the way to go. The problem with inheritance is that, first of all Triplet is not a type of Pair. How would you implement the equals method for it? Is a Triplet equal to a Pair with first two items the same? If you have a collection of Pairs, can you add triplet to it or vice versa? If in your domain this is fine, you can go with inheritance.
Any case, it pays to have an interface/abstract class (maybe Tuple) which all these implement.
it depends on the semantics that you need -
a pair of opposites is not semantically compatible with a 3-tuple of similar objects
a pair of coordinates in polar space is not semantically compatible with a 3-tuple of coordinates in Euclidean space
if your semantics are simple compositions, then a generic class Tuple<N> would make more sense
I'd go with 0,1,2 or infinity. e.g. null, 1 object, your Pair class, or then a collection of some sort.
Your Pair could even implement a Collection interface.
If there's a specific relationship between Three or Four items, it should probably be named.
[Perhaps I'm missing the problem, but I just can't think of a case where I want to specifically link 3 things in a generic way]
Gilad Bracha blogged about tuples, which I found interesting reading.
One point he made (whether correctly or not I can't yet judge) was:
Literal tuples are best defined as read only. One reason for this is that readonly tuples are more polymorphic. Long tuples are subtypes of short ones:
{S. T. U. V } <= {S. T. U} <= {S. T} <= {S}
[and] read only tuples are covariant:
T1 <= T2, S1 <= S2 ==> {S1. T1} <= {S2. T2}
That would seem to suggest my inclination to using inheritance may be correct, and would contradict amit.dev when he says that a Triple is not a Pair.