ANY , NONE and Unit in Nim - any

i couldn't find any specific information in the manual.
can anyone clarify how does ANY , NONE and type unit are reflected in Nim?
short definitions -
a unit type is a type that allows only one value (and thus can hold no information). The carrier (underlying set) associated with a unit type can be any singleton set. There is an isomorphism between any two such sets, so it is customary to talk about the unit type and ignore the details of its value. One may also regard the unit type as the type of 0-tuples, i.e. the product of no types.
ANY -
type ANY also known as ALL or Top , is the universal set. (all possible values).
NONE- the "empty set"
thank you!

Your question seems to be about sets. Let's have a look:
let emptySet: set[int8] = {}
This is an empty set of type int8. The {} literal for the empty set is implicitly casted to any actual set type.
let singletonSet = {1'i8}
This is a set containing exactly one value (a unit type if I understand it correctly). The type of the set can now be automatically deduced from the type of the single value in it.
let completeSet = {low(int8) .. high(int8)}
This set holds all possible int8 values.
The builtin set type is implemented as bitvector and thus can only be used for value types which can hold only a small set of possible values (for int8, the bitvector is already 256 bits long). Besides int8, it is usually used for char and enumeration types.
Then there is HashSet from the module sets which can hold larger types. However, if you construct a HashSet containing all possible values, memory consumption will probably be enormous.

Nim is not a functional language, and never claims to be one. There is no equivalent of these types, and the solution is more like the road that c++ takes.
There is void, wich is closest to what Unit is. The Any type does not exist, but there is the untyped pointer. But that type does not hold any type information in it, so you need to know what you can cast it to. And for NONE, or Nothing how I know it from scala, you have to use void, too. But here you can add the noReturn pragma.

Related

When comparing Ada Scalar type With a subtype do I need to range check?

When comparing Ada Scalar type with a constrained subtype do I need to range check?
I know that in an assignment to the constrained subtype from the base type I am at risk to a Range_Check exception at runtime if I don't ensure that the value is within range prior to assignment.
But is this also true when performing a comparison? I would think that since the only knowledge the user wants back is a Boolean result, that no implicit conversion to the base type or range checking would be needed.
Please Note: I am looking for an answer that references the Ada95 LRM.
Example
declare
type Day_Type is (SUN, MON, TUE, WED, THU, FRI, SAT);
subtype Workday_Type is MON .. FRI;
Payday : constant Workday_Type := FRI;
...
function Is_Payday (Day : Day_Type)
return Boolean is
begin
return (Day = Payday);
end Is_Payday;
begin
-- Will this raise a RANGE_CHECK error in Is_Payday()?
if Is_Payday(Day => SAT) then
...
elsif Is_Payday(Day => FRI) then
...
end if;
end;
What I've Found So Far...
I have not found the full answer. But I did find several interesting bits of data.
Operations of Discrete Types / Chapter 3.5.5 para 12 it says:
3.2 Types and Subtypes -- This chapter only discusses how types and subtypes are defined.
3.5.5 Operations of Discrete Types -- This chapter had the interesting note in it but I'm not sure it or anything else in this chapter are relevant.
(31) For a subtype of a discrete type, the result delivered by the
attribute Val might not belong to the subtype; similarly, the actual
parameter of the attribute Pos need not belong to the subtype. The
following relations are satisfied (in the absence of an exception) by
these attributes:
S'Val(S'Pos(X)) = X
S'Pos(S'Val(N)) = N
4.5.1 Logical Operators and Short Circuit Control Forms -- Nothing here about types vs subtypes.
4.6 Type Conversions -- Now, this is where I think the golden answer lies. But its not jumping out at me. I see a small clue in note (20).
(20) In addition to explicit type_conversions, type conversions are performed implicitly in situations where the expected type and the actual type of a construct differ, as is permitted by the type resolution rules (see 8.6). For example, an integer literal is of the type universal_integer, and is implicitly converted when assigned to a target of some specific integer type. Similarly, an actual parameter of a specific tagged type is implicitly converted when the corresponding formal parameter is of a class-wide type.
Even when the expected and actual types are the same, implicit subtype conversions are performed to adjust the array bounds (if any) of an operand to match the desired target subtype, or to raise Constraint_Error if the (possibly adjusted) value does not satisfy the constraints of the target subtype.
This seems to suggest that implicit type conversions will always be performed down to the subtype (the language is difficult for me to read). But I don't see anywhere that states a comparison performed between an subtype and a base type required.
I am also confused by the statement:
implicit subtype conversions are performed to adjust the array bounds (if any) of an operand to match the desired target subtype
Does Array Bounds reference array types? or Is it referencing the range of the constrained subtype?
Is there a Target Subtype if there is no subtype?
Ada 95 Rationale...
Ada95 Rationale - Part I Overview -- This is Too High Level. Nothing helpful here.
Ada95 Rationale - Part II - 3.1 Types, Classes, Objects and Views -- This section had lots of insightful information about types, but still none relevant to comparing Types and Subtypes
I have probably just missed the answer in my search. But I'm looking for definitive proof one way or another.
Sorry for the long read.
You may find the Annotated ARM version (AARM95 3.2) helpful (note, your references were to the unmaintained AdaHome web site; prefer http://www.adaic.org).
A subtype has a type and possibly constraints:
type T is ...;
subtype S1 is T; -- effectively a renaming of T
subtype S2 is T range ...; -- (added) constraints
and the operations of the subtype are those of its type, which is why you can write the comparison Day = Payday without conversions. We know that a Workday_Type is a Day_Type, so we can just do the comparison directly.
It is true that Workday_Type (Day) = Payday runs the risk of CE, but there is no need for you or the compiler to do that.
Just like Simon Wright has explained, operations are those of a type, whereas subtype forms subsets of values of a type, not a different type with different operations.
For illustration, consider "+" in this:
type N is range 0 .. 10;
X : constant N := N'Base'(-1) + N'(2);
A number of Ada rules apply, and while IANALL, I remember LLawyers emphasizing that the rules are made so that producing the mathematically (logically) correct result is possible without raising exceptions even when, seemingly, some operation involves values exceeding a range along the way. There's a hint in LRM 4.5. So, I'd expect relational_operator rules, "="'s in particular, to be supportive, too.

What is ADT? (Abstract Data Type)

I am currently studying about Abstract Data Types (ADT's) but I don't get the concept at all. Can someone please explain to me what this actually is? Also what is collection, bag, and List ADT? in simple terms?
Abstract Data Type(ADT) is a data type, where only behavior is defined but not implementation.
Opposite of ADT is Concrete Data Type (CDT), where it contains an implementation of ADT.
Examples:
Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs. Each of these ADTs has many implementations i.e. CDT. The container is a high-level ADT of above all ADTs.
Real life example:
book is Abstract (Telephone Book is an implementation)
The Abstact data type Wikipedia article has a lot to say.
In computer science, an abstract data type (ADT) is a mathematical model for a certain class of data structures that have similar behavior; or for certain data types of one or more programming languages that have similar semantics. An abstract data type is defined indirectly, only by the operations that may be performed on it and by mathematical constraints on the effects (and possibly cost) of those operations.
In slightly more concrete terms, you can take Java's List interface as an example. The interface doesn't explicitly define any behavior at all because there is no concrete List class. The interface only defines a set of methods that other classes (e.g. ArrayList and LinkedList) must implement in order to be considered a List.
A collection is another abstract data type. In the case of Java's Collection interface, it's even more abstract than List, since
The List interface places additional stipulations, beyond those specified in the Collection interface, on the contracts of the iterator, add, remove, equals, and hashCode methods.
A bag is also known as a multiset.
In mathematics, the notion of multiset (or bag) is a generalization of the notion of set in which members are allowed to appear more than once. For example, there is a unique set that contains the elements a and b and no others, but there are many multisets with this property, such as the multiset that contains two copies of a and one of b or the multiset that contains three copies of both a and b.
In Java, a Bag would be a collection that implements a very simple interface. You only need to be able to add items to a bag, check its size, and iterate over the items it contains. See Bag.java for an example implementation (from Sedgewick & Wayne's Algorithms 4th edition).
A truly abstract data type describes the properties of its instances without commitment to their representation or particular operations. For example the abstract (mathematical) type Integer is a discrete, unlimited, linearly ordered set of instances. A concrete type gives a specific representation for instances and implements a specific set of operations.
Notation of Abstract Data Type(ADT)
An abstract data type could be defined as a mathematical model with a
collection of operations defined on it. A simple example is the set of
integers together with the operations of union, intersection defined
on the set.
The ADT's are generalizations of primitive data type(integer, char
etc) and they encapsulate a data type in the sense that the definition
of the type and all operations on that type localized to one section
of the program. They are treated as a primitive data type outside the
section in which the ADT and its operations are defined.
An implementation of an ADT is the translation into statements of
a programming language of the declaration that defines a variable to
be of that ADT, plus a procedure in that language for each
operation of that ADT. The implementation of the ADT chooses a
data structure to represent the ADT.
A useful tool for specifying the logical properties of data type is
the abstract data type. Fundamentally, a data type is a collection of
values and a set of operations on those values. That collection and
those operations form a mathematical construct that may be implemented
using a particular hardware and software data structure. The term
"abstract data type" refers to the basic mathematical concept that defines the data type.
In defining an abstract data type as mathamatical concept, we are not
concerned with space or time efficinecy. Those are implementation
issue. Infact, the defination of ADT is not concerned with
implementaion detail at all. It may not even be possible to implement
a particular ADT on a particular piece of hardware or using a
particular software system. For example, we have already seen that an
ADT integer is not universally implementable.
To illustrate the concept of an ADT and my specification method,
consider the ADT RATIONAL which corresponds to the mathematical
concept of a rational number. A rational number is a number that can
be expressed as the quotient of two integers. The operations on
rational numbers that, we define are the creation of a rational number
from two integers, addition, multiplication and testing for equality.
The following is an initial specification of this ADT.
/* Value defination */
abstract typedef <integer, integer> RATIONAL;
condition RATIONAL [1]!=0;
/*Operator defination*/
abstract RATIONAL makerational (a,b)
int a,b;
preconditon b!=0;
postcondition makerational [0] =a;
makerational [1] =b;
abstract RATIONAL add [a,b]
RATIONAL a,b;
postcondition add[1] = = a[1] * b[1]
add[0] = a[0]*b[1]+b[0]*a[1]
abstract RATIONAL mult [a, b]
RATIONAL a,b;
postcondition mult[0] = = a[0]*b[a]
mult[1] = = a[1]*b[1]
abstract equal (a,b)
RATIONAL a,b;
postcondition equal = = |a[0] * b[1] = = b[0] * a[1];
An ADT consists of two parts:-
1) Value definition
2) Operation definition
1) Value Definition:-
The value definition defines the collection of values for the ADT and
consists of two parts:
1) Definition Clause
2) Condition Clause
For example, the value definition for the ADT RATIONAL states that
a RATIONAL value consists of two integers, the second of which does
not equal to 0.
The keyword abstract typedef introduces a value definitions and the
keyword condition is used to specify any conditions on the newly
defined data type. In this definition the condition specifies that the
denominator may not be 0. The definition clause is required, but the
condition may not be necessary for every ADT.
2) Operator Definition:-
Each operator is defined as an abstract junction with three parts.
1)Header
2)Optional Preconditions
3)Optional Postconditions
For example the operator definition of the ADT RATIONAL includes the
operations of creation (makerational), addition (add) and
multiplication (mult) as well as a test for equality (equal). Let us
consider the specification for multiplication first, since, it is the
simplest. It contains a header and post-conditions, but no
pre-conditions.
abstract RATIONAL mult [a,b]
RATIONAL a,b;
postcondition mult[0] = a[0]*b[0]
mult[1] = a[1]*b[1]
The header of this definition is the first two lines, which are just
like a C function header. The keyword abstract indicates that it is
not a C function but an ADT operator definition.
The post-condition specifies, what the operation does. In a
post-condition, the name of the function (in this case, mult) is used
to denote the result of an operation. Thus, mult [0] represents
numerator of result and mult 1 represents the denominator of the
result. That is it specifies, what conditions become true after the
operation is executed. In this example the post-condition specifies
that the neumerator of the result of a rational multiplication equals
integer product of numerators of the two inputs and the denominator
equals th einteger product of two denominators.
List
In computer science, a list or sequence is an abstract data type that
represents a countable number of ordered values, where the same value
may occur more than once. An instance of a list is a computer
representation of the mathematical concept of a finite sequence; the
(potentially) infinite analog of a list is a stream. Lists are a basic
example of containers, as they contain other values. If the same value
occurs multiple times, each occurrence is considered a distinct item
The name list is also used for several concrete data structures that
can be used to implement abstract lists, especially linked lists.
Image of a List
Bag
A bag is a collection of objects, where you can keep adding objects to
the bag, but you cannot remove them once added to the bag. So with a
bag data structure, you can collect all the objects, and then iterate
through them. You will bags normally when you program in Java.
Image of a Bag
Collection
A collection in the Java sense refers to any class that implements the
Collection interface. A collection in a generic sense is just a group
of objects.
Image of collections
Actually Abstract Data Types is:
Concepts or theoretical model that defines a data type logically
Specifies set of data and set of operations that can be performed on that data
Does not mention anything about how operations will be implemented
"Existing as an idea but not having a physical idea"
For example, lets see specifications of some Abstract Data Types,
List Abstract Data Type: initialize(), get(), insert(), remove(), etc.
Stack Abstract Data Type: push(), pop(), peek(), isEmpty(), isNull(), etc.
Queue Abstract Data Type: enqueue(), dequeue(), size(), peek(), etc.
One of the simplest explanation given on Brilliant's wiki:
Abstract data types, commonly abbreviated ADTs, are a way of
classifying data structures based on how they are used and the
behaviors they provide. They do not specify how the data structure
must be implemented or laid out in memory, but simply provide a
minimal expected interface and set of behaviors. For example, a stack
is an abstract data type that specifies a linear data structure with
LIFO (last in, first out) behavior. Stacks are commonly implemented
using arrays or linked lists, but a needlessly complicated
implementation using a binary search tree is still a valid
implementation. To be clear, it is incorrect to say that stacks are
arrays or vice versa. An array can be used as a stack. Likewise, a
stack can be implemented using an array.
Since abstract data types don't specify an implementation, this means
it's also incorrect to talk about the time complexity of a given
abstract data type. An associative array may or may not have O(1)
average search times. An associative array that is implemented by a
hash table does have O(1) average search times.
Example for ADT: List - can be implemented using Array and LinkedList, Queue, Deque, Stack, Associative array, Set.
https://brilliant.org/wiki/abstract-data-types/?subtopic=types-and-data-structures&chapter=abstract-data-types
ADT are a set of data values and associated operations that are precisely independent of any paticular implementaition. The strength of an ADT is implementaion is hidden from the user.only interface is declared .This means that the ADT is various ways
Abstract Data type is a mathematical module that includes data with various operations. Implementation details are hidden and that's why it is called abstract. Abstraction allowed you to organise the complexity of the task by focusing on logical properties of data and actions.
In programming languages, a type is some data and the associated operations. An ADT is a user defined data aggregate and the operations over these data and is characterized by encapsulation, the data and operations are represented, or at list declared, in a single syntactic unit, and information hiding, only the relevant operations are visible for the user of the ADT, the ADT interface, in the same way that a normal data type in the programming language. It's an abstraction because the internal representation of the data and implementation of the operations are of no concern to the ADT user.
Before defining abstract data types, let us considers the different
view of system-defined data types. We all know that by default all
primitive data types (int, float, etc.) support basic operations such
as addition and subtraction. The system provides the implementations
for the primitive data types. For user-defined data types, we also
need to define operations. The implementation for these operations can
be done when we want to actually use them. That means in general,
user-defined data types are defined along with their operations.
To simplify the process of solving problems, we combine the data
structures with their operations and we call this "Abstract Data
Type". (ADT's).
Commonly used ADT'S include: Linked List, Stacks, Queues, Binary Tree,
Dictionaries, Disjoint Sets (Union and find), Hash Tables and many
others.
ADT's consist of two types:
1. Declaration of data.
2. Declaration of operation.
Simply Abstract Data Type is nothing but a set of operation and set of data is used for storing some other data efficiently in the machine.
There is no need of any perticular type declaration.
It just require a implementation of ADT.
To solve problems we combine the data structure with their operations. An ADT consists of two parts:
Declaration of Data.
Declaration of Operation.
Commonly used ADT's are Linked Lists, Stacks, Queues, Priority Queues, Trees etc. While defining ADTs we don't need to worry about implementation detals. They come into picture only when we want to use them.
Abstract data type are like user defined data type on which we can perform functions without knowing what is there inside the datatype and how the operations are performed on them . As the information is not exposed its abstracted. eg. List,Array, Stack, Queue. On Stack we can perform functions like Push, Pop but we are not sure how its being implemented behind the curtains.
ADT is a set of objects and operations, no where in an ADT’s definitions is there any mention of how the set of operations is implemented. Programmers who use collections only need to know how to instantiate and access data in some pre-determined manner, without concerns for the details of the collections implementations. In other words, from a user’s perspective, a collection is an abstraction, and for this reason, in computer science, some collections are referred to as abstract data types (ADTs). The user is only concern with learning its interface, or the set of operations its performs...more
in a simple word: An abstract data type is a collection of data and operations that work on that data. The operations both describe the data to the rest of the program and allow the rest of the program to change the data. The word “data” in “abstract data type” is used loosely. An ADT might be a graphics window with all the operations that affect it, a file and file operations, an insurance-rates table and the operations on it, or something else.
from code complete 2 book
Abstract data type is the collection of values and any kind of operation on these values. For example, since String is not a primitive data type, we can include it in abstract data types.
ADT is a data type in which collection of data and operation works on that data . It focuses on more the concept than implementation..
It's up to you which language you use to make it visible on the earth
Example:
Stack is an ADT while the Array is not
Stack is ADT because we can implement it by many languages,
Python c c++ java and many more , while Array is built in data type
An abstract data type, sometimes abbreviated ADT, is a logical description of how we view the data and the operations that are allowed without regard to how they will be implemented. This means that we are concerned only with what the data is representing and not with how it will eventually be constructed.
https://runestone.academy/runestone/books/published/pythonds/Introduction/WhyStudyDataStructuresandAbstractDataTypes.html
Abstractions give you only information(service information) but not implementation.
For eg: When you go to withdraw money from an ATM machine, you just know one thing i.e put your ATM card to the machine, click the withdraw option, enter the amount and your money is out if there is money.
This is only what you know about ATM machines. But do you know how you are receiving money?? What business logic is going on behind? Which database is being called? Which server at which location is being invoked?? No, you only know is service information i.e you can withdraw money. This is an abstraction.
Similarly, ADT gives you an overview of data types: what they are / can be stored and what operations you can perform on those data types. But it doesn’t provide how to implement that. This is ADT. It only defines the logical form of your data types.
Another analogy is :
In a car or bike, you only know when you press the brake your vehicle will stop. But do you know how the bike stops when you press the brake??? No, means implementation detail is being hidden. You only know what brake does when you press but don’t know how it does.
An abstract data type(ADT) is an abstraction of a data structure that provides only the interface to which the data structure must adhere. The interface does not give any specific details about how something should be implemented or in what programming language.
The term data type is as the type of data which a particular variable can hold - it may be an integer, a character, a float, or any range of simple data storage representation. However, when we build an object oriented system, we use other data types, known as abstract data type, which represents more realistic entities.
E.g.: We might be interested in representing a 'bank account' data type, which describe how all bank account are handled in a program. Abstraction is about reducing complexity, ignoring unnecessary details.

What exactly is a datatype?

I understand what a datatype is (intuitively). But I need the formal definition. I don't understand if it is a set or it's the names 'int' 'float' etc. The formal definition found on wikipedia is confusing.
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of that type can be stored.
Can anyone help me with that?
Yep. What that's saying is that a data type has three pieces:
The various possible values. So, for example, an eight bit signed integer might have -127..128. This of that as a set of values V.
The operations: so an 8-bit signed integer might have +, -, * (multiply), and / (divide). The full definition would define those as functions from V into V, or possible as a function from V into float for division.
The way it's stored -- I sort of gave it away when I said "eight bit signed integer". The other detail is that I'm assuming a specific representation by the way I showed the range of values.
You might, if you're into object oriented programming, notice that this is very much like the definition of a class, which is defined by the storage used by each object, adn the methods of the class. Providing those parts for some arbitrary thing, but not inheritance rules, gives you what's called an abstract data type.
Update
#Appy, there's some room for differences in the formalities. I was a little subtle because it was late and I was suddenly uncertain if I'd assumed one's complement or two's complement -- of course it's two's complement. So interpretation is included in my description. Abstractly, though, you'd say it is a algebraic structure T=(V,O) where V is a set of values, O a set of functions from V into some arbitrary type -- remember '==' for example will be a function eq:V × V → {0,1} so you can't expect every operation to be into V.
I can define it as a classification of a particular type of information. It is easy for humans to distinguish between different types of data. We can usually tell at a glance whether a number is a percentage, a time, or an amount of money. We do this through special symbols %, :, and $.
Basically it's the concept that I am sure you grock. For computers however a data type is defined and has various associated attributes, like size, like a definition keywork (sometimes), the values it can take (numbers or characters for example) and operations that can be done on it like add subtract for numbers and append on string or compare on a character, etc. These differ from language to language and even from environment to env. (16 - 32 bit ints/ 32 - 64 envs./ etc).
If there is anything I am missing or needs refining please ask as this is fairly open ended.

What is a 'value' in the context of programming?

Can you suggest a precise definition for a 'value' within the context of programming without reference to specific encoding techniques or particular languages or architectures?
[Previous question text, for discussion reference: "What is value in programming? How to define this word precisely?"]
I just happened to be glancing through Pierce's "Types and Programming Languages" - he slips a reasonably precise definition of "value" in a programming context into the text:
[...] defines a subset of terms, called values, that are possible final results of evaluation
This seems like a rather tidy definition - i.e., we take the set of all possible terms, and the ones that can possibly be left over after all evaluation has taken place are values.
Based on the ongoing comments about "bits" being an unacceptable definition, I think this one is a little better (although possibly still flawed):
A value is anything representable on a piece of possibly-infinite Turing machine tape.
Edit: I'm refining this some more.
A value is a member of the set of possible interpretations of any possibly-infinite sequence of symbols.
That is equivalent to the earlier definition based on Turing machine tape, but it actually generalises better.
Here, I'll take a shot: A value is a piece of stored information (in the information-theoretical sense) that can be manipulated by the computer.
(I won't say that a value has meaning; a random number in a register may have no meaning, but it's still a value.)
In short, a value is some assigned meaning to a variable (the object containing the value)
For example type=boolean; name=help; variable=a storage location; value=what is stored in that location;
Further break down:
X = 2; where X is a variable while 2 is the value stored in X.
Have you checked the article in wikipedia?
In computer science, a value is a sequence of bits that is interpreted according to some data type. It is possible for the same sequence of bits to have different values, depending on the type used to interpret its meaning. For instance, the value could be an integer or floating point value, or a string.
Read the Wiki
Value = Value is what we call the "contents" that was stored in the variable
Variables = containers for storing data values
Example: Think of a folder named "Movies"(Variables) and inside of it are it contents which are namely; Pirates of the Carribean, Fantastic Beast, and Lala land, (this in turn is what we now call it's Values )

What is boxing and unboxing and what are the trade offs?

I'm looking for a clear, concise and accurate answer.
Ideally as the actual answer, although links to good explanations welcome.
Boxed values are data structures that are minimal wrappers around primitive types*. Boxed values are typically stored as pointers to objects on the heap.
Thus, boxed values use more memory and take at minimum two memory lookups to access: once to get the pointer, and another to follow that pointer to the primitive. Obviously this isn't the kind of thing you want in your inner loops. On the other hand, boxed values typically play better with other types in the system. Since they are first-class data structures in the language, they have the expected metadata and structure that other data structures have.
In Java and Haskell generic collections can't contain unboxed values. Generic collections in .NET can hold unboxed values with no penalties. Where Java's generics are only used for compile-time type checking, .NET will generate specific classes for each generic type instantiated at run time.
Java and Haskell have unboxed arrays, but they're distinctly less convenient than the other collections. However, when peak performance is needed it's worth a little inconvenience to avoid the overhead of boxing and unboxing.
* For this discussion, a primitive value is any that can be stored on the call stack, rather than stored as a pointer to a value on the heap. Frequently that's just the machine types (ints, floats, etc), structs, and sometimes static sized arrays. .NET-land calls them value types (as opposed to reference types). Java folks call them primitive types. Haskellions just call them unboxed.
** I'm also focusing on Java, Haskell, and C# in this answer, because that's what I know. For what it's worth, Python, Ruby, and Javascript all have exclusively boxed values. This is also known as the "Everything is an object" approach***.
*** Caveat: A sufficiently advanced compiler / JIT can in some cases actually detect that a value which is semantically boxed when looking at the source, can safely be an unboxed value at runtime. In essence, thanks to brilliant language implementors your boxes are sometimes free.
from C# 3.0 In a Nutshell:
Boxing is the act of casting a value
type into a reference type:
int x = 9;
object o = x; // boxing the int
unboxing is... the reverse:
// unboxing o
object o = 9;
int x = (int)o;
Boxing & unboxing is the process of converting a primitive value into an object oriented wrapper class (boxing), or converting a value from an object oriented wrapper class back to the primitive value (unboxing).
For example, in java, you may need to convert an int value into an Integer (boxing) if you want to store it in a Collection because primitives can't be stored in a Collection, only objects. But when you want to get it back out of the Collection you may want to get the value as an int and not an Integer so you would unbox it.
Boxing and unboxing is not inherently bad, but it is a tradeoff. Depending on the language implementation, it can be slower and more memory intensive than just using primitives. However, it may also allow you to use higher level data structures and achieve greater flexibility in your code.
These days, it is most commonly discussed in the context of Java's (and other language's) "autoboxing/autounboxing" feature. Here is a java centric explanation of autoboxing.
In .Net:
Often you can't rely on what the type of variable a function will consume, so you need to use an object variable which extends from the lowest common denominator - in .Net this is object.
However object is a class and stores its contents as a reference.
List<int> notBoxed = new List<int> { 1, 2, 3 };
int i = notBoxed[1]; // this is the actual value
List<object> boxed = new List<object> { 1, 2, 3 };
int j = (int) boxed[1]; // this is an object that can be 'unboxed' to an int
While both these hold the same information the second list is larger and slower. Each value in the second list is actually a reference to an object that holds the int.
This is called boxed because the int is wrapped by the object. When its cast back the int is unboxed - converted back to it's value.
For value types (i.e. all structs) this is slow, and potentially uses a lot more space.
For reference types (i.e. all classes) this is far less of a problem, as they are stored as a reference anyway.
A further problem with a boxed value type is that it's not obvious that you're dealing with the box, rather than the value. When you compare two structs then you're comparing values, but when you compare two classes then (by default) you're comparing the reference - i.e. are these the same instance?
This can be confusing when dealing with boxed value types:
int a = 7;
int b = 7;
if(a == b) // Evaluates to true, because a and b have the same value
object c = (object) 7;
object d = (object) 7;
if(c == d) // Evaluates to false, because c and d are different instances
It's easy to work around:
if(c.Equals(d)) // Evaluates to true because it calls the underlying int's equals
if(((int) c) == ((int) d)) // Evaluates to true once the values are cast
However it is another thing to be careful of when dealing with boxed values.
Boxing is the process of conversion of a value type into a reference type. Whereas Unboxing is the conversion of a reference type into a value type.
EX: int i = 123;
object o = i;// Boxing
int j = (int)o;// UnBoxing
Value Types are: int, char and structures, enumerations.
Reference Types are:
Classes,interfaces,arrays,strings and objects
The language-agnostic meaning of a box is just "an object contains some other value".
Literally, boxing is an operation to put some value into the box. More specifically, it is an operation to create a new box containing the value. After boxing, the boxed value can be accessed from the box object, by unboxing.
Note that objects (not OOP-specific) in many programming languages are about identities, but values are not. Two objects are same iff. they have identities not distinguishable in the program semantics. Values can also be the same (usually under some equality operators), but we do not distinguish them as "one" or "two" unique values.
Providing boxes is mainly about the effort to distinguish side effects (typically, mutation) from the states on the objects otherwise probably invisible to the users.
A language may limit the allowed ways to access an object and hide the identity of the object by default. For example, typical Lisp dialects has no explicit distinctions between objects and values. As a result, the implementation has the freedom to share the underlying storage of the objects until some mutation operations occurs on the object (so the object must be "detached" after the operation from the shared instance to make the effect visible, i.e. the mutated value stored in the object could be different than the other objects having the old value). This technique is sometimes called object interning.
Interning makes the program more memory efficient at runtime if the objects are shared without frequent needs of mutation, at the cost that:
The users cannot distinguish the identity of the objects.
There are no way to identify an object and to ensure it has states explicitly independent to other objects in the program before some side effects have actually occur (and the implementation does not aggressively to do the interning concurrently; this should be the rare case, though).
There may be more problems on interoperations which require to identify different objects for different operations.
There are risks that such assumptions can be false, so the performance is actually made worse by applying the interning.
This depends on the programming paradigm. Imperative programming which mutates objects frequently certainly would not work well with interning.
Implementations depending on COW (copy-on-write) to ensure interning can incur serious performance degradation in concurrent environments.
Even local sharing specifically for a few internal data structures can be bad. For example, ISO C++ 11 did not allow sharing of the internal elements of std::basic_string for this reason exactly, even at the cost of breaking the ABI on at least one mainstream implementation (libstdc++).
Boxing and unboxing incur performance penalties. This is obvious especially when these operations can be naively avoided by hand but actually not easy for the optimizer. The concrete measurement of the cost depends (on per-implementation or even per-program basis), though.
Mutable cells, i.e. boxes, are well-established facilities exactly to resolve the problems of the 1st and 2nd bullets listed above. Additionally, there can be immutable boxes for implementation of assignment in a functional language. See SRFI-111 for a practical instance.
Using mutable cells as function arguments with call-by-value strategy implements the visible effects of mutation being shared between the caller and the callee. The object contained by an box is effectively "called by shared" in this sense.
Sometimes, the boxes are referred as references (which is technically false), so the shared semantics are named "reference semantics". This is not correct, because not all references can propagate the visible side effects (e.g. immutable references). References are more about exposing the access by indirection, while boxes are the efforts to expose minimal details of the accesses like whether indirection or not (which is uninterested and better avoided by the implementation).
Moreover, "value semantic" is irrelevant here. Values are not against to references, nor to boxes. All the discussions above are based on call-by-value strategy. For others (like call-by-name or call-by-need), no boxes are needed to shared object contents in this way.
Java is probably the first programming language to make these features popular in the industry. Unfortunately, there seem many bad consequences concerned in this topic:
The overall programming paradigm does not fit the design.
Practically, the interning are limited to specific objects like immutable strings, and the cost of (auto-)boxing and unboxing are often blamed.
Fundamental PL knowledge like the definition of the term "object" (as "instance of a class") in the language specification, as well as the descriptions of parameter passing, are biased compared to the the original, well-known meaning, during the adoption of Java by programmers.
At least CLR languages are following the similar parlance.
Some more tips on implementations (and comments to this answer):
Whether to put the objects on the call stacks or the heap is an implementation details, and irrelevant to the implementation of boxes.
Some language implementations do not maintain a contiguous storage as the call stack.
Some language implementations do not even make the (per thread) activation records a linear stack.
Some language implementations do allocate stacks on the free store ("the heap") and transfer slices of frames between the stacks and the heap back and forth.
These strategies has nothing to do boxes. For instance, many Scheme implementations have boxes, with different activation records layouts, including all the ways listed above.
Besides the technical inaccuracy, the statement "everything is an object" is irrelevant to boxing.
Python, Ruby, and JavaScript all use latent typing (by default), so all identifiers referring to some objects will evaluate to values having the same static type. So does Scheme.
Some JavaScript and Ruby implementations use the so-called NaN-boxing to allow inlining allocation of some objects. Some others (including CPython) do not. With NaN boxing, a normal double object needs no unboxing to access its value, while a value of some other types can be boxed in a host double object, and there is no reference for double or the boxed value. With the naive pointer approach, a value of host object pointer like PyObject* is an object reference holding a box whose boxed value is stored in the dynamically allocated space.
At least in Python, objects are not "everything". They are also not known as "boxed values" unless you are talking about interoperability with specific implementations.
The .NET FCL generic collections:
List<T>
Dictionary<TKey, UValue>
SortedDictionary<TKey, UValue>
Stack<T>
Queue<T>
LinkedList<T>
were all designed to overcome the performance issues of boxing and unboxing in previous collection implementations.
For more, see chapter 16, CLR via C# (2nd Edition).
Boxing and unboxing facilitates value types to be treated as objects. Boxing means converting a value to an instance of the object reference type. For example, Int is a class and int is a data type. Converting int to Int is an exemplification of boxing, whereas converting Int to int is unboxing. The concept helps in garbage collection, Unboxing, on the other hand, converts object type to value type.
int i=123;
object o=(object)i; //Boxing
o=123;
i=(int)o; //Unboxing.
Like anything else, autoboxing can be problematic if not used carefully. The classic is to end up with a NullPointerException and not be able to track it down. Even with a debugger. Try this:
public class TestAutoboxNPE
{
public static void main(String[] args)
{
Integer i = null;
// .. do some other stuff and forget to initialise i
i = addOne(i); // Whoa! NPE!
}
public static int addOne(int i)
{
return i + 1;
}
}