Value receiver vs. pointer receiver - function

It is very unclear for me in which case I would want to use a value receiver instead of always using a pointer receiver.
To recap from the docs:
type T struct {
a int
}
func (tv T) Mv(a int) int { return 0 } // value receiver
func (tp *T) Mp(f float32) float32 { return 1 } // pointer receiver
The docs also say "For types such as basic types, slices, and small structs, a value receiver is very cheap so unless the semantics of the method requires a pointer, a value receiver is efficient and clear."
First point they docs say a value receiver is "very cheap", but the question is whether it is cheaper than a pointer receiver. So I made a small benchmark (code on gist) which showed me, that pointer receiver is faster even for a struct that has only one string field. These are the results:
// Struct one empty string property
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 500000000 3.62 ns/op
// Struct one zero int property
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 2000000000 0.36 ns/op
(Edit: Please note that second point became invalid in newer go versions, see comments.)
Second point the docs say that a value receiver it is "efficient and clear" which is more a matter of taste, isn't it? Personally I prefer consistency by using the same thing everywhere. Efficiency in what sense? Performance wise it seems pointer are almost always more efficient. Few test-runs with one int property showed minimal advantage of Value receiver (range of 0.01-0.1 ns/op)
Can someone tell me a case where a value receiver clearly makes more sense than a pointer receiver? Or am I doing something wrong in the benchmark? Did I overlook other factors?

Note that the FAQ does mention consistency
Next is consistency. If some of the methods of the type must have pointer receivers, the rest should too, so the method set is consistent regardless of how the type is used. See the section on method sets for details.
As mentioned in this thread:
The rule about pointers vs. values for receivers is that value methods can
be invoked on pointers and values, but pointer methods can only be invoked
on pointers
Which is not true, as commented by Sart Simha
Both value receiver and pointer receiver methods can be invoked on a correctly-typed pointer or non-pointer.
Regardless of what the method is called on, within the method body the identifier of the receiver refers to a by-copy value when a value receiver is used, and a pointer when a pointer receiver is used: example.
Now:
Can someone tell me a case where a value receiver clearly makes more sense then a pointer receiver?
The Code Review comment can help:
If the receiver is a map, func or chan, don't use a pointer to it.
If the receiver is a slice and the method doesn't reslice or reallocate the slice, don't use a pointer to it.
If the method needs to mutate the receiver, the receiver must be a pointer.
If the receiver is a struct that contains a sync.Mutex or similar synchronizing field, the receiver must be a pointer to avoid copying.
If the receiver is a large struct or array, a pointer receiver is more efficient. How large is large? Assume it's equivalent to passing all its elements as arguments to the method. If that feels too large, it's also too large for the receiver.
Can function or methods, either concurrently or when called from this method, be mutating the receiver? A value type creates a copy of the receiver when the method is invoked, so outside updates will not be applied to this receiver. If changes must be visible in the original receiver, the receiver must be a pointer.
If the receiver is a struct, array or slice and any of its elements is a pointer to something that might be mutating, prefer a pointer receiver, as it will make the intention more clear to the reader.
If the receiver is a small array or struct that is naturally a value type (for instance, something like the time.Time type), with no mutable fields and no pointers, or is just a simple basic type such as int or string, a value receiver makes sense.
A value receiver can reduce the amount of garbage that can be generated; if a value is passed to a value method, an on-stack copy can be used instead of allocating on the heap. (The compiler tries to be smart about avoiding this allocation, but it can't always succeed.) Don't choose a value receiver type for this reason without profiling first.
Finally, when in doubt, use a pointer receiver.
The part in bold is found for instance in net/http/server.go#Write():
// Write writes the headers described in h to w.
//
// This method has a value receiver, despite the somewhat large size
// of h, because it prevents an allocation. The escape analysis isn't
// smart enough to realize this function doesn't mutate h.
func (h extraHeader) Write(w *bufio.Writer) {
...
}
Note: irbull points out in the comments a warning about interface methods:
Following the advice that the receiver type should be consistent, if you have a pointer receiver, then your (p *type) String() string method should also use a pointer receiver.
But this does not implement the Stringer interface, unless the caller of your API also uses pointer to your type, which might be a usability problem of your API.
I don't know if consistency beats usability here.
points out to:
"Method Sets (Pointer vs Value Receiver)"
you can mix and match methods with value receivers and methods with pointer receivers, and use them with variables containing values and pointers, without worrying about which is which.
Both will work, and the syntax is the same.
However, if methods with pointer receivers are needed to satisfy an interface, then only a pointer will be assignable to the interface — a value won't be valid.
"Go interfaces and automatically generated functions" from Chris Siebenmann (June 2017)
Calling value receiver methods through interfaces always creates extra copies of your values.
Interface values are fundamentally pointers, while your value receiver methods require values; ergo every call requires Go to create a new copy of the value, call your method with it, and then throw the value away.
There is no way to avoid this as long as you use value receiver methods and call them through interface values; it's a fundamental requirement of Go.
"Learning about Go's unaddressable values and slicing" (still from Chris (Sept. 2018))
Concept of unaddressable values, which are the opposite of addressable values. The careful technical version is in the Go specification in Address operators, but the hand waving summary version is that most anonymous values are not addressable (one big exception is composite literals)

To add additionally to #VonC great, informative answer.
I am surprised no one really mentioned the maintainance cost once the project gets larger, old devs leave and new one comes. Go surely is a young language.
Generally speaking, I try to avoid pointers when I can but they do have their place and beauty.
I use pointers when:
working with large datasets
have a struct maintaining state, e.g. TokenCache,
I make sure ALL fields are PRIVATE, interaction is possible only via defined method receivers
I don't pass this function to any goroutine
E.g:
type TokenCache struct {
cache map[string]map[string]bool
}
func (c *TokenCache) Add(contract string, token string, authorized bool) {
tokens := c.cache[contract]
if tokens == nil {
tokens = make(map[string]bool)
}
tokens[token] = authorized
c.cache[contract] = tokens
}
Reasons why I avoid pointers:
pointers are not concurrently safe (the whole point of GoLang)
once pointer receiver, always pointer receiver (for all Struct's methods for consistency)
mutexes are surely more expensive, slower and harder to maintain comparing to the "value copy cost"
speaking of "value copy cost", is that really an issue? Premature optimization is root to all evil, you can always add pointers later
it directly, conciously forces me to design small Structs
pointers can be mostly avoided by designing pure functions with clear intention and obvious I/O
garbage collection is harder with pointers I believe
easier to argue about encapsulation, responsibilities
keep it simple, stupid (yes, pointers can be tricky because you never know the next project's dev)
unit testing is like walking through pink garden (slovak only expression?), means easy
no NIL if conditions (NIL can be passed where a pointer was expected)
My rule of thumb, write as many encapsulated methods as possible such as:
package rsa
// EncryptPKCS1v15 encrypts the given message with RSA and the padding scheme from PKCS#1 v1.5.
func EncryptPKCS1v15(rand io.Reader, pub *PublicKey, msg []byte) ([]byte, error) {
return []byte("secret text"), nil
}
cipherText, err := rsa.EncryptPKCS1v15(rand, pub, keyBlock)
UPDATE:
This question inspired me to research the topic more and write a blog post about it https://medium.com/gophersland/gopher-vs-object-oriented-golang-4fa62b88c701

It is a question of semantics. Imagine you write a function taking two numbers as arguments. You don't want to suddenly find out that either of these numbers got mutated by the calling function. If you pass them as pointers that is possible. Lots of things should act just like numbers. Things like points, 2D vectors, dates, rectangles, circles etc. These things don't have identity. Two circle at the same position and with the same radius should not be distinguished from each other. They are value types.
But something like a database connection or a file handle, a button in the GUI is something where identity matters. In these cases you want a pointer to the object.
When something is inherently a value type such as a rectangle or point, it is really preferable to be able to pass them without using pointers. Why? Because it means you are certain to avoid mutating the object. It clarifies semantics and intent to reader of your code. It is clear that the function receiving the object cannot and will not mutate the object.

Related

What is the memory allocation when you return a function in Golang?

Here is a simplified code
func MyHandler(a int) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteCode(a)
})
}
Whenever a http request comes MyHandler will be called, and it will return a function which will be used to handle the request. So whenever a http request comes a new function object will be created. Function is taken as the first class in Go. I'm trying to understand what actually happened when you return a function from memory's perspective. When you return a value, for example an integer, it will occupy 4 bytes in stack. So how about return a function and lots of things inside the function body? Is it an efficient way to do so? What's shortcomings?
If you're not used to closures, they may seem a bit magic. They are, however, easy to implement in compilers:
The compiler finds any variables that must be captured by the closure. It puts them into a working area that will be allocated and remain allocated as long as the closure itself exists.
The compiler then generates the inner function with a secret extra parameter, or some other runtime trickery,1 such that calling the function activates the closure.
Because the returned function accesses its closure variables through the compile-time arrangement, there's nothing special needed. And since Go is a garbage-collected language, there's nothing else needed either: the pointer to the closure keeps the closure data alive until the pointer is gone because the function cannot be called any more, at which point the closure data evaporates (well, at the next GC).
1GCC sometimes uses trampolines to do this for C, where trampolines are executable code generated at runtime. The executable code may set a register or pass an extra parameter or some such. This can be expensive since something treated as data at runtime (generated code) must be turned into executable code at runtime (possibly requiring a system call and potentially requiring that some supervisory code "vet" the resulting runtime code).
Go does not need any of this because the language was defined with closures in mind, so implementors don't, er, "close off" any easy ways to make this all work. Some runtime ABIs are defined with closures in mind as well, e.g., register r1 is reserved as the closure-variables pointer in all pointer-to-function types, or some such.
Actual function size is irrelevant. When you return a function like this, memory will be allocated for the closure, that is, any variables in the scope that the function uses. In this case, a pointer will be returned containing the address of the function and a pointer to the closure, which will contain a reference to the variable a.

Term for function parameters defined for generic use, not specific

Is there a term for this technique? One prominent example is the WinAPI: SendMessage( hwnd, msg, info1, info2 ) where parameters #3 and #4 only make sense per msg (which also means there are cases when only one or none of those two parameters are needed). See MSDN.
Rephrased: having an all-purpose function that always accepts multiple arguments, but interpreting them depends on a previous argument. I don't want to talk about open arrays, open arguments, typeless arguments... I know all that. That's not what I'm asking - I want to have the term for this type of functions (any maybe also how unspecific parameters are called).
This is not about casts or passing by reference - the parameter types are always the same. Other example: calculate( char operation, int a, int b ) which is then used as
calculate( '+', 2, 5 ) (parameters #2 and #3 are summands)
calculate( '/', 4, 2 ) (parameter #2 is the divident and parameter #3 is the divisor)
calculate( '!', 3, 0 ) (parameter #2 is the factorial and parameter #3 is unused)
In all these cases the data type is always the same and never casted. But the meaning of parameters #2 and #3 differ per parameter #1. And since this is the case it is difficult to give those parameters a meaningful name. Of course the function itself most likely uses a switch(), but that is not subject to my question. How are parameters #2 and #3 called, where a distinct name cannot be found, but data types are always the same?
The fact the msg argument "changes" the parameters is through a simple switch statement. Each "msg" in the switch knows the parameters(with type) needed and casts them appropriately.
This "technique" is called passing by reference, or passing by address. The latter is usually used for method pointers.
There is no Special name if that is what you are asking. It is a regular function, method or procedure.
The referenced Function is a Win32 API Function, which may be referred to as a "Windows function call."
This is an example of a static Parameter and multiple Dynamic parameters.
The static is the "msg" and the dynamic is described as the following:
These parameters are generic pointers. Passed by reference. They can point to any data type or no value, ie null pointer. It is up to the sender to lock the memory in place, and the receiving method to interpret the pointer correctly (through pointer casts).
This is an example of typeless argument passing. The only thing passed is a memory address. It is dangerous since the types passed must be agreed upon ahead of time(by convention and not contract as with a typed language construct) and must match on both sides of the call.
This was common before C++, in the C days, we only had C structs to pass around. Leading to many General Fault Protection errors. Since then, typed interfaces mostly have replaced the generic equivalents through libraries. But the underlying Win32 methods remain the same. The main substantial change since its' inception is the acceptance of 64-bit pointers.
Although not widely supported, what you are referring to would be a dependently typed function (or dependently typed parameters).
To quote wikipedia on dependent types
A "pair of integers" is a type. A "pair of integers where the second is greater than the first" is a dependent type because of the dependence on the value.
The parameters could have a type that depends on a value. The type of info1 depends on the value msg as does info2.
In order to make this approach work in a language without dependent types, the dependent parameters are given a very generic type that is only refined later on when more information is available. When the type of msg becomes known (at runtime) only then is are the types of info1 and info2 assumed. Even though the language doesn't allow you to express this dependency, I would still call the approach a dependently type one.

Golang Function Types with a binding to a struct?

This might seem like a silly question, but I want to make a struct with a collection of functions, but the functions bind to the struct. I can sorta see that this is a cycle, but humor me with this example:
type FuncType func() error
type FuncSet struct {
TokenVariable int
FuncTyper FuncType
}
and I want to be able to create a function bound to the FuncSet type so it can operate on TokenVariable, thusly:
func (f *FuncSet) FuncType() error {
f.TokenVariable = 100
return nil
}
However, this changes the signature of the type (I can't find any information about type bindings as part of function type specifications) such that assigning this function to the struct element tells me this function/variable is not found.
I can see an easy work-around for this, by prefixing the parameters with a pointer to the struct type, it's just a bit ugly.
I looked around a little further and discovered that what I'm kinda looking for is like a closure in that it can be passed a variable from the immediate outer scope but... well, I'll be glad to be corrected about this absence of type binding in function types, but for now passing the pointer to the type looks like the way to go.
I think I found the solution:
type nullTester func(*Bast, uint32) bool
type Bast struct {
...
isNull nullTester
...
}
func isNull(b *Bast, d uint32) bool {
return d == 0
}
and then I can bind it to the type like this:
func NewBast() (b *Bast) {
...
b.isNull = isNull
...
}
// IsNull - tests if a value in the tree is null
func (b *Bast) IsNull(d uint32) bool {
return b.isNull(b, d)
}
It seems a bit hackish and I'm not sure what's going to happen in a second library that I will write that sets a different type for the uint32 parameter, but go vet is happy so maybe this is the correct way to do it.
It does seem to me that func types should really have a field in the grammar to specify a binding type, but maybe I just found a hack that sorta lets me do polymorphism. In calling programs all they will see is the nice exported function that binds to the type as planned and I get my readability as well as being able to retarget the base library to store a different type of data.
I think this is the proper solution. I just can't find anything that confirms or denies whether in a type Name func specification there is any way of asserting the type. It really should not match up, since the binding is part of the signature, but the syntax for type with functions does not appear to have this type binding.
My actual code is here, and you can see by looking at it what I am aiming to do:
https://github.com/calibrae-project/bast/blob/master/pkg/bast/bast.go
The differences between the type of data the tree stores is entirely superficial, because it is intended to be primarily used for sorting unsigned integers of various lengths, and one important thing it needs to have is to be able to work from a, for example, 64 bit integer but sort only by the first or last half (as I have a bigger project that treats these hash values as coordinates in an adjacency list). In theory it could be used instead of a hash table lookup as well, with a low variance in time to find elements because of the binary tree structure.
It's not a conventional, reference-vector based tree, and the store itself is an array with an unconventional power of two mapping, a 'dense' tree, and the purpose above all, for implementing this way, is that when the tree is walked, as well as rotated, much of the time it is sequential blocks of memory being accessed which should make for a lot less cache misses than a conventional binary tree (and for which reason generally this type of application just uses some kind of sort like a bucket sort).
You could use an anonymous field with an interface that defines the method set that you want to use (that might change).
Go playground here
You'd define your interface
type validator interface {
IsRightOf(a, b interface{}) bool
... // other methods
}
and your type:
type Bast struct {
validator // anonymous interface field
... // some fields
}
Then you can access the methods of validator from the Bast type
b := bast.New()
b.IsRightOf(c, d) // this is valid, you do not need to do b.validator.IsRightOf(...)
because validator is an interface you can change those methods how you like.

Is currying just a way to avoid inheritance?

So my understanding of currying (based on SO questions) is that it lets you partially set parameters of a function and return a "truncated" function as a result.
If you have a big hairy function takes 10 parameters and looks like
function (location, type, gender, jumpShot%, SSN, vegetarian, salary) {
//weird stuff
}
and you want a "subset" function that will let you deal with presets for all but the jumpShot%, shouldn't you just break out a class that inherits from the original function?
I suppose what I'm looking for is a use case for this pattern. Thanks!
Currying has many uses. From simply specifying default parameters for functions you use often to returning specialized functions that serve a specific purpose.
But let me give you this example:
function log_message(log_level, message){}
log_error = curry(log_message, ERROR)
log_warning = curry(log_message, WARNING)
log_message(WARNING, 'This would be a warning')
log_warning('This would also be a warning')
In javascript I do currying on callback functions (because they cannot be passed any parameters after they are called (from the caller)
So something like:
...
var test = "something specifically set in this function";
onSuccess: this.returnCallback.curry(test).bind(this),
// This will fail (because this would pass the var and possibly change it should the function be run elsewhere
onSuccess: this.returnCallback.bind(this,test),
...
// this has 2 params, but in the ajax callback, only the 'ajaxResponse' is passed, so I have to use curry
returnCallback: function(thePassedVar, ajaxResponse){
// now in here i can have 'thePassedVar', if
}
I'm not sure if that was detailed or coherent enough... but currying basically lets you 'prefill' the parameters and return a bare function call that already has data filled (instead of requiring you to fill that info at some other point)
When programming in a functional style, you often bind arguments to generate new functions (in this example, predicates) from old. Pseudo-code:
filter(bind_second(greater_than, 5), some_list)
might be equivalent to:
filter({x : x > 5}, some_list)
where {x : x > 5} is an anonymous function definition. That is, it constructs a list of all values from some_list which are greater than 5.
In many cases, the parameters to be omitted will not be known at compile time, but rather at run time. Further, there's no limit to the number of curried delegates that may exist for a given function. The following is adapted from a real-world program.
I have a system in which I send out command packets to a remote machine and receive back response packets. Every command packet has an index number, and each reply bears the index number of the command to which it is a response. A typical command, translated into English, might be "give me 128 bytes starting at address 0x12300". A typical response might be "Successful." along with 128 bytes of data.
To handle communication, I have a routine which accepts a number of command packets, each with a delegate. As each response is received, the corresponding delegate will be run on the received data. The delegate associated with the command above would be something like "Confirm that I got a 'success' with 128 bytes of data, and if so, store them into my buffer at address 0x12300". Note that multiple packets may be outstanding at any given time; the curried address parameter is necessary for the routine to know where the incoming data should go. Even if I wanted to write a "store data to buffer" routine which didn't require an address parameter, it would have no way of knowing where the incoming data should go.

What is Type-safe?

What does "type-safe" mean?
Type safety means that the compiler will validate types while compiling, and throw an error if you try to assign the wrong type to a variable.
Some simple examples:
// Fails, Trying to put an integer in a string
String one = 1;
// Also fails.
int foo = "bar";
This also applies to method arguments, since you are passing explicit types to them:
int AddTwoNumbers(int a, int b)
{
return a + b;
}
If I tried to call that using:
int Sum = AddTwoNumbers(5, "5");
The compiler would throw an error, because I am passing a string ("5"), and it is expecting an integer.
In a loosely typed language, such as javascript, I can do the following:
function AddTwoNumbers(a, b)
{
return a + b;
}
if I call it like this:
Sum = AddTwoNumbers(5, "5");
Javascript automaticly converts the 5 to a string, and returns "55". This is due to javascript using the + sign for string concatenation. To make it type-aware, you would need to do something like:
function AddTwoNumbers(a, b)
{
return Number(a) + Number(b);
}
Or, possibly:
function AddOnlyTwoNumbers(a, b)
{
if (isNaN(a) || isNaN(b))
return false;
return Number(a) + Number(b);
}
if I call it like this:
Sum = AddTwoNumbers(5, " dogs");
Javascript automatically converts the 5 to a string, and appends them, to return "5 dogs".
Not all dynamic languages are as forgiving as javascript (In fact a dynamic language does not implicity imply a loose typed language (see Python)), some of them will actually give you a runtime error on invalid type casting.
While its convenient, it opens you up to a lot of errors that can be easily missed, and only identified by testing the running program. Personally, I prefer to have my compiler tell me if I made that mistake.
Now, back to C#...
C# supports a language feature called covariance, this basically means that you can substitute a base type for a child type and not cause an error, for example:
public class Foo : Bar
{
}
Here, I created a new class (Foo) that subclasses Bar. I can now create a method:
void DoSomething(Bar myBar)
And call it using either a Foo, or a Bar as an argument, both will work without causing an error. This works because C# knows that any child class of Bar will implement the interface of Bar.
However, you cannot do the inverse:
void DoSomething(Foo myFoo)
In this situation, I cannot pass Bar to this method, because the compiler does not know that Bar implements Foo's interface. This is because a child class can (and usually will) be much different than the parent class.
Of course, now I've gone way off the deep end and beyond the scope of the original question, but its all good stuff to know :)
Type-safety should not be confused with static / dynamic typing or strong / weak typing.
A type-safe language is one where the only operations that one can execute on data are the ones that are condoned by the data's type. That is, if your data is of type X and X doesn't support operation y, then the language will not allow you to to execute y(X).
This definition doesn't set rules on when this is checked. It can be at compile time (static typing) or at runtime (dynamic typing), typically through exceptions. It can be a bit of both: some statically typed languages allow you to cast data from one type to another, and the validity of casts must be checked at runtime (imagine that you're trying to cast an Object to a Consumer - the compiler has no way of knowing whether it's acceptable or not).
Type-safety does not necessarily mean strongly typed, either - some languages are notoriously weakly typed, but still arguably type safe. Take Javascript, for example: its type system is as weak as they come, but still strictly defined. It allows automatic casting of data (say, strings to ints), but within well defined rules. There is to my knowledge no case where a Javascript program will behave in an undefined fashion, and if you're clever enough (I'm not), you should be able to predict what will happen when reading Javascript code.
An example of a type-unsafe programming language is C: reading / writing an array value outside of the array's bounds has an undefined behaviour by specification. It's impossible to predict what will happen. C is a language that has a type system, but is not type safe.
Type safety is not just a compile time constraint, but a run time constraint. I feel even after all this time, we can add further clarity to this.
There are 2 main issues related to type safety. Memory** and data type (with its corresponding operations).
Memory**
A char typically requires 1 byte per character, or 8 bits (depends on language, Java and C# store unicode chars which require 16 bits).
An int requires 4 bytes, or 32 bits (usually).
Visually:
char: |-|-|-|-|-|-|-|-|
int : |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type safe language does not allow an int to be inserted into a char at run-time (this should throw some kind of class cast or out of memory exception). However, in a type unsafe language, you would overwrite existing data in 3 more adjacent bytes of memory.
int >> char:
|-|-|-|-|-|-|-|-| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?|
In the above case, the 3 bytes to the right are overwritten, so any pointers to that memory (say 3 consecutive chars) which expect to get a predictable char value will now have garbage. This causes undefined behavior in your program (or worse, possibly in other programs depending on how the OS allocates memory - very unlikely these days).
** While this first issue is not technically about data type, type safe languages address it inherently and it visually describes the issue to those unaware of how memory allocation "looks".
Data Type
The more subtle and direct type issue is where two data types use the same memory allocation. Take a int vs an unsigned int. Both are 32 bits. (Just as easily could be a char[4] and an int, but the more common issue is uint vs. int).
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type unsafe language allows the programmer to reference a properly allocated span of 32 bits, but when the value of a unsigned int is read into the space of an int (or vice versa), we again have undefined behavior. Imagine the problems this could cause in a banking program:
"Dude! I overdrafted $30 and now I have $65,506 left!!"
...'course, banking programs use much larger data types. ;) LOL!
As others have already pointed out, the next issue is computational operations on types. That has already been sufficiently covered.
Speed vs Safety
Most programmers today never need to worry about such things unless they are using something like C or C++. Both of these languages allow programmers to easily violate type safety at run time (direct memory referencing) despite the compilers' best efforts to minimize the risk. HOWEVER, this is not all bad.
One reason these languages are so computationally fast is they are not burdened by verifying type compatibility during run time operations like, for example, Java. They assume the developer is a good rational being who won't add a string and an int together and for that, the developer is rewarded with speed/efficiency.
Many answers here conflate type-safety with static-typing and dynamic-typing. A dynamically typed language (like smalltalk) can be type-safe as well.
A short answer: a language is considered type-safe if no operation leads to undefined behavior. Many consider the requirement of explicit type conversions necessary for a language to be strictly typed, as automatic conversions can sometimes leads to well defined but unexpected/unintuitive behaviors.
A programming language that is 'type-safe' means following things:
You can't read from uninitialized variables
You can't index arrays beyond their bounds
You can't perform unchecked type casts
An explanation from a liberal arts major, not a comp sci major:
When people say that a language or language feature is type safe, they mean that the language will help prevent you from, for example, passing something that isn't an integer to some logic that expects an integer.
For example, in C#, I define a function as:
void foo(int arg)
The compiler will then stop me from doing this:
// call foo
foo("hello world")
In other languages, the compiler would not stop me (or there is no compiler...), so the string would be passed to the logic and then probably something bad will happen.
Type safe languages try to catch more at "compile time".
On the down side, with type safe languages, when you have a string like "123" and you want to operate on it like an int, you have to write more code to convert the string to an int, or when you have an int like 123 and want to use it in a message like, "The answer is 123", you have to write more code to convert/cast it to a string.
To get a better understanding do watch the below video which demonstrates code in type safe language (C#) and NOT type safe language ( javascript).
http://www.youtube.com/watch?v=Rlw_njQhkxw
Now for the long text.
Type safety means preventing type errors. Type error occurs when data type of one type is assigned to other type UNKNOWINGLY and we get undesirable results.
For instance JavaScript is a NOT a type safe language. In the below code “num” is a numeric variable and “str” is string. Javascript allows me to do “num + str” , now GUESS will it do arithmetic or concatenation .
Now for the below code the results are “55” but the important point is the confusion created what kind of operation it will do.
This is happening because javascript is not a type safe language. Its allowing to set one type of data to the other type without restrictions.
<script>
var num = 5; // numeric
var str = "5"; // string
var z = num + str; // arthimetic or concat ????
alert(z); // displays “55”
</script>
C# is a type safe language. It does not allow one data type to be assigned to other data type. The below code does not allow “+” operator on different data types.
Concept:
To be very simple Type Safe like the meanings, it makes sure that type of the variable should be safe like
no wrong data type e.g. can't save or initialized a variable of string type with integer
Out of bound indexes are not accessible
Allow only the specific memory location
so it is all about the safety of the types of your storage in terms of variables.
Type-safe means that programmatically, the type of data for a variable, return value, or argument must fit within a certain criteria.
In practice, this means that 7 (an integer type) is different from "7" (a quoted character of string type).
PHP, Javascript and other dynamic scripting languages are usually weakly-typed, in that they will convert a (string) "7" to an (integer) 7 if you try to add "7" + 3, although sometimes you have to do this explicitly (and Javascript uses the "+" character for concatenation).
C/C++/Java will not understand that, or will concatenate the result into "73" instead. Type-safety prevents these types of bugs in code by making the type requirement explicit.
Type-safety is very useful. The solution to the above "7" + 3 would be to type cast (int) "7" + 3 (equals 10).
Try this explanation on...
TypeSafe means that variables are statically checked for appropriate assignment at compile time. For example, consder a string or an integer. These two different data types cannot be cross-assigned (ie, you can't assign an integer to a string nor can you assign a string to an integer).
For non-typesafe behavior, consider this:
object x = 89;
int y;
if you attempt to do this:
y = x;
the compiler throws an error that says it can't convert a System.Object to an Integer. You need to do that explicitly. One way would be:
y = Convert.ToInt32( x );
The assignment above is not typesafe. A typesafe assignement is where the types can directly be assigned to each other.
Non typesafe collections abound in ASP.NET (eg, the application, session, and viewstate collections). The good news about these collections is that (minimizing multiple server state management considerations) you can put pretty much any data type in any of the three collections. The bad news: because these collections aren't typesafe, you'll need to cast the values appropriately when you fetch them back out.
For example:
Session[ "x" ] = 34;
works fine. But to assign the integer value back, you'll need to:
int i = Convert.ToInt32( Session[ "x" ] );
Read about generics for ways that facility helps you easily implement typesafe collections.
C# is a typesafe language but watch for articles about C# 4.0; interesting dynamic possibilities loom (is it a good thing that C# is essentially getting Option Strict: Off... we'll see).
Type-Safe is code that accesses only the memory locations it is authorized to access, and only in well-defined, allowable ways.
Type-safe code cannot perform an operation on an object that is invalid for that object. The C# and VB.NET language compilers always produce type-safe code, which is verified to be type-safe during JIT compilation.
Type-safe means that the set of values that may be assigned to a program variable must fit well-defined and testable criteria. Type-safe variables lead to more robust programs because the algorithms that manipulate the variables can trust that the variable will only take one of a well-defined set of values. Keeping this trust ensures the integrity and quality of the data and the program.
For many variables, the set of values that may be assigned to a variable is defined at the time the program is written. For example, a variable called "colour" may be allowed to take on the values "red", "green", or "blue" and never any other values. For other variables those criteria may change at run-time. For example, a variable called "colour" may only be allowed to take on values in the "name" column of a "Colours" table in a relational database, where "red, "green", and "blue", are three values for "name" in the "Colours" table, but some other part of the computer program may be able to add to that list while the program is running, and the variable can take on the new values after they are added to the Colours table.
Many type-safe languages give the illusion of "type-safety" by insisting on strictly defining types for variables and only allowing a variable to be assigned values of the same "type". There are a couple of problems with this approach. For example, a program may have a variable "yearOfBirth" which is the year a person was born, and it is tempting to type-cast it as a short integer. However, it is not a short integer. This year, it is a number that is less than 2009 and greater than -10000. However, this set grows by 1 every year as the program runs. Making this a "short int" is not adequate. What is needed to make this variable type-safe is a run-time validation function that ensures that the number is always greater than -10000 and less than the next calendar year. There is no compiler that can enforce such criteria because these criteria are always unique characteristics of the problem domain.
Languages that use dynamic typing (or duck-typing, or manifest typing) such as Perl, Python, Ruby, SQLite, and Lua don't have the notion of typed variables. This forces the programmer to write a run-time validation routine for every variable to ensure that it is correct, or endure the consequences of unexplained run-time exceptions. In my experience, programmers in statically typed languages such as C, C++, Java, and C# are often lulled into thinking that statically defined types is all they need to do to get the benefits of type-safety. This is simply not true for many useful computer programs, and it is hard to predict if it is true for any particular computer program.
The long & the short.... Do you want type-safety? If so, then write run-time functions to ensure that when a variable is assigned a value, it conforms to well-defined criteria. The down-side is that it makes domain analysis really difficult for most computer programs because you have to explicitly define the criteria for each program variable.
Type Safety
In modern C++, type safety is very important. Type safety means that you use the types correctly and, therefore, avoid unsafe casts and unions. Every object in C++ is used according to its type and an object needs to be initialized before its use.
Safe Initialization: {}
The compiler protects from information loss during type conversion. For example,
int a{7}; The initialization is OK
int b{7.5} Compiler shows ERROR because of information loss.\
Unsafe Initialization: = or ()
The compiler doesn't protect from information loss during type conversion.
int a = 7 The initialization is OK
int a = 7.5 The initialization is OK, but information loss occurs. The actual value of a will become 7.0
int c(7) The initialization is OK
int c(7.5) The initialization is OK, but information loss occurs. The actual value of a will become 7.0