Possible way of overloading functions or prototypes with different return data types RPGLE - function

I have a procedure that returns a data type of char/packed/date based on the input parameters. I was thinking of possible ways to use overload, but IBM doesn't allowed overloading of prototypes that return different types of variables.
One way I have gotten around was by returning a data structure with all 3 data types starting at position 1 and just picking the one I need. So the code would look something like this.
Copy Source
dcl-ds myDs qualified;
charData char(100) pos(1);
packedData packed(10:3) pos(1);
dateData date pos(1);
end-ds;
dcl-proc someProc export;
dcl-pi someProc likeDS(myDS);
x1;
x2;
x3;
end-pi;
doSomething;
end-proc;
Which requires to be used as:
dcl-s localChar char(100);
dcl-s localPacked packed(10:3);
myDs = someProc(par1:par2:par3);
localPacked = myDS.packedData;
I was wondering if there was a way to just skip the coding of myDS = someProc() and just code directly as localPacked = someProc();
if it makes a difference, the caller would know what type of data to get back as well as the result would be included in the calling parms.
I was reading up on pointers and was wondering if using pointers would solve my problem if I used *caller as actgrp. I am still new to RPG and still learning, so not too familiar with the pointers usage. Just learned procedures and prototypes, and their usage and how fun to use, so that's that's all basically I am coding now on free time.

Welcome to RPG coding! :-)
I think your goal should be to make things as easy as possible for people who want to call your procedure. So I think you should create three procedures.
If the caller has to know the type of data returned, it's easier for them to just call the specific procedure. That way, the called procedure also knows what kind of data is wanted.
localPacked = someProcPacked(par1:par2:par3);
If all three procedures have a lot of code that is the same, you could put that code into a fourth procedure with the DS passed as a parameter. The real callers would not need to know about this procedure, so you would not export someProc().
dcl-proc someProcPacked export;
dcl-pi *n Packed(10:3);
x1;
x2;
x3;
end-pi;
dcl-ds ds likeds(myDs);
someProc(ds : x1 : x2 : x3);
// should you also tell someProc to give packed data?
return ds.packedData;
end-proc;

The first question that arose personally for me: as a consumer calling your function. must understand what type of result it will return to him?
Given that the possible types in your example are absolutely incompatible with each other.
One could understand if these were potentially compatible types, for example
char(26)
and
timestamp(6)
or
char(7)
and
zoned(7:0)
but
char(100)
packed(10:3)
date(1)
are not compatible with each other. If the function returns
packed(10:3)
and you try to interpret it as
char(100)
You will get complete nonsense. Otherwise, you risk getting an error.
In your case, one more field suggests itself in the resulting structure - the type of the result.
But in general, this is not a very good approach.
Overload is designed for when you have two or more functions that perform basically the same operation and return the same result. For example, the function of reading a message from the Data Queue is a possible option to unconditionally read the first available message, or (if it is a KEYED queue) read the message with the corresponding key value. In the first case, you only need to pass the name of the queue and the buffer for the message, in the second, you will also need to pass the key value.
dcl-pr ReadNoKey int(10);
dtaqName char(10) const;
dtaqLib char(10) const;
msgBuff char(65536) options(*varsize);
msgBuffLen int(10) value;
end-pr;
dcl-pr ReadKey int(10);
dtaqName char(10) const;
dtaqLib char(10) const;
msgBuff char(65536) options(*varsize);
msgBuffLen int(10) value;
msgKey char(256) optopns(*varsize);
msgKeyLen int(10) value;
end-pr;
dcl-pr Read int(10) overload(ReadNoKey: ReadKey);
In this case, you always call one Read function, and according to the set of parameters (passed the key or not), it is already clear what you want and what the program should do.
So it works

Related

Sub functions ml

So I'm trying to do an assignment for an ml course, the issue is that the function requires a set type: int * int -> int for example, and the way that I see to solve the problem is to use another function (say for iteration) to solve the problem.
I believe that lisp has some kind of way of having a function be in scope for only one other function.
I think that this could be done:
fun a (x, y) =
let
fun b (i,j) = ...;
in
...;
[Not sure of exact syntax for this but I remember reading something like this only it was for temporary variables (which could be functions?]
but please correct me if this is wrong.
In ML, functions are first class citizens (i.e. values). You can bind them via let just like any other value.
Therefore, your idea is correct. It is especially a good design for functions passed as "iterators" (i.e. to map/fold/iter). Your question is too vague however for any further advise.

Mysql - transforming non-null values part 2

In this question I asked if Mysql have a function which receives two arguments and returns null if the first is null or the second argument otherwise. Somebody said in the comment section that such function doesn't exist. How can I define this function in Mysql, considering that it may receive arguments of any type and the return value can be null or of the same type of the second parameter? Is it even possible?
This is not possible, given your or requirements.
The arguments and return values of stored functions (written in SQL) and user-defined functions (written in C) are all statically typed.
Granted, MySQL is pretty flexible with implicit casting, but the values are typed nonetheless. Even if you were okay with implicit casting, a scenario where that would be preferable to the seemingly-obvious solution is difficult to imagine.
IF(foo IS NULL,foo,bar) would suffice for the purpose and preserve the underlying types correctly, and IF(foo IS NULL,NULL,bar) would be almost the same thing, though the type of foo would be lost (e.g a "it's a NULL DATETIME").
You have previously rejected those, for reasons that are not intuitively obvious. When a purpose can be accomplished with built-ins, the motivation to reinvent the wheel is hard to understand.
try this:
Select IF(ISNULL(arg1), null, arg2)
Read more about Comparison Functions and Operators and Control Flow Functions.

Is it feasible to create a NullObject for every class? ( with a tool of course )

The NullObjectPattern is intended to be a "safe" ( neutral ) behavior.
The idea is create an object that don't do anything ( but doesn't throw NullPointerException either )
For instance the class defined as:
class Employee {
private String name;
private int age;
public String getName(){ return name; }
public int getAge() { return age; }
}
Would cause a NullPointerException in this code:
class Other {
Employee oscar;
String theName = oscar.getName(); // NPE
}
What the NOP says, is you can have an object like this:
class NullEmployee extends Employee {
public static final Employee instance = new NullEmployee();
public String getName(){ return ""; }
public int getAge() { return 0; }
}
And then use it as the default value.
Employee oscar = NullEmployee.instance;
The problem comes, when you need to repeat the code for every class you create, then a solution would be to have a tool to created it.
Would it be feasible/reasonable/useful to create such a tool or to use it ( if existed )?
Perhaps using AOP or DI the default value could be used automagically.
To me, the Null Object pattern feels like a placebo. A real object and a null object may have completely different meanings, but act very similar. Just like a placebo, the null object will trick you into believing there's nothing wrong, but something could be very wrong.
I think it's a good practice to fail early and fail often. In the end, you'll want to distinguish between a real object and a null object somewhere, and at that point it would be no different from checking against a null pointer.
From the Wikipedia article:
The advantage of this approach over a working default implementation is that a Null Object is very predictable and has no side effects: it does nothing.
It won't point out any problems either. Think of what will happen when a null object travels all the way through your application. Then, at some point, your application expects certain behavior from the object, which the null object implementation fails to deliver. At that point your application may crash or enter an invalid state. You'll have a very hard time tracing the origin of the null object. A null pointer would have thrown an exception right at the beginning, drawing your attention directly to the source of the problem.
The only example the Wikipedia article gives, is that of an empty collection instead of null. This is a very good practice, but a lousy example of the null object pattern, because it's dealing with a collection of objects, instead of a single instance.
In short, I'm sure it's feasible to create null object implementations for all your classes, but I strongly recommend against it.
I am not sure this is a good idea.
A "nullObject" may be useful and make perfect sense in some cases, but having this for every class is an overkill. Especially because this could potentially make some bug (or gaps in analysis) very hard to diagnose.
Also, what about 3rd part libraries that return new objects? Would you put some kind of "interface" in front of these so that in case they return null you will substitute an appopriate flavour of nullObject?
You mention that you are trying to automate this - wouldn't some cases require an actual design decision to return an appropriate value?
I.e., suppose you have an Image object, and it returns a "ImageType" (an Enumeration for .gif, .jpg etc.)
In this case, the ImageNullObject should return... "IMAGETYPE.GIF"? "IMAGETYPE.UNKNOWN"? Null?
I think you are just begging to push the error conditions down a layer.
When an application asks for an object, and it is null, then you need to deal with that condition, not just hope it is ok. If the object is null, and you try to use it, you will get errors, so why force all that error checking on the rest of the code?
I'd think it would be tricky to autogenerate functional null objects. (You could of course create shells that you then go and fill in the implementation itself).
As a quick example - what is the null functionality for a method that returns an int? Should it return -1, 0, or perhaps some other value? How about something that returns a String - again, is null or "" the correct choice? For methods returning some other class - if it's one of your own classes then presumably you could use the Null Object for that class, but how would you access it? Perhaps you could make every class you write implement some interface with an accessor for the null object, but is it worth it?
Even with methods that return void, it's not always the case that the null functionality is to simply return. Some classes, for example, might need to call a method on a parent/delegate object when they've done their own thing (which in this case would be a no-op) in order to preserve the invariants of the class.
So basically - even to implement "null" behaviour, you have to be able to infer what this behaviour is, and I expect this is far too advanced for a tool to do itself. Perhaps you could annotate all your methods with something like #NullReturns(-1), and have a tool pick these up and form the null object implementation, but even that won't cover every situation and you may as well write the object directly instead of writing all its functionality in annotations.
What I'd like to see would be for a compiler to allow one to define the behavior of a null object with a statically-defined type. For example, if 'Foo' is of class 'bar':
Class Bar
Dim Value as Integer
Function Boz() As Integer
Return Value
End Sub
Sub Nothing.Bar() As Integer
Return -9
End Sub
End Class
attempting to evaluate Foo.Bar() would yield Foo.Value if Foo is non-null, or else -9. One could achieve some such functionality using extension methods, but that seems a little icky.
Except for testing, I generally call this type of concept coding for Slop. It's the opposite of fail-fast, you are postponing any chance of locating a state you did not code for.
The main point here is why is it possible for your object to return a value that's not in the method's contract? When you designed and coded the method's contract, you either said it could or couldn't return null, right? So if you can return null, it's important. If you can't return null, it's important.
Also, If the object reference you are using can be null then that's a significant value and you shouldn't EVER be dipping into code that might access that object's methods.
On top of that, use final/constant/invariant variables every chance you get and (at least with OO languages, isolate your data behind a constructor that can insure correct state). If an immutable object correctly sets up it's variables, it's impossible for those variables to return invalid values.
Except for emergency patches and testing I honestly can't see any excuse for this kind of coding except "I don't understand the code and I think this isn't working right but I don't want to try to understand it"--which is not, IMO, a valid excuse for an engineer.

How do you return two values from a single method?

When your in a situation where you need to return two things in a single method, what is the best approach?
I understand the philosophy that a method should do one thing only, but say you have a method that runs a database select and you need to pull two columns. I'm assuming you only want to traverse through the database result set once, but you want to return two columns worth of data.
The options I have come up with:
Use global variables to hold returns. I personally try and avoid globals where I can.
Pass in two empty variables as parameters then assign the variables inside the method, which now is a void. I don't like the idea of methods that have a side effects.
Return a collection that contains two variables. This can lead to confusing code.
Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
This is not entirely language-agnostic: in Lisp, you can actually return any number of values from a function, including (but not limited to) none, one, two, ...
(defun returns-two-values ()
(values 1 2))
The same thing holds for Scheme and Dylan. In Python, I would actually use a tuple containing 2 values like
def returns_two_values():
return (1, 2)
As others have pointed out, you can return multiple values using the out parameters in C#. In C++, you would use references.
void
returns_two_values(int& v1, int& v2)
{
v1 = 1; v2 = 2;
}
In C, your method would take pointers to locations, where your function should store the result values.
void
returns_two_values(int* v1, int* v2)
{
*v1 = 1; *v2 = 2;
}
For Java, I usually use either a dedicated class, or a pretty generic little helper (currently, there are two in my private "commons" library: Pair<F,S> and Triple<F,S,T>, both nothing more than simple immutable containers for 2 resp. 3 values)
I would create data transfer objects. If it is a group of information (first and last name) I would make a Name class and return that. #4 is the way to go. It seems like more work up front (which it is), but makes it up in clarity later.
If it is a list of records (rows in a database) I would return a Collection of some sort.
I would never use globals unless the app is trivial.
Not my own thoughts (Uncle Bob's):
If there's cohesion between those two variables - I've heard him say, you're missing a class where those two are fields. (He said the same thing about functions with long parameter lists.)
On the other hand, if there is no cohesion, then the function does more than one thing.
I think the most preferred approach is to build a container (may it be a class or a struct - if you don't want to create a separate class for this, struct is the way to go) that will hold all the parameters to be returned.
In the C/C++ world it would actually be quite common to pass two variables by reference (an example, your no. 2).
I think it all depends on the scenario.
Thinking from a C# mentality:
1: I would avoid globals as a solution to this problem, as it is accepted as bad practice.
4: If the two return values are uniquely tied together in some way or form that it could exist as its own object, then you can return a single object that holds the two values. If this object is only being designed and used for this method's return type, then it likely isn't the best solution.
3: A collection is a great option if the returned values are the same type and can be thought of as a collection. However, if the specific example needs 2 items, and each item is it's 'own' thing -> maybe one represents the beginning of something, and the other represents the end, and the returned items are not being used interchangably, then this may not be the best option.
2: I like this option the best, if 4, and 3 do not make sense for your scenario. As stated in 3, if you wanted to get two objects that represent the beginning and end items of something. Then I would use parameters by reference (or out parameters, again, depending on how it's all being used). This way your parameters can explicitly define their purpose: MethodCall(ref object StartObject, ref object EndObject)
Personally I try to use languages that allow functions to return something more than a simple integer value.
First, you should distinguish what you want: an arbitrary-length return or fixed-length return.
If you want your method to return an arbitrary number of arguments, you should stick to collection returns. Because the collections--whatever your language is--are specifically tied to fulfill such a task.
But sometimes you just need to return two values. How does returning two values--when you're sure it's always two values--differ from returning one value? No way it differs, I say! And modern languages, including perl, ruby, C++, python, ocaml etc allow function to return tuples, either built-in or as a third-party syntactic sugar (yes, I'm talking about boost::tuple). It looks like that:
tuple<int, int, double> add_multiply_divide(int a, int b) {
return make_tuple(a+b, a*b, double(a)/double(b));
}
Specifying an "out parameter", in my opinion, is overused due to the limitations of older languages and paradigms learned those days. But there still are many cases when it's usable (if your method needs to modify an object passed as parameter, that object being not the class that contains a method).
The conclusion is that there's no generic answer--each situation has its own solution. But one common thing there is: it's not violation of any paradigm that function returns several items. That's a language limitation later somehow transferred to human mind.
Python (like Lisp) also allows you to return any number of
values from a function, including (but not limited to)
none, one, two
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
Some languages make doing #3 native and easy. Example: Perl. "return ($a, $b);". Ditto Lisp.
Barring that, check if your language has a collection suited to the task, ala pair/tuple in C++
Barring that, create a pair/tuple class and/or collection and re-use it, especially if your language supports templating.
If your function has return value(s), it's presumably returning it/them for assignment to either a variable or an implied variable (to perform operations on, for instance.) Anything you can usefully express as a variable (or a testable value) should be fair game, and should dictate what you return.
Your example mentions a row or a set of rows from a SQL query. Then you reasonably should be ready to deal with those as objects or arrays, which suggests an appropriate answer to your question.
When your in a situation where you
need to return two things in a single
method, what is the best approach?
It depends on WHY you are returning two things.
Basically, as everyone here seems to agree, #2 and #4 are the two best answers...
I understand the philosophy that a
method should do one thing only, but
say you have a method that runs a
database select and you need to pull
two columns. I'm assuming you only
want to traverse through the database
result set once, but you want to
return two columns worth of data.
If the two pieces of data from the database are related, such as a customer's First Name and Last Name, I would indeed still consider this to be doing "one thing."
On the other hand, suppose you have come up with a strange SELECT statement that returns your company's gross sales total for a given date, and also reads the name of the customer that placed the first sale for today's date. Here you're doing two unrelated things!
If it's really true that performance of this strange SELECT statement is much better than doing two SELECT statements for the two different pieces of data, and both pieces of data really are needed on a frequent basis (so that the entire application would be slower if you didn't do it that way), then using this strange SELECT might be a good idea - but you better be prepared to demonstrate why your way really makes a difference in perceived response time.
The options I have come up with:
1 Use global variables to hold returns. I personally try and avoid
globals where I can.
There are some situations where creating a global is the right thing to do. But "returning two things from a function" is not one of those situations. Doing it for this purpose is just a Bad Idea.
2 Pass in two empty variables as parameters then assign the variables
inside the method, which now is a
void.
Yes, that's usually the best idea. This is exactly why "by reference" (or "output", depending on which language you're using) parameters exist.
I don't like the idea of methods that have a side effects.
Good theory, but you can take it too far. What would be the point of calling SaveCustomer() if that method didn't have a side-effect of saving the customer's data?
By Reference parameters are understood to be parameters that contain returned data.
3 Return a collection that contains two variables. This can lead to confusing code.
True. It wouldn't make sense, for instance, to return an array where element 0 was the first name and element 1 was the last name. This would be a Bad Idea.
4 Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
Yes and no. As you say, I wouldn't want to create an object called FirstAndLastNames just to be used by one method. But if there was already an object which had basically this information, then it would make perfect sense to use it here.
If I was returning two of the exact same thing, a collection might be appropriate, but in general I would usually build a specialized class to hold exactly what I needed.
And if if you are returning two things today from those two columns, tomorrow you might want a third. Maintaining a custom object is going to be a lot easier than any of the other options.
Use var/out parameters or pass variables by reference, not by value. In Delphi:
function ReturnTwoValues(out Param1: Integer):Integer;
begin
Param1 := 10;
Result := 20;
end;
If you use var instead of out, you can pre-initialize the parameter.
With databases, you could have an out parameter per column and the result of the function would be a boolean indicating if the record is retrieved correctly or not. (Although I would use a single record class to hold the column values.)
As much as it pains me to do it, I find the most readable way to return multiple values in PHP (which is what I work with, mostly) is using a (multi-dimensional) array, like this:
function doStuff($someThing)
{
// do stuff
$status = 1;
$message = 'it worked, good job';
return array('status' => $status, 'message' => $message);
}
Not pretty, but it works and it's not terribly difficult to figure out what's going on.
I generally use tuples. I mainly work in C# and its very easy to design generic tuple constructs. I assume it would be very similar for most languages which have generics. As an aside, 1 is a terrible idea, and 3 only works when you are getting two returns that are the same type unless you work in a language where everything derives from the same basic type (i.e. object). 2 and 4 are also good choices. 2 doesn't introduce any side effects a priori, its just unwieldy.
Use std::vector, QList, or some managed library container to hold however many X you want to return:
QList<X> getMultipleItems()
{
QList<X> returnValue;
for (int i = 0; i < countOfItems; ++i)
{
returnValue.push_back(<your data here>);
}
return returnValue;
}
For the situation you described, pulling two fields from a single table, the appropriate answer is #4 given that two properties (fields) of the same entity (table) will exhibit strong cohesion.
Your concern that "it might be confusing to create a class just for the purpose of a return" is probably not that realistic. If your application is non-trivial you are likely going to need to re-use that class/object elsewhere anyway.
You should also consider whether the design of your method is primarily returning a single value, and you are getting another value for reference along with it, or if you really have a single returnable thing like first name - last name.
For instance, you might have an inventory module that queries the number of widgets you have in inventory. The return value you want to give is the actual number of widgets.. However, you may also want to record how often someone is querying inventory and return the number of queries so far. In that case it can be tempting to return both values together. However, remember that you have class vars availabe for storing data, so you can store an internal query count, and not return it every time, then use a second method call to retrieve the related value. Only group the two values together if they are truly related. If they are not, use separate methods to retrieve them separately.
Haskell also allows multiple return values using built in tuples:
sumAndDifference :: Int -> Int -> (Int, Int)
sumAndDifference x y = (x + y, x - y)
> let (s, d) = sumAndDifference 3 5 in s * d
-16
Being a pure language, options 1 and 2 are not allowed.
Even using a state monad, the return value contains (at least conceptually) a bag of all relevant state, including any changes the function just made. It's just a fancy convention for passing that state through a sequence of operations.
I will usually opt for approach #4 as I prefer the clarity of knowing what the function produces or calculate is it's return value (rather than byref parameters). Also, it lends to a rather "functional" style in program flow.
The disadvantage of option #4 with generic tuple classes is it isn't much better than returning a collection (the only gain is type safety).
public IList CalculateStuffCollection(int arg1, int arg2)
public Tuple<int, int> CalculateStuffType(int arg1, int arg2)
var resultCollection = CalculateStuffCollection(1,2);
var resultTuple = CalculateStuffTuple(1,2);
resultCollection[0] // Was it index 0 or 1 I wanted?
resultTuple.A // Was it A or B I wanted?
I would like a language that allowed me to return an immutable tuple of named variables (similar to a dictionary, but immutable, typesafe and statically checked). But, sadly, such an option isn't available to me in the world of VB.NET, it may be elsewhere.
I dislike option #2 because it breaks that "functional" style and forces you back into a procedural world (when often I don't want to do that just to call a simple method like TryParse).
I have sometimes used continuation-passing style to work around this, passing a function value as an argument, and returning that function call passing the multiple values.
Objects in place of function values in languages without first-class functions.
My choice is #4. Define a reference parameter in your function. That pointer references to a Value Object.
In PHP:
class TwoValuesVO {
public $expectedOne;
public $expectedTwo;
}
/* parameter $_vo references to a TwoValuesVO instance */
function twoValues( & $_vo ) {
$vo->expectedOne = 1;
$vo->expectedTwo = 2;
}
In Java:
class TwoValuesVO {
public int expectedOne;
public int expectedTwo;
}
class TwoValuesTest {
void twoValues( TwoValuesVO vo ) {
vo.expectedOne = 1;
vo.expectedTwo = 2;
}
}

What is Type-safe?

What does "type-safe" mean?
Type safety means that the compiler will validate types while compiling, and throw an error if you try to assign the wrong type to a variable.
Some simple examples:
// Fails, Trying to put an integer in a string
String one = 1;
// Also fails.
int foo = "bar";
This also applies to method arguments, since you are passing explicit types to them:
int AddTwoNumbers(int a, int b)
{
return a + b;
}
If I tried to call that using:
int Sum = AddTwoNumbers(5, "5");
The compiler would throw an error, because I am passing a string ("5"), and it is expecting an integer.
In a loosely typed language, such as javascript, I can do the following:
function AddTwoNumbers(a, b)
{
return a + b;
}
if I call it like this:
Sum = AddTwoNumbers(5, "5");
Javascript automaticly converts the 5 to a string, and returns "55". This is due to javascript using the + sign for string concatenation. To make it type-aware, you would need to do something like:
function AddTwoNumbers(a, b)
{
return Number(a) + Number(b);
}
Or, possibly:
function AddOnlyTwoNumbers(a, b)
{
if (isNaN(a) || isNaN(b))
return false;
return Number(a) + Number(b);
}
if I call it like this:
Sum = AddTwoNumbers(5, " dogs");
Javascript automatically converts the 5 to a string, and appends them, to return "5 dogs".
Not all dynamic languages are as forgiving as javascript (In fact a dynamic language does not implicity imply a loose typed language (see Python)), some of them will actually give you a runtime error on invalid type casting.
While its convenient, it opens you up to a lot of errors that can be easily missed, and only identified by testing the running program. Personally, I prefer to have my compiler tell me if I made that mistake.
Now, back to C#...
C# supports a language feature called covariance, this basically means that you can substitute a base type for a child type and not cause an error, for example:
public class Foo : Bar
{
}
Here, I created a new class (Foo) that subclasses Bar. I can now create a method:
void DoSomething(Bar myBar)
And call it using either a Foo, or a Bar as an argument, both will work without causing an error. This works because C# knows that any child class of Bar will implement the interface of Bar.
However, you cannot do the inverse:
void DoSomething(Foo myFoo)
In this situation, I cannot pass Bar to this method, because the compiler does not know that Bar implements Foo's interface. This is because a child class can (and usually will) be much different than the parent class.
Of course, now I've gone way off the deep end and beyond the scope of the original question, but its all good stuff to know :)
Type-safety should not be confused with static / dynamic typing or strong / weak typing.
A type-safe language is one where the only operations that one can execute on data are the ones that are condoned by the data's type. That is, if your data is of type X and X doesn't support operation y, then the language will not allow you to to execute y(X).
This definition doesn't set rules on when this is checked. It can be at compile time (static typing) or at runtime (dynamic typing), typically through exceptions. It can be a bit of both: some statically typed languages allow you to cast data from one type to another, and the validity of casts must be checked at runtime (imagine that you're trying to cast an Object to a Consumer - the compiler has no way of knowing whether it's acceptable or not).
Type-safety does not necessarily mean strongly typed, either - some languages are notoriously weakly typed, but still arguably type safe. Take Javascript, for example: its type system is as weak as they come, but still strictly defined. It allows automatic casting of data (say, strings to ints), but within well defined rules. There is to my knowledge no case where a Javascript program will behave in an undefined fashion, and if you're clever enough (I'm not), you should be able to predict what will happen when reading Javascript code.
An example of a type-unsafe programming language is C: reading / writing an array value outside of the array's bounds has an undefined behaviour by specification. It's impossible to predict what will happen. C is a language that has a type system, but is not type safe.
Type safety is not just a compile time constraint, but a run time constraint. I feel even after all this time, we can add further clarity to this.
There are 2 main issues related to type safety. Memory** and data type (with its corresponding operations).
Memory**
A char typically requires 1 byte per character, or 8 bits (depends on language, Java and C# store unicode chars which require 16 bits).
An int requires 4 bytes, or 32 bits (usually).
Visually:
char: |-|-|-|-|-|-|-|-|
int : |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type safe language does not allow an int to be inserted into a char at run-time (this should throw some kind of class cast or out of memory exception). However, in a type unsafe language, you would overwrite existing data in 3 more adjacent bytes of memory.
int >> char:
|-|-|-|-|-|-|-|-| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?| |?|?|?|?|?|?|?|?|
In the above case, the 3 bytes to the right are overwritten, so any pointers to that memory (say 3 consecutive chars) which expect to get a predictable char value will now have garbage. This causes undefined behavior in your program (or worse, possibly in other programs depending on how the OS allocates memory - very unlikely these days).
** While this first issue is not technically about data type, type safe languages address it inherently and it visually describes the issue to those unaware of how memory allocation "looks".
Data Type
The more subtle and direct type issue is where two data types use the same memory allocation. Take a int vs an unsigned int. Both are 32 bits. (Just as easily could be a char[4] and an int, but the more common issue is uint vs. int).
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
|-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-| |-|-|-|-|-|-|-|-|
A type unsafe language allows the programmer to reference a properly allocated span of 32 bits, but when the value of a unsigned int is read into the space of an int (or vice versa), we again have undefined behavior. Imagine the problems this could cause in a banking program:
"Dude! I overdrafted $30 and now I have $65,506 left!!"
...'course, banking programs use much larger data types. ;) LOL!
As others have already pointed out, the next issue is computational operations on types. That has already been sufficiently covered.
Speed vs Safety
Most programmers today never need to worry about such things unless they are using something like C or C++. Both of these languages allow programmers to easily violate type safety at run time (direct memory referencing) despite the compilers' best efforts to minimize the risk. HOWEVER, this is not all bad.
One reason these languages are so computationally fast is they are not burdened by verifying type compatibility during run time operations like, for example, Java. They assume the developer is a good rational being who won't add a string and an int together and for that, the developer is rewarded with speed/efficiency.
Many answers here conflate type-safety with static-typing and dynamic-typing. A dynamically typed language (like smalltalk) can be type-safe as well.
A short answer: a language is considered type-safe if no operation leads to undefined behavior. Many consider the requirement of explicit type conversions necessary for a language to be strictly typed, as automatic conversions can sometimes leads to well defined but unexpected/unintuitive behaviors.
A programming language that is 'type-safe' means following things:
You can't read from uninitialized variables
You can't index arrays beyond their bounds
You can't perform unchecked type casts
An explanation from a liberal arts major, not a comp sci major:
When people say that a language or language feature is type safe, they mean that the language will help prevent you from, for example, passing something that isn't an integer to some logic that expects an integer.
For example, in C#, I define a function as:
void foo(int arg)
The compiler will then stop me from doing this:
// call foo
foo("hello world")
In other languages, the compiler would not stop me (or there is no compiler...), so the string would be passed to the logic and then probably something bad will happen.
Type safe languages try to catch more at "compile time".
On the down side, with type safe languages, when you have a string like "123" and you want to operate on it like an int, you have to write more code to convert the string to an int, or when you have an int like 123 and want to use it in a message like, "The answer is 123", you have to write more code to convert/cast it to a string.
To get a better understanding do watch the below video which demonstrates code in type safe language (C#) and NOT type safe language ( javascript).
http://www.youtube.com/watch?v=Rlw_njQhkxw
Now for the long text.
Type safety means preventing type errors. Type error occurs when data type of one type is assigned to other type UNKNOWINGLY and we get undesirable results.
For instance JavaScript is a NOT a type safe language. In the below code “num” is a numeric variable and “str” is string. Javascript allows me to do “num + str” , now GUESS will it do arithmetic or concatenation .
Now for the below code the results are “55” but the important point is the confusion created what kind of operation it will do.
This is happening because javascript is not a type safe language. Its allowing to set one type of data to the other type without restrictions.
<script>
var num = 5; // numeric
var str = "5"; // string
var z = num + str; // arthimetic or concat ????
alert(z); // displays “55”
</script>
C# is a type safe language. It does not allow one data type to be assigned to other data type. The below code does not allow “+” operator on different data types.
Concept:
To be very simple Type Safe like the meanings, it makes sure that type of the variable should be safe like
no wrong data type e.g. can't save or initialized a variable of string type with integer
Out of bound indexes are not accessible
Allow only the specific memory location
so it is all about the safety of the types of your storage in terms of variables.
Type-safe means that programmatically, the type of data for a variable, return value, or argument must fit within a certain criteria.
In practice, this means that 7 (an integer type) is different from "7" (a quoted character of string type).
PHP, Javascript and other dynamic scripting languages are usually weakly-typed, in that they will convert a (string) "7" to an (integer) 7 if you try to add "7" + 3, although sometimes you have to do this explicitly (and Javascript uses the "+" character for concatenation).
C/C++/Java will not understand that, or will concatenate the result into "73" instead. Type-safety prevents these types of bugs in code by making the type requirement explicit.
Type-safety is very useful. The solution to the above "7" + 3 would be to type cast (int) "7" + 3 (equals 10).
Try this explanation on...
TypeSafe means that variables are statically checked for appropriate assignment at compile time. For example, consder a string or an integer. These two different data types cannot be cross-assigned (ie, you can't assign an integer to a string nor can you assign a string to an integer).
For non-typesafe behavior, consider this:
object x = 89;
int y;
if you attempt to do this:
y = x;
the compiler throws an error that says it can't convert a System.Object to an Integer. You need to do that explicitly. One way would be:
y = Convert.ToInt32( x );
The assignment above is not typesafe. A typesafe assignement is where the types can directly be assigned to each other.
Non typesafe collections abound in ASP.NET (eg, the application, session, and viewstate collections). The good news about these collections is that (minimizing multiple server state management considerations) you can put pretty much any data type in any of the three collections. The bad news: because these collections aren't typesafe, you'll need to cast the values appropriately when you fetch them back out.
For example:
Session[ "x" ] = 34;
works fine. But to assign the integer value back, you'll need to:
int i = Convert.ToInt32( Session[ "x" ] );
Read about generics for ways that facility helps you easily implement typesafe collections.
C# is a typesafe language but watch for articles about C# 4.0; interesting dynamic possibilities loom (is it a good thing that C# is essentially getting Option Strict: Off... we'll see).
Type-Safe is code that accesses only the memory locations it is authorized to access, and only in well-defined, allowable ways.
Type-safe code cannot perform an operation on an object that is invalid for that object. The C# and VB.NET language compilers always produce type-safe code, which is verified to be type-safe during JIT compilation.
Type-safe means that the set of values that may be assigned to a program variable must fit well-defined and testable criteria. Type-safe variables lead to more robust programs because the algorithms that manipulate the variables can trust that the variable will only take one of a well-defined set of values. Keeping this trust ensures the integrity and quality of the data and the program.
For many variables, the set of values that may be assigned to a variable is defined at the time the program is written. For example, a variable called "colour" may be allowed to take on the values "red", "green", or "blue" and never any other values. For other variables those criteria may change at run-time. For example, a variable called "colour" may only be allowed to take on values in the "name" column of a "Colours" table in a relational database, where "red, "green", and "blue", are three values for "name" in the "Colours" table, but some other part of the computer program may be able to add to that list while the program is running, and the variable can take on the new values after they are added to the Colours table.
Many type-safe languages give the illusion of "type-safety" by insisting on strictly defining types for variables and only allowing a variable to be assigned values of the same "type". There are a couple of problems with this approach. For example, a program may have a variable "yearOfBirth" which is the year a person was born, and it is tempting to type-cast it as a short integer. However, it is not a short integer. This year, it is a number that is less than 2009 and greater than -10000. However, this set grows by 1 every year as the program runs. Making this a "short int" is not adequate. What is needed to make this variable type-safe is a run-time validation function that ensures that the number is always greater than -10000 and less than the next calendar year. There is no compiler that can enforce such criteria because these criteria are always unique characteristics of the problem domain.
Languages that use dynamic typing (or duck-typing, or manifest typing) such as Perl, Python, Ruby, SQLite, and Lua don't have the notion of typed variables. This forces the programmer to write a run-time validation routine for every variable to ensure that it is correct, or endure the consequences of unexplained run-time exceptions. In my experience, programmers in statically typed languages such as C, C++, Java, and C# are often lulled into thinking that statically defined types is all they need to do to get the benefits of type-safety. This is simply not true for many useful computer programs, and it is hard to predict if it is true for any particular computer program.
The long & the short.... Do you want type-safety? If so, then write run-time functions to ensure that when a variable is assigned a value, it conforms to well-defined criteria. The down-side is that it makes domain analysis really difficult for most computer programs because you have to explicitly define the criteria for each program variable.
Type Safety
In modern C++, type safety is very important. Type safety means that you use the types correctly and, therefore, avoid unsafe casts and unions. Every object in C++ is used according to its type and an object needs to be initialized before its use.
Safe Initialization: {}
The compiler protects from information loss during type conversion. For example,
int a{7}; The initialization is OK
int b{7.5} Compiler shows ERROR because of information loss.\
Unsafe Initialization: = or ()
The compiler doesn't protect from information loss during type conversion.
int a = 7 The initialization is OK
int a = 7.5 The initialization is OK, but information loss occurs. The actual value of a will become 7.0
int c(7) The initialization is OK
int c(7.5) The initialization is OK, but information loss occurs. The actual value of a will become 7.0