Syntactic sugar vs. feature - language-agnostic

In C# (and Java) a string is little more than a char array with a stored length and a few methods tacked on. Likewise, (reference vs. value stuff aside) objects are little more than glorified structs with inheritance and interfaces added.
On one level, these additions feel like clear features and enhancements unto themselves. On another level, they feel like a marginal upgrade from the status of "syntactic sugar."
To take this idea further, consider (I may have some details wrong, but the point remains):
transistor
elementary logic gate
compound gate
| |
ALU flip-flop
| | |
| register RAM
| |
CPU
microcode
assembly
C
C++
| |
MSIL JavaScript
C# jQuery
Many times, any single layer of abstraction looks a lot like syntactic sugar but multiple layers of separation feel very removed from each other.
How do you know when something has stopped being syntactic sugar and started being a bona fide feature?

It turns out to be a feature instead of syntactic sugar when it implies a different way of thinking.
You are right when you say objects are in fact glorified structs with methods and inheritance. That, however, is just the implementation detail. What objects allow is to think in a different way. You can relate more easily to real world entities when thinking about objects. The same thing happened when even further back in time, we jumped from using go-to's to procedural programming. Under the hood, the processor still keeps on jmp'ing from OP to OP, but we could think in a different, more black-box, way.
Having said that, in extreme, you can say everything is syntactic sugar, but some of that sugar is a feature when it allows you to think differently.

Syntactic sugar is a feature.

All of software is a giant stack of abstractions built on top of other abstractions. A string may be nothing more than an array of characters, but there are many operations that feel natural on strings, but awkward on character arrays. The goal of all of these abstractions is the same: remove irrelevant details so that the developer can focus on the important parts of the problem.
As you point out, all modern programming languages could be eliminated, and we could go back to working in assembly language. But our productivity would plummet.
I guess people call something syntactic sugar when they feel they get little benefit from it, and a feature when they feel the get a large benefit from it. That makes the distinction very fuzzy, and quite subjective.

When the change provides value? I have coded in assembler. I switched to C and looked at the output from the compiler. It's code was 95+% as good as my hand crafted assembler and it was much easier to write. For me that provided value so I'd say it wasn't sugar.
C++ helps me translate my object oriented thoughts into code. As long as the overhead isn't terribly high then I think it's a feature.
I'm a practical sort. "If I can see it's valuable" is my answer

"Syntactic sugar" is a feature you don't like

It seems that syntactical surgar is a syntax that changes nothing about the abilities of the language, and using a different construct accomplishes exactly the same thing. A String (thinking in Java) is not just syntatical sugar over a char array. A char array is mutable (in content if not in length). You could not make a char array immutable with an existing language feature without a String array.
On the other hand, the plus operator working on Strings is indeed syntatical sugar for using a StringBuilder and calling append.

I would have to say when the same result is cannot be achieved simply by writing different code, with the same type of "time-constraint" as using the syntactical sugar.
My Example would be a Lambda expression, writing a foreach loop doesn't take a lot of effort, but using .Foreach() sure is nice too; versus rewriting the whole HttpRequest class on your own. One is syntactical, one is a feature. Both save time, one in a much bigger way than the other.

Generally the term "syntactic sugar" refers to language features which never allowed a programmer to do something which could not be done before, but rather provided a nice means of expressing something that could already expressed in the language, even if somewhat more awkwardly.
Certain constructs may be unambiguously regarded as syntactic sugar. For example, in VB.NET, code to test for whether two references weren't equal used to require If Not (ref1 Is Ref2) but newer versions of the language allow If ref1 IsNot Ref2. Nothing can be expressed in the new syntax that couldn't be expressed in the old, but the new syntax is cleaner, introduces no ambiguities, and the only reason not to use it would be if code had to be back-compatible with old versions of the language.
Some constructs may be a bit harder to define as sugar. In particular, if a language adds constructs which will work identically to existing construct when used with other types, but will fail compilation with others, such constructs may provide a means of compile-time type verification which did not exist previously. Java generics may generally be viewed in this light. One can add a Cat to an ArrayList<Cat> just as easily as to an ArrayList; what the ArrayList<Cat> adds is a guard to reject Dogs at compile time. Since compile-time constraints don't allow one to write any program that couldn't be written without them, some people may view them as syntactic sugar. On the other hand, even though type verification is performed at compile-time rather than run-time, it might still be viewed as one of the jobs of a program.

Syntactic sugar and language feature are basically describing the same thing, even if syntactic sugar is sometimes used in a pejorative way whereas feature is often associated with deeper changes in the language architecture (introducing lambdas etc.).
But this distinction is very dependent on a individual point of view (and its subjectively felt usefulness).
Regarding language-design aspects and your example with strings and char-arrays, I would say that this should be neither a feature nor sugar, but simply expressible in the languages basic syntax (LOP - language-oriented programming). Generic concepts (typeclasses, metaprogramming etc.) allow you to express many new and useful constructs by yourself without waiting for the language to get a new feature. Just look at Haskell or C++'s metaprogramming capabilities.

Related

Beyond type theory

There has been much fuss about dynamically vs. statically typed languages. To my eye, however, while statically typed languages enable the compiler (or interpreter) to know a bit more about your intentions, they only barely scratch the surface of what could be conveyed. Indeed, some languages have an orthogonal mechanism for providing a bit more information in annotations.
I am aware of strongly typed languages like Agda and Coq that are very persnickety about what they allow you to do; I'm not terribly interested in those. Rather, I'm wondering what languages or theory exist that expand the richness of what you can explain to the compiler about what it is that you intend. For example, if you have a mutable vector and you turn it into a unit vector, why couldn't your compiler select a unit-vector form of vector projection instead of the more computationally expensive general form? The type has not changed--and the work required to build all the requisite types would be off-putting even in a language with amazingly easy typing such as Haskell--and yet it seems that the compiler could be empowered to know a great deal about the situation.
Does some language already enable things like this, either outside of standard type-theory or within one of its more advanced branches?
there are languages with turing-complete system type. which means that your types can express any computable property. for example list of length 6 or valid credit card number. however most mainstream languages uses simpler system types. haskell is considered to have very powerful system type

What's the use of abstract syntax trees?

I am learning on my own about writing an interpreter for a programming language, and I have read about Abstract Syntax Trees. I have an idea of what they are, but I do not see their use.
Why are ASTs useful?
They represent the logic/syntax of the code, which is naturally a tree rather than a list of lines, without getting bogged down in concrete syntax issues such as where you place your asterisk.
The logic can then be manipulated in a manner more consistent and convenient from the backend's POV, which can be (and is, for everything but Lisps ;) very different from how we write the concrete syntax.
The main benefit os using an AST is that you separate the parsing and validation logic from the implementation piece. Interpreters implemented as ASTs really are easier to understand and maintain. If you have a problem parsing some strange syntax you look at the AST parser , if a pices of code is not producing the expected results than you look at the code that interprets the AST.
The other great advantage is when you syntax requires "lookahead" e.g. if your syntax allows a subroutine to be used before it is defined it is trivial to validate the existence of a subroutine when you are using an AST - its much more difficult with an "on the fly" parser.
You need "syntax trees" to represent the structure of most programming langauges, in order to carry out analysis or transformation on documents that contain programming language text. (You can see some fancy examples of this via my bio).
Whether that tree is abstract (AST) or concrete (CST) is a matter of taste, convenience, and engineering sweat. The term CST is specially used to describe the parse derivation tree when a grammar is used to deconstruct source code; it usually contains tree elements for lots of concrete syntax such as statement terminator semicolons. AST is used to mean "something simpler than the CST", e.g., leaving out semicolon tree nodes because they don't affect program analysis much, and thus writing analyzers that process ASTs is less conceptual and engineering effort than writing the same analyzer on a CST. A better way to understand this is to realize that the AST is usually as isomorphic equivalent of the CST, that is, you should be able to regenerate the CST from it. If you want to transform the source text and regenerate it, then the CST is often a better choice as it loses less information from the original program (and my fancy example uses this approach).
I think you will find the SO discussion on abstract vs. concrete syntax trees pretty helpful.
In general you are going to parse you code into some form of AST, it may be more or less of a formal model. So I think what Kirk Woll was getting at by his comment above is that when you parse the language, you very often use the parser to create some sort of data model of the raw content of what you are reading, generally organized in a tree fashion. So by that definition an AST is hard to avoid unless you are doing a very simple translator.
I use ANTLR often for parsing complex languages and in that context there is a slightly more specific meaning of an AST. ANTLR has a handy way of generating an AST in the parser grammar using pretty simple actions. You then write a generally much simpler parser for this AST which you can operate on like a much simpler version the language you are processing. Whether the extra work of building two parsers is a net gain is a function of the language complexity and what you are planning on doing with with it once you parsed it.
A good book on the subject that you may want to take a look at is "Language Implementation Patterns" by Terrence Parr the ANTLR author. He addresses this topic pretty thoroughly. That said, I didn't really get ASTs until I started using them, so that (as usual) is the best way to understand them.
Late to the question but I thought I'd add something. You don't actually have to build an AST. It is possible to emit instructions directly as you parse the source code. In this case, the AST is implied in the parsing grammar. For simple languages, especially dynamically typed languages, this is a perfectly ok strategy. For more complex languages or where you need to further analyze the source code, an AST can be very useful. For example, if your language is statically typed, ie your variables are declared with fixed types then the AST can be used to check that you're not assigning the wrong type to a variable. eg assigning a string to a variable that is declared to hold an integer would be wrong and this can be caught more conveniently with the AST.
Also, as others have mentioned, the AST offers a clean separation between syntax analysis and code generation and makes the code much more modular.

Idiom vs. pattern

In the context of programming, how do idioms differ from patterns?
I use the terms interchangeably and normally follow the most popular way I've heard something called, or the way it was called most recently in the current conversation, e.g. "the copy-swap idiom" and "singleton pattern".
The best difference I can come up with is code which is meant to be copied almost literally is more often called pattern while code meant to be taken less literally is more often called idiom, but such isn't even always true. This doesn't seem to be more than a stylistic or buzzword difference. Does that match your perception of how the terms are used? Is there a semantic difference?
Idioms are language-specific.
Patterns are language-independent design principles, usually written in a "pattern language" (a uniform template) describing things such as the motivating circumstances, pros & cons, related patterns, etc.
When people observing program development from On High (Analysts, consultants, academics, methodology gurus, etc) see developers doing the same thing over and over again in various situations and environments, then the intelligence gained from that observation can be distilled into a Pattern. A pattern is a way of "doing things" with the software tools at hand that represent a common abstraction.
Some examples:
OO programming took global variables away from developers. For those cases where they really still need global variables but need a way to make their use look clean and object oriented, there's the Singleton Pattern.
Sometimes you need to create a new object having one of a variety of possible different types, depending on some circumstances. An ugly way might involve an ever-expanding case statement. The accepted "elegant" way to achieve this in an OO-clean way is via the "Factory" or "Factory Method" pattern.
Sometimes, a lot of developers do things in a certain way but it's a bad way that should be disrecommended. This can be formalized in an antipattern.
Patterns are a high-level way of doing things, and most are language independent. Whether you create your objects with new Object or Object.new is immaterial to the pattern.
Since patterns are something a bit theoretical and formal, there is usually a formal pattern (heh - word overload! let's say "template") for their description. Such a template may include:
Name
Effect achieved
Rationale
Restrictions and Limitations
How to do it
Idioms are something much lower-level, and usually operate at the language level. Example:
*dst++ = *src++
in C copies a data element from src to dst while incrementing the pointers to both; it's usually done in a loop. Obviously, you won't see this idiom in Java or Object Pascal.
while <INFILE> { print chomp; }
is (roughly quoted from memory) a Perl idiom for looping over an input file and printing out all lines in the file. There's a lot of implicit variable use in that statement. Again, you won't see this particular syntax anywhere but in Perl; but an old Perl hacker will take a quick look at the statement and immediately recognize what you're doing.
Contrary to the idea that patterns are language agnostic, both Paul Graham and Peter Norvig have suggested that the need to use a pattern is a sign that your language is missing a feature. (Visitor Pattern is often singled out as the most glaring example of this.)
I generally think the main difference between "patterns" and "idioms" to be one of size. An idiom is something small, like "use an interface for the type of a variable that holds a collection" while Patterns tend to be larger. I think the smallness of idioms does mean that they're more often language specific (the example I just gave was a Java idiom), but I don't think of that as their defining characteristic.
Since if you put 5 programmers in a room they will probably not even agree on what things are patterns, there's no real "right answer" to this.
One opinion that I've heard once and really liked (though can't for the life of me recall the source), is that idioms are things that should probably be in your language or there is some language that has them. Conversely, they are tricks that we use because our language doesn't offer a direct primitive for them. For instance, there's no singleton in Java, but we can mimic it by hiding the constructor and offering a getInstance method.
Patterns, on the other hand, are more language agnostic (though they often refer to a specific paradigm). You may have some infrastructure to support them (e.g., Spring for MVC), but they're not and not going to be language constructs, and yet you could need them in any language from that paradigm.

What general purpose language should I learn next?

I'm currently participating in a programming contest (http://contest.github.com), which has as goal, to create a recommendation engine. I started coding in ruby, but soon realised it wasn't fast enough for the algorithms I had in mind. So I switched to C, which is the only non-scripting language I know. It was fast, of course, but I cringed every time I had to write a for loop, to go through the elements of an array (which was very often).
That's when it dawned: I wish I knew a fast, yet high-level language, to program all these intensive computations with ease!
So I looked at my options, but there are a lot of options these days! Here the best candidates I've found over the months, with something which bothers me about each of them (that hopefully you can clear up):
Clojure: I'm not sure I want to get into the whole lisp thing, I like my syntax and cruft. I could be convinced, though.
Haskell: Too academic? I don't really care for pure functional, I just want something which works. But it has nice syntax, and I don't mind static typing.
Scala: Weird language. I tried it out but it feels messy/inconsistent to me.
OCaml: Also wondering if this is too academic? The poor concurrency support also bothers me.
Arc: Paul Graham's lisp, too obscure, and again, I'm not sure I want to learn a lisp. But I trust this man!
Any advice? I really like the functional languages, for their ability to manipulate lists with ease, but I'm open to other options too. I'd like something about as fast as Java..
The kind of things I want to be able to do with lists are like (ruby):
([1, 2, 3, 4] - [2, 3]).map {|i| i * 2 } # which results in [2, 8]
I would also prefer an open-source language.
Thanks
Out of the languages that you've listed, neither Haskell nor Arc match your "fast" requirement - both are slower than Java. Your idea that Haskell is faster than Java and approaches C is most likely coming from one well-known flawed test that tried to measure performance by implementing sort. One thing that they've missed is that Haskell is lazy, and thus you need to use the results of the sort for it to actually perform that; and they measured performance simply by remembering current time, "calling" the sort function, and checking the time delta. C version of the test faithfully performed the sort, Haskell version simply returned a thunk for lazy evaluation which was never called.
In practice, there are a number of reasons why Haskell cannot be that fast even in theory. First, because of pervasive lazy evaluation, it often cannot pass around raw values, and has to generate thunks for expressions - the optimizer can trim down on those in trivial cases, but not for more complicated ones. Second, polymorphic Haskell functions are implemented as runtime-polymorphic, and not like C++ templates where every new type parameter instantiates a new version of code that is optimally compiled. Obviously, this necessitates extra boxing/unboxing. In the end, Haskell will struggle to beat any decent VM (such as HotSpot JVM, or CLR in .NET 2.0+), much less C/C++.
Now that's settled in, let's move on to the rest. Scala uses JVM as a backend, and thus is not going to be any faster than Java - and if you use higher-level abstractions, it will most likely be slower somewhat, but probably in the same ballpark. Clojure also runs on JVM, but it's also dynamically typed, and that carries an unavoidable performance penalty (I heard it does clever tricks to mitigate that to some extent, but some of it really is unavoidable no matter what).
That leaves OCaml, and out of your list, it is the only language that had actually been conclusively shown to reach the performance of C/C++ compilers on valid tests. It should be noted however that this would not be typical of idiomatic OCaml code - for example, its polymorphism is also runtime, similar to Haskell, and that carries the appropriate penalty; also, its OOP system is structural rather than nominal, which precludes an optimal vtable-based implementation; so that is going to be slower than C++, too (I'd expect perf penalty close to that of Objective-C dispatch compared to C++ dispatch, but I don't have any numbers to back that up). So you can beat C++ in OCaml if you steer away from certain language features, but unfortunately, it's those features that make OCaml so attractive in the first place.
My advice would be this: if you really need speed, go with C++. It can be fairly high-level if you use high-level libraries such as STL and Boost. It doesn't have some high-level language abstractions you might be used to, but libraries can compensate for that - sometimes fully, sometimes in part. For example, you don't have to write a for-loop to iterate over an array - you can use std::for_each, std::copy_if, std::transform, std::accumulate and similar algorithms (which are mostly analogous to map, filter, fold, and similar traditional FP primitives), and also Boost.Lambda to cut down on boilerplace.
Why not simple Java or C#? Should be faster then Ruby, more high level then C and have a huge userbase.
Your criticism of pretty much everything seems to be that it's "weird" or "too academic." But what does that mean? It's the sort of vague criticism that you can throw at any unfamiliar language that isn't totally mainstream (i.e., not C, C++, Objective-C, Java, Ruby, Python or PHP). There's nothing about all those languages that's inherently good for academia and bad for anything else. Try to break down your analysis a little further: Specifically, what is it that troubles you about those languages? You might find that your brain is just instinctively pushing away something unfamiliar. If that's the case, learning one of those languages might be a good way to expand your mind.
Alternatively: It sounds like you're looking for a functional language, so you might look at F#. It's a first-class CLR language created by Microsoft, so it doesn't carry any "academic" mental baggage, and it's very similar to OCaml.
newLISP is fast, small, integrates extremely easily with C, and it has quite a few statistical functions built-in.
Haskell is my current preference as a performant, high-level language. I've also heard very good things about OCaml, but haven't personally used it much.
Scala and Clojure will have similar performance to Java -- slow, slow, slow! Sure, they'll be faster than Ruby, but what isn't?
Arc is a set of macros for MzScheme, and is not particularly fast. If you want a performant LISP, try Common LISP -- it can be compiled to machine code.
How about Delphi / FreePascal? They're native code & fast. I do a lot of real-time graphics & processing with them. They dont require that you work 'low level', but you can if you need to. Plus you can embed assembler if needed for extra performance. FreePascal is cross platform if you want to stay off Windows.
D might fit the bill? Compiles to machine code but allows for programming using higher-level concepts.
Python can be made to run fast, especially using the NumPy package. Relevant links below:
http://www.scipy.org/PerformancePython
Cython and numpy speed
You seem uncomfortable with any language that doesn't look like one you already use. That's going to limit you, so I'd suggest one you won't be comfortable with if you're interested in expanding your horizons. I'm not saying you'll want to continue with any particular language (I have a definite preference never to touch Tcl again), but you should try it sometime.
There are nice fast implementations of Common Lisp, and that's an easy language to write functional programs in. Besides, if you can get along with it, you'll find a lot of neat things you can do with it.
Computation? Fortran. Beats the pants off of anything else.
If you don't mind .NET...
F# - based on O'Caml, multiparadigm language with full access to .NET Framework. Included officially in .NET FW 4.0
Nemerle - see F# and add to that a POWERFUL metaprogramming capabilities.
After your update:
If you want to manipulate lists easily you should go with Common Lisp. It is only 2 times slower that C in average (and actually faster in some things), it is great for list processing and it is multi-paradigm (imperative, functional and OO) - so you don't have to stick to functional-only programming. SBCL is a good Common Lisp to try first, IMO.
And don't get bothered by strange "lispy" things like parentheses. It is not only quite stupid to judge language by its syntax, rather than semantics, but also parentheses are one of the greatest strengths of LISP, because they eliminate differences between data and expressions and you can manipulate language itself to make it fit your needs.
Don't listen to people who advice C++/C#/Java. Java functional part is non-existant. C++ functional part is terrible. C# delegates makes me sick because of their complexity. They are not REAL multi-paradigm imperative/functional languages, they are imperative/OO languages that have some small functional bits, you can't do real functional programming in them.
C++ or alternatively C# and mono.
Honestly, to accomplish much in the world of software engineering, you will likely have to wrap your head around these languages you find distasteful. Java, C, C++, C#, etc. are likely to come up in a career that involves programming.
Looks like you've done some interesting work. I encourage you to push your technical skills harder. It will be worth the effort.
Alternatively, Python might be good, given your interests. You might find Smalltalk interesting, or even ATS.
For some ideas, look at the Language Shootout and analysis by Oscar Boykin. You have already discovered this, but comparing Ruby to C we see that Ruby is between 14 and 600 times slower (several tests are more than 100 times slower). He also points out that Python is faster than Ruby. The benchmarks for all languages is interesting.
Also interesting are benchmarks from Dan Corlan.
You might consider python; it supports writing modules in C or C++, so you can get it working in a high-level language, profile it, rework the algorithms, and if it still isn't fast enough, translate the hotspots to C or C++ for speed.
Consider Tcl, combined with C. Do the really compute-intensive stuff in C since that's what you know how to do, then use Tcl as the glue to combine the high level code with your C-based code.
I make this recommendation not because Tcl is necessarily the best language for the job (there really is no "best" for something like this) but because you'll learn a lot about the concept of combining the strengths of two different languages. It's an important technique that could serve you well in your career whether it's Tcl/C, Lua/C, Groovy/Java, Python/C, etc.
Python with pyrex or psyco may be a better fit? Probably not as fast as C, but you can see some significant speedups from regular Python.
If you want something that's "about as fast as Java," the obvious solution is JRuby.
If you install Netbeans (use the download button under the Ruby column), JRuby is the default interpreter. It doesn't get much easier!
If your problem is C's clunky loops, I'd suggest looking at Ada. It allows you to loop through a whole array with a simple statement like so:
for I in array_name'range loop
--'// Code goes here
end loop;
For AI projects, I'd also suggest you look into using Clips, which is a freely-available inference engine.
Rather than OCAML, you might consider F# -- it's source compatible with OCAML (or you can use a lighter weight syntax) and it supports actor-style concurrency with what it calls asynchronous workflows (which are really an almost-monad for applying asynchronous execution).
Not that -- as Scala shows -- you need to have actor style concurrency baked into the language, if you build it into a library. The rest is just syntactic sugar.
Learn C++ and familiarize yourself with its standard library. It won't be that hard to learn as you already 'speak' C, but keep in mind that C++ is not just a better C, it's another language with its own concepts and methods.
Why not Erlang?
It's not too much like the languages you already know, so you can learn new concepts
It has some interesting capabilities for multiprocessing
It's not out of academia. Erlang was a commercial language first.
There are at least two significant open source applications written in it: CouchDB and Wings3d
I believe going through C C++ and Java or .net then moving on from here to any one way java or .net , because c is more machine oriented and C++ and java will give you hands on with object oriented learning, then later on switching to python (to really appreciate the clean code than in C C++ and JAVA ).

What makes a language Object-Oriented?

Since debate without meaningful terms is meaningless, I figured I would point at the elephant in the room and ask: What exactly makes a language "object-oriented"? I'm not looking for a textbook answer here, but one based on your experiences with OO languages that work well in your domain, whatever it may be.
A related question that might help to answer first is: What is the archetype of object-oriented languages and why?
Definitions for Object-Orientation are of course a huge can of worms, but here are my 2 cents:
To me, Object-Orientation is all about objects that collaborate by sending messages. That is, to me, the single most important trait of an object-oriented language.
If I had to put up an ordered list of all the features that an object-oriented language must have, it would look like this:
Objects sending messages to other objects
Everything is an Object
Late Binding
Subtype Polymorphism
Inheritance or something similarly expressive, like Delegation
Encapsulation
Information Hiding
Abstraction
Obviously, this list is very controversial, since it excludes a great variety of languages that are widely regarded as object-oriented, such as Java, C# and C++, all of which violate points 1, 2 and 3. However, there is no doubt that those languages allow for object-oriented programming (but so does C) and even facilitate it (which C doesn't). So, I have come to call languages that satisfy those requirements "purely object-oriented".
As archetypical object-oriented languages I would name Self and Newspeak.
Both satisfy the above-mentioned requirements. Both are inspired by and successors to Smalltalk, and both actually manage to be "more OO" in some sense. The things that I like about Self and Newspeak are that both take the message sending paradigm to the extreme (Newspeak even more so than Self).
In Newspeak, everything is a message send. There are no instance variables, no fields, no attributes, no constants, no class names. They are all emulated by using getters and setters.
In Self, there are no classes, only objects. This emphasizes, what OO is really about: objects, not classes.
According to Booch, the following elements:
Major:
Abstraction
Encapsulation
Modularity
Hierarchy (Inheritance)
Minor:
Typing
Concurrency
Persistence
Basically Object Oriented really boils down to "message passing"
In a procedural language, I call a function like this :
f(x)
And the name f is probably bound to a particular block of code at compile time. (Unless this is a procedural language with higher order functions or pointers to functions, but lets ignore that possibility for a second.) So this line of code can only mean one unambiguous thing.
In an object oriented language I pass a message to an object, perhaps like this :
o.m(x)
In this case. m is not the name of a block of code, but a "method selector" and which block of code gets called actually depends on the object o in some way. This line of code is more ambiguous or general because it can mean different things in different situations, depending on o.
In the majority of OO languages, the object o has a "class", and the class determines which block of code is called. In a couple of OO languages (most famously, Javascript) o doesn't have a class, but has methods directly attached to it at runtime, or has inherited them from a prototype.
My demarcation is that neither classes nor inheritance are necessary for a language to be OO. But this polymorphic handling of messages is essential.
Although you can fake this with function pointers in say C, that's not sufficient for C to be called an OO language, because you're going to have to implement your own infrastructure. You can do that, and a OO style is possible, but the language hasn't given it to you.
It's not really the languages that are OO, it's the code.
It is possible to write object-oriented C code (with structs and even function pointer members, if you wish) and I have seen some pretty good examples of it. (Quake 2/3 SDK comes to mind.) It is also definitely possible to write procedural (i.e. non-OO) code in C++.
Given that, I'd say it's the language's support for writing good OO code that makes it an "Object Oriented Language." I would never bother with using function pointer members in structs in C, for example, for what would be ordinary member functions; therefore I will say that C is not an OO language.
(Expanding on this, one could say that Python is not object oriented, either, with the mandatory "self" reference on every step and constructors called init, whatnot; but that's a Religious Discussion.)
Smalltalk is usually considered the archetypal OO language, although Simula is often cited as the first OO language.
Current OO languages can be loosely categorized by which language they borrow the most concepts from:
Smalltalk-like: Ruby, Objective-C
Simula-like: C++, Object Pascal, Java, C#
I am happy to share this with you guys, it was quite interesting and helpful to me. This is an extract from a 1994 Rolling Stone interview where Steve (not a programmer) explains OOP in simple terms.
Jeff Goodell: Would you explain, in simple terms, exactly what object-oriented software is?
Steve Jobs: Objects are like people. They’re living, breathing things that have knowledge inside them about how to do things and have memory inside them so they can remember things. And rather than interacting with them at a very low level, you interact with them at a very high level of abstraction, like we’re doing right here.
Here’s an example: If I’m your laundry object, you can give me your dirty clothes and send me a message that says, “Can you get my clothes laundered, please.” I happen to know where the best laundry place in San Francisco is. And I speak English, and I have dollars in my pockets. So I go out and hail a taxicab and tell the driver to take me to this place in San Francisco. I go get your clothes laundered, I jump back in the cab, I get back here. I give you your clean clothes and say, “Here are your clean clothes.”
You have no idea how I did that. You have no knowledge of the laundry place. Maybe you speak French, and you can’t even hail a taxi. You can’t pay for one, you don’t have dollars in your pocket. Yet, I knew how to do all of that. And you didn’t have to know any of it. All that complexity was hidden inside of me, and we were able to interact at a very high level of abstraction. That’s what objects are. They encapsulate complexity, and the interfaces to that complexity are high level.
As far as I can tell, the main view of what makes a language "Object Oriented" is supporting the idea of grouping data, and methods that work on that data, which is generally achieved through classes, modules, inheritance, polymorphism, etc.
See this discussion for an overview of what people think (thought?) Object-Orientation means.
As for the "archetypal" OO language - that is indeed Smalltalk, as Kristopher pointed out.
Supports classes, methods, attributes, encapsulation, data hiding, inheritance, polymorphism, abstraction...?
Disregarding the theoretical implications, it seems to be
"Any language that has a keyword called 'class'" :-P
To further what aib said, I would say that a language isn't really object oriented unless the standard libraries that are available are object oriented. The biggest example of this is PHP. Although it supports all the standard object oriented concepts, the fact that such a large percentage of the standard libraries aren't object oriented means that it's almost impossible to write your code in an object oriented way.
It doesn't matter that they are introducing namespaces if all the standard libraries still require you to prefix all your function calls with stuff like mysql_ and pgsql_, when in a language that supported namespaces in the actual API, you could get rid of functions with mysql_ and have just a simple "include system.db.mysql.*" at the top of your file so that it would know where those things came from.
when you can make classes, it is object-oriented
for example : java is object-oriented, javascript is not, and c++ looks like some kind of "object-curious" language
In my experience, languages are not object-oriented, code is.
A few years ago I was writing a suite of programs in AppleScript, which doesn't really enforce any object-oriented features, when I started to grok OO. It's clumsy to write Objects in AppleScript, although it is possible to create classes, constructors, and so forth if you take the time to figure out how.
The language was the correct language for the domain: getting different programs on the Macintosh to work together to accomplish some automatic tasks based on input files. Taking the trouble to self-enforce an object-oriented style was the correct programming choice because it resulted in code that was easier to trouble-shoot, test, and understand.
The feature that I noticed the most in changing that code over from procedural to OO was encapsulation: both of properties and method calls.
Simples:(compare insurance character)
1-Polymorphism
2-Inheritance
3-Encapsulation
4-Re-use.
:)
Object: An object is a repository of data. For example, if MyList is a ShoppingList object, MyList might record your shopping list.
Class: A class is a type of object. Many objects of the same class might exist; for instance, MyList and YourList may both be ShoppingList objects.
Method: A procedure or function that operates on an object or a class. A method is associated with a particular class. For instance, addItem might be a method that adds an item to any ShoppingList object. Sometimes a method is associated with a family of classes. For instance, addItem might operate on any List, of which a ShoppingList is just one type.
Inheritance: A class may inherit properties from a more general class. For example, the ShoppingList class inherits from the List class the property of storing a sequence of items.
Polymorphism: The ability to have one method call work on several different classes of objects, even if those classes need different implementations of the method call. For example, one line of code might be able to call the "addItem" method on every kind of List, even though adding an item to a ShoppingList is completely different from adding an item to a ShoppingCart.
Object-Oriented: Each object knows its own class and which methods manipulate objects in that class. Each ShoppingList and each ShoppingCart knows which implementation of addItem applies to it.
In this list, the one thing that truly distinguishes object-oriented languages from procedural languages (C, Fortran, Basic, Pascal) is polymorphism.
Source: https://www.youtube.com/watch?v=mFPmKGIrQs4&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd
If a language is designed with the facilities specifically to support object-oriented programming(4 features) then it is an Object-oriented programming language.
You can program in an object-orientated style in more or less any language.It’s the code that is object-oriented not the language.
Examples of real object-oriented languages are Java, c#, Python, Ruby, C++.
Also, it's possible to have extensions to provide Object-Oriented features like PHP, Perl etc.
You can write an object-oriented code with C but it is not object-oriented prog. lang. It is not designed for that (that was the whole point of c++)
Archetype
The ability to express real-world scenarios in code.
foreach(House house in location.Houses)
{
foreach(Deliverable mail in new Mailbag(new Deliverable[]
{
GetLetters(),
GetPackages(),
GetAdvertisingJunk()
})
{
if(mail.AddressedTo(house))
{
house.Deliver(mail);
}
}
}
-
foreach(Deliverable myMail in GetMail())
{
IReadable readable = myMail as IReadable;
if ( readable != null )
{
Console.WriteLine(readable.Text);
}
}
Why?
To help us understand this more easily. It makes better sense in our heads and if implemented correctly makes the code more efficient, re-usable and reduces repetition.
To achieve this you need:
Pointers/References to ensure that this == this and this != that.
Classes to point to (e.g. Arm) that store data (int hairyness) and operations (Throw(IThrowable))
Polymorphism (Inheritance and/or Interfaces) to treat specific objects in a generic fashion so you can read books as well as graffiti on the wall (both implement IReadable)
Encapsulation because an apple doesn't expose an Atoms[] property