Is a Treeset a language agnostic data structure? - treeset

I'm just trying to research lesser known data structures, and when I search for Treeset only Java related resources come up.
Should non-Java developers be expected to know what it is? Is it an actual data structure?

All data-structures are language agnostic. They might happen to be implement in one or more languages, but that's irrelevant.
It's kind of like asking "Is the Eiffel tower a material-agnostic building?". Well the Eiffel tower is steel, but nothing stops you from building one out of popsicle sticks.
java.util.TreeSet<E> is specific to Java (or JVM languages that inter-operate with Java), yes, but nothing stops you from implementing the same data structure in other languages.
Though different languages might have different names for the same thing.
E.g. what Java calls an ArrayList is an Array in Swift, a list in Python, a std::vector in C++, or std::vec in Rust.
Should non-Java developers be expected to know what it is?
Yeah, probably. Tree-based alternatives to HashMaps and HashSets have their niche uses, and it would be expected knowledge to at least be aware of their general pro/cons and approximately how they work.

Related

What are important languages to learn to understand different approaches and concepts? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
When all you have is a pair of bolt cutters and a bottle of vodka, everything looks like the lock on the door of Wolf Blitzer's boathouse. (Replace that with a hammer and a nail if you don't read xkcd)
I currently program Clojure, Python, Java and PHP, so I am familiar with the C and LISP syntax as well as the whitespace thing. I know imperative, functional, immutable, OOP and a couple type systems and other things. Now I want more!
What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
I don't feel like learning another functional language(Haskell) or another imperative OOP language(Ruby), nor do I want to practice impractical fun languages like Brainfuck.
One very interesting thing I found myself are monoiconic stack based languages like Factor.
Only when I feel I understand most concepts and have answers to all my questions, I want to start thinking about my own toy language to contain all my personal preferences.
Matters of practicality are highly subjective, so I will simply say that learning different language paradigms will only serve to make you a better programmer. What is more practical than that?
Functional, Haskell - I know you said that you didn't want to, but you should really really reconsider. You've gotten some functional exposure with Clojure and even Python, but you've not experienced it to its fullest without Haskell. If you're really against Haskell then good compromises are either ML or OCaml.
Declarative, Datalog - Many people would recommend Prolog in this slot, but I think Datalog is a cleaner example of a declarative language.
Array, J - I've only just discovered J, but I find it to be a stunning language. It will twist your mind into a pretzel. You will thank J for that.
Stack, Factor/Forth - Factor is very powerful and I plan to dig into it ASAP. Forth is the grand-daddy of the Stack languages, and as an added bonus it's simple to implement yourself. There is something to be said about learning through implementation.
Dataflow, Oz - I think the influence of Oz is on the upswing and will only continue to grow in the future.
Prototype-based, JavaScript / Io / Self - Self is the grand-daddy and highly influential on every prototype-based language. This is not the same as class-based OOP and shouldn't be treated as such. Many people come to a prototype language and create an ad-hoc class system, but if your goal is to expand your mind, then I think that is a mistake. Use the language to its full capacity. Read Organizing Programs without Classes for ideas.
Expert System, CLIPS - I always recommend this. If you know Prolog then you will likely have the upper-hand in getting up to speed, but it's a very different language.
Frink - Frink is a general purpose language, but it's famous for its system of unit conversions. I find this language to be very inspiring in its unrelenting drive to be the best at what it does. Plus... it's really fun!
Functional+Optional Types, Qi - You say you've experience with some type systems, but do you have experience with "skinnable* type systems? No one has... but they should. Qi is like Lisp in many ways, but its type system will blow your mind.
Actors+Fault-tolerance, Erlang - Erlang's process model gets a lot of the buzz, but its fault-tolerance and hot-code-swapping mechanisms are game-changing. You will not learn much about FP that you wouldn't learn with Clojure, but its FT features will make you wonder why more languages can't seem to get this right.
Enjoy!
What about Prolog (for unification/backtracking etc), Smalltalk (for "everything's a message"), Forth (reverse polish, threaded interpreters etc), Scheme (continuations)?
Not a language, but the Art of the Metaobject Protocol is mind-bending stuff
I second Haskell. Don't think "I know a Lisp, so I know functional programming". Ever heard of type classes? Algebraic data types? Monads? "Modern" (more or less - at least not 50 years old ;) ) functional languages, especially Haskell, have explored a plethora of very powerful useful new concepts. Type classes add ad-hoc polymorphism, but type inference (yet another thing the languages you already know don't have) works like a charm. Algebraic data types are simply awesome, especially for modelling trees-like data structures, but work fine for enums or simple records, too. And monads... well, let's just say people use them to make exceptions, I/O, parsers, list comprehensions and much more - in purely functional ways!
Also, the whole topic is deep enough to keep one busy for years ;)
I currently program Clojure, Python, Java and PHP [...] What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
C
There's a lot of C code lying around---it's definitely practical. If you learn C++ too, there's a big lot of more code around (and the leap is short once you know C and Java).
It also gives you (or forces you to have) a great understanding of some theoretical issues; for instance, each running program lives in a 4 GB byte array, in some sense. Pointers in C are really just indices into this array---they're just a different kind of integer. No different in Java, Python, PHP, except hidden beneath a surface layer.
Also, you can write object-oriented code in C, you just have to be a bit manual about vtables and such. Simon Tatham's Portable Puzzle Collection is a great example of fairly accessible object-oriented C code; it's also fairly well designed and well worth a read to a beginner/intermediate C programmer. This is what happens in Haskell too---type classes are in some sense "just another vtable".
Another great thing about C: engaging in Q&A with skilled C programmers will get you a lot of answers that explain C in terms of lower-level constructs, which builds your closer-to-the-iron knowledge base.
I may be missing OP's point---I think I am, judging by the other answers---but I think it might be a useful answer to other people who have a similar question and read this thread.
From Peter Norvig's site:
"Learn at least a half dozen programming languages. Include one language that supports class abstractions (like Java or C++), one that supports functional abstraction (like Lisp or ML), one that supports syntactic abstraction (like Lisp), one that supports declarative specifications (like Prolog or C++ templates), one that supports coroutines (like Icon or Scheme), and one that supports parallelism (like Sisal). "
http://norvig.com/21-days.html
I'm amazed that after 6 months and hundreds of votes, noone has mentioned SQL ...
In the types as theorems / advanced type systems: Coq ( I think Agda comes in this category too).
Coq is a proof assistant embedded into a functional programing language.
You can write mathematical proofs and Coq helps to build a solution.
You can write functions and prove properties about it.
It has dependent types, that alone blew my mind. A simple example:
concatenate: forall (A:Set)(n m:nat), (array A m)->(array A n)->(array A (n+m))
is the signature of a function that concatenates two arrays of size n and m of elements of A and returns an array of size (n+m). It won't compile if the function doesn't return that!
Is based on the calculus of inductive constructions, and it has a solid theory behind it.
I'm not smart enough to understand it all, but I think is worth taking a look, specially if you trend towards type theory.
EDIT: I need to mention: you write a function in Coq and then you can PROVE it is correct for any input, that is amazing!
One of the languages which i am interested for have a very different point of view (including a new vocabulary to define the language elements and a radical diff syntax) is J. Haskell would be the obvious choice for me, although it is a functional lang, cause its type system and other unique features open your mind and makes you rethink you previous knowledge in (functional) programming.
Just like fogus has suggested it to you in his list, I advise you too to look at the language OzML/Mozart
Many paradigms, mainly targetted at concurrency/multi agent programming.
Concerning concurrency, and distributed calculus, the equivalent of Lambda calculus (which is behind functionnal programming) is called the Pi Calculus.
I have only started begining to look at some implementation of the Pi calculus. But they already have enlarged my conceptions of computing.
Pict
Nomadic Pict
FunLoft. (this one is pretty recent, conceived at INRIA)
Dataflow programming, aka flow-based programming is a good step ahead on the road. Some buzzwords: paralell processing, rapid prototyping, visual programming (not as bad as sounds first).
Wikipedia's articles are good:
In computer science, flow-based
programming (FBP) is a programming
paradigm that defines applications as
networks of "black box" processes,
which exchange data across predefined
connections by message passing, where
the connections are specified
externally to the processes. These
black box processes can be reconnected
endlessly to form different
applications without having to be
changed internally. FBP is thus
naturally component-oriented.
http://en.wikipedia.org/wiki/Flow-based_programming
http://en.wikipedia.org/wiki/Dataflow_programming
http://en.wikipedia.org/wiki/Actor_model
Read JPM's book: http://jpaulmorrison.com/fbp/
(We've written a simple implementation in C++ for home automation purposes, and we're very happy with it. Documentation is under construction.)
You've learned a lot of languages. Now is the time to focus on one language, and master it.
perhaps you might want to try LabView for it's visual programming, although it's for engineering purposes.
nevertheless, you seem pretty interested in all that's out there, hence the suggestion
also, you could try the android appinventor for visually building stuff
Bruce A. Tate, taking a page from The Pragmatic Programmer wrote a book on exactly that:
Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages
In the book, he covers Clojure, Haskell, Io, Prolog, Scala, Erlang, and Ruby.
Mercury: http://www.mercury.csse.unimelb.edu.au/
It's a typed Prolog, with uniqueness types and modes (i.e. specifying that the predicate append(X,Y,Z) meaning X appended to Y is Z yields one Z given an X and Y, but can yield multiple X/Ys for a given Z). Also, no cut or other extra-logical predicates.
If you will, it's to Prolog as Haskell is to Lisp.
Programming does not cover the task of programmers.
New things are always interesting, but there are some very cool old stuff.
The first database system was dBaseIII for me, I was spending about a month to write small examples (dBase/FoxPro/Clipper is a table-based db with indexes). Then, at my first workplace, I met MUMPS, and I got headache. I was young and fresh-brained, but it took 2 weeks to understand the MUMPS database model. There was a moment, like in comics: after 2 weeks, a button has been switched on, and the bulb has just lighten up in my mind. MUMPS is natural, low level, and very-very fast. (It's an unbalanced, unformalized btree without types.) Today's trends shows the way back to it: NoSQL, key-value db, multidimensional db - so there are only some steps left, and we reach Mumps.
Here's a presentation about MUMPS's advantages: http://www.slideshare.net/george.james/mumps-the-internet-scale-database-presentation
A short doc on hierarchical db: http://www.cs.pitt.edu/~chang/156/14hier.html
An introduction to MUMPS globals (in MUMPS, local variables, short: locals are the memory variables, and the global variables, short: globals are the "db variables", setting a global variable goes to the disk immediatelly):
http://gradvs1.mgateway.com/download/extreme1.pdf (PDF)
Say you want to write a love poem...
Instead of using a hammer just because there's one already in your hand, learn the proper tools for the task: learn to speak French.
Once you've reached near-native speaking level, you're ready to start your poem.
While learning new languages on an academical level is an interesting hobby, IMHO you can't really learn to use one until you try to apply it to a real world problem. So, rather than looking for a new language to learn, I'd in your place first look for a new things to build, and only then I'd look for the right language to use for that one specific project. First pick the problem, then the tool, not the other way around..
For anyone who hasn't been around since the mid 80's, I'd suggest learning 8-bit BASIC. It's very low-level, very primitive and it's an interesting exercise to program around its holes.
On the same line, I'd pick an HP-41C series calculator (or emulator, although nothing beats real hardware). It's hard to wrap your brain around it, but well worth it. A TI-57 will do, but will be a completely different experience. If you manage to solve second degree equations on a TI-55, you'll be considered a master (it had no conditionals and no branches except a RST, that jumped the program back to step 0).
And last, I'd pick FORTH (it was mentioned before). It has a nice "build your language" Lisp-ish thing, but is much more bare metal. It will teach you why Rails is interesting and when DSLs make sense and you'll have a glipse on what your non-RPN calculator is thinking while you type.
PostScript. It is a rather interesting language as it's stack based, and it's quite practical once you want to put things on paper and you want either to get it done or troubleshoot why isn't it getting done.
Erlang. The intrinsic parallelism gives it a rather unusual feel and you can again learn useful things from that. I'm not so sure about practicality, but it can be useful for some fast prototyping tasks and highly redundant systems.
Try programming GPUs - either CUDA or OpenCL. It's just C/C++ extensions, but the mental model of the architecture is again completely different from the classic approach, and it definitely gets practical once you need to get some real number crunching done.
Erlang, Forth and some embedded work with assembly language. Really; buy an Arduino kit or something similar, and create a polyphonic beep in assembly. You'll really learn something.
There's also anic:
https://code.google.com/p/anic/
From its site:
Faster than C, Safer than Java, Simpler than *sh
anic is the reference implementation compiler for the experimental, high-performance, implicitly parallel, deadlock-free general-purpose dataflow programming language ANI.
It doesn't seem to be under active development anymore, but it seems to have some interesting concepts (and that, after all, is what you seem to be after).
While not meeting your requirement of "different" - I'd wager that Fantom is a language that a professional programmer should look at. By their own admission, the authors of fantom call it a boring language. It merely shores up the most common use cases of Java and C#, with some borrowed closure syntax from ruby and similar newer languages.
And yet it manages to have its own bootstrapped compiler, provide a platform that has a drop in install with no external dependencies, gets packages right - and works on Java, C# and now the Web (via js).
It may not widen your horizons in terms of new ways of programming, but it will certainly show you better ways of programming.
One thing that I see missing from the other answers: languages based on term-rewriting.
You could take a look at Pure - http://code.google.com/p/pure-lang/ .
Mathematica is also rewriting based, although it's not so easy to figure out what's going on, as it's rather closed.
APL, Forth and Assembly.
Have some fun. Pick up a Lego Mindstorm robot kit and CMU's RobotC and write some robotics code. Things happen when you write code that has to "get dirty" and interact with the real world that you cannot possibly learn in any other way. Yes, same language, but a very different perspective.

is there a Universal Model for languages?

Many programming languages share generic and even fairly universal features. For example, if you compared Java, VB6, .NET, PHP, Python, then you would find common functions such as control structures, numeric and string manipulation, etc.
What has been done to define these features at a meta-language (or language-agnostic) level?
UML offers a descriptive reference of software in every aspect, but the real-world focus seems to be data processes. Is UML relevant?
I'm not asking "Why we don't have a single language that replaces the current plethora." We need many different tools (at least in this eon).
I'm not asking that all languages fit a template -- assembly vs. compiled languages are different enough to make that unfeasible (and some folks call HTML a language, though I wouldn't). Any attempt would start with a properly narrow scope. In line with this, I wouldn't expect the model to cover even a small selection with full validity.
I would expect however that such a model could be used to transpose from one language to another (with limited goals -- think jist translation).
There have been many attempts at this, but none have been very successful. The earliest I'm aware of is UNCOL more than 50 years ago.
You've given a list of languages that have a lot in common because they're pretty similar -- they're all procedural languages with common roots and some OO extensions thrown in, so that's not too suprising. If you start looking at different languages like LISP, haskell, erlang, prolog, or even SQL you start seeing very different things.
What you're describing sounds like the formal semantics of programming languages. There are a variety of approaches and each will give a way to formally specify the meaning of a program in some programming language. In some cases, this specification is essentially a translation into another language such as lambda calculus, or compilation for a formally specified abstract machine such as SECD.
There is so much work here it's hard to pick a specific reference. But I hope I've given you some useful keywords to continue your search.
UML is typically used to define algorithms/code in simpler terms before moving on to real code.
To answer what I am guessing to be your question, there is already a defined set of required parts of languages while,for,if,else... Will this ever be set as a standard, or made into a base library that is used by all languages: no, this is because the different developers of languages like to do it themselves.
I think the closest you can get to this without loss of generality is a Turing machine, which is not very useful for practical purposes. But if you allow Turing machine languages to be "labeled" and reused, you could build up the concepts you need, working from low- to high-level.
I think that MOF is the universal language.
You can for example create UML diagrams from MOF via a UML metamodel. If you save this metamodel information into xmi then you can save what ever information you need and even more than in any language. XMI semantic is so rich that there is no limit to its use. If you map UML to xmi on the top of a metamodel live synchronize with MOF then this is for me the universal language.
The author of Pattern Calculus seems to propose such a universal model. I expect that it will turn out to be just as useful as previous attempts to define a universal model, that is to say, good in parts but not the last word.

Performance advantages of using methods inside of classes verses data structures with libraries of functions?

Basically is the only advantage of object oriented languages the improved understanding of a programs purpose?
Do the compilers of object oriented languages break apart the objects into structures and function libraries?
Basically, yes. The only advantage is improved understanding of code.
For some languages the OO version is the same as the non-OO version after compilation. Perl for example. For the majority of cases the OO version is much slower than the non-OO version. With very rare exceptions, non-OO languages are always faster than OO languages.
But in general, most experienced programmers will tell you not to worry about the performance differences between OO and non-OO languages (or Lispers will tell you not to worry about the performance difference between procedural and functional languages). This is because you should never, ever, ever, underestimate the importance of understanding code.
These days we rarely talk about it anymore because we've gotten used to using very high level languages - be it OO or functional or multi-paradigm or metaprogramming. But back in the 80s and 90s there was what was then known as the software crisis. What was the software crisis? It's basically the fact that most software projects were never completed!
The software crisis affected all sectors of the industry: from military radar systems, to games to commercial operating systems. The consumers called them vaporware. They were projects that were too ambitious.
But these days there are lots of very ambitious and impressive projects that manage to reach at least beta versions (and for web2.0 beta is good enough for public consumption). Part of the reason is that we now understand requirements engineering better and we also understand the process of software development better. But part of it is also because we have better tools to actually understand what we're doing. And OO is part of that toolset.
Yes, method code is central to the class definition and each instance method accepts an implicit this pointer to the data as its first argument. If you disassemble an instance method call you will see this.
Here are a couple of links to compare speed, first is comparing C/C++, please read the entire article:
http://unthought.net/c++/c_vs_c++.html
To compare Python, Java, C++, PHP and other languages:
http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-python-ruby-jython-jruby-groovy/
But, to answer your question, the main advantage of OO is that for many problems it is the best way to model the solution, as the model naturally fits into objects. But, if you try to force it to work where it is not a good fit you will have harder to understand code.
There are various language paradigms as there are many different types of problems, and you should pick the language type that best models the solution. For example, I would not want to write an OS in C++ as it doesn't seem to fit well in OO methodologies, but I would also not want to write a car racing game in C, as it would make more sense to have objects.
Depending on the language and compiler, you may see the compiled application compile to C, but others are not, as some are going to be interpreted.
For example, C++ compiles to C, but Java does not, neither does .NET languages. PHP is generally interpreted, though it is possible to compile it (though I have never tried it). One compiler is:
http://www.phpcompiler.org/

handling amorphous subsystems in formal software design

People like Alexander Stepanov and Sean Parent vote for a formal and abstract approach on software design.
The idea is to break complex systems down into a directed acyclic graph and hide cyclic behaviour in nodes representing that behaviour.
Parent gave presentations at boost-con and google (sheets from boost-con, p.24 introduces the approach, there is also a video of the google talk).
While i like the approach and think its a neccessary development, i have a problem with imagining how to handle subsystems with amorphous behaviour.
Imagine for example a common pattern for state-machines: using an interface which all states support and having different behaviour in concrete implementations for the states.
How would one solve that?
Note that i am just looking for an abstract approach.
I can think of hiding that behaviour behind a node and defining different sub-DAGs for the states, but that complicates the design considerately if you want to influence the behaviour of the main DAG from a sub-DAG.
Your question is not clear. Define amorphous subsystems.
You are "just looking for an abstract approach" but then you seem to want details about an implementation in a conventional programming language ("common pattern for state-machines"). So, what are you asking for? How to implement nested finite state-machines?
Some more detail will help the conversation.
For a real abstract approach, look at something like Stream X-Machines:
... The X-machine model is structurally the
same as the finite state machine, except
that the symbols used to label the machine's
transitions denote relations of type X→X. ...
The Stream X-Machine differs from Eilenberg's
model, in that the fundamental data type
X = Out* × Mem × In*,
where In* is an input sequence,
Out* is an output sequence, and Mem is the
(rest of the) memory.
The advantage of this model is that it
allows a system to be driven, one step
at a time, through its states and
transitions, while observing the
outputs at each step. These are
witness values, that guarantee that
particular functions were executed on
each step. As a result, complex
software systems may be decomposed
into a hierarchy of Stream
X-Machines, designed in a top-down
way and tested in a bottom-up way.
This divide-and-conquer approach to
design and testing is backed by
Florentin Ipate's proof of correct
integration, which proves how testing
the layered machines independently is
equivalent to testing the composed
system. ...
But I don't see how the presentation is related to this. He seems to speak about a quite mainstream approach to programming, nothing similar to X-Machines. Anyway, the presentation is quite confusing and I have no time to see the video right now.
First impression of the talk, reading the slides only
The author touches haphazardly on numerous fields/problems/solutions, apparently without recognizing it: from Peopleware (for example Psychology of programming), to Software Engineering (for example software product lines), to various programming techniques.
How the various parts are linked and what exactly he is advocating is not clear at all (I'm accustomed to just reading slides and they are usually consequential):
Dataflow programming?
Constraints solving for User Interfaces? For practical implementations, see Garnet for Common Lisp, Amulet/OpenAmulet for C++.
What advantages gives us this "new" concept-based generic programming with respect to well-known approaches (for example, tools based on Hoare logic pre/post conditions and invariants or, better, Hoare's Communicating Sequential Processes (CSP) or Hehner's Practical Theory of Programming or some programming language with a sophisticated type-system like ATS, Qi or Epigram and so on)? It seems to me that introducing "concepts" - which, as-is, are specific to C++ - is not more simple than using the alternatives. Is it just about jargon and "politics"? (Finally formal methods... but disguised).
Why organizing program modules as a DAG and not as a tree, like David Parnas advocated decades ago in Designing software for ease of extension and contraction? (here a directly accessible .pdf and here slides from a lecture). The work on X-Machines probably is an answer to this question (going even beyond DAGs), but, again, the author seems to speak about a quite conventional program development regime in which Parnas' approach is the only sensible.
If/when I will see the video I will update this answer.

What's the point of OOP?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
As far as I can tell, in spite of the countless millions or billions spent on OOP education, languages, and tools, OOP has not improved developer productivity or software reliability, nor has it reduced development costs. Few people use OOP in any rigorous sense (few people adhere to or understand principles such as LSP); there seems to be little uniformity or consistency to the approaches that people take to modelling problem domains. All too often, the class is used simply for its syntactic sugar; it puts the functions for a record type into their own little namespace.
I've written a large amount of code for a wide variety of applications. Although there have been places where true substitutable subtyping played a valuable role in the application, these have been pretty exceptional. In general, though much lip service is given to talk of "re-use" the reality is that unless a piece of code does exactly what you want it to do, there's very little cost-effective "re-use". It's extremely hard to design classes to be extensible in the right way, and so the cost of extension is normally so great that "re-use" simply isn't worthwhile.
In many regards, this doesn't surprise me. The real world isn't "OO", and the idea implicit in OO--that we can model things with some class taxonomy--seems to me very fundamentally flawed (I can sit on a table, a tree stump, a car bonnet, someone's lap--but not one of those is-a chair). Even if we move to more abstract domains, OO modelling is often difficult, counterintuitive, and ultimately unhelpful (consider the classic examples of circles/ellipses or squares/rectangles).
So what am I missing here? Where's the value of OOP, and why has all the time and money failed to make software any better?
The real world isn't "OO", and the idea implicit in OO--that we can model things with some class taxonomy--seems to me very fundamentally flawed
While this is true and has been observed by other people (take Stepanov, inventor of the STL), the rest is nonsense. OOP may be flawed and it certainly is no silver bullet but it makes large-scale applications much simpler because it's a great way to reduce dependencies. Of course, this is only true for “good” OOP design. Sloppy design won't give any advantage. But good, decoupled design can be modelled very well using OOP and not well using other techniques.
There are much better, more universal models (Haskell's type model comes to mind) but these are also often more complicated and/or difficult to implement efficiently. OOP is a good trade-off between extremes.
OOP isn't about creating re-usable classes, its about creating Usable classes.
All too often, the class is used
simply for its syntactic sugar; it
puts the functions for a record type
into their own little namespace.
Yes, I find this to be too prevalent as well. This is not Object Oriented Programming. It's Object Based Programming and data centric programing. In my 10 years of working with OO Languages, I see people mostly doing Object Based Programming. OBP breaks down very quickly IMHO since you are essentially getting the worst of both words: 1) Procedural programming without adhering to proven structured programming methodology and 2) OOP without adhering to to proven OOP methodology.
OOP done right is a beautiful thing. It makes very difficult problems easy to solve, and to the uninitiated (not trying to sound pompous there), it can almost seem like magic. That being said, OOP is just one tool in the toolbox of programming methodologies. It is not the be all end all methodology. It just happens to suit large business applications well.
Most developers who work in OOP languages are utilizing examples of OOP done right in the frameworks and types that they use day-to-day, but they just aren't aware of it. Here are some very simple examples: ADO.NET, Hibernate/NHibernate, Logging Frameworks, various language collection types, the ASP.NET stack, The JSP stack etc... These are all things that heavily rely on OOP in their codebases.
Reuse shouldn't be a goal of OOP - or any other paradigm for that matter.
Reuse is a side-effect of an good design and proper level of abstraction. Code achieves reuse by doing something useful, but not doing so much as to make it inflexible. It does not matter whether the code is OO or not - we reuse what works and is not trivial to do ourselves. That's pragmatism.
The thought of OO as a new way to get to reuse through inheritance is fundamentally flawed. As you note the LSP violations abound. Instead, OO is properly thought of as a method of managing the complexity of a problem domain. The goal is maintainability of a system over time. The primary tool for achieving this is the separation of public interface from a private implementation. This allows us to have rules like "This should only be modified using ..." enforced by the compiler, rather than code review.
Using this, I'm sure you will agree, allows us to create and maintain hugely complex systems. There is lots of value in that, and it is not easy to do in other paradigms.
Verging on religious but I would say that you're painting an overly grim picture of the state of modern OOP. I would argue that it actually has reduced costs, made large software projects manageable, and so forth. That doesn't mean it's solved the fundamental problem of software messiness, and it doesn't mean the average developer is an OOP expert. But the modularization of function into object-components has certainly reduced the amount of spaghetti code out there in the world.
I can think of dozens of libraries off the top of my head which are beautifully reusable and which have saved time and money that can never be calculated.
But to the extent that OOP has been a waste of time, I'd say it's because of lack of programmer training, compounded by the steep learning curve of learning a language specific OOP mapping. Some people "get" OOP and others never will.
There's no empirical evidence that suggests that object orientation is a more natural way for people to think about the world. There's some work in the field of psychology of programming that shows that OO is not somehow more fitting than other approaches.
Object-oriented representations do not appear to be universally more usable or less usable.
It is not enough to simply adopt OO methods and require developers to use such methods, because that might have a negative impact on developer productivity, as well as the quality of systems developed.
Which is from "On the Usability of OO Representations" from Communications of the ACM Oct. 2000. The articles mainly compares OO against theprocess-oriented approach. There's lots of study of how people who work with the OO method "think" (Int. J. of Human-Computer Studies 2001, issue 54, or Human-Computer Interaction 1995, vol. 10 has a whole theme on OO studies), and from what I read, there's nothing to indicate some kind of naturalness to the OO approach that makes it better suited than a more traditional procedural approach.
I think the use of opaque context objects (HANDLEs in Win32, FILE*s in C, to name two well-known examples--hell, HANDLEs live on the other side of the kernel-mode barrier, and it really doesn't get much more encapsulated than that) is found in procedural code too; I'm struggling to see how this is something particular to OOP.
HANDLEs (and the rest of the WinAPI) is OOP! C doesn't support OOP very well so there's no special syntax but that doesn't mean it doesn't use the same concepts. WinAPI is in every sense of the word an object-oriented framework.
See, this is the trouble with every single discussion involving OOP or alternative techniques: nobody is clear about the definition, everyone is talking about something else and thus no consensus can be reached. Seems like a waste of time to me.
Its a programming paradigm.. Designed to make it easier for us mere mortals to break down a problem into smaller, workable pieces..
If you dont find it useful.. Don't use it, don't pay for training and be happy.
I on the other hand do find it useful, so I will :)
Relative to straight procedural programming, the first fundamental tenet of OOP is the notion of information hiding and encapsulation. This idea leads to the notion of the class that seperates the interface from implementation. These are hugely important concepts and the basis for putting a framework in place to think about program design in a different way and better (I think) way. You can't really argue against those properties - there is no trade-off made and it is always a cleaner way to modulize things.
Other aspects of OOP including inheritance and polymorphism are important too, but as others have alluded to, those are commonly over used. ie: Sometimes people use inheritance and/or polymorphism because they can, not because they should have. They are powerful concepts and very useful, but need to be used wisely and are not automatic winning advantages of OOP.
Relative to re-use. I agree re-use is over sold for OOP. It is a possible side effect of well defined objects, typically of more primitive/generic classes and is a direct result of the encapsulation and information hiding concepts. It is potentially easier to be re-used because the interfaces of well defined classes are just simply clearer and somewhat self documenting.
The problem with OOP is that it was oversold.
As Alan Kay originally conceived it, it was a great alternative to the prior practice of having raw data and all-global routines.
Then some management-consultant types latched onto it and sold it as the messiah of software, and lemming-like, academia and industry tumbled along after it.
Now they are lemming-like tumbling after other good ideas being oversold, such as functional programming.
So what would I do differently? Plenty, and I wrote a book on this. (It's out of print - I don't get a cent, but you can still get copies.)Amazon
My constructive answer is to look at programming not as a way of modeling things in the real world, but as a way of encoding requirements.
That is very different, and is based on information theory (at a level that anyone can understand). It says that programming can be looked at as a process of defining languages, and skill in doing so is essential for good programming.
It elevates the concept of domain-specific-languages (DSLs). It agrees emphatically with DRY (don't repeat yourself). It gives a big thumbs-up to code generation. It results in software with massively less data structure than is typical for modern applications.
It seeks to re-invigorate the idea that the way forward lies in inventiveness, and that even well-accepted ideas should be questioned.
HANDLEs (and the rest of the WinAPI) is OOP!
Are they, though? They're not inheritable, they're certainly not substitutable, they lack well-defined classes... I think they fall a long way short of "OOP".
Have you ever created a window using WinAPI? Then you should know that you define a class (RegisterClass), create an instance of it (CreateWindow), call virtual methods (WndProc) and base-class methods (DefWindowProc) and so on. WinAPI even takes the nomenclature from SmallTalk OOP, calling the methods “messages” (Window Messages).
Handles may not be inheritable but then, there's final in Java. They don't lack a class, they are a placeholder for the class: That's what the word “handle” means. Looking at architectures like MFC or .NET WinForms it's immediately obvious that except for the syntax, nothing much is different from the WinAPI.
Yes OOP did not solve all our problems, sorry about that. We are, however working on SOA which will solve all those problems.
OOP lends itself well to programming internal computer structures like GUI "widgets", where for example SelectList and TextBox may be subtypes of Item, which has common methods such as "move" and "resize".
The trouble is, 90% of us work in the world of business where we are working with business concepts such as Invoice, Employee, Job, Order. These do not lend themselves so well to OOP because the "objects" are more nebulous, subject to change according to business re-engineering and so on.
The worst case is where OO is enthusiastically applied to databases, including the egregious OO "enhancements" to SQL databases - which are rightly ignored except by database noobs who assume they must be the right way to do things because they are newer.
In my experience of reviewing code and design of projects I have been through, the value of OOP is not fully realised because alot of developers have not properly conceptualised the object-oriented model in their minds. Thus they do not program with OO design, very often continuing to write top-down procedural code making the classes a pretty flat design. (if you can even call that "design" in the first place)
It is pretty scary to observe how little colleagues know about what an abstract class or interface are, let alone properly design an inheritance hierarchy to suit the business needs.
However, when good OO design is present, it is just sheer joy reading the code and seeing the code naturally fall into place into intuitive components/classes. I have always perceived system architecture and design like designing the various departments and staff jobs in a company - all are there to accomplish a certain piece of work in the grand scheme of things, emitting the synergy required to propel the organisation/system forward.
That, of course, is quite rare unfortunately. Like the ratio of beautifully-designed versus horrendously-designed physical objects in the world, the same can pretty much be said about software engineering and design. Having the good tools at one's disposal does not necessarily confer good practices and results.
Maybe a bonnet, lap or a tree is not a chair but they all are ISittable.
I think those real world things are objects
You do?
What methods does an invoice have? Oh, wait. It can't pay itself, it can't send itself, it can't compare itself with the items that the vendor actually delivered. It doesn't have any methods at all; it's totally inert and non-functional. It's a record type (a struct, if you prefer), not an object.
Likewise the other things you mention.
Just because something is real does not make it an object in the OO sense of the word. OO objects are a peculiar coupling of state and behaviour that can act of their own accord. That isn't something that's abundant in the real world.
I have been writing OO code for the last 9 years or so. Other than using messaging, it's hard for me to imagine other approach. The main benefit I see totally in line with what CodingTheWheel said: modularisation. OO naturally leads me to construct my applications from modular components that have clean interfaces and clear responsibilities (i.e. loosely coupled, highly cohesive code with a clear separation of concerns).
I think where OO breaks down is when people create deeply nested class heirarchies. This can lead to complexity. However, factoring out common finctionality into a base class, then reusing that in other descendant classes is a deeply elegant thing, IMHO!
In the first place, the observations are somewhat sloppy. I don't have any figures on software productivity, and have no good reason to believe it's not going up. Further, since there are many people who abuse OO, good use of OO would not necessarily cause a productivity improvement even if OO was the greatest thing since peanut butter. After all, an incompetent brain surgeon is likely to be worse than none at all, but a competent one can be invaluable.
That being said, OO is a different way of arranging things, attaching procedural code to data rather than having procedural code operate on data. This should be at least a small win by itself, since there are cases where the OO approach is more natural. There's nothing stopping anybody from writing a procedural API in C++, after all, and so the option of providing objects instead makes the language more versatile.
Further, there's something OO does very well: it allows old code to call new code automatically, with no changes. If I have code that manages things procedurally, and I add a new sort of thing that's similar but not identical to an earlier one, I have to change the procedural code. In an OO system, I inherit the functionality, change what I like, and the new code is automatically used due to polymorphism. This increases the locality of changes, and that is a Good Thing.
The downside is that good OO isn't free: it requires time and effort to learn it properly. Since it's a major buzzword, there's lots of people and products who do it badly, just for the sake of doing it. It's not easier to design a good class interface than a good procedural API, and there's all sorts of easy-to-make errors (like deep class hierarchies).
Think of it as a different sort of tool, not necessarily generally better. A hammer in addition to a screwdriver, say. Perhaps we will eventually get out of the practice of software engineering as knowing which wrench to use to hammer the screw in.
#Sean
However, factoring out common finctionality into a base class, then reusing that in other descendant classes is a deeply elegant thing, IMHO!
But "procedural" developers have been doing that for decades anyway. The syntax and terminology might differ, but the effect is identical. There is more to OOP than "reusing common functionality in a base class", and I might even go so far as to say that that is hard to describe as OOP at all; calling the same function from different bits of code is a technique as old as the subprocedure itself.
#Konrad
OOP may be flawed and it certainly is no silver bullet but it makes large-scale applications much simpler because it's a great way to reduce dependencies
That is the dogma. I am not seeing what makes OOP significantly better in this regard than procedural programming of old. Whenever I make a procedure call I am isolating myself from the specifics of the implementation.
To me, there is a lot of value in the OOP syntax itself. Using objects that attempt to represent real things or data structures is often much more useful than trying to use a bunch of different flat (or "floating") functions to do the same thing with the same data. There is a certain natural "flow" to things with good OOP that just makes more sense to read, write, and maintain long term.
It doesn't necessarily matter that an Invoice isn't really an "object" with functions that it can perform itself - the object instance can exist just to perform functions on the data without having to know what type of data is actually there. The function "invoice.toJson()" can be called successfully without having to know what kind of data "invoice" is - the result will be Json, no matter it if comes from a database, XML, CSV, or even another JSON object. With procedural functions, you all the sudden have to know more about your data, and end up with functions like "xmlToJson()", "csvToJson()", "dbToJson()", etc. It eventually becomes a complete mess and a HUGE headache if you ever change the underlying data type.
The point of OOP is to hide the actual implementation by abstracting it away. To achieve that goal, you must create a public interface. To make your job easier while creating that public interface and keep things DRY, you must use concepts like abstract classes, inheritance, polymorphism, and design patterns.
So to me, the real overriding goal of OOP is to make future code maintenance and changes easier. But even beyond that, it can really simplify things a lot when done correctly in ways that procedural code never could. It doesn't matter if it doesn't match the "real world" - programming with code is not interacting with real world objects anyways. OOP is just a tool that makes my job easier and faster - I'll go for that any day.
#CodingTheWheel
But to the extent that OOP has been a waste of time, I'd say it's because of lack of programmer training, compounded by the steep learning curve of learning a language specific OOP mapping. Some people "get" OOP and others never will.
I dunno if that's really surprising, though. I think that technically sound approaches (LSP being the obvious thing) make hard to use, but if we don't use such approaches it makes the code brittle and inextensible anyway (because we can no longer reason about it). And I think the counterintuitive results that OOP leads us to makes it unsurprising that people don't pick it up.
More significantly, since software is already fundamentally too hard for normal humans to write reliably and accurately, should we really be extolling a technique that is consistently taught poorly and appears hard to learn? If the benefits were clear-cut then it might be worth persevering in spite of the difficulty, but that doesn't seem to be the case.
#Jeff
Relative to straight procedural programming, the first fundamental tenet of OOP is the notion of information hiding and encapsulation. This idea leads to the notion of the class that seperates the interface from implementation.
Which has the more hidden implementation: C++'s iostreams, or C's FILE*s?
I think the use of opaque context objects (HANDLEs in Win32, FILE*s in C, to name two well-known examples--hell, HANDLEs live on the other side of the kernel-mode barrier, and it really doesn't get much more encapsulated than that) is found in procedural code too; I'm struggling to see how this is something particular to OOP.
I suppose that may be a part of why I'm struggling to see the benefits: the parts that are obviously good are not specific to OOP, whereas the parts that are specific to OOP are not obviously good! (this is not to say that they are necessarily bad, but rather that I have not seen the evidence that they are widely-applicable and consistently beneficial).
In the only dev blog I read, by that Joel-On-Software-Founder-of-SO guy, I read a long time ago that OO does not lead to productivity increases. Automatic memory management does. Cool. Who can deny the data?
I still believe that OO is to non-OO what programming with functions is to programming everything inline. (And I should know, as I started with GWBasic.) When you refactor code to use functions, variable2654 becomes variable3 of the method you're in. Or, better yet, it's got a name that you can understand, and if the function is short, it's called value and that's sufficient for full comprehension.
When code with no functions becomes code with methods, you get to delete miles of code.
When you refactor code to be truly OO, b, c, q, and Z become this, this, this and this. And since I don't believe in using the this keyword, you get to delete miles of code. Actually, you get to do that even if you use this.
I do not think OO is natural metaphor. I don't think language is a natural metaphor either, nor do I think that Fowler's "smells" are better than saying "this code tastes bad." That said, I think that OO is not about natural metaphors and people who think the objects just pop out at you are basically missing the point. You define the object universe, and better object universes result in code that is shorter, easier to understand, works better, or all of these (and some criteria I am forgetting). I think that people who use the customers/domain's natural objects as programming objects are missing the power to redefine the universe.
For instance, when you do an airline reservation system, what you call a reservation might not correspond to a legal/business reservation at all.
Some of the basic concepts are really cool tools I think that most people exaggerate with that whole "when you have a hammer, they're all nails" thing. I think that the other side of the coin/mirror is just as true: when you have a gadget like polymorphism/inheritance, you begin to find uses where it fits like a glove/sock/contact-lens. The tools of OO are very powerful. Single-inheritance is, I think, absolutely necessary for people not to get carried away, my own multi-inheritance software not withstanding.
What's the point of OOP? I think it's a great way to handle an absolutely massive code base. I think it lets you organize and reorganize you code and gives you a language to do that in (beyond the programming language you're working in), and modularizes code in a pretty natural and easy-to-understand way.
OOP is destined to be misunderstood by the majority of developers This is because it's an eye-opening process like life: you understand OO more and more with experience, and start avoiding certain patterns and employing others as you get wiser. One of the best examples is that you stop using inheritance for classes that you do not control, and prefer the Facade pattern instead.
Regarding your mini-essay/question
I did want to mention that you're right. Reusability is a pipe-dream, for the most part. Here's a quote from Anders Hejilsberg about that topic (brilliant) from here:
If you ask beginning programmers to
write a calendar control, they often
think to themselves, "Oh, I'm going to
write the world's best calendar
control! It's going to be polymorphic
with respect to the kind of calendar.
It will have displayers, and mungers,
and this, that, and the other." They
need to ship a calendar application in
two months. They put all this
infrastructure into place in the
control, and then spend two days
writing a crappy calendar application
on top of it. They'll think, "In the
next version of the application, I'm
going to do so much more."
Once they start thinking about how
they're actually going to implement
all of these other concretizations of
their abstract design, however, it
turns out that their design is
completely wrong. And now they've
painted themself into a corner, and
they have to throw the whole thing
out. I have seen that over and over.
I'm a strong believer in being
minimalistic. Unless you actually are
going to solve the general problem,
don't try and put in place a framework
for solving a specific one, because
you don't know what that framework
should look like.
Have you ever created a window using WinAPI?
More times than I care to remember.
Then you should know that you define a class (RegisterClass), create an instance of it (CreateWindow), call virtual methods (WndProc) and base-class methods (DefWindowProc) and so on. WinAPI even takes the nomenclature from SmallTalk OOP, calling the methods “messages” (Window Messages).
Then you'll also know that it does no message dispatch of its own, which is a big gaping void. It also has crappy subclassing.
Handles may not be inheritable but then, there's final in Java. They don't lack a class, they are a placeholder for the class: That's what the word “handle” means. Looking at architectures like MFC or .NET WinForms it's immediately obvious that except for the syntax, nothing much is different from the WinAPI.
They're not inheritable either in interface or implementation, minimally substitutable, and they're not substantially different from what procedural coders have been doing since forever.
Is this really it? The best bits of OOP are just... traditional procedural code? That's the big deal?
I agree completely with InSciTek Jeff's answer, I'll just add the following refinements:
Information hiding and encapsulation: Critical for any maintainable code. Can be done by being careful in any programming language, doesn't require OO features, but doing it will make your code slightly OO-like.
Inheritance: There is one important application domain for which all those OO is-a-kind-of and contains-a relationships are a perfect fit: Graphical User Interfaces. If you try to build GUIs without OO language support, you will end up building OO-like features anyway, and it's harder and more error-prone without language support. Glade (recently) and X11 Xt (historically) for example.
Using OO features (especially deeply nested abstract hierarchies), when there is no point, is pointless. But for some application domains, there really is a point.
I believe the most beneficial quality of OOP is data hiding/managing. However, there are a LOT of examples where OOP is misused and I think this is where the confusion comes in.
Just because you can make something into an object does not mean you should. However, if doing so will make your code more organized/easier to read then you definitely should.
A great practical example where OOP is very helpful is with a "product" class and objects that I use on our website. Since every page is a product, and every product has references to other products, it can get very confusing as to which product the data you have refers to. Is this "strURL" variable the link to the current page, or to the home page, or to the statistics page? Sure you could make all kinds of different variable that refer to the same information, but proCurrentPage->strURL, is much easier to understand (for a developer).
In addition, attaching functions to those pages is much cleaner. I can do proCurrentPage->CleanCache(); Followed by proDisplayItem->RenderPromo(); If I just called those functions and had it assume the current data was available, who knows what kind of evil would occur. Also, if I had to pass the correct variables into those functions, I am back to the problem of having all kinds of variables for the different products laying around.
Instead, using objects, all my product data and functions are nice and clean and easy to understand.
However. The big problem with OOP is when somebody believes that EVERYTHING should be OOP. This creates a lot of problems. I have 88 tables in my database. I only have about 6 classes, and maybe I should have about 10. I definitely don't need 88 classes. Most of the time directly accessing those tables is perfectly understandable in the circumstances I use it, and OOP would actually make it more difficult/tedious to get to the core functionality of what is occurring.
I believe a hybrid model of objects where useful and procedural where practical is the most effective method of coding. It's a shame we have all these religious wars where people advocate using one method at the expense of the others. They are both good, and they both have their place. Most of the time, there are uses for both methods in every larger project (In some smaller projects, a single object, or a few procedures may be all that you need).
I don't care for reuse as much as I do for readability. The latter means your code is easier to change. That alone is worth in gold in the craft of building software.
And OO is a pretty damn effective way to make your programs readable. Reuse or no reuse.
"The real world isn't "OO","
Really? My world is full of objects. I'm using one now. I think that having software "objects" model the real objects might not be such a bad thing.
OO designs for conceptual things (like Windows, not real world windows, but the display panels on my computer monitor) often leave a lot to be desired. But for real world things like invoices, shipping orders, insurance claims and what-not, I think those real world things are objects. I have a stack on my desk, so they must be real.
The point of OOP is to give the programmer another means for describing and communicating a solution to a problem in code to machines and people. The most important part of that is the communication to people. OOP allows the programmer to declare what they mean in the code through rules that are enforced in the OO language.
Contrary to many arguments on this topic, OOP and OO concepts are pervasive throughout all code including code in non-OOP languages such as C. Many advanced non-OO programmers will approximate the features of objects even in non-OO languages.
Having OO built into the language merely gives the programmer another means of expression.
The biggest part to writing code is not communication with the machine, that part is easy, the biggest part is communication with human programmers.