I took a glimpse on Hoare Logic in college. What we did was really simple. Most of what I did was proving the correctness of simple programs consisting of while loops, if statements, and sequence of instructions, but nothing more. These methods seem very useful!
Are formal methods used in industry widely?
Are these methods used to prove mission-critical software?
Well, Sir Tony Hoare joined Microsoft Research about 10 years ago, and one of the things he started was a formal verification of the Windows NT kernel. Indeed, this was one of the reasons for the long delay of Windows Vista: starting with Vista, large parts of the kernel are actually formally verified wrt. to certain properties like absence of deadlocks, absence of information leaks etc.
This is certainly not typical, but it is probably the single most important application of formal program verification, in terms of its impact (after all, almost every human being is in some way, shape or form affected by a computer running Windows).
This is a question close to my heart (I'm a researcher in Software Verification using formal logics), so you'll probably not be surprised when I say I think these techniques have a useful place, and are not yet used enough in the industry.
There are many levels of "formal methods", so I'll assume you mean those resting on a rigourous mathematical basis (as opposed to, say, following some 6-Sigma style process). Some types of formal methods have had great success - type systems being one example. Static analysis tools based on data flow analysis are also popular, model checking is almost ubiquitous in hardware design, and computational models like Pi-Calculus and CCS seem to be inspiring some real change in practical language design for concurrency. Termination analysis is one that's had a lot of press recently - The SDV project at Microsoft and work by Byron Cook are recent examples of research/practice crossover in formal methods.
Hoare Reasoning has not, so far, made great inroads in the industry - this is for more reasons than I can list, but I suspect is mostly around the complexity of writing then proving specifications for real programs (they tend to get big, and fail to express properties of many real world environments). Various sub-fields in this type of reasoning are now making big inroads into these problems - Separation Logic being one.
This is partially the nature of ongoing (hard) research. But I must confess that we, as theorists, have entirely failed to educate the industry on why our techniques are useful, to keep them relevant to industry needs, and to make them approachable to software developers. At some level, that's not our problem - we're researchers, often mathematicians, and practical usage is not foremost in our minds. Also, the techniques being developed are often too embryonic for use in large scale systems - we work on small programs, on simplified systems, get the math working, and move on. I don't much buy these excuses though - we should be more active in pushing our ideas, and getting a feedback loop between the industry and our work (one of the main reasons I went back to research).
It's probably a good idea for me to resurrect my weblog, and make some more posts on this stuff...
I cannot comment much on mission-critical software, although I know that the avionics industry uses a wide variety of techniques to validate software, including Hoare-style methods.
Formal methods have suffered because early advocates like Edsger Dijkstra insisted that they ought to be used everywhere. Neither the formalisms nor the software support were up to the job. More sensible advocates believe that these methods should be used on problems that are hard. They are not widely used in industry, but adoption is increasing. Probably the greatest inroads have been in the use of formal methods to check safety properties of software. Some of my favorite examples are the SPIN model checker and George Necula's proof-carrying code.
Moving away from practice and into research, Microsoft's Singularity operating-system project is about using formal methods to provide safety guarantees that ordinarily require hardware support. This in turn leads to faster performance and stronger guarantees. For example, in singularity they have proved that if a third-party device driver is allowed into the system (which means basic verification conditions have been proved), then it cannot possibly bring down that whole OS–he worst it can do is hose its own device.
Formal methods are not yet widely used in industry, but they are more widely used than they were 20 years ago, and 20 years from now they will be more widely used still. So you are future-proofed :-)
Yes, they are used, but not widely in all areas. There are more methods than just hoare logic, some are used more, some less, depending on suitability for given task. The common problem is that sofware is biiiiiiig and verifying that all of it is correct is still too hard a problem.
For example the theorem-prover (a software that aids humans in proving program correctness) ACL2 has been used to prove that a certain floating-point processing unit does not have a certain type of bug. It was a big task, so this technique is not too common.
Model checking, another kind of formal verification, is used rather widely nowadays, for example Microsoft provides a type of model checker in the driver development kit and it can be used to verify the driver for a set of common bugs. Model checkers are also often used in verifying hardware circuits.
Rigorous testing can be also thought of as formal verification - there are some formal specifications of which paths of program should be tested and so on.
"Are formal methods used in industry?"
Yes.
The assert statement in many programming languages is related to formal methods for verifying a program.
"Are formal methods used in industry widely ?"
No.
"Are these methods used to prove mission-critical software ?"
Sometimes. More often, they're used to prove that the software is secure. More formally, they're used to prove certain security-related assertions about the software.
There are two different approaches to formal methods in the industry.
One approach is to change the development process completely. The Z notation and the B method that were mentioned are in this first category. B was applied to the development of the driverless subway line 14 in Paris (if you get a chance, climb in the front wagon. It's not often that you get a chance to see the rails in front of you).
Another, more incremental, approach is to preserve the existing development and verification processes and to replace only one of the verification tasks at a time by a new method. This is very attractive but it means developing static analysis tools for exiting, used languages that are often not easy to analyse (because they were not designed to be).
If you go to (for instance)
http://dblp.uni-trier.de/db/indices/a-tree/d/Delmas:David.html
(sorry, only one hyperlink allowed for new users :( )
you will find instances of practical applications of formal methods to the verification of C programs (with static analyzers Astrée, Caveat, Fluctuat, Frama-C) and binary code (with tools from AbsInt GmbH).
By the way, since you mentioned Hoare Logic, in the above list of tools, only Caveat is based on Hoare logic (and Frama-C has a Hoare logic plug-in). The others rely on abstract interpretation, a different technique with a more automatic approach.
My area of expertise is the use of formal methods for static code analysis to show that software is free of run-time errors. This is implemented using a formal methods technique known "abstract interpretation". The technique essentially enables you to prove certain atributes of a s/w program. E.g. prove that a+b will not overflow or x/(x-y) will not result in a divide by zero. An example static analysis tool that uses this technique is Polyspace.
With respect to your question: "Are formal methods used in industry widely?" and "Are these methods used to prove mission-critical software?"
The answer is yes. This opinion is based on my experience and supporting the Polyspace tool for industries that rely on the use of embedded software to control safety critical systems such as electronic throttle in an automobile, braking system for a train, jet engine controller, drug delivery infusion pump, etc. These industries do indeed use these types of formal methods tools.
I don't believe all 100% of these industry segments are using these tools, but the use is increasing. My opinion is that the Aerospace and Automotive industries lead with the Medical Device industry quickly ramping up use.
Polyspace is a a (hideously expensive, but very good) commercial product based on program verification. It's fairly pragmatic, in that it scales up from 'enhanced unit testing that will probably find some bugs' to 'the next three years of your life will be spent showing these 10 files have zero defects'.
It is based more on negative verification ('this program won't corrupt your stack') instead positive verification ('this program will do precisely what these 50 pages of equations say it will').
To add to Jorg's answer, here's an interview with Tony Hoare. The tools Jorg's referring to, I think, are PREfast and PREfix. See here for more information.
Besides of other more procedural approaches, Hoare logic was in the basis of Design by Contract, introduced as an object oriented technique by Bertrand Meyer in Eiffel (see Meyer's article of 1992, page 4). While Design by Contract is not the same as formal verification methods (for one thing, DbC doesn't prove anything until the software is executed), in my opinion it provides a more practical use.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
After competing in and following this year's Google Code Jam competition, I couldn't help but notice the incredible number of [successful] contestants that used C/C++ and Java. The distribution of languages used throughout the competition can be seen here.
After programming in C/C++ for several years, I recently fell in love with Python for its readable/straightforward nature. More recently, I learned functional languages like OCaml, Scheme, and even logic languages like Prolog. These languages certainly have their merits and, in my opinion, can be applied more easily than C++ and Java for certain situations. For example, Scheme's use of call/cc simplifies backtracking (a tool required to answer several problems) and Prolog's logic specification, although inefficient due to its brute-force nature, can drastically simplify (and even automatically solve) certain problems that are difficult to wrap one's brain around.
It is clear that a competition contestant should use the tools that are best suited for the challenge. Even x86 assembly is Turing complete - that doesn't justify solving problems with it. In this case, why are the contestants that use less common languages like Scheme/Lisp, Prolog, and even Python significantly less successful than contestants that use C/C++ and Java? Worded differently, why don't successful contestants use languages that, although may be less mainstream, are arguably better tools for the job?
There are several motivations for my question. Most importantly, I would like to become a better programmer - both in the practical aspect and the competition aspect. After being introduced to such beautiful paradigms like functional and logic programming, it is discouraging to see so many people discard them in favor of C/C++ and Java. It even makes me question my admiration for said paradigms, worrying that I cannot be successful as a Lisp/Scheme/Prolog programmer in a programming competition.
Great question! As someone who has dabbled in programming contests a bit myself, I may have something to say.
[Let's get the standard disclaimer out of the way: contest programming is only loosely related to "programming in the real world", and while it tests algorithmic and problem-solving skills and the ability to come up with fast bug-free working code under time pressure, it does not necessarily correlate with being able to build large software projects, write maintainable code, etc (beyond the fact that well-structured programs are easier to debug).]
Now for some answers:
C++/Java are more common than other languages in the real world as well, so you'd expect to see a higher proportion anywhere. (But it's even higher in the contest population.)
Many of these participants are students, or got into contests as students, and C++/Java are more common "first languages" that students learn. (Undergrad students these days may start with Scheme, Haskell, Python, etc., but high-schoolers (often self-taught) less often.) In fact, many of the Eastern European participants still use Pascal, and are more amazing with it than the rest of us will ever be with any language.
The school- and college-level contests usually use these languages. The International Olympiad in Informatics (IOI) allows only C, C++ and Pascal (or maybe it allows Java now; I haven't kept up), and the ACM Intercollegiate Programming Contest (ACM ICPC) allows only C, C++ and Java. TopCoder allows C++, Java, C# and VB (really :p); and recently, Python. So you could say the "contest ecosystem" has more C++/Java programmers in it. Google Code Jam and IPSC are among the few contests that allow code in any language, actually.
Now the question is, in GCJ where the contestants are free to choose a language, why wouldn't they choose Python or Scheme? The most relevant factor is that these languages are slow. Sure, for most real-world programming they are easily fast enough, but for the tight loops that are often involved in getting a program to run under the n-second limit for all test cases, these languages don't cut it for any of the algorithmically more involved problems. (A problem designed to accept O(n log n) solutions but not Θ(n2) solutions for C/C++ frequently rules out even optimal O(n log n) solutions in slower languages. Even Java used to be given a handicap at USACO; I'm not sure this is still the case.)
Another factor is the libraries: C++ and Java have better libraries for frequently useful algorithms and data structures (e.g. red-black trees, C++'s next_permutation), while Python's libraries (good enough for the real world) are less useful here, and Prolog and Scheme... I don't know about their libraries. This is a relatively minor factor, because these programmers can write their own code when necessary. :-)
General-purpose multi-paradigm languages are more useful for just getting things done within the time constraints of the contest, than languages that force a philosophy or way of doing things on you. This is why Prolog will always remain unpopular, for instance. (General philosophy: some languages are "enabling" languages that let you do anything including shooting yourself in the foot, some are "directing" that force you to do things the right way.) This is also why C++ is three times more popular than Java in the general contest participants, and much more popular among the top contestants. Since code doesn't have to be read by anyone else, it's ok and even useful to have loop macros like FOR(i,n) (less code to type, and more importantly less chance of making a bug when in a hurry). Nothing against Java, there are a few top programmers who use Java too. :-)
Finally, although many of these top programmers may have C++/Java/Pascal as their "first language", they are not good because of their language, so you don't have to despair about that. Many of these same programmers have won contests like the ICFP contest even with intentionally using crazy languages like shell scripts, m4 (used in autoconf), and assembly (the team named "You Can't Spell Awesome Without ASM").
I liked Jerry Coffin's idea of plotting contestants of the Google AI contest, so I took all of the results and plotted them (calculated mean, standard deviation, and then graphed the normal distribution curves in Excel).
With Lua and JS, got this:
Without (there were few contestants, so maybe the results are skewed):
It looks like Java participants did markedly worse than the rest, while Go, Common Lisp, and C are on the better end.
Why we all speak English and not Esperanto? Well, it just happened so. Even though English is inconsistent and bloated and Esperanto is intentionally designed as 'better tool'.
Thus, one reason is a tradition. In most schools programming is still taught in C/C++, Java, Pascal or even Basic. And participate in those contests mostly students, which choose language they know better.
Also, you can notice that most algorithmic books feature psedudocode in style of Pascal or Ada, and very very rarely - Lisp. I don't know why, perhaps also a tradition. Or perhaps it's just not so good for the algorithms.
Another reason would be speed. Although it's not a problem for Google Code Jam, in almost all contests 2x speed gap is a difference between 'Accepted' and 'Time Limit' verdicts.
In other words, if optimal algorithm in C++ runs 10 times faster than in Ruby, it may mean that sub-optimal algorithm in C++ will still be faster than a good one in Ruby. And contest authors usually don't want to allow O(n^2) submissions, if O(n*logn) can be achieved.
First, I'd question your premise [edit: or what I take to be a premise -- that contestants using C++ and Java fare about equally well]. For example, here's what languages were used for the entries that came in the first 100 places and the last 100 places in Google's recent AI contest:
Contestants using C++ and Java did not seem to be anywhere close to equally successful in that contest. Contestants using Python didn't seem to fare particularly well either, though there were considerably fewer of them, weakening any conclusion in that regard.
Second, of course, an awful lot of the explanation (as others have pointed out) is undoubtedly just the number of people who are familiar with each language. There are probably more people taking a course in Java right now than the total number of people who've ever written any Lisp, Scheme or Prolog.
Edit: I think a third possibility is simply versatility. To pick an extreme example, Prolog is very well suited to a few problems, but equally poorly suited to many others. Few people can (or at least do) learn more than one or two languages well enough to use them in a contest, so most people who are interested in such things are likely to choose languages that can work reasonably well for almost anything, rather than attempting to learn a specialized language for every problem that might be chosen.
In nearly all Google Code Jam rounds, more of the higher-performing contestants code in C++.
Below are the language stats from Google Code Jam 2012 Round 1A, 1B, and 1C (listed top to bottom).
The number of contestants in each round are 3,686, 3,281, and 3,189 respectively.
fun question, probably should be community wiki.
Look at number of finalists by countries: http://www.go-hero.net/jam/10/regions. notice number of people from East Europe and Russia. those places have very strong C++ communities, as well as Java, for number of reasons.
look at number languages in qualifiers: http://www.go-hero.net/jam/10/languages/0 and finals: http://www.go-hero.net/jam/10/languages/6. C++ starts out less than half and has 75 percent in finals. either good programmers prefer C++ or C++ makes the programmers. Probably by the time you master C++, other things become trivial.
You are free to draw your own conclusions though.
First of all, as you have pointed C++ and Java are mainstream languages. These automatically means that people who start doing programming competitions will be introduced to them first - by the way who learns Lisp as a first language:) I also participate regularly in such competitions - I use C++ to compete, although my favorite language is Java. It is just I want to practice another language apart from Java - also C++ is a little bit less verbose and runs faster which is important for programming competitions.
Now to my point - people become experts first in mainstream languages. To participate in programming competitions you must have quite a good grasp of the language you are using. You don't have time to search on the internet for stupid things - like forgot a construct. It is just that speed is an important factor there. To use Lisp in a competition, you must be fond of it. I don't think there are such many people out there. Correct me if I am wrong. And honestly the pros you have mentioned like simplifies backtracking: In whatever language backtracking is easy - declare a method and just call it again for every possible outcome. It couldn't be simpler. I haven't felt till now that the language I am using is trying to trip up my feet for programming competitions.
OMG ... People are all going through the Stats and Figures !!
Lets not forget the basics.. These are the only two languages (mostly) which are taught to people in college/schools...!
That might answer the heavy rush!
A vital reason might be that every contests don't support languages like python or prolog. Specially ACM ICPC World Finals support C/C++ and Java. And TopCoder also supports only C++, Java, C#, VB, and now Python. It is natural for the contestants that they will choose one language that is available in every contest. Another reason might be execution speed. And yes, another reason is these are the languages that most of the people learn first.
Big libraries were a selling point for Java in ACM ICPC. It's handy to be able to realize you want some random data structure or algorithm and just pull it out of the standard libraries.
Keep in mind that C++ is not only the majority among all contestants, but as the rounds progress, its percentage just keeps and keeps improving.
I'd say it is true that most of the participants are students (However, since it is an open tournament with chances to a job interview with google, then you have to consider that many who participate are graduated). But the latest rounds are only for people with ton of experience. They are not just students who just learned to code in C++ / Java.
Of course, the student argument also works against languages like LISP and OcaML or ProLog. That is languages, that are used a lot in AI areas but in the mainstream world students are the most likely to be learning and use them.
Big contests other than google's support few languages, but that still wouldn't explain why Pascal or .net are not near the level of Java (As they tend to be equally supported in the major contest events).
A lot of the best coders in these contests know a lot of languages. But they still prefer to use C++ during the rounds it must be for a bigger reason than "learned C++" first.
I would argue against the claim that languages other than C++ or Java are better tools for the job. If direct data says that the finalists are more likely to use C++ and Java it is a direct contradiction to that claim.
Google AI competition data does not actually contradict any premise regarding the code jam. It actually does show that top coders are able to use languages like Common Lisp when it is truly the better tool for the job. If we want to use this data to assume that CLISP is a great tool for AI competitions, then we should also assume that C++ is a great tool for algorithm competitions like GCJ.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am currently doing a dissertation about the implications or dangers that today's software development practices or teachings may have on the long term effects of programming.
Just to make it clear: I am not attacking the use abstractions in programming. Every programmer knows that abstractions are the bases for modularity.
What I want to investigate with this dissertation are the positive and negative effects abstractions can have in software development. As regards the positive, I am sure that I can find many sources that can confirm this. But what about the negative effects of abstractions? Do you have any stories to share that talk about when certain abstractions failed on you?
The main concern is that many programmers today are programming against abstractions without having the faintest idea of what the abstraction is doing under-the-covers. This may very well lead to bugs and bad design. So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
Taking a simple example from Joel's Back to Basics, C's strcat:
void strcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
}
The above function hosts the issue that if you are doing string concatenation, the function is always starting from the beginning of the dest pointer to find the null terminator character, whereas if you write the function as follows, you will return a pointer to where the concatenated string is, which in turn allows you to pass this new pointer to the concatenation function as the *dest parameter:
char* mystrcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
return --dest;
}
Now this is obviously a very simple as regards abstractions, but it is the same concept I shall be investigating.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Can you please give your opinions and your says as regards this subject?
Thank you for your time and I appreciate every comment.
First of all, abstractions are inevitable because they help us to deal with the mind-blowing complexity of things.
Abstractions are also inevitable because it is more and more required of an individual to undertake more tasks or even complete projects. To address the problem, one uses libraries which wrap lower-level concepts and expose more complex behavior.
Naturally, a developer has less and less time to know the intrinsics of the things. The latest concern I heard about on SO pages is starting to learn JavaScript with jQuery library, ignoring the raw JavaScript at all.
The issue is about the balance between:
Know the little tiniest details of some technology and be a master of it, but at the same time being unable to work with anything else.
Superficial knowledge of a wide variety of technologies and tools which however proves sufficient for common everyday tasks which allows an individual to perform in multiple areas possibly covering all sides of some (moderately big) project.
Take your pick.
Some work requires the one, another position requires the other.
So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
It would be nice if people knew what is happening behind the scenes. This knowledge comes with time and practice, up to a certain degree. Depends on what kind of tasks you have. You certainly shouldn't blame people for not knowing everything. If you wish a person to be able to perform in a variety of fields, it is inevitable he won't have time to cover each up to the last bit.
What is essential, is the knowledge of the basic building blocks. Data structures, algorithms, complexity. That should provide a basis for everything else.
Knowing tiniest details of some particular technology is good, but not essential. Anyway, you can't learn them all. They're too many and they keep coming.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Schools shouldn't be teaching programming languages at all. They're to teach basics of theoretical and practical CS, social skills, communication, team work. To cover a vast variety of topics and problems to provide a wide angle view for their graduates. This will help them to find their way. Whatever they need to know in details, they'll do it on their own.
An example where abstraction has failed:
In this case, a piece of software was needed to communicate to many different third party data processors. The communication was done through various messaging protocols; the transport method/protocol is not important in this case. Just assume everyone communicated through messaging.
The idea was to abstract the features of each of these third parties into a single, unified message format. It seemed relatively straightforward because each of the third parties performed a similar service. The problem was that some third parties used different terms to explain similar features. It was also found that some third parties had additional features that other third parties did not have.
The designers of the abstraction did not see through the difference of third party terms nor did they think it was reasonable to limit the scope of the unified features to only support the common features of the third parties. Instead, a single, monolithic message schema was developed to support any and all features of the third parties considered at the time. In what was probably considered a future-proofing move, they added a means of also passing an infinite number of name/value pairs along with the monolithic message in case there were future data elements that the monolithic message could not handle.
Early on, it became clear that changing the monolithic message was going to be difficult due to so many people using it in mission critical systems. The use of the name/value pairs increased. Each name that could be used was documented inside a large spreadsheet, and developers were required to consult the spreadsheet to avoid duplication of name value function. The list got so large, however, that it was found that there were frequently collisions in purposes of name values.
The majority of the monolithic message's fields now have no purpose and are kept mainly for backwards compatibility. There are name values that can be used to replace fields in the monolithic message. The majority of the interfacing is now done through the name/value pairs. In cases where the client is intending to communicate with more than one third party, each client needs to reconcile the name values available for each third party. It would be almost simpler to interface directly to the third party themselves.
I believe this illustrates that, from a consumer of the monolithic message perspective, that it is important that developers of the consuming code not know what is happening under the covers. If the designers had considered that the consumers of the monolithic message should not have to understand the abstraction in great detail, the monolithic message and it's associated name/value pairs might never have happened. Documenting the abstraction with assertions regarding input and expected output would make life so much simpler.
As for colleges not teaching C and Lisp....they are cheating the students. You get a better understanding of what is going on with the machine and OS with C. You get a bit of a different perspective on processing data and approaching problems with Lisp. I have used some of the ideas I learned using Lisp in programs written in C, C++, .Net, and Java. Learning Java after knowing even just C is not very difficult. The OO part is really not programming language specific, so perhaps using Java for that is acceptable.
An understanding of fundamentals of algorithms (e.g. time complexity) and some knowledge about the metal is essential to designing/writing smells-good code.
I would suggest, though, that just as important is education in modern abstractions and profiling. I feel that modern abstractions make me so much more productive than I would be without them that they are at least as important as good fundamentals, if not more so.
An important element that lacked in my education was the use of profilers. When used routinely and correctly, profilers can help mitigate problems with poor fundamentals.
Since you quote Joel Spolsky, I take it your aware of his "Law of Leaky Abstractions"? I'll mention it for future readers. http://www.joelonsoftware.com/articles/LeakyAbstractions.html
Green & Blackwell's Ironies of Abstractions talks a bit about the effort of learning the abstraction. http://homepage.ntlworld.com/greenery/workStuff/Papers/index.html
The term "astronaut architecture" is a reaction to over-abstraction.
I know I certainly curse abstraction when I haven't touched Java or C# in a while and i want to write to a file, but have to instance a Stream...Writer...Adaptor....Handler....
Also, Patterns, as in Gang Of Four. Seemed great when I first read about them in the mid-90's, but can never remember factory, facade, interface, helper, worker, flyweight....
People like Alexander Stepanov and Sean Parent vote for a formal and abstract approach on software design.
The idea is to break complex systems down into a directed acyclic graph and hide cyclic behaviour in nodes representing that behaviour.
Parent gave presentations at boost-con and google (sheets from boost-con, p.24 introduces the approach, there is also a video of the google talk).
While i like the approach and think its a neccessary development, i have a problem with imagining how to handle subsystems with amorphous behaviour.
Imagine for example a common pattern for state-machines: using an interface which all states support and having different behaviour in concrete implementations for the states.
How would one solve that?
Note that i am just looking for an abstract approach.
I can think of hiding that behaviour behind a node and defining different sub-DAGs for the states, but that complicates the design considerately if you want to influence the behaviour of the main DAG from a sub-DAG.
Your question is not clear. Define amorphous subsystems.
You are "just looking for an abstract approach" but then you seem to want details about an implementation in a conventional programming language ("common pattern for state-machines"). So, what are you asking for? How to implement nested finite state-machines?
Some more detail will help the conversation.
For a real abstract approach, look at something like Stream X-Machines:
... The X-machine model is structurally the
same as the finite state machine, except
that the symbols used to label the machine's
transitions denote relations of type X→X. ...
The Stream X-Machine differs from Eilenberg's
model, in that the fundamental data type
X = Out* × Mem × In*,
where In* is an input sequence,
Out* is an output sequence, and Mem is the
(rest of the) memory.
The advantage of this model is that it
allows a system to be driven, one step
at a time, through its states and
transitions, while observing the
outputs at each step. These are
witness values, that guarantee that
particular functions were executed on
each step. As a result, complex
software systems may be decomposed
into a hierarchy of Stream
X-Machines, designed in a top-down
way and tested in a bottom-up way.
This divide-and-conquer approach to
design and testing is backed by
Florentin Ipate's proof of correct
integration, which proves how testing
the layered machines independently is
equivalent to testing the composed
system. ...
But I don't see how the presentation is related to this. He seems to speak about a quite mainstream approach to programming, nothing similar to X-Machines. Anyway, the presentation is quite confusing and I have no time to see the video right now.
First impression of the talk, reading the slides only
The author touches haphazardly on numerous fields/problems/solutions, apparently without recognizing it: from Peopleware (for example Psychology of programming), to Software Engineering (for example software product lines), to various programming techniques.
How the various parts are linked and what exactly he is advocating is not clear at all (I'm accustomed to just reading slides and they are usually consequential):
Dataflow programming?
Constraints solving for User Interfaces? For practical implementations, see Garnet for Common Lisp, Amulet/OpenAmulet for C++.
What advantages gives us this "new" concept-based generic programming with respect to well-known approaches (for example, tools based on Hoare logic pre/post conditions and invariants or, better, Hoare's Communicating Sequential Processes (CSP) or Hehner's Practical Theory of Programming or some programming language with a sophisticated type-system like ATS, Qi or Epigram and so on)? It seems to me that introducing "concepts" - which, as-is, are specific to C++ - is not more simple than using the alternatives. Is it just about jargon and "politics"? (Finally formal methods... but disguised).
Why organizing program modules as a DAG and not as a tree, like David Parnas advocated decades ago in Designing software for ease of extension and contraction? (here a directly accessible .pdf and here slides from a lecture). The work on X-Machines probably is an answer to this question (going even beyond DAGs), but, again, the author seems to speak about a quite conventional program development regime in which Parnas' approach is the only sensible.
If/when I will see the video I will update this answer.
Over the years we have seen (well, I have :) a number of languages come and go. Some were more accepted, some a little less. So I was wondering, what do you think are factors which most impact whether the language survives ? And whether it will have a future for a number of years (by that I mean several decades or so) ?
For example, fortran and C have survived the test of time. They were popular though, but they also had very good corporate backing, financing, and standard specifications (ANSI and ISO).
Some of the modern languages I see today, although they are popular, have none of that (the current implementation is often considered standard). That is all fine for the time being, but what about 10 or 20 years later, when their authors are maybe not here anymore. I very rarely see open source languages which make the transition into corporate financing.
If you could put with a few words, in your opinion, what would be the most important factors for the survival of a language and why ?
Ruby is popular, although it has no corporate backing. And it has been here for 14 years already.
Perl already survived 22 years, and probably will survive a few more.
Python has no corporation backing (ok, don't know if you'd count Google's engagement), yet it made to Fortune 500 companies.
On the other hand:
Pascal got corporate backing and died.
Ada has corporation backing and it's practically reduced to DSL for avionics.
I think the answer depends a lot on the time-frame in which you define survival. This is important because I think there are three factors that have changed over time, and are still changing:
Hardware performance (i.e. speed or memory)
Hardware complexity (i.e. single-core v.s. multi-core)
Software complexity
I think the reason C has survived is because, until just the past few years, there was still a very real need for maximum performance in a lot of applications. Perhaps there will always be that kind of need, but I think it has been growing much less relevant in the past few years. I think it's always going to be around, but I'd be surprised if it was widely used 20 years from now; it's already started getting passed up in favor of C#/Java/etc in the past five years.
The recent (by which I mean past five years or so) rise of languages like Python are also a response to the fact that software has grown more complex, while performance has become less of an issue. Because consumers value the 'now', there's a huge incentive to develop quickly, and worry about speed later, if at all. That has a pretty big impact on which language you use for development.
I see clarity, maintainability and ease of use as the most important factor for survival, if you take the future out to 20+ years.
Every future language needs to make an existing problem easy
For example, concurrent programming is not easy on most languages today. This will be solved with a new language as we can not easily coax our existing paradigms into the parallel world. Just take a look at Java, which was built from the ground up with threads in mind, it has so many caveats with you even dare to do concurrent programming.
We'll need a system that makes it so easy to do concurrent programming that we won't even need to think about it. We'll need a memory model that protects us from having to think about these problems. For those who can't imagine such a world, you are just stuck in our current paradigm. We will need to change the way we develop software for this to work. Serious problems require change.
Another way for a language to survive is to attach it to an entire system. Just look at Objective C, it is Apple's language for all Apple products. I think this is the way to go. Design a system that is worthy of its own language.
There are many other examples, I've been thinking about this problem for a long time.
As far as I can recall, Fortran had no corporate backing until it was well established. C was backed by AT&T, but they really didn't care if anyone else adapted it. And both were well established before they had ANSI Standards (also, note the ANSI & ISO provide Standard specifications, not implementations)
On the other hand, IBM heavily back & promoted PL/I, and that never really caught on. And the US Government tried to get all of us writing Ada, and that didn't work either.
So, what does work? Good question. Getting schools to teach it is good (Pascal pretty much disappeared when colleges switched to C++ & Java). Lately have "buzz on the 'net" is good (cite: Java, Ruby)
In order for a language to survive it needs several things:
It needs to solve a problem better then other comparable options. This is the subjective aspect, that developers feel it is better and so they adopt it.
It needs to have good tooling. Without good tooling a language will never catch on to the masses.
It needs a strong community to be built around it. A community which provides assistance, help, components, etc etc...
I don't think corporate backing has a direct impact on these items. I think it can make things such as developing tooling more likely, but there are too many examples where it has helped or not helped adoption of a language.
Open source community has become more like a huge corporate, hasn't it?
Languages survive while they are used, and while people are prepared to maintain them. People are often prepared to maintain the language while it is used. If a language is not used, it dies.
There can be all sorts of things that contribute to, or determine whether, a language dies. Corporate-sponsored languages die if the corporate sponsor ceases to see a benefit (profit) in the language, or they want people to use an alternative, and the corporate sponsor is unwilling to release the code to open source, and there is no open source alternative.
I don't see evidence that corporate backing or standardization are sufficient to determine whether a language survives or not. There are many corporate backed languages that have failed to gain a strong foothold (ADA comes to mind). There are many standardized languages (Common Lisp) that also failed. On the other hand, there are plenty of non-standard non-corporate languages that gain popularity (Perl, PHP, Ruby). There doesn't seem to be causality there.
The viability of a language is really determined by the community around it. There is a positive feedback loop. More users means more support and more libraries which in turn means more users. Popular languages can languish, but they don't totally die out. Not for a long time.
If I were looking for a language to use for something that had to last, the two biggest criteria in my mind would be:
Does it work well for my problem domain?
Is the community strong enough to be self-perpetuating?
If the answers to those two questions are true, use the language. If either answer is false, don't.
While other languages have been almost killed by their corporate backing = Delphi
I'm finding only about 30% of my code actually solves problems, the rest is taken up by logging, tests, parameter checking, exceptions, error handling and so on. Do you find that in your code, and is there an IDE/Editor that allows you to hide code that's not interesting?
OTOH are there languages which make the support code more manageable and smaller in size?
Edit - I think we're all aware of the difference between business logic and other code. I'm not saying that the logging etc is not important. The things is, when I'm coding I'm either implementing business logic, or I'm making sure things don't break. For me that's two different ways of thinking, do others develop like that, and is there an IDE that supports that way of developing?
Supporting code is just as important as the "real code". The quality of your product is determined as much by supporting code as anything else.
Consider an automobile. In terms of just getting from point A to point B, that requires nothing more than a go-cart: a frame, a seat, an engine, a few tires. But modern cars have a lot more than just the basics. Highly efficient engines using electronic engine timing. Automatic transmissions. Bucket seats. Heating and A/C. Rack and pinion steering. Power brakes. Anti-lock brakes. Quiet, comfortable cabins protected from the weather. Air bags. Crumple zones and other advanced safety features. Etc. Etc.
Details and execution are important, even in software. If you find that your "supporting code" tends to look more like kludges and hacks, then it's time to rethink your fundamental approach. But ultimately the fit and finish determines quality of the end product as much as anything else.
Edit: The questions you should ask yourself:
Is your "supporting code":
An umbrella duct taped to a pole or a metal and glass cabin frame?
A piece of pipe tied to the front of the car or an energy absorbing bumper integrated into a crumple zone?
A grappling hook on a rope tied to the frame or 4-wheel anti-lock power brakes?
A pair of goggles and a thick coat or a windshield and a heating system?
Answers to these questions will probably affect how much you care about your "supporting code".
Edit: Response to Dave Turvey's comment:
I'd encourage rereading the original question, one of the examples of "support code" listed is "error handling". Consider this for a moment. Imagine it in the context of, say, an automobile, a microwave oven, or even an operating system. Should error handling be relegated to second class citizenship because it serves a "support" function in some abstract sense? In an automobile the safety features are part of the fundamental design of the vehicle and comprise a substantial part of the value of the car. The safety features and "error handling" of a microwave oven (indeed, of the microwave oven's embedded software as well) are an important part of its value as well. A microwave oven that was improperly shielded could cook food just fine, under the right circumstances, but it would pose a hazard to the operator.
The implicit featureset of every tool (software or otherwise) includes this list:
Robustness
Usability
Performance
Everything anyone has ever built or used has had these features. Failure to understand this will translate to failure to execute well on these features which will make for a poor quality product of low value and low commercial interest. There is no such thing as "support code", there is only a misunderstanding of the nature of what it means for a feature to be complete. A "feature" that works in the abstract only under laboratory conditions is an experiment, not a part of a product.
The idea of pure, pristine features floating on a bog of dirty, ugly support code is the wrong image of software development. Instead, think of elegant, superbly-integrated machinery that is well-built, intuitive to use, and powerful.
The supporting code is important, but you want not to be distracted by it when you don't want to. There are two technologies that can help.
A language with first-class functions will help you modularize your code so that logging, timing, and so on can be implemented once and then combined with many other modules. It will also help you write unit tests. Some good ways to learn the techniques are to read the paper Why Functional Programming Matters and to learn about the QuickCheck tool. (No, I am not a shill for John Hughes, but he does do wonderful work.)
If you cannot use a programming language with powerful capabilities for modularization and reuse, or if you don't want to, Don Knuth's Literate Programming technique will help you organize your code so that you can split up parts the way you want and pay attention only to what you want, when you want. The Noweb literate-programming tool supports any language that can be written in ASCII, and also combinations of those languages.
If my IDE could hide "not interesting code" I would definitely turn the feature off. You wouldn't want that happening, I bet :)
There are certainly languages that minimise the amount of supporting code, but I don't think you could switch from Java to lets say JavaScript simply because in JavaScript you wouldn't have to declare every exception... I think it's quite necessary to have your supporting code where it is.
Oh, and you could have your program formally specified and mathematically proven, then you wouldn't need to support your code too much ;D
The real code you are referring to is usually called "Business Logic".
In a good unit testing system, your unit tests should be in their own classes (and probably their own assemblies) so that shouldn't be an issue.
The rest is language based for the most part. The more advanced a language, the better it's ability to avoid writing support code to some degree. Also, a well-targetted development system can help you avoid writing a lot of code (Visual Basic's screen builder, Ruby on Rails, ...) but these abstractions can break down and cause you to write just as much code as anything else if you use it to develop targets outside it's intended types of apps. (VB & Ruby don't help all that much if you're calculating prime numbers)
Beyond the language/platform, you have refactoring--the art of eliminating all the supporting code that you can (as well as redundancies in your business logic) to keep your code-base clean and small.
When practicing advanced refactoring, you'll probably end up writing tools for yourself.
Sometimes abstracting data out of your code and into a structured file of some sort can eliminate huge piles of support code and move the rest into "Business logic" because now parsing that data and setting it up is part of the "business" your program does.
This is a good trade-off because this type of business logic tends to be more readable and easier to factor. The other advantage of this kind of abstraction is that all your "Configuration" is now done in data which tends to make it somebody elses' problem.
As an example of this type of tooling: Rails itself! It takes a lot of the boilerplate of web development and factors it out of the code and into libraries driven by data and simplistic code (Ruby blurs the line between code and data--their data files actually loop back to being specified in Ruby code!)
It's like you want to take a trip to the top of Pike's Peak. You can take the Winnebago, you can take your SUV, or a motorcycle, or ride up on your bike.
Some ways are a more or less expensive, faster, etc. Sometimes you end up taking along a lot of stuff the isn't there strictly for accomplishing vertical progress; it's nice to have a beer in the cooler. But it pays to remember that you're responsible for everything that goes with you to the top.
Aspect Oriented Programming partly addresses this. It allows you to inject code into existing source/bytecode. This way you can make a task such as logging appear in its own module instead of woven into the business logic.
Work expands to fill it's container. This really sounds like an economics question. (ie. optimizing your outputs- features for users and features for the developer) with expensive inputs (time spent writing features, time spent writing plumbing code.)
You have to include user visible features or you don't have a viable product or job. Once that is done partly done, your remaining budget of time will be split between activities with a visible return on effort and an invisible (but positive!) return on effort, like exception logging, memory management, etc.
What ever language makes it cheaper to implement features will probably increase your features/to plumbing code ratio. Likewise, whatever language makes it cheaper to implement plumbing code will probably increase the feature to plumbing code because you'll have freed up more time to write features.
Like all optimization problems you'd have 2 effects-- the increase in the size of the support code (because say, you're using cheap code generation) and the increase in the size of feature related code (because you have more time left over to write features), so the final ratio might be hard to predict.
I do not begrudge the 90% of my code that is data access plumbing, because it is all testable, code generated and very cheap, compared to the 10% of handwritten of domain specific code.
I don't try to make all routines foolproof, only those exposed to the outside world.
http://en.wikipedia.org/wiki/Folding_editor
Higher and more dynamic languages are usually less verbose. Weak typing also saves a lot of code. Of course there are trade-offs.
I use the #region directive in Visual Studio to collapse blocks of code that are not the primary focus, e.g. unit tests. With log4net logging statements are only ever one line. I haven't found any approaches to reduce the parameter checking code although it sounds like C# 4 has some kind of contract framework that will help there.
I have some coworkers who once, while being chewed out by a client for an overdue and bug-ridden project, bragged to the customer that they had written 5 times as much test code as operational code.
The client was not happy, and by "not happy" I mean their skin turned green, they grew to 5 times their normal size, and their clothes popped off.
You could just make a static class in a utilities assembly that checks your parameters and things. For instance in the Spring Framework (which is where I got the idea) it has an Assert class and it makes it really fast to make sure that string params aren't empty or that object params aren't null. It cleans up validation code and reduces duplicate code which is a win win.