Related
I'm curious about how much design-by-contract is used in practice outside of the Eiffel community. Are there any active open-source projects that use design-by-contract?
Or, to recast the question into one what that has a single answer: what's the most widely-used (non-Eiffel) open-source project that uses design-by-contract?
The "non-Eiffel" part of your question is interesting. Contracts take all their sense when there is support for them in the programming language, otherwise it's just a nice syntax for comments.
That brings us to the languages that support contracts. I know of three except Eiffel:
ESC/Java adds contracts to Java using a language named JML.
.NET contracts for all .NET languages (works at the bytecode level)
Frama-C adds contracts to C using the language ACSL
The first two have executable contracts. Advantages: can be used as run-time assertions. Disadvantages: lack the expressive power to completely specify what a function does in a contract. You can basically only write sanity checks.
ACSL contracts on the other hand are more expressive, and not executable. They make it possible to completely specify that a sort function should always terminate, and leave the same elements as in the original array in order. ACSL contracts can be used for static analysis, especially Hoare-style weakest precondition computation.
And only being really familiar with the last one (disclaimer: I work on Frama-C, but the ACSL part is the work of a lot of people, some of whom have contributed much more than me), I can only mention "ACSL by example", an open source C library with ACSL contracts currently being developed by Fraunhofer FIRST. It's not released yet, but it will be as part of the Device-soft project. I am sure that you could get a preliminary version if you were interested. Feel free to contact the person mentioned as contact on that last web page.
People like Alexander Stepanov and Sean Parent vote for a formal and abstract approach on software design.
The idea is to break complex systems down into a directed acyclic graph and hide cyclic behaviour in nodes representing that behaviour.
Parent gave presentations at boost-con and google (sheets from boost-con, p.24 introduces the approach, there is also a video of the google talk).
While i like the approach and think its a neccessary development, i have a problem with imagining how to handle subsystems with amorphous behaviour.
Imagine for example a common pattern for state-machines: using an interface which all states support and having different behaviour in concrete implementations for the states.
How would one solve that?
Note that i am just looking for an abstract approach.
I can think of hiding that behaviour behind a node and defining different sub-DAGs for the states, but that complicates the design considerately if you want to influence the behaviour of the main DAG from a sub-DAG.
Your question is not clear. Define amorphous subsystems.
You are "just looking for an abstract approach" but then you seem to want details about an implementation in a conventional programming language ("common pattern for state-machines"). So, what are you asking for? How to implement nested finite state-machines?
Some more detail will help the conversation.
For a real abstract approach, look at something like Stream X-Machines:
... The X-machine model is structurally the
same as the finite state machine, except
that the symbols used to label the machine's
transitions denote relations of type X→X. ...
The Stream X-Machine differs from Eilenberg's
model, in that the fundamental data type
X = Out* × Mem × In*,
where In* is an input sequence,
Out* is an output sequence, and Mem is the
(rest of the) memory.
The advantage of this model is that it
allows a system to be driven, one step
at a time, through its states and
transitions, while observing the
outputs at each step. These are
witness values, that guarantee that
particular functions were executed on
each step. As a result, complex
software systems may be decomposed
into a hierarchy of Stream
X-Machines, designed in a top-down
way and tested in a bottom-up way.
This divide-and-conquer approach to
design and testing is backed by
Florentin Ipate's proof of correct
integration, which proves how testing
the layered machines independently is
equivalent to testing the composed
system. ...
But I don't see how the presentation is related to this. He seems to speak about a quite mainstream approach to programming, nothing similar to X-Machines. Anyway, the presentation is quite confusing and I have no time to see the video right now.
First impression of the talk, reading the slides only
The author touches haphazardly on numerous fields/problems/solutions, apparently without recognizing it: from Peopleware (for example Psychology of programming), to Software Engineering (for example software product lines), to various programming techniques.
How the various parts are linked and what exactly he is advocating is not clear at all (I'm accustomed to just reading slides and they are usually consequential):
Dataflow programming?
Constraints solving for User Interfaces? For practical implementations, see Garnet for Common Lisp, Amulet/OpenAmulet for C++.
What advantages gives us this "new" concept-based generic programming with respect to well-known approaches (for example, tools based on Hoare logic pre/post conditions and invariants or, better, Hoare's Communicating Sequential Processes (CSP) or Hehner's Practical Theory of Programming or some programming language with a sophisticated type-system like ATS, Qi or Epigram and so on)? It seems to me that introducing "concepts" - which, as-is, are specific to C++ - is not more simple than using the alternatives. Is it just about jargon and "politics"? (Finally formal methods... but disguised).
Why organizing program modules as a DAG and not as a tree, like David Parnas advocated decades ago in Designing software for ease of extension and contraction? (here a directly accessible .pdf and here slides from a lecture). The work on X-Machines probably is an answer to this question (going even beyond DAGs), but, again, the author seems to speak about a quite conventional program development regime in which Parnas' approach is the only sensible.
If/when I will see the video I will update this answer.
I've heard from someone that they´re using a business process automation tool (like Weblogic Integration) as a programming language (what sounds like something kind of stupid) to make things declarative. Then they put all the logic inside a process, every single if and while.
But, isn´t a process a how to step-by-step entity to reach a target?
For me it makes a process completely imperative. What do you think?
Orchestration languages are in fact imperative scripting languages with conditionals, looping and other traditionally imperative constructs, typically expressed through a flowchart-based user interface. They certainly do not (in my experience) implement tail-recursive functional programming, backward chaining or any other paradigm that might reasonably described as declarative in the generally accepted sense.
MS Workflow Foundation is advertised as having a rules engine, but this is fairly simplistic and doesn't really do forward chaining, except in a somewhat roundabout way. ILOG actually makes an adaptor for their rules engine specifically to drop it into MS workflow foundation.
Other workflow tools have better rule engines and a proper forward chaining system that could be viewed as declarative. However, once you get into the workflows themselves with looping and conditional branches you are most definitely in the territory of imperative programming.
However, some systems also implement a petri-net or state change based markup system for workflow, which might reasonably be described as declarative, but they still have an imperative mode of interaction with the underlying system. They still update variables and have side-effects.
I have seen one or two applications (for example TOAD for data anlaysis) actually using MS Workflow Foundation as a scripting language. As such it allows you to add a scripting facility to the application that (at least for marketing purposes) doesn't require programming skill to use.
In practice, a tool designed for writing, editing and running SQL queries being fitted with a scripting framework for 'non-programmers' makes one wonder what audience it's really aimed at. As a scripting language, workflow modelling tools are fairly clumsy and offer very limited opportunities for abstraction; in practice a .Net based scripting language such as IronPython or Boo, particularly in conjunction with a decent templating mechanism, would be a very powerful addition to such a tool.
One point about graphical languages of this sort is that they do not scale well with complexity. A similar issue applies with ETL tools as well. I have seen a provisioning application (see below) that was done (ironically) with Crossworlds (now known as Websphere Integrator). Within a month of starting on the application it became obvious that the graphical workflow language was not going to scale with the complexity of the application and it was re-built, based on a custom rules engine written in Java and a fairly large body of bespoke java code.
This type of issue is not uncommon with EAI and Orchestration systems and is one of the reasons that SOA is hard to implement in practice. What you are doing is actually pushing business logic into a very clumsy programming environment that is not being officially acknowledged as such. This will work in a simple case but is hard to make work on a complex system - this is sort of a guilty secret in SOA circles.
Coda:
A provisioning application is a system that takes plans for telecommunication services contracts (in this case for a mobile phone network) and pushes configuration information
based on rules out to various switches, billing applications and other applications. They tend to be fairly complex. When you buy a mobile phone plan with so many minutes and so many texts per month, a provisioning application is pushing out configuration information to the rest of the system about your access and billing rules.
It is definitely not what people usually mean when they talk about declarative programming, even if it some sense can be called declarative.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I'm working on improving our group's development process, and I'm considering how best to implement Design By Contract with Test-Driven Development. It seems the two techniques have a lot of overlap, and I was wondering if anyone had some insight on the following (related) questions:
Isn't it against the DRY principle to have TDD and DbC unless you're using some kind of code generator to generate the unit tests based on contracts? Otherwise, you have to maintain the contract in two places (the test and the contract itself), or am I missing something?
To what extent does TDD make DbC redundant? If I write tests well enough, aren't they equivalent to writing a contract? Do I only get added benefit if I enforce the contract at run time as well as through the tests?
Is it significantly easier/more flexible to only use TDD rather than TDD with DbC?
The main point of these questions is this more general question: If we're already doing TDD properly, will we get a significant benefit for the overhead if we also use DbC?
A couple of details, though I think the question is largely language-agnostic:
Our team is very small, <10 programmers.
We mostly use Perl.
Note the differences.
Design driven by contract. Contract Driven Design.
Develop driven by test. Test Driven Development.
They are related in that one precedes the other. They describe software at different levels of abstraction.
Do you discard the design when you go to implementation? Do you consider that a design document is a violation of DRY? Do you maintain the contract and the code separately?
Software is one implementation of the contract. Tests are another. User's manual is a third. Operations guide is a fourth. Database backup/restore procedures are one part of an implementation of the contract.
I can't see any overhead from Design by Contract.
If you're already doing design, then you change the format from too many words to just the right words to outline the contractual relationship.
If you're not doing design, then writing a contract will eliminate problems, reducing cost and complexity.
I can't see any loss of flexibility.
start with a contract,
then
a. write tests and
b. write code.
See how the two development activities are essentially intertwined and both come from the contract.
I think there is overlap between DbC and TDD, however, I don't think there is duplicated work: introducing DbC will probably result in a reduction of test cases.
Let me explain.
In TDD, tests aren't really tests. They are behavioral specifications. However, they are also design tools: by writing the test first, you use the external API of your object under test – that you haven't actually written yet – in the same way that a user would. That way, you design the API in a way that makes sense to a user, and not in the way that makes it easiest for you to implement. Something like queue.full? instead of queue.num_entries == queue.size.
This second part cannot be replaced by Contracts.
The first part can be partially replaced by contracts, at least for unit tests. TDD tests serve as specifications of behavior, both to other developers (unit tests) and domain experts (acceptance tests). Contracts also specify behavior, to other developers, to domain experts, but also to the compiler and the runtime library.
But contracts have fixed granularity: you have method pre- and postconditions, object invariants, module contracts and so on. Maybe loop variants and invariants. Unit tests however, test units of behavior. Those might be smaller than a method or consist of multiple methods. That's not something you can do with contracts. And for the "big picture" you still need integration tests, functional tests and acceptance tests.
And there is another important part of TDD that DbC doesn't cover: the middle D. In TDD, tests drive your development process: you never write a single line of implementation code unless you have a failing test, you never write a single line of test code unless your tests all pass, you only write the minimal amount of implementation code to make the tests pass, you only write the minimal amount of test code to produce a failing test.
In conclusion: use tests to design the "flow", the "feel" of the API. Use contracts to design the, well, contract of the API. Use tests to provide the "rhythm" for the development process.
Something like this:
Write an acceptance test for a feature
Write a unit test for a unit that implements a part of that feature
Using the method signature you designed in step 2, write the method prototype
Add the postcondition
Add the precondition
Implement the method body
If the acceptance test passes, goto 1, otherwise goto 2
If you want to know what Bertrand Meyer, the inventor of Design by Contract, thinks about combining TDD and DbC, there is a nice paper by his group, called Contract-Driven Design = Test-Driven Development - Writing Test Cases. The basic premise is that contracts provide an abstract representation of all possible cases, whereas test cases only test specific cases. Therefore, a suitable test harness can be automatically generated from the contracts.
I would add:
the API is the contract for the programmers, the UI definition is the contract with the clients, the protocol is the contract for client-server interactions. Get those first, then you can take advantage of parallel development tracks and not get lost in the weeds. Yes, periodically review to make sure requirements are met, but never start a new track without the contract. And 'contract' is a strong word: once deployed, it must never change. You should include versioning management and introspection from the get-go, changes to the contract are only implemented by extension sets, version numbers change with these, and then you can do things like graceful degradation when dealing with mixed or old installations.
I learned this lesson the hard way, with a large project that wandered off into never-never land, then applied it the right way later when seriously under the gun, company-survival, short fuse timeline. We defined the protocol, defined and wrote a set of protocol emulations for each side of the transactions (basically canned message generators and received message checker, one evenings' worth of two-brained coding), then parted to separately write the server and client ends of the app. We recombined the night of the show, and it just worked. Requirements, design, contract, test, code, integrate. In that order. Repeat until baked.
I am a little leery of design by TLA. As with Patterns, buzz-word compliant recipes are a good guide, but it is my experience that there is no such thing as a one-size-fits-all design or project management procedure. If you are doing things precisely By The Book (tm) then, unless it is a DOD contract with DOD procedural requirements, you will probably get into trouble somewhere along the way. Read the Book(s), yes, but be sure tounderstand them, and then take into account also the people side of your team. Rules that are only enforced by the Book will not get enforced uniformly - even when tool-enforced there can be drop-outs (e.g. svn comments left empty or cryptically brief). Procedures only tend to get followed when the tool chain not only enforces them but makes following easier than any possible short-cuts. Believe me, when the going gets tough, the short-cuts get found, and you may not know about the ones that got used at 3am until it is too late.
You can also use executable acceptance tests that are written in the domain language of the contract. It might not be the actual "contract", but half way between unit tests and the contract.
I would recomment using Ruby Cucumber
http://github.com/aslakhellesoy/cucumber
But since you are a Perl shop, then maybe you can use my own small attempt at p5-cucumber.
http://github.com/kesor/p5-cucumber
Microsoft has done work on automatic generation of unit tests, based on code contracts and parameterized unit test. E.g. the contract says the count must be increased by one when an item is added to a collection, and the parameterized unit test say how to add “n” items to a collection. Pex will then try to create a unit test that proves the contract is broken. See this video for a overview.
If this works, your unit test will only have to be written for one example of each thing you are trying to test, and PEX will be able to then work out witch data items will break the test.
I had some ruminations about that topic some time ago.
You may want to take a look at
http://gleichmann.wordpress.com/2007/12/09/test-driven-development-and-design-by-contract-friend-or-foe/
When you are using TDD to implement a new method, you need some input: you need to know the assertions to check in your tests. Design-by-contract gives you those assertions: they are the post-conditions and invariants of the method.
I have found DbC very handy for jumpstarting the red-green-refactor cycle because it helps in identifying unit tests to start with. With DbC I start thinking about pre-conditions that the object being TDD-ed must handle and each pre-condition might represent a failing unit test to start a red-green-refactor cycle. At some point I switch to start the cycle with a failing unit test for a post-condition, and then just keep the TDD flow going. I have tried this approach with newcomers to TDD and it really works in kickstarting the TDD mindset.
In summary, think of DbC as an effective way to identify key behavioral unit tests. DbC helps at analyzing inputs (pre-conditions) and outputs(post-conditions), which are the two things that we need to control (inputs) and observe (outputs) to write testable software (a similar aim of TDD).
I have been doing a little reading on Flow Based Programming over the last few days. There is a wiki which provides further detail. And wikipedia has a good overview on it too. My first thought was, "Great another proponent of lego-land pretend programming" - a concept harking back to the late 80's. But, as I read more, I must admit I have become intrigued.
Have you used FBP for a real project?
What is your opinion of FBP?
Does FBP have a future?
In some senses, it seems like the holy grail of reuse that our industry has pursued since the advent of procedural languages.
1. Have you used FBP for a real project?
We've designed and implemented a DF server for our automation project (dispatcher, component iterface, a bunch of components, DF language, DF compiler, UI). It is written in bare C++, and runs on several Unix-like systems (Linux x86, MIPS, avr32 etc., Mac OSX). It lacks several features, e.g. sophisticated flow control, complex thread control (there is only a not too advanced component for it), so it is just a prototype, even it works. We're now working on a full-featured server. We've learnt lot during implementing and using the prototype.
Also, we'll make a visual editor some day.
2. What is your opinion of FBP?
2.1. First of all, dataflow programming is ultimate fun
When I met dataflow programming, I was feel like 20 years ago, when I met programming first. Altough, DF programming differs from procedural/OOP programming, it's just a kind of programming. There are lot of things to discover, even sooo simple ones! It's very funny, when, as an experienced programmer, you met a DF problem, which is a very-very basic thing, but it was completely unknown for you before. So, if you jump into DF programming, you will feel like a rookie programmer, who first met the "cycle" or "condition".
2.2. It can be used only for specific architectures
It's just a hammer, which are for hammering nails. DF is not suitable for UIs, web server and so on.
2.3. Dataflow architecture is optimal for some problems
A dataflow framework can make magic things. It can paralellize procedures, which are not originally designed for paralellization. Components are single-threaded, but when they're organized into a DF graph, they became multi-threaded.
Example: did you know, that make is a DF system? Try make -j (see man, what -j is used for). If you have multi-core machine, compile your project with and without -j, and compare times.
2.4. Optimal split of the problem
If you're writing a program, you often split up the problem for smaller sub-problems. There are usual split points for well-known sub-problems, which you don't need to implement, just use the existing solutions, like SQL for DB, or OpenGL for graphics/animation, etc.
DF architecture splits your problem a very interesting way:
the dataflow framework, which provides the architecture (just use an existing one),
the components: the programmer creates components; the components are simple, well-separated units - it's easy to make components;
the configuration: a.k.a. dataflow programming: the configurator puts the dataflow graph (program) together using components provided by the programmer.
If your component set is well-designed, the configurator can build such system, which the programmer has never even dreamed about. Configurator can implement new features without disturbing the programmer. Customers are happy, because they have personalised solution. Software manufacturer is also happy, because he/she don't need to maintain several customer-specific branches of the software, just customer-specific configurations.
2.5. Speed
If the system is built on native components, the DF program is fast. The only time loss is the message dispatching between components compared to a simple OOP program, it's also minimal.
3. Does FBP have a future?
Yes, sure.
The main reason is that it can solve massive multiprocessing issues without introducing brand new strange software architectures, weird languages. Dataflow programming is easy, and I mean both: component programming and dataflow configuration building. (Even dataflow framework writing is not a rocket science.)
Also, it's very economic. If you have a good set of components, you need only put the lego bricks together. A DF program is easy to maintain. The DF config building requires no experienced programmer, just a system integrator.
I would be happy, if native systems spread, with doors open for custom component creating. Also there should be a standard DF language, which means that it can be used with platform-independent visual editors and several DF servers.
Interesting discussion! It occurred to me yesterday that part of the confusion may be due to the fact that many different notations use directed arcs, but use them to mean different things. In FBP, the lines represent bounded buffers, across which travel streams of data packets. Since the components are typically long-running processes, streams may comprise huge numbers of packets, and FBP applications can run for very long periods - perhaps even "perpetually" (see a 2007 paper on a project called Eon, mostly by folks at UMass Amherst). Since a send to a bounded buffer suspends when the buffer is (temporarily) full (or temporarily empty), indefinite amounts of data can be processed using finite resources.
By comparison, the E in Grafcet comes from Etapes, meaning "steps", which is a rather different concept. In this kind of model (and there are a number of these out there), the data flowing between steps is either limited to what can be held in high-speed memory at one time, or has to be held on disk. FBP also supports loops in the network, which is hard to do in step-based systems - see for example http://www.jpaulmorrison.com/cgi-bin/wiki.pl?BrokerageApplication - notice that this application used both MQSeries and CORBA in a natural way. Furthermore, FBP is natively parallel, so it lends itself to programming of grid networks, multicore machines, and a number of the directions of modern computing. One last comment: in the literature I have found many related projects, but few of them have all the characteristics of FBP. A list that I have amassed over the years (a number of them closer than Grafcet) can be found in http://www.jpaulmorrison.com/cgi-bin/wiki.pl?FlowLikeProjects .
I do have to disagree with the comment about FBP being just a means of implementing FSMs: I think FSMs are neat, and I believe they have a definite role in building applications, but the core concept of FBP is of multiple component processes running asynchronously, communicating by means of streams of data chunks which run across what are now called bounded buffers. Yes, definitely FSMs are one way of building component processes, and in fact there is a whole chapter in my book on FBP devoted to this idea, and the related one of PDAs (1) - http://www.jpaulmorrison.com/fbp/compil.htm - but in my opinion an FSM implementing a non-trivial FBP network would be impossibly complex. As an example the diagram shown in
is about 1/3 of a single batch job running on a mainframe. Every one of those blocks is running asynchronously with all the others. By the way, I would be very interested to hearing more answers to the questions in the first post!
1: http://en.wikipedia.org/wiki/Pushdown_automaton Push-down automata
Whenever I hear the term flow based programming I think of LabView, conceptually. Ie component processes who's scheduling is driven primarily by a change to its input data. This really IS lego programming in the sense that the labview platform was used for the latest crop of mindstorm products. However I disagree that this makes it a less useful programming model.
For industrial systems which typically involve data collection, control, and automation, it fits very well. What is any control system if not data in transformed to data out? Ie what component in your control scheme would you not prefer to represent as a black box in a bigger picture, if you could do so. To achieve that level of architectural clarity using other methodologies you might have to draw a data domain class diagram, then a problem domain run time class relationship, then on top of that a use case diagram, and flip back and forth between them. With flow driven systems you have the luxury of being able to collapse a lot of this information together accurately enough that you can realistically design a system visually once the components are build and defined.
One question I never had to ask when looking at an application written in labview is "What piece of code set this value?", as it was inherent and easy to trace backwards from the data, and also mistakes like multiple untintended writers were impossible to create by mistake.
If only that was true of code written in a more typically procedural fashion!
1) I build a small FBP framework for an anomaly detection project, and it turns out to have been a great idea.
You can also have a look at some of the KNIME videos, that give a good idea of what a flow based framework feels like when the framework is put together by a great team. Admittedly, it is batch based and not created for continuous operation.
By far the best example of flow based programming, however, is UNIX pipes which is one of the oldest, most overlooked FBP framework. I don't think I have to elaborate on the power of nix pipes...
2) FBP is a very powerful tool for a large set of problems. The intrinsic parallelism is a great advantage, and any FBP framework can be made completely network transparent by using adapter modules. Smart frameworks are also absurdly fault tolerant, and able to dynamically reload crashed modules when necessary. The conceptual simplicity also allows cleaner communication with everybody involved in a project, and much cleaner code.
3) Absolutely! Pipes are here to stay, and are one of the most powerful feature of unix. The power inherent in a FBP framework compared to a static program are many, and trivialise change, to the point where some frameworks can be reconfigured while running with no special measures.
FBP FTW! ;-)
In automotive development, they have a language agnostic messaging protocol which is part of the MOST specification (Media Oriented Systems Transport), this was designed to communicate between components over a network or within the same device. Systems usually have both a real and visualized message bus - therefore you effectively have a form of flow based programming.
That was what made the light bulb go on for me several years ago and brought me here. It really is a fantastic way to work and so much more fun than conventional programming. The message catalog form the central specification and point of reference. It works well for both developers and management. i.e. Management are able to browse the message catalog instead of looking at source.
With integrated logging also referencing the catalog to produce intelligible analysis things can get really productive. I have real world experience of developing commercial products in this way. I am interested in taking things further, particularly with regards to tools and IDEs. Unfortunately I think many people within the automotive sector have missed the point about how great this is and have failed to build on it. They are now distracted by other fads and failed to realize that there was far more to most development than the physical bus.
I've used Spring Web Flow extensively in Java Web applications to model (typically) application processes, which tend to be complex wizard-like affairs with lots of conditional logic as to which pages to display. Its incredibly powerful. A new product was added and I managed to recut the existing pieces into a completely new application process in an hour or two (with adding a couple of new views/states).
I also looked into using OS Workflow to model business processes but that project got canned for various reasons.
In the Microsoft world you have Windows Workflow Foundation ("WWF"), which is becoming more popular, particularly in conjunction with Sharepoint.
FBP is just a means of implementing a finite state machine. It's nothing new.
I realize that it is not exactly the same thing, but this model has been used for years in PLC programming. ISO calls it Sequential Flow Chart, but many people call it Grafcet after a popular implementation. It offers parallel processing and defines transitions between states.
It's being used in the Business Intelligence world these days to mashup and process data. Data processing steps like ETL, querying, joining , and producing reports can be done by the end-user. I'm a developer on an open system - ComposableAnalytics.com In CA, the flow-based apps can be shared and executed via the browser.
This is what MQ Series, MSMQ and JMS are for.
This is cornerstone of Web Services and Enterprise Service Bus implementations.
Products like TIBCO and Sun's JCAPS are basically flow-based without using this particular buzz-word.
Most of the work of the application is done with small modules that pass messages through a processing network.