User Stories and Use case scenario - language-agnostic

What is the difference between User Stories and Use case scenario , purpose-wise ?

Use Cases are more like a contract while Use Stories are a planning tool. Consequently, Use Cases usually outlive User Stories since they (should) serve as documentation that concretely reflects the built system.
User stories are written by the customer/stakeholder/client/user. User stories aren't very detailed and are relatively open to interpretation.
Use cases are more formal in structure and are often written by a someone on the team - requirements engineer/product manager. They are often more detailed, breaking down an interaction into individual steps, and clearly identifying pre-conditions and post-conditions such as failure conditions and success conditions.
While one Use Case can cover many scenarios - success and failure; validation errors; sub use-cases and extensions - a User Story is more limited in scope, usually describing a single scenario.
See also User_story#Comparing_with_use_cases on Wikipedia, as well as the chapter "What Use Cases are Not" in the book User Stories Applied.
Lastly, according to Allistair Cockburn...
A user story is synonymous with “feature” as used in the 1990s, a marker for what is to be built, fine-grained enough to fit into modern iteration/sprint periods.
A use case provides a contextual view of what is to be built, serving to bind the organization together, among other things.

"A user story is to a use case as a gazelle is to a gazebo." -- Cockburn
User Stories (opposed to requirements) are brief statements of intent that describe something the system needs to do for some user. It's a primary technique used by agile teams to understand and communicate customer requirements. It’s certainly a handy construct, and small user stories help us drive the extreme instrumentalism that characterizes agile development.
Use Cases are a traditional way to express system behavior in complex systems. Use cases are the primary means to represent requirements with the UML. They are well described there as well as in a variety of texts on the subject. Use cases can be used for both specification and analysis. They are especially useful when the system of interest is in turn composed of other subsystems.
Books I recommend:
Agile Software Requirements (Dean Leffingwell)
Writing Effective Use Cases (Alistair Cockburn)

Related

Interesting NLP/machine-learning style project -- analyzing privacy policies

I wanted some input on an interesting problem I've been assigned. The task is to analyze hundreds, and eventually thousands, of privacy policies and identify core characteristics of them. For example, do they take the user's location?, do they share/sell with third parties?, etc.
I've talked to a few people, read a lot about privacy policies, and thought about this myself. Here is my current plan of attack:
First, read a lot of privacy and find the major "cues" or indicators that a certain characteristic is met. For example, if hundreds of privacy policies have the same line: "We will take your location.", that line could be a cue with 100% confidence that that privacy policy includes taking of the user's location. Other cues would give much smaller degrees of confidence about a certain characteristic.. For example, the presence of the word "location" might increase the likelihood that the user's location is store by 25%.
The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.
I wanted to ask whether you guys think this is a good approach to this problem. How exactly would you approach a problem like this? Furthermore, are there any specific tools or frameworks you'd recommend using. Any input is welcome. This is my first time doing a project which touches on artificial intelligence, specifically machine learning and NLP.
The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.
This is text classification. Given that you have multiple output categories per document, it's actually multilabel classification. The standard approach is to manually label a set of documents with the classes/labels that you want to predict, then train a classifier on features of the documents; typically word or n-gram occurrences or counts, possibly weighted by tf-idf.
The popular learning algorithms for document classification include naive Bayes and linear SVMs, though other classifier learners may work too. Any classifier can be extended to a multilabel one by the one-vs.-rest (OvR) construction.
A very interesting problem indeed!
On a higher level, what you want is summarization- a document has to be reduced to a few key phrases. This is far from being a solved problem. A simple approach would be to search for keywords as opposed to key phrases. You can try something like LDA for topic modelling to find what each document is about. You can then search for topics which are present in all documents- I suspect what will come up is stuff to do with licenses, location, copyright, etc. MALLET has an easy-to-use implementation of LDA.
I would approach this as a machine learning problem where you are trying to classify things in multiple ways- ie wants location, wants ssn, etc.
You'll need to enumerate the characteristics you want to use (location, ssn), and then for each document say whether that document uses that info or not. Choose your features, train your data and then classify and test.
I think simple features like words and n-grams would probably get your pretty far, and a dictionary of words related to stuff like ssn or location would finish it nicely.
Use the machine learning algorithm of your choice- Naive Bayes is very easy to implement and use and would work ok as a first stab at the problem.

What factors do I need to consider to determine whether I should "trust the defaults" with respect to encryption

Background
With respect to cryptography in general, the following advice is so common that it may even be platform and language-agnostic.
Cryptography is an incredibly complex subject which developers should leave to security experts`
I understand and agree with the reasoning behind this statement, and therefore follow the advice when using cryptography in an application.
That being said, because cryptography is tread upon so lightly in all but crypto-specific reference material, I do not know enough about how cryptography works in order to be able to determine whether the default provided to me is adequate for the situation I'm in.There are thousands of crypto frameworks out there in a myriad of different languages, I refuse to believe that every one of those implementations is secure because I don't believe every crypto implementation was created by a crypto expert, principally because if popular opinion is to be believed there just aren't that many of them.
Question:
What information do I need to know about a given encryption algorithm to be able to determine for myself whether an algorithm is a reasonable choice?
You need to know the current estimates of time-to-break for each algorithm variant.
You need to know the certifications for particular libraries.
You need to know the required effective security level for the data you are encrypting. Health information in the USA has particular requirements, for example. So do electric utilities.
The more technical you want to get with crypto algorithm evaluation, the more you are wanting the services of an expert. :-/
Consider http://www.cryptopp.com as an example of information provided. For instance, it is certified by NIST.
What information do I need to know
about a given encryption algorithm to
be able to determine for myself
whether an algorithm is a reasonable
choice?
Once you identify what you do need, there are very few peer-reviewed solutions you can trust. For example:
Symmetric Encryption: AES (Rijndael), Triple DES
Asymmetric Encryption: Diffie-Hellman, RSA
Hashing: The SHA family of functions
These are proven, battle-tested solutions. Until someone proves otherwise, they can be used safely. It's been a while since cryptography departed from security through obscurity and "roll your own" implementations.
There's a lot of cryptographic quackery out there, just be careful when choosing your solution. Make sure it's built on proven technologies, and if it sounds too good or has words like "unbreakable," "revolutionary" or the like, you can be 99% sure that it's bogus.
The effective methods are well documented and extensively used. I tend to think of three situations relative to cryptography:
If a government sized entity wants your stuff, they'll get it.
For confidential personal or business stuff, social engineering and non-cryptographic means are almost always more effective than code-breaking for almost any imaginable situation.
For hiding stuff from friends, relations, and mere interlopers, anything off the shelf is sufficient. In these scenarios that you have hidden stuff is typically more damning than the stuff itself might be.
There was a time when railroad boxcars switched from heavy-duty padlocks to easily defeated but hard to forge loops of wire. Make the lock stronger and they just go in through the walls. Turn the lock into an intrusion detector and you've gained something.
Signing and authentication are turning out to be better uses of cryptography than mere encryption.

Under what circumstances is it reasonable to use a vendor-specific API?

I'm working with Websphere Portal and Oracle. On two occasions last week, I found that the most direct way to code a particular solution involved using API's provided by IBM and Oracle. However, I realize every line of code written using these vendor API's renders us a little bit more locked in to these systems (particularly the portal, which we only implemented recently).
How would you decide whether the benefits of using a vendor API outweigh the costs of being tied (Bound? Shackled? Fettered? Pinioned?) to a particular product?
Obviously a very personal / specific question...
One thing to remember however, is that you can offset some the risks of being "held prisoner" by wrapping whatever proprietary API is supplied into your own defined API. In doing so you will not only decouple your own code from the 3rd party API, but you may also make it easier to write your own logic, by streamlining the 3rd party API to the minimal set of features you desire. A small risk, risk with this minimalist approach, however, is that you may sometimes "miss-out" on some cool features supplied by the 3rd party library/API.
If other vendors are likely to provide similar functionality with a different API, you can define your own interface for the operations, and create a vendor-specific implementation. This helps you avoid lock-in.
Whether this is worthwhile depends on several factors: how much application code will depend on the API, how complex is the API (and the wrapper it will require), how likely are you to switch vendors.
The cost-benefit ratio of being tightly bound to a custom/vendor-specific API varies a lot depending on what you're working with and with who will use your product. Vendor lock-in may be an impediment for some customers and may go unnoticed on others. Perhaps you may find these guidelines useful when deciding what to do:
If your product will target a specific company only and they're already bound to some technology stack because they have contracts that make them pay big bucks for support and there's no way that in a near or far future they'll change that stack, you should go lock-in.
If your product will target a variety of companies in the same sector and you verify that most of them have the same kind of systems, then you may go lock-in for now because it's faster or easier and later on you may adapt your code to the minority
If you will target a broad range of customers, try to avoid to lock-ins. However if you note that these will play a fundamental role in your product's time-to-market, you may again use them for your first customers and then adapt it to the needs of the others, but only as they come.
This is a pretty subjective question. My take is that if you're using Oracle, and don't have immediate plans to move away from Oracle, I can't think of a good reason to deny yourself the power and flexibility of APIs provided by the vendor. Abstracting your application to the Nth degree in pursuit of some holy grail of portability often causes more trouble than its worth, and definitely increases your cost. It can also lead to less-than-optimal performance. Even TSQL has been enhanced and tweaked by every DB vendor to the point that you're probably going to be using some proprietary extension in a query at some point; may as well get all of the benefits while you're at it.

How do I explain APIs to a non-technical audience?

A little background: I have the opportunity to present the idea of a public API to the management of a large car sharing company in my country. Currently, the only options to book a car are a very slow web interface and a hard to reach call center. So I'm excited of the possiblity of writing my own search interface, integrating this functionality into other products and applications etc.
The problem: Due to the special nature of this company, I'll first have to get my proposal trough a comission, which is entirely made up of non-technical and rather conservative people. How do I explain the concept of an API to such an audience?
Don't explain technical details like an API. State the business problem and your solution to the business problem - and how it would impact their bottom line.
For years, sales people have based pitches on two things: Features and Benefit. Each feature should have an associated benefit (to somebody, and preferably everybody). In this case, you're apparently planning to break what's basically a monolithic application into (at least) two pieces: a front end and a back end. The obvious benefits are that 1) each works independently, so development of each is easier. 2) different people can develop the different pieces, 3) it's easier to increase capacity by simply buying more hardware.
Though you haven't said it explicitly, I'd guess one intent is to publicly document the API. This allows outside developers to take over (at least some) development of the front-end code (often for free, no less) while you retain control over the parts that are crucial to your business process. You can more easily [allow others to] add new front-end code to address new market segments while retaining security/certainty that the underlying business process won't be disturbed in the process.
HardCode's answer is correct in that you should really should concentrate on the business issues and benefits.
However, if you really feel you need to explain something you could use the medical receptionist analogue.
A medical practice has it's own patient database and appointment scheduling system used by it's admin and medical staff. This might be pretty complex internally.
However when you want to book an appointment as a patient you talk to the receptionist with a simple set of commands - 'I want an appointment', 'I want to see doctor X', 'I feel sick' and they interface to their systems based on your medical history, the symptoms presented and resource availability to give you an appointment - '4:30pm tomorrow' - in simple language.
So, roughly speaking using the receptionist is analogous to an exterior program using an API. It allows you to interact with a complex system to get the information you need without having to deal with the internal complexities.
They'll be able to understand the benefit of having a mobile phone app that can interact with the booking system, and an API is a necessary component of that. The second benefit of the API being public is that you won't necessarily have to write that app, someone else will be able to (whether or not they actually do is another question, of course).
You should explain which use cases will be improved by your project proposal. An what benefits they can expect, like customer satisfaction.

Do formal methods of program verfication have a place in industry?

I took a glimpse on Hoare Logic in college. What we did was really simple. Most of what I did was proving the correctness of simple programs consisting of while loops, if statements, and sequence of instructions, but nothing more. These methods seem very useful!
Are formal methods used in industry widely?
Are these methods used to prove mission-critical software?
Well, Sir Tony Hoare joined Microsoft Research about 10 years ago, and one of the things he started was a formal verification of the Windows NT kernel. Indeed, this was one of the reasons for the long delay of Windows Vista: starting with Vista, large parts of the kernel are actually formally verified wrt. to certain properties like absence of deadlocks, absence of information leaks etc.
This is certainly not typical, but it is probably the single most important application of formal program verification, in terms of its impact (after all, almost every human being is in some way, shape or form affected by a computer running Windows).
This is a question close to my heart (I'm a researcher in Software Verification using formal logics), so you'll probably not be surprised when I say I think these techniques have a useful place, and are not yet used enough in the industry.
There are many levels of "formal methods", so I'll assume you mean those resting on a rigourous mathematical basis (as opposed to, say, following some 6-Sigma style process). Some types of formal methods have had great success - type systems being one example. Static analysis tools based on data flow analysis are also popular, model checking is almost ubiquitous in hardware design, and computational models like Pi-Calculus and CCS seem to be inspiring some real change in practical language design for concurrency. Termination analysis is one that's had a lot of press recently - The SDV project at Microsoft and work by Byron Cook are recent examples of research/practice crossover in formal methods.
Hoare Reasoning has not, so far, made great inroads in the industry - this is for more reasons than I can list, but I suspect is mostly around the complexity of writing then proving specifications for real programs (they tend to get big, and fail to express properties of many real world environments). Various sub-fields in this type of reasoning are now making big inroads into these problems - Separation Logic being one.
This is partially the nature of ongoing (hard) research. But I must confess that we, as theorists, have entirely failed to educate the industry on why our techniques are useful, to keep them relevant to industry needs, and to make them approachable to software developers. At some level, that's not our problem - we're researchers, often mathematicians, and practical usage is not foremost in our minds. Also, the techniques being developed are often too embryonic for use in large scale systems - we work on small programs, on simplified systems, get the math working, and move on. I don't much buy these excuses though - we should be more active in pushing our ideas, and getting a feedback loop between the industry and our work (one of the main reasons I went back to research).
It's probably a good idea for me to resurrect my weblog, and make some more posts on this stuff...
I cannot comment much on mission-critical software, although I know that the avionics industry uses a wide variety of techniques to validate software, including Hoare-style methods.
Formal methods have suffered because early advocates like Edsger Dijkstra insisted that they ought to be used everywhere. Neither the formalisms nor the software support were up to the job. More sensible advocates believe that these methods should be used on problems that are hard. They are not widely used in industry, but adoption is increasing. Probably the greatest inroads have been in the use of formal methods to check safety properties of software. Some of my favorite examples are the SPIN model checker and George Necula's proof-carrying code.
Moving away from practice and into research, Microsoft's Singularity operating-system project is about using formal methods to provide safety guarantees that ordinarily require hardware support. This in turn leads to faster performance and stronger guarantees. For example, in singularity they have proved that if a third-party device driver is allowed into the system (which means basic verification conditions have been proved), then it cannot possibly bring down that whole OS–he worst it can do is hose its own device.
Formal methods are not yet widely used in industry, but they are more widely used than they were 20 years ago, and 20 years from now they will be more widely used still. So you are future-proofed :-)
Yes, they are used, but not widely in all areas. There are more methods than just hoare logic, some are used more, some less, depending on suitability for given task. The common problem is that sofware is biiiiiiig and verifying that all of it is correct is still too hard a problem.
For example the theorem-prover (a software that aids humans in proving program correctness) ACL2 has been used to prove that a certain floating-point processing unit does not have a certain type of bug. It was a big task, so this technique is not too common.
Model checking, another kind of formal verification, is used rather widely nowadays, for example Microsoft provides a type of model checker in the driver development kit and it can be used to verify the driver for a set of common bugs. Model checkers are also often used in verifying hardware circuits.
Rigorous testing can be also thought of as formal verification - there are some formal specifications of which paths of program should be tested and so on.
"Are formal methods used in industry?"
Yes.
The assert statement in many programming languages is related to formal methods for verifying a program.
"Are formal methods used in industry widely ?"
No.
"Are these methods used to prove mission-critical software ?"
Sometimes. More often, they're used to prove that the software is secure. More formally, they're used to prove certain security-related assertions about the software.
There are two different approaches to formal methods in the industry.
One approach is to change the development process completely. The Z notation and the B method that were mentioned are in this first category. B was applied to the development of the driverless subway line 14 in Paris (if you get a chance, climb in the front wagon. It's not often that you get a chance to see the rails in front of you).
Another, more incremental, approach is to preserve the existing development and verification processes and to replace only one of the verification tasks at a time by a new method. This is very attractive but it means developing static analysis tools for exiting, used languages that are often not easy to analyse (because they were not designed to be).
If you go to (for instance)
http://dblp.uni-trier.de/db/indices/a-tree/d/Delmas:David.html
(sorry, only one hyperlink allowed for new users :( )
you will find instances of practical applications of formal methods to the verification of C programs (with static analyzers Astrée, Caveat, Fluctuat, Frama-C) and binary code (with tools from AbsInt GmbH).
By the way, since you mentioned Hoare Logic, in the above list of tools, only Caveat is based on Hoare logic (and Frama-C has a Hoare logic plug-in). The others rely on abstract interpretation, a different technique with a more automatic approach.
My area of expertise is the use of formal methods for static code analysis to show that software is free of run-time errors. This is implemented using a formal methods technique known "abstract interpretation". The technique essentially enables you to prove certain atributes of a s/w program. E.g. prove that a+b will not overflow or x/(x-y) will not result in a divide by zero. An example static analysis tool that uses this technique is Polyspace.
With respect to your question: "Are formal methods used in industry widely?" and "Are these methods used to prove mission-critical software?"
The answer is yes. This opinion is based on my experience and supporting the Polyspace tool for industries that rely on the use of embedded software to control safety critical systems such as electronic throttle in an automobile, braking system for a train, jet engine controller, drug delivery infusion pump, etc. These industries do indeed use these types of formal methods tools.
I don't believe all 100% of these industry segments are using these tools, but the use is increasing. My opinion is that the Aerospace and Automotive industries lead with the Medical Device industry quickly ramping up use.
Polyspace is a a (hideously expensive, but very good) commercial product based on program verification. It's fairly pragmatic, in that it scales up from 'enhanced unit testing that will probably find some bugs' to 'the next three years of your life will be spent showing these 10 files have zero defects'.
It is based more on negative verification ('this program won't corrupt your stack') instead positive verification ('this program will do precisely what these 50 pages of equations say it will').
To add to Jorg's answer, here's an interview with Tony Hoare. The tools Jorg's referring to, I think, are PREfast and PREfix. See here for more information.
Besides of other more procedural approaches, Hoare logic was in the basis of Design by Contract, introduced as an object oriented technique by Bertrand Meyer in Eiffel (see Meyer's article of 1992, page 4). While Design by Contract is not the same as formal verification methods (for one thing, DbC doesn't prove anything until the software is executed), in my opinion it provides a more practical use.