Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
When all you have is a pair of bolt cutters and a bottle of vodka, everything looks like the lock on the door of Wolf Blitzer's boathouse. (Replace that with a hammer and a nail if you don't read xkcd)
I currently program Clojure, Python, Java and PHP, so I am familiar with the C and LISP syntax as well as the whitespace thing. I know imperative, functional, immutable, OOP and a couple type systems and other things. Now I want more!
What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
I don't feel like learning another functional language(Haskell) or another imperative OOP language(Ruby), nor do I want to practice impractical fun languages like Brainfuck.
One very interesting thing I found myself are monoiconic stack based languages like Factor.
Only when I feel I understand most concepts and have answers to all my questions, I want to start thinking about my own toy language to contain all my personal preferences.
Matters of practicality are highly subjective, so I will simply say that learning different language paradigms will only serve to make you a better programmer. What is more practical than that?
Functional, Haskell - I know you said that you didn't want to, but you should really really reconsider. You've gotten some functional exposure with Clojure and even Python, but you've not experienced it to its fullest without Haskell. If you're really against Haskell then good compromises are either ML or OCaml.
Declarative, Datalog - Many people would recommend Prolog in this slot, but I think Datalog is a cleaner example of a declarative language.
Array, J - I've only just discovered J, but I find it to be a stunning language. It will twist your mind into a pretzel. You will thank J for that.
Stack, Factor/Forth - Factor is very powerful and I plan to dig into it ASAP. Forth is the grand-daddy of the Stack languages, and as an added bonus it's simple to implement yourself. There is something to be said about learning through implementation.
Dataflow, Oz - I think the influence of Oz is on the upswing and will only continue to grow in the future.
Prototype-based, JavaScript / Io / Self - Self is the grand-daddy and highly influential on every prototype-based language. This is not the same as class-based OOP and shouldn't be treated as such. Many people come to a prototype language and create an ad-hoc class system, but if your goal is to expand your mind, then I think that is a mistake. Use the language to its full capacity. Read Organizing Programs without Classes for ideas.
Expert System, CLIPS - I always recommend this. If you know Prolog then you will likely have the upper-hand in getting up to speed, but it's a very different language.
Frink - Frink is a general purpose language, but it's famous for its system of unit conversions. I find this language to be very inspiring in its unrelenting drive to be the best at what it does. Plus... it's really fun!
Functional+Optional Types, Qi - You say you've experience with some type systems, but do you have experience with "skinnable* type systems? No one has... but they should. Qi is like Lisp in many ways, but its type system will blow your mind.
Actors+Fault-tolerance, Erlang - Erlang's process model gets a lot of the buzz, but its fault-tolerance and hot-code-swapping mechanisms are game-changing. You will not learn much about FP that you wouldn't learn with Clojure, but its FT features will make you wonder why more languages can't seem to get this right.
Enjoy!
What about Prolog (for unification/backtracking etc), Smalltalk (for "everything's a message"), Forth (reverse polish, threaded interpreters etc), Scheme (continuations)?
Not a language, but the Art of the Metaobject Protocol is mind-bending stuff
I second Haskell. Don't think "I know a Lisp, so I know functional programming". Ever heard of type classes? Algebraic data types? Monads? "Modern" (more or less - at least not 50 years old ;) ) functional languages, especially Haskell, have explored a plethora of very powerful useful new concepts. Type classes add ad-hoc polymorphism, but type inference (yet another thing the languages you already know don't have) works like a charm. Algebraic data types are simply awesome, especially for modelling trees-like data structures, but work fine for enums or simple records, too. And monads... well, let's just say people use them to make exceptions, I/O, parsers, list comprehensions and much more - in purely functional ways!
Also, the whole topic is deep enough to keep one busy for years ;)
I currently program Clojure, Python, Java and PHP [...] What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
C
There's a lot of C code lying around---it's definitely practical. If you learn C++ too, there's a big lot of more code around (and the leap is short once you know C and Java).
It also gives you (or forces you to have) a great understanding of some theoretical issues; for instance, each running program lives in a 4 GB byte array, in some sense. Pointers in C are really just indices into this array---they're just a different kind of integer. No different in Java, Python, PHP, except hidden beneath a surface layer.
Also, you can write object-oriented code in C, you just have to be a bit manual about vtables and such. Simon Tatham's Portable Puzzle Collection is a great example of fairly accessible object-oriented C code; it's also fairly well designed and well worth a read to a beginner/intermediate C programmer. This is what happens in Haskell too---type classes are in some sense "just another vtable".
Another great thing about C: engaging in Q&A with skilled C programmers will get you a lot of answers that explain C in terms of lower-level constructs, which builds your closer-to-the-iron knowledge base.
I may be missing OP's point---I think I am, judging by the other answers---but I think it might be a useful answer to other people who have a similar question and read this thread.
From Peter Norvig's site:
"Learn at least a half dozen programming languages. Include one language that supports class abstractions (like Java or C++), one that supports functional abstraction (like Lisp or ML), one that supports syntactic abstraction (like Lisp), one that supports declarative specifications (like Prolog or C++ templates), one that supports coroutines (like Icon or Scheme), and one that supports parallelism (like Sisal). "
http://norvig.com/21-days.html
I'm amazed that after 6 months and hundreds of votes, noone has mentioned SQL ...
In the types as theorems / advanced type systems: Coq ( I think Agda comes in this category too).
Coq is a proof assistant embedded into a functional programing language.
You can write mathematical proofs and Coq helps to build a solution.
You can write functions and prove properties about it.
It has dependent types, that alone blew my mind. A simple example:
concatenate: forall (A:Set)(n m:nat), (array A m)->(array A n)->(array A (n+m))
is the signature of a function that concatenates two arrays of size n and m of elements of A and returns an array of size (n+m). It won't compile if the function doesn't return that!
Is based on the calculus of inductive constructions, and it has a solid theory behind it.
I'm not smart enough to understand it all, but I think is worth taking a look, specially if you trend towards type theory.
EDIT: I need to mention: you write a function in Coq and then you can PROVE it is correct for any input, that is amazing!
One of the languages which i am interested for have a very different point of view (including a new vocabulary to define the language elements and a radical diff syntax) is J. Haskell would be the obvious choice for me, although it is a functional lang, cause its type system and other unique features open your mind and makes you rethink you previous knowledge in (functional) programming.
Just like fogus has suggested it to you in his list, I advise you too to look at the language OzML/Mozart
Many paradigms, mainly targetted at concurrency/multi agent programming.
Concerning concurrency, and distributed calculus, the equivalent of Lambda calculus (which is behind functionnal programming) is called the Pi Calculus.
I have only started begining to look at some implementation of the Pi calculus. But they already have enlarged my conceptions of computing.
Pict
Nomadic Pict
FunLoft. (this one is pretty recent, conceived at INRIA)
Dataflow programming, aka flow-based programming is a good step ahead on the road. Some buzzwords: paralell processing, rapid prototyping, visual programming (not as bad as sounds first).
Wikipedia's articles are good:
In computer science, flow-based
programming (FBP) is a programming
paradigm that defines applications as
networks of "black box" processes,
which exchange data across predefined
connections by message passing, where
the connections are specified
externally to the processes. These
black box processes can be reconnected
endlessly to form different
applications without having to be
changed internally. FBP is thus
naturally component-oriented.
http://en.wikipedia.org/wiki/Flow-based_programming
http://en.wikipedia.org/wiki/Dataflow_programming
http://en.wikipedia.org/wiki/Actor_model
Read JPM's book: http://jpaulmorrison.com/fbp/
(We've written a simple implementation in C++ for home automation purposes, and we're very happy with it. Documentation is under construction.)
You've learned a lot of languages. Now is the time to focus on one language, and master it.
perhaps you might want to try LabView for it's visual programming, although it's for engineering purposes.
nevertheless, you seem pretty interested in all that's out there, hence the suggestion
also, you could try the android appinventor for visually building stuff
Bruce A. Tate, taking a page from The Pragmatic Programmer wrote a book on exactly that:
Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages
In the book, he covers Clojure, Haskell, Io, Prolog, Scala, Erlang, and Ruby.
Mercury: http://www.mercury.csse.unimelb.edu.au/
It's a typed Prolog, with uniqueness types and modes (i.e. specifying that the predicate append(X,Y,Z) meaning X appended to Y is Z yields one Z given an X and Y, but can yield multiple X/Ys for a given Z). Also, no cut or other extra-logical predicates.
If you will, it's to Prolog as Haskell is to Lisp.
Programming does not cover the task of programmers.
New things are always interesting, but there are some very cool old stuff.
The first database system was dBaseIII for me, I was spending about a month to write small examples (dBase/FoxPro/Clipper is a table-based db with indexes). Then, at my first workplace, I met MUMPS, and I got headache. I was young and fresh-brained, but it took 2 weeks to understand the MUMPS database model. There was a moment, like in comics: after 2 weeks, a button has been switched on, and the bulb has just lighten up in my mind. MUMPS is natural, low level, and very-very fast. (It's an unbalanced, unformalized btree without types.) Today's trends shows the way back to it: NoSQL, key-value db, multidimensional db - so there are only some steps left, and we reach Mumps.
Here's a presentation about MUMPS's advantages: http://www.slideshare.net/george.james/mumps-the-internet-scale-database-presentation
A short doc on hierarchical db: http://www.cs.pitt.edu/~chang/156/14hier.html
An introduction to MUMPS globals (in MUMPS, local variables, short: locals are the memory variables, and the global variables, short: globals are the "db variables", setting a global variable goes to the disk immediatelly):
http://gradvs1.mgateway.com/download/extreme1.pdf (PDF)
Say you want to write a love poem...
Instead of using a hammer just because there's one already in your hand, learn the proper tools for the task: learn to speak French.
Once you've reached near-native speaking level, you're ready to start your poem.
While learning new languages on an academical level is an interesting hobby, IMHO you can't really learn to use one until you try to apply it to a real world problem. So, rather than looking for a new language to learn, I'd in your place first look for a new things to build, and only then I'd look for the right language to use for that one specific project. First pick the problem, then the tool, not the other way around..
For anyone who hasn't been around since the mid 80's, I'd suggest learning 8-bit BASIC. It's very low-level, very primitive and it's an interesting exercise to program around its holes.
On the same line, I'd pick an HP-41C series calculator (or emulator, although nothing beats real hardware). It's hard to wrap your brain around it, but well worth it. A TI-57 will do, but will be a completely different experience. If you manage to solve second degree equations on a TI-55, you'll be considered a master (it had no conditionals and no branches except a RST, that jumped the program back to step 0).
And last, I'd pick FORTH (it was mentioned before). It has a nice "build your language" Lisp-ish thing, but is much more bare metal. It will teach you why Rails is interesting and when DSLs make sense and you'll have a glipse on what your non-RPN calculator is thinking while you type.
PostScript. It is a rather interesting language as it's stack based, and it's quite practical once you want to put things on paper and you want either to get it done or troubleshoot why isn't it getting done.
Erlang. The intrinsic parallelism gives it a rather unusual feel and you can again learn useful things from that. I'm not so sure about practicality, but it can be useful for some fast prototyping tasks and highly redundant systems.
Try programming GPUs - either CUDA or OpenCL. It's just C/C++ extensions, but the mental model of the architecture is again completely different from the classic approach, and it definitely gets practical once you need to get some real number crunching done.
Erlang, Forth and some embedded work with assembly language. Really; buy an Arduino kit or something similar, and create a polyphonic beep in assembly. You'll really learn something.
There's also anic:
https://code.google.com/p/anic/
From its site:
Faster than C, Safer than Java, Simpler than *sh
anic is the reference implementation compiler for the experimental, high-performance, implicitly parallel, deadlock-free general-purpose dataflow programming language ANI.
It doesn't seem to be under active development anymore, but it seems to have some interesting concepts (and that, after all, is what you seem to be after).
While not meeting your requirement of "different" - I'd wager that Fantom is a language that a professional programmer should look at. By their own admission, the authors of fantom call it a boring language. It merely shores up the most common use cases of Java and C#, with some borrowed closure syntax from ruby and similar newer languages.
And yet it manages to have its own bootstrapped compiler, provide a platform that has a drop in install with no external dependencies, gets packages right - and works on Java, C# and now the Web (via js).
It may not widen your horizons in terms of new ways of programming, but it will certainly show you better ways of programming.
One thing that I see missing from the other answers: languages based on term-rewriting.
You could take a look at Pure - http://code.google.com/p/pure-lang/ .
Mathematica is also rewriting based, although it's not so easy to figure out what's going on, as it's rather closed.
APL, Forth and Assembly.
Have some fun. Pick up a Lego Mindstorm robot kit and CMU's RobotC and write some robotics code. Things happen when you write code that has to "get dirty" and interact with the real world that you cannot possibly learn in any other way. Yes, same language, but a very different perspective.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am currently doing a dissertation about the implications or dangers that today's software development practices or teachings may have on the long term effects of programming.
Just to make it clear: I am not attacking the use abstractions in programming. Every programmer knows that abstractions are the bases for modularity.
What I want to investigate with this dissertation are the positive and negative effects abstractions can have in software development. As regards the positive, I am sure that I can find many sources that can confirm this. But what about the negative effects of abstractions? Do you have any stories to share that talk about when certain abstractions failed on you?
The main concern is that many programmers today are programming against abstractions without having the faintest idea of what the abstraction is doing under-the-covers. This may very well lead to bugs and bad design. So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
Taking a simple example from Joel's Back to Basics, C's strcat:
void strcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
}
The above function hosts the issue that if you are doing string concatenation, the function is always starting from the beginning of the dest pointer to find the null terminator character, whereas if you write the function as follows, you will return a pointer to where the concatenated string is, which in turn allows you to pass this new pointer to the concatenation function as the *dest parameter:
char* mystrcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
return --dest;
}
Now this is obviously a very simple as regards abstractions, but it is the same concept I shall be investigating.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Can you please give your opinions and your says as regards this subject?
Thank you for your time and I appreciate every comment.
First of all, abstractions are inevitable because they help us to deal with the mind-blowing complexity of things.
Abstractions are also inevitable because it is more and more required of an individual to undertake more tasks or even complete projects. To address the problem, one uses libraries which wrap lower-level concepts and expose more complex behavior.
Naturally, a developer has less and less time to know the intrinsics of the things. The latest concern I heard about on SO pages is starting to learn JavaScript with jQuery library, ignoring the raw JavaScript at all.
The issue is about the balance between:
Know the little tiniest details of some technology and be a master of it, but at the same time being unable to work with anything else.
Superficial knowledge of a wide variety of technologies and tools which however proves sufficient for common everyday tasks which allows an individual to perform in multiple areas possibly covering all sides of some (moderately big) project.
Take your pick.
Some work requires the one, another position requires the other.
So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
It would be nice if people knew what is happening behind the scenes. This knowledge comes with time and practice, up to a certain degree. Depends on what kind of tasks you have. You certainly shouldn't blame people for not knowing everything. If you wish a person to be able to perform in a variety of fields, it is inevitable he won't have time to cover each up to the last bit.
What is essential, is the knowledge of the basic building blocks. Data structures, algorithms, complexity. That should provide a basis for everything else.
Knowing tiniest details of some particular technology is good, but not essential. Anyway, you can't learn them all. They're too many and they keep coming.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Schools shouldn't be teaching programming languages at all. They're to teach basics of theoretical and practical CS, social skills, communication, team work. To cover a vast variety of topics and problems to provide a wide angle view for their graduates. This will help them to find their way. Whatever they need to know in details, they'll do it on their own.
An example where abstraction has failed:
In this case, a piece of software was needed to communicate to many different third party data processors. The communication was done through various messaging protocols; the transport method/protocol is not important in this case. Just assume everyone communicated through messaging.
The idea was to abstract the features of each of these third parties into a single, unified message format. It seemed relatively straightforward because each of the third parties performed a similar service. The problem was that some third parties used different terms to explain similar features. It was also found that some third parties had additional features that other third parties did not have.
The designers of the abstraction did not see through the difference of third party terms nor did they think it was reasonable to limit the scope of the unified features to only support the common features of the third parties. Instead, a single, monolithic message schema was developed to support any and all features of the third parties considered at the time. In what was probably considered a future-proofing move, they added a means of also passing an infinite number of name/value pairs along with the monolithic message in case there were future data elements that the monolithic message could not handle.
Early on, it became clear that changing the monolithic message was going to be difficult due to so many people using it in mission critical systems. The use of the name/value pairs increased. Each name that could be used was documented inside a large spreadsheet, and developers were required to consult the spreadsheet to avoid duplication of name value function. The list got so large, however, that it was found that there were frequently collisions in purposes of name values.
The majority of the monolithic message's fields now have no purpose and are kept mainly for backwards compatibility. There are name values that can be used to replace fields in the monolithic message. The majority of the interfacing is now done through the name/value pairs. In cases where the client is intending to communicate with more than one third party, each client needs to reconcile the name values available for each third party. It would be almost simpler to interface directly to the third party themselves.
I believe this illustrates that, from a consumer of the monolithic message perspective, that it is important that developers of the consuming code not know what is happening under the covers. If the designers had considered that the consumers of the monolithic message should not have to understand the abstraction in great detail, the monolithic message and it's associated name/value pairs might never have happened. Documenting the abstraction with assertions regarding input and expected output would make life so much simpler.
As for colleges not teaching C and Lisp....they are cheating the students. You get a better understanding of what is going on with the machine and OS with C. You get a bit of a different perspective on processing data and approaching problems with Lisp. I have used some of the ideas I learned using Lisp in programs written in C, C++, .Net, and Java. Learning Java after knowing even just C is not very difficult. The OO part is really not programming language specific, so perhaps using Java for that is acceptable.
An understanding of fundamentals of algorithms (e.g. time complexity) and some knowledge about the metal is essential to designing/writing smells-good code.
I would suggest, though, that just as important is education in modern abstractions and profiling. I feel that modern abstractions make me so much more productive than I would be without them that they are at least as important as good fundamentals, if not more so.
An important element that lacked in my education was the use of profilers. When used routinely and correctly, profilers can help mitigate problems with poor fundamentals.
Since you quote Joel Spolsky, I take it your aware of his "Law of Leaky Abstractions"? I'll mention it for future readers. http://www.joelonsoftware.com/articles/LeakyAbstractions.html
Green & Blackwell's Ironies of Abstractions talks a bit about the effort of learning the abstraction. http://homepage.ntlworld.com/greenery/workStuff/Papers/index.html
The term "astronaut architecture" is a reaction to over-abstraction.
I know I certainly curse abstraction when I haven't touched Java or C# in a while and i want to write to a file, but have to instance a Stream...Writer...Adaptor....Handler....
Also, Patterns, as in Gang Of Four. Seemed great when I first read about them in the mid-90's, but can never remember factory, facade, interface, helper, worker, flyweight....
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am trying to get a couple team-members on to the OOP mindset, who currently think in terms of procedural programming.
However I am having a hard time putting into terms the "why" all this is good, and "why" they should want to benefit from it.
They use a different language than I do, and I am lacking the communication skills to explain this to them in a way that makes them "want" to learn the OOP way of doing things.
What are some good language independent books, articles, or arguments anyone can give or point to?
OOP is good for a multi-developer team because it easily allows abstraction, encapsulation, inheritance and polymorphism. These are the big buzz words of OOP and they are the big buzz words for good reasons.
Abstraction: Allows other members of your team to use code that you write without having to understand the implementation details. This reduces the amount of necessary communication. Think of The Mythical Man Month wherein it is detailed that communication is one of the highest costs facing a development team.
Encapsulation: Allows you to change your implementation details without impacting users of your code. As such, it reduces code maintenance costs.
Inheritance: Allows your team to reuse and extend your implementations with reduced costs.
Polymorphism: Allows your team to use different implementations of a given abstraction. If your team is writing code to read and parse data from a Stream, because of polymorphism it can now work with FileStreams, MemoryStreams and PigeonStreams seamlessly and with significantly reduced costs.
OOP is not a holy grail. It is inappropriate for some teams because the costs of using it could be higher than the costs of not using it. For example, if you try to design for polymorphism but never have multiple implementations of a given abstraction then you have probably increased your costs.
Always give examples.
Take a bit of their code you think is bad. Re-write it to be better. Explain why it is better. Your colleagues will either agree or disagree.
Nobody uses (or should use) techniques because they're good techniques, they (should) use them because they produce good results. The advantages of very simple use of classes and objects are usually fairly easy to see, for instance when you have an array of objects with n properties instead of n arrays, one for each field you care about, and so on.
Comparing procedural to OOP, the biggest winner by far is encapsulation. OOP doesn't mean that you get encapsulation automatically, but the process of doing it is free compared with procedural code.
Abstraction helps manage the complexity of an application: only the information that's required is exposed.
There are many ways to go about this: OOP is not the only discipline to promote this strategy.
Of course, it is not because one claims to do OOP that one builds an application without abundant "abstraction leaks" thereby defeating the strategy...
I have a bit strange thought. I don't know but there probably some areas exist where OOP is unnecessary or even bad (very-very IMHO: javascript programming).
You and your team probably work in one of these areas. In other case you'd failed many years ago due to teams which use oop and all its benefits (like different frameworks, UML and so on) would simply do their job more efficiently.
I mean that if you still work well without oop then, maybe, just leave it.
The killer phrase: With OOP you can model the world "as it is" *cough*.
OOP didn't make sense to me until I was working on an application that connected to two different databases. I needed a function called getEmployeeName() for both databases. I decided to create two objects, one for each database, to encapsulate the functions that ran against each one (there were no functions that ran against both simultaneously). Not the epitomy of OOP, but a good start for me.
Most of my code is still procedural, but I'm more aware of situations where objects would make sense in my code. I'm not of the mindset that everything needs to be one way or the other.
Re-use of existing code through hierarchies.
The killer argument is IMHO that it takes much less time to re-design your code. Here is a similar question explaining why.
Having the ability to pass an entire object around that has a bunch of methods/functions you can call using that object. For example, let's say you have want to pass a message around you only need to pass one object and everyone who gets that object will have access to all it's functions.
Also, you can declare some objects' functions as public and some as private. There is also the concept of a friend function where only objects that are related through OO hierarchies have access to their friend's functions.
Objects help keep functions near the data they use and encapsulates it all into one entity that can be easily passed around.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Are there objective metrics for measuring code refactoring?
Would running findbugs, CRAP or checkstyle before and after a refactoring be a useful way of checking whether the code was actually improved rather than just changed?
I'm looking for metrics that can be can determined and tested for to help improve the code review process.
Number of failed unittests must be less or equal to zero :)
Depending on your specific goals, metrics like cyclomatic complexity can provide an indicator for success. In the end every metric can be subverted, since they cannot capture intelligence and/or common sense.
A healthy code review process might do wonders though.
Would running findbugs, CRAP or checkstyle before and after a refactoring be a useful way of checking if the code was actually improved rather than just changed?
Actually, as I have detailed in the question "What is the fascination with code metrics?", the trend of any metrics (findbugs, CRAP, whatever) is the true added value of metrics.
It (the evolution of metrics) allows you to prioritize the main fixing action you really need to make to your code (as opposed to blindly try to respect every metric out there)
A tool like Sonar can, in this domain (monitoring of metrics) can be very useful.
Sal adds in the comments:
The real issue is on checking what code changes add value rather than just adding change
For that, test coverage is very important, because only tests (unit tests, but also larger "functional tests") will give you a valid answer.
But refactoring should not be done without a clear objective anyway. To do it only because it would be "more elegant" or even "easier to maintain" may be not in itself a good reason enough to change the code.
There should be other measures like some bugs which will be fixed in the process, or some new functions which will be implemented much faster as a result of the "refactored" code.
In short, the added value of a refactoring is not solely measured with metrics, but should also be evaluated against objectives and/or milestones.
Code size. Anything that reduces it without breaking functionality is an improvement in my book (removing comments and shortening identifiers would not count, of course)
No matter what you do just make sure this metric thing is not used for evaluating programmer performance, deciding promotion or anything like that.
I would stay away from metrics for measuring refactoring success (aside from #unit test failures == 0). Instead, I'd go with code reviews.
It doesn't take much work to find obvious targets for refactoring: "Haven't I seen that exact same code before?" For the rest, you should create certain guidelines around what not to do, and make sure your developers know about them. Then they'll be able to find places where the other developer didn't follow the standards.
For higher-level refactorings, the more senior developers and architects will need to look at code in terms of where they see the code base moving. For instance, it may be perfectly reasonable for the code to have a static structure today; but if they know or suspect that a more dynamic structure will be required, they may suggest using a factory method instead of using new, or extracting an interface from a class because they know there will be another implementation in the next release.
None of these things would benefit from metrics.
Yes, several measures of code quality can tell you if a refactoring improves the quality of your code.
Duplication. In general, less duplication is better. However, duplication finders that I've used sometimes identify duplicated blocks that are merely structurally similar but have nothing to do with one another semantically and so should not be deduplicated. Be prepared to suppress or ignore those false positives.
Code coverage. This is by far my favorite metric in general, but it's only indirectly related to refactoring. You can and should raise low coverage by writing more tests, but that's not refactoring. However, you should monitor code coverage while refactoring (as with any other change to the code) to be sure it doesn't go down. Refactoring can improve code coverage by removing untested copies of duplicated code.
Size metrics such as lines of code, total and per class, method, function, etc. A Jeff Atwood post lists a few more. If a refactoring reduces lines of code while maintaining clarity, quality has increased. Unusually long classes, methods, etc. are likely to be good targets for refactoring. Be prepared to use judgement in deciding when a class, method, etc. really does need to be longer than usual to get its job done.
Complexity metrics such as cyclomatic complexity. Refactoring should try to decrease complexity and not increase it without a well thought out reason. Methods/functions with high complexity are good refactoring targets.
Robert C. Martin's package-design metrics: Abstractness, Instability and Distance from the abstractness-instability main sequence. He described them in his article on Stability in C++ Report and his book Agile Software Development, Principles, Patterns, and Practices. JDepend is one tool that measures them. Refactoring that improves package design should minimize D.
I have used and continue to use all of these to monitor the quality of my software projects.
There are two outcomes you want from refactoring. You want to team to maintain sustainable pace and you want zero defects in production.
Refactoring takes place on the code and the unit test build during Test Driven Development (TDD). Refactoring can be small and completed on a piece of code necessary to finish a story card. Or, refactoring can be large and required a technical story card to address technical debt. The story card can be placed on the product backlog and prioritized with the business partner.
Furthermore, as you write unit tests as you do TDD, you will continue to refactor the test as the code is developed.
Remember, in agile, the management practices as defined in SCRUM will provide you collaboration and ensure you understand the needs of the business partner and the code you have developed meets the business need. However, without proper engineering practices (as defined by Extreme Programming) you project will loss sustainable pace. Many agile project that did not employ engineering practices were in need of rescue. On the other hand, a team that was disciplined and employed both management and engineering agile practice were able to sustain delivery indefinitely.
So, if your code is released with many defects or your team looses velocity, refactoring, and other engineering practices (TDD, paring, automated testing, simple evolutionary design etc), are not being properly employed.
I see the question from the smell point of view. Smells could be treated as indicators of quality problems and hence, the volume of identified smell instances could reveal the software code quality.
Smells can be classified based on their granularity and their potential impact. For instance, there could be implementation smells, design smells, and architectural smells. You need to identify smells at all granularity levels before and after to show the gain from a refactoring exercise. In fact, refactoring could be guided by identified smells.
Examples:
Implementation smells: Long method, Complex conditional, Missing default case, Complex method, Long statement, and Magic numbers.
Design smells: Multifaceted abstraction, Missing abstraction, Deficient encapsulation, Unexploited encapsulation, Hub-like modularization, Cyclically-dependent modularization, Wide hierarchy, and Broken hierarchy. More information about design smells can be found in this book.
Architecture smells: Missing layer, Cyclical dependency in packages, Violated layer, Ambiguous Interfaces, and Scattered Parasitic Functionality. Find more information about architecture smells here.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
We all know to keep it simple, right?
I've seen complexity being measured as the number of interactions between systems, and I guess that's a very good place to start. Aside from gut feel though, what other (preferably more objective) methods can be used to determine the level of complexity of a particular design or piece of software?
What are YOUR favorite rules or heuristics?
Here are mine:
1) How hard is it to explain to someone who understands the problem but hasn't thought about the solution? If I explain the problem to someone in the hall (who probably already understands the problem if they're in the hall) and can explain the solution, then it's not too complicated. If it takes over an hour, chances are good the solution's overengineered.
2) How deep in the nested functions do you have to go? If I have an object which requires a property held by an object held by another object, then chances are good that what I'm trying to do is too far removed from the object itself. Those situations become problematic when trying to make objects thread-safe, because there'd be many objects of varying depths from your current position to lock.
3) Are you trying to solve problems that have already been solved before? Not every problem is new (and some would argue that none really are). Is there an existing pattern or group of patterns you can use? If you can't, why not? It's all good to make your own new solutions, and I'm all for it, but sometimes people have already answered the problem. I'm not going to rewrite STL (though I tried, at one point), because the solution already exists and it's a good one.
Complexity can be estimated with the coupling and how cohesive are all your objects. If something is have too much coupling or is not enough cohesive, than the design will start to be more complex.
When I attended the Complex Systems Modeling workshop at the New England Complex Systems Institute (http://necsi.org/), one of the measures that they used was the number of system states.
For example if you have two nodes, which interact, A and B, and each of these can be 0 or 1, your possible states are:
A B
0 0
1 0
0 1
1 1
Thus a system of only 1 interaction between binary components can actually result in 4 different states. The point being that the complexity of the system does not necessarily increase linearly as the number of interactions increases.
good measures can be also number of files, number of places where configuration is stored, order of compilation on some languages.
Examples:
.- properties files, database configuration, xml files to hold related information.
.- tens of thousands of classes with interfaces, and database mappings
.- a extremely long and complicated build file (build.xml, Makefile, others..
If your app is built, you can measure it in terms of time (how long a particular task would take to execute) or computations (how much code is executed each time the task is run).
If you just have designs, then you can look at how many components of your design are needed to run a given task, or to run an average task. For example, if you use MVC as your design pattern, then you have at least 3 components touched for the majority of tasks, but depending on your implementation of the design, you may end up with dozens of components (a cache in addition to the 3 layers, for example).
Finally something LOC can actually help measure? :)
i think complexity is best seen as the number of things that need to interact.
A complex design would have n tiers whereas a simple design would have only two.
Complexity is needed to work around issues that simplicity cannot overcome, so it is not always going to be a problem.
There is a problem in defining complexity in general as complexity usually has a task associated with it.
Something may be complex to understand, but simple to look at (very terse code for example)
The number of interactions getting this web page from the server to your computer is very complex, but the abstraction of the http protocol is very simple.
So having a task in mind (e.g. maintenance) before selecting a measure may make it more useful. (i.e. adding a config file and logging to an app increases its objective complexity [yeah, only a little bit sure], but simplifies maintenance).
There are formal metrics. Read up on Cyclomatic Complexity, for example.
Edit.
Also, look at Function Points. They give you a non-gut-feel quantitative measurement of system complexity.