Is solving the halting problem easier than people think? [duplicate] - language-agnostic

This question already has answers here:
What exactly is the halting problem?
(24 answers)
Is there a "good enough" solution for the halting problem?
(6 answers)
Closed 15 days ago.
Although the general case is undecidable, many people still do solve problems that are equivilent well enough for day to day use.
In cohen's phd thesis on computer viruses, he showed how virus scanning is equivilent to the halting problem, yet we have an entire industry based around this challenge.
I also have seen microsoft's terminator project - http://research.microsoft.com/Terminator/
Which leads me to ask - is the halting problem overrated - do we need to worry about the general case?
Will types become turing complete over time - dependant types do seem like a good development?
Or, to look the other way, will we begin to use non turing complete languages to gain the benefits of static analysis ?

Is solving the halting problem easier than people think?
I think it is exactly as difficult as people think.
Will types become turing complete over time?
My dear, they already are!
dependant types do seem like a good development?
Very much so.
I think there could be a growth in non-Turing complete-but-provable languages. For quite some time, SQL was in this category (it isn't any more), but this didn't really diminish its utility. There is certainly a place for such systems, I think.

First: The Halting Problem is not a "problem" in a practical sense, as in "a problem that needs to be solved." It is rather a statement about the nature of mathematics, analogous to Gödel's Incompleteness Theorem.
Second: The fact that building a perfect virus scanner is intractable (due to its being equivalent to the Halting Problem) is precisely the reason that there is "an entire industry built around this challenge." If an algorithm for perfect virus scanning could be designed, it would simply be a matter of someone doing it once, and then there's no need for an industry any more. Story over.
Third: Working in a Turing Complete language does not eliminate "the benefits of static analysis"-- it merely means that there are limits to the static analysis. That's ok-- there are limits to almost everything we do, anyway.
Finally: If the Halting Problem could be "solved" in any way, it would definitely be "easier than people think", as Turing demonstrated that it is unsolvable. The general case is the only relevant case, from a mathematical standpoint. Specific cases are matters of engineering.

There are plenty of programs for which the halting problem can be solved and plenty of those programs are useful.
If you had a compiler that would tell you "Halts", "Doesn't halt", or "Don't know" then it could tell you which part of the program caused the "Halt" or "Don't know" condition. If you really wanted a program that definitely halted or didn't halt then you'd fix those "don't know" units in much the same way we get rid of compiler warnings. I think we would all be surprised at how often trying to solve this generally-impossible problem proved useful.

As a day-to-day programmer, I'd say it's worthwhile to continue as far down the path to solving halting-style problems, even if you only approach that limit and never reach it. As you pointed out, virus scanning proves valuable. Google search doesn't pretend to be the absolute answer to "find me the best X for Y," but it's also notably useful. If I unleash a novel virus (muwahaha), does that create a bigger solution set, or just cast light on an existing problem area? Regardless of the technical difference, some will pragmatically develop and charge for follow-up "detection and removal" services.
I look forward to real scientific answers for your other questions...

The Halting Problem is really only interesting if you look at it in the general case, since if the Halting problem were decidable, all other undecidable problems would also be decidable via reduction.
So, my opinion on this question is, no, it is not easy in the cases that matter. That said, in the real world, it may not be such a big deal.
See also: http://en.wikipedia.org/wiki/Halting_problem#Importance_and_consequences

Incidentally, I think that the Turing completeness of templates shows that halting is overrated. Most languages guarantee that their compilers will halt; not so C++. Does this diminish C++ as a language? I don't think so; it has many flaws, but compiles that don't always halt aren't one of them.

I don't know how hard people think it is, so I can't say if it is easier. However, you are right in your observation that undecidability of a problem (in general) does not mean that all instances of that problem are undecidable. For instance, I can easily tell you that a program like while false do something terminates (assuming the obvious semantics of the while and false).
Projects like the Terminator project you mentioned obviously exist (and probably even work in some cases), so it is clear that not all is hopeless. There is also a contest (I believe every year) for tools that try to prove termination for rewrite systems, which are basically a model of computation. But it is the case that termination in many cases is very hard to prove.
The easiest way to look at it is perhaps to see the undecidability as a maximum on the complexity of instantiations of a problem. Each instantiation is somewhere on the scale of trivial to this maximum and with a higher maximum you typically have that the instantiations are harder on average as well.

The fact that a problem is undecidable does not mean that it is not interesting: on the contrary! So yes, the fact that we do not have an effective and uniform procedure to address termination for all programs (as well as many other problems about software) does not mean that it is not worth to look for partial solutions. In a sense, this is why we need software engineering: because we cannot just delegate the task to computers.
The title of your question is, however, a bit misleading. I agree with DrPizza: the termination problem is exactly as difficult as people think.
Moreover, the fact that we do not necessarily have to worry with the general case does not imply that the termination problem is overrated: it is worth to look for partial solutions beacuse we know that the general solution is hard.
Finally, the issues about dependent types and subrecursive languages, although partially related, are really different questions, and I am not sure to see the point to mix them all together.

001 int D(int (*x)())
002 {
003 int Halt_Status = H(x, x);
004 if (Halt_Status)
005 HERE: goto HERE;
006 return Halt_Status;
007 }
008
009 int main()
010 {
011 Output("Input_Halts = ", H(D,D));
012 }
H correctly predicts that D(D) will never stop running unless H aborts its simulation of its input.
(a) If simulating halt decider H correctly simulates its input D until H correctly determines that its simulated D could not possibly reach its own "return" statement in a finite number of simulated steps then:
(b) H can abort its simulation of D and correctly report that D specifies a non-halting sequence of configurations.
When it is understood that (b) is a necessary consequence of (a) and we can see that (a) has been met then we understand that H(D,D) could correctly determine the halt status of its otherwise "impossible" input.
Simulating halt deciders applied to the halting theorem
The above is fully operational code in the x86utm operating system.
Because H correctly detects that D correctly simulated by H would continue to call H(D,D) never reaching its own "return" statement H aborts it simulation of D and returns 0 to main() on line 011.
I finally have agreement on this key point:
H(D,D) does correctly compute the mapping from its input to its reject
state on the basis that H correctly predicts that D correctly
simulated by H would never halt.
I am the original author of this work and anything that you find on the internet about this was written by me.

Related

Bridging the gap between hardware and software

So, I've done a bit of programming in my day. Java, C#, C++, and I've always had a fascination with computers in general. One thing that I would really like to learn, and, what I think would really help my programming skills, is how software tells the hardware what to do.
I'm aware that's quite the tall order: I know that's different per language; per OS. I'm not asking for an actual answer, as much as I'm asking for a starting point. Also, if this is actually a waste of time, like, if it wouldn't really help my programming and/or wouldn't be worth it because it's a massive amount of stuff to learn and it would take years for it to actually pay off, saying that would be helpful too.
I can't escape the feeling that I'm asking a stupid question.
What we commonly call hardware can be thought of as a (big) number of electrical devices that function according to some specific rules. By putting some electrons in the input(s), the output(s) will vary after after a fixed rule ( similar devices behave the same). The best known device is the transistor. Transistors can be connected in such a way that they perform logical functions, the most used being NAND ( not and). Using NAND gates any kind of logic can be(and is) implemented. To sum it up, hardware does logic functions by moving electrons around.
Now comes the interresting question. What is software? People tend to think that because there is thought involved in writing software, that it doesn't exist in the real world. Which isn't true. The program is stored in RAM* when you write it, effectively being a pattern of electrons. Now this pattern suffers some transformations ( compiler , assembler ), during those steps the pattern changes from something that is meaningfull to humans to something that can be used as input to the logic functions from above.
On a tangent: A RS flip flop is an interresting device. It uses two NAND blocks to create a memory cell.
Have you thought of hardware design? Either studying it by reading up, or by actually designing your own hardware. You could buy yourself a Raspberry PI, or Arduino, or something else if you don't want to get your hands dirty. Use any of these options to get your hands on hardware, or even use something like Vbox and write your own operating system.
Some random thoughts to consider. And, no your question isn't a stupid one at all.

How to convince your fellow developer to write short methods?

Long methods are evil on several grounds:
They're hard to understand
They're hard to change
They're hard to reuse
They're hard to test
They have low cohesion
They may have high coupling
They tend to be overly complex
How to convince your fellow developer to write short methods? (weapons are forbidden =)
question from agiledeveloper
Ask them to write unit tests for the methods.
That depends on your definitions of "short" and "long".
When I hear someone say "write short methods", I immediately react badly because I've encountered too much spaghetti written by people who think the ideal method is two lines long: One line to do the tiniest possible unit of work followed by one line to call another method. (You say long methods are evil because "they're hard to understand"? Try walking into a project where every trivial action generates a call stack 50 methods deep and trying to figure out which of those 50 layers is the one you need to change...)
On the other hand, if, by "short", you mean "self-contained and limited to a single conceptual function", then I'm all for it. But remember that this can't be measured simply by lines of code.
And, as tydok pointed out, you catch more flies with honey than vinegar. Try telling them why your way is good instead of why their way is bad. If you can do this without making any overt comparisons or references to them or their practices (unless they specifically ask how your ideas would relate to something they're doing), it'll work even better.
You made a list of drawbacks. Try to make a list of what you'll gain by using short methods. Concrete examples. Then try to convince him again.
I read this quote from somewhere:
Write your code as if the person who has to maintain it is a violent psycho, who knows where you live.
In my experience the best way to convince a peer in these cases is by example. Just find opportunities to show them your code and discuss with them the benefits of short functions vs. long functions. Eventually they'll realize what's better spontaneously, without the need to make them feel "bad" programmers.
Code Reviews!
I suggest you try and get some code reviews going. This way you could have a little workshop on best practices and whatever formatting your company adhers to. This adds the context that short methods is a way to make code more readable and easier to understand and also compliant with the SRP.
If you've tried to explain good design and people just aren't getting it, or are just refusing to get it, then stop trying. It's not worth the effort. All you'll get is a bad rep for yourself. Some people are just hopeless.
Basically what it comes down to is that some programmers just aren't cut out for development. They can understand code that's already written, but they can't create it on their own.
These folks should be steered toward a support role, but they shouldn't be allowed to work on anything new. Support is a good place to see lots of different code, so maybe after a few years they'll come to see the benefits of good design.
I do like the idea of Code Reviews that someone else suggested. These sloppy programmers should not only have their own code reviewed, they should sit in on reviews of good code as well. That will give them a chance to see what good code is. Possibly they've just never seen good code.
To expand upon rvanider's answer, performing the cyclomatic complexity analysis on the code did wonders to get attention to the large method issue; getting people to change was still in the works when I left (too much momentum towards big methods).
The tipping point was when we started linking the cyclomatic complexity to the bug database. A CC of over 20 that wasn't a factory was guaranteed to have several entries in the bug database and oftentimes those bugs had a "bloodline" (fix to Bug A caused Bug B; fix to Bug B caused Bug C; etc). We actually had three CC's over 100 (max of 275) and those methods accounted for 40% of the cases in our bug database -- "you know, maybe that 5000 line function isn't such a good idea..."
It was more evident in the project I led when I started there. The goal was to keep CC as low as possible (97% were under 10) and the end result was a product that I basically stopped supporting because the 20 bugs I had weren't worth fixing.
Bug-free software isn't going to happen because of short methods (and this may be an argument you'll have to address) but the bug fixes are very quick and are often free of side-effects when you are working with short, concise methods.
Though writing unit tests would probably cure them of long methods, your company probably doesn't use unit tests. Rhetoric only goes so far and rarely works on developers who are stuck in their ways; show them numbers about how those methods are creating more work and buggy software.
Finding the right blend between function length and simplicity can be complex. Try to apply a metric such as Cyclomatic Complexity to demonstrate the difficulty in maintaining the code in its present form. Nothing beats a non-personal measurement that is based on testing factors such as branch and decision counts.
Not sure where this great quote comes from, but:
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it"
Force him to read Code Complete by Steve McConnell. Say that every good developer has to read this.
Get him drunk? :-)
The serious point to this answer is the question, "why do I consistently write short functions, and hate myself when I don't?"
The reason is that I have difficulty understanding complex code, be that long functions, things that maintain and manipulate a lot of state, or that sort of thing. I noticed many years ago that there are a fair number of people out there that are significantly better at dealing with this sort of complexity than I am. Ironically enough, it's probably because of that that I tend to be a better programmer than many of them: my own limitations force me to confront and clean up that sort of code.
I'm sorry I can't really provide a real answer here, but perhaps this can provide some insight to help lead us to an answer.
Force them to read the book "Clean Code", there are many others but this one is new, good and an easy read.
Asking them to write Unit tests for the complex code is a good avenue to take. This person needs to see for himself what that debt that complexity brings when performing maintenance or analysis.
The question I always ask my team is: "It's 11 pm and you have to read this code - can you? Do you understand under pressure? Can you, over the phone, no remote login, lead them to the section where they can fix an error?" If the answer is no, the follow up is "Can you isolate some of the complexity?"
If you get an argument in return, it's a lost cause. Throw something then.
I would give them 100 lines of code all under 1 method and then another 100 lines of code divided up between several methods and ask them to write down an explanation of what each does.
Time how long it takes to write both paragraphs and then show them the result.
...Make sure to pick code that will take twice or three times as long to understand if it were all under one method - Main() -
Nothing is better than learning by example.
short or long are terms that can be interpreted differently. For one short is a 2 line method while some else will think that method with no more than 100 lines of code are pretty short.
I think it would be better to state that a single method should not do more than one thing at the same time, meaning it should only have one responsibility.
Maybe you could let your fellow developers read something about how to practice the SOLID principles.
I'd normally show them older projects which have well written methods. I would then step through these methods while explaining the reasons behind why we developed them that way.
Hopefully when looking at the bigger picture, they would understand the reasons behind this.
ps. Also, this exercise could be used in conjuction as a mini knowledge transfer on older projects.
Show him how much easier it is to test short methods. Prove that writing short methods will make it easier and faster for him to write the tests for his methods (he is testing these methods, right?)
Bring it up when you are reviewing his code. "This method is rather long, complicated, and seems to be doing four distinct things. Extract method here, here, and here."
Long methods usually mean that the object model is flawed, i.e. one class has too many responsibilities. Chances are that you don't want just more functions, each one shorter, in the same class, but those responsibilies properly assigned to different classes.
No use teaching a pig to sing. It wastes your time and annoys the pig.
Just outshine someone.
When it comes time to fix a bug in the 5000 line routine, then you'll have a ten-line routine and a 4990-line routine. Do this slowly, and nobody notices a sudden change except that things start working better and slowly the big ball of mud evaporates.
You might want to tell them that he might have a really good memory, but you don't. Some people are able to handle much longer methods than others. If you both have to be able to maintain the code, it can only be done if the methods are smaller.
Only do this if he doesn't have a superiority complex
[edit]
why is this collecting negative scores?
You could start refactoring every single method they wrote into multiple methods, even when they're currently working on them. Assign extra time to your schedule for "refactoring other's methods to make the code maintanable". Do it like you think it should be done, and - here comes the educational part - when they complain, tell them you wouldn't have to refactor the methods if they would have made it right the first time. This way, your boss learns that you have to correct other's lazyness, and your co-workers learn that they should make it different.
That's at least some theory.

Should code be short/concise? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
When writing a mathematical proof, one goal is to continue compressing the proof. The proof gets more elegant but not necessarily more readable. Compression translates to better understanding, as you weed out unnecessary characters and verbosity.
I often hear developers say you should make your code foot print as small as possible. This can very quickly yield unreadable code. In mathematics, it isn't such an issue since the exercise is purely academic. However, in production code where time is money, having people try to figure out what some very concise code is doing doesn't seem to make much sense. For a little more verbose code, you get readability and savings.
At what point do you stop compressing software code?
I try to reach a level of verbosity where my program statements read like a sentence any programmer could understand. This does mean heavily refactoring my code such that it's all short pieces of a story, so each action would be described in a separate method (an even further level might be to another class).
Meaning I would not reduce my number of characters just because it can be expressed in fewer. That's what code-golf competitions are for.
My rule is say what you mean. One common way I see people go wrong is "strength reduction." Basically, they replace the concept they are thinking with something that seems to skip steps. Unfortunately, they are leaving concepts out of their code, making it harder to read.
For example, changing
for (int i = 0; i < n; i++)
foo[i] = ...
to
int * p = foo, q = foo+n;
while ( *p++ = ... < q );
is an example of a strength reduction that seems to save steps, but it leaves out the fact that foo is an array, making it harder to read.
Another common one is using bool instead of an enum.
enum {
MouseDown,
MouseUp
};
Having this be
bool IsMouseDown;
leaves out the fact that this is a state machine, making the code harder to maintain.
So my rule of thumb would be, in your implementation, don't dig down to a lower level than the concepts you are trying to express.
You can make code smaller by seeing redundancy and eliminating it, or by being clever. Do the former and not the latter.
Here's a good article by Steve McConnell - Best Practices http://www.stevemcconnell.com/ieeesoftware/bp06.htm
I think short/concise are two results from well written code. There are many aspects to make code good and many results from well written code, realize the two are different. You don't plan for a small foot print, you plan for a function that is concise and does a single thing extremely well - this SHOULD lead to a small foot print (but may not). Here's a short list of what I would focus on when writing code:
single focused functions - a function should do only one thing, a simple delivery, multi featured functions are buggy and not easily reusable
loosely coupled - don't reach out from inside one function to global data and don't rely heavily on other functions
precise naming - use meaningful precise variable names, cryptic names are just that
keep the code simple and not complex - don't over use language specific technical wow's, good for impressing others, difficult to easily understand and maintain - if you do add something 'special' comment it so at least people can appreciate it prior to cursing you out
evenly comment - to many comments will be ignored and outdated to few have no meaning
formatting - take pride in how the code looks, properly indented code helps
work with the mind of a code maintenance person - think what it would be like to maintain the code you're writting
do be afraid or to lazy to refactor - nothing is perfect the first time, clean up your own mess
One way to find a balance is to seek for readability and not concise-ness. Programmers are constantly scanning code visually to see what is being done, and so the code should as much as possible flow nicely.
If the programmer is scanning code and hits a section that is hard to understand, or takes some effort to visually parse and understand, it is a bad thing. Using common well understood constructs is important, stay away from the vague and infrequently used unless necessary.
Humans are not compilers. Compilers can eat the stuff and keep moving on. Obscure code is not mentally consumed by humans as quickly as clearly understood code.
At times it is very hard to produce readable code in a complicated algorithm, but for the most part, human readability is what we should look for, and not cleverness. I don't think length of code is really a measure of clearness either, because sometimes a more verbose method is more readable than a concise method, and sometimes a concise method is more readable than a long one.
Also, comments should only supplement, and should not describe your code, your code should describe itself. If you have to comment a line because it isn't obvious what is done, that is bad. It takes longer for most experienced programmers to read an English explanation than it does to read the code itself. I think the book Code Complete hammers this one home.
As far as object names go, the thinking on this has gone through an evolution with the introduction of new programming languages.
If you take the "curly brace" languages, starting with C, brevity was considered the soul of wit. So, you would have a variable to hold a loan value named "lv", for instance. The idea was that you were typing a lot of code, so keep the keystrokes to a minimum.
Then along came the Microsoft-sanctioned "Hungarian notation", where the first letters of a variable name were meant to indicate its underlying type. One might use "fLV", or some such, to indicate that the loan value was represented by a float variable.
With Java, and then C#, the paradigm has become one of clarity. A good name for a loan value variable would be "loanValue". I believe part of the reason for this is the command-completion feature in most modern editors. Since its not necessary to type an entire name anymore, you might as well use as many characters as is needed to be descriptive.
This is a good trend. Code needs to be intelligible. Comments are often added as an afterthought, if at all. They are also not updated as code is updated, so they become out of date. Descriptive, well-chosen, variable names are the first, best and easiest way to let others know what you were coding about.
I had a computer science professor who said "As engineers, we are constantly creating types of things that never existed before. The names that we give them will stick, so we should be careful to name things meaningfully."
There needs to be a balance between short sweet source code and performance. If it is nice source and runs the fastest, then good, but for the sake of nice source it runs like a dog, then bad.
Strive to refactor until the code itself reads well. You'll discover your own mistakes in the process, the code will be easier to grok for the "next guy", and you won't be burdened by maintaining (and later forgetting to change) in comments what you're already expressed in code.
When that fails... sure, leave me a comment.
And don't tell me "what" in the comment (that's what the code is for), tell me "why".
As opposed to long/rambling? Sure!
But it gets to the point where it's so short and so concise that it's hard to understand, then you've gone too far.
Yes. Always.
DRY: Don't Repeat Yourself. That will give you a code that is both concise and secure. Writing the same code several times is a good way to make it hard to maintain.
Now that does not mean you should make a function of any blocks of code looking remotely alike.
A very common error (horror ?) for instance is factorizing code doing nearly the same thing, and to handle the differences between occurences by adding a flag to function API. This may look inocuous at first, but generates code flow hard to understand and bug prone, and even harder to refactor.
If you follow common refactoring rules (looking about code smells) your code will become more and more concise as a side effect as many code smells are about detecting redundancy.
On the other hand, if you try to make the code as short as possible not following any meaningfull guidelines, at some point you will have to stop because you just won't see any more how to reduce code.
Just imagine if the first step is removing all useless whitespaces... after that step code in most programming languages will become so hard to read you won't have much chance to find any other possible enhancement.
The example above is quite caricatural, but not so far from what you get when trying to optimise for size without following any sensible guideline.
There's no exact line that can be drawn to distinguish between code that is glib and code that is flowery. Use your best judgment. Have others look at your code and see how easily they can understand it. But remember, correctness is the number 1 goal.
The need for small code footprints is a throwback from the days of assembly language and the first slightly high level languages... there small code footprints where a real and pressing need. These days though, its not so much of a necessity.
That said, I hate verbose code. Where I work, we write code that reads as much as possible like a natural language, without any extra grammar or words. And we don't abbreviate anything unless its a very common abbreviation.
Company.get_by_name("ABC")
makeHeaderTable()
is about as terse as we go.
In general, I make things obvious and easy to work with. If concision/shortness serves me in that end, all the better. Often short answers are the clearest, so shortness is a byproduct of obvious.
There are a couple points to my mind that determine when to stop optimizing:
Worth of spending time performing optimizations. If you have people spending weeks and not finding anything, are there better uses of those resources?
What is the order of optimization priority. There are a few different factors that one could care about when it comes to code: Execution time, execution space(both running and just the compiled code), scalability, stability, how many features are implemented, etc. Part of this is the trade off of time and space, but it can also be where does some code go, e.g. can middleware execute ad hoc SQL commands or should those be routed through stored procedures to improve performance?
I think the main point is that there is a moderation that most good solutions will have.
The code optimizations have little to do with the coding style. The fact that the file contains x spaces or new lines less than at the beginning does not make it better or faster, at least at the execution stage - you format the code with white characters that are unsually ignored by the compiler. It even makes the code worse, because it becomes unreadable for the other programmers and yourself.
It is much more important for the code to be short and clean in its logical structure, such as testing conditions, control flow, assumptions, error handling or the overall programming interface. Of course, I would also include here smart and useful comments + the documentation.
There is not necessarily a correlation between concise code and performance. This is a myth. In mature languages like C/C++ the compilers are capable of optimizing the code very effectively. There is cause need in such languages to assume that the more concise code is the better performing code. Newer, less performance-optimized languages like Ruby lack the compiler optimization features of C/C++ compilers, but there is still little reason to believe that concise code is better performing. The reality is that we never know how well code will perform in production until it gets into production and is profiled. Simple, innocuous, functions can be huge performance bottlenecks if called from enough locations within the code. In highly concurrent systems the biggest bottlenecks are generally caused by poor concurrency algorithms or excessive locking. These issues are rarely solved by writing "concise" code.
The bottom line is this: Code that performs poorly can always be refactored once profiling determines it is the bottleneck. Code can only be effectively refactored if it is easy to understand. Code that is written to be "concise" or "clever" is often more difficult to refactor and maintain.
Write your code for human readability then refactor for performance when necessary.
My two cents...
Code should be short, concrete, and concentrated. You can always explain your ideas with many words in the comments.
You can make your code as short or compact as you like as long as you comment it. This way your code can be optimized but still make sence. I tend to stay in the middle somewhere with descriptive variables and methods and sparce comments if it is still unclear.

What's your most controversial programming opinion?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
This is definitely subjective, but I'd like to try to avoid it becoming argumentative. I think it could be an interesting question if people treat it appropriately.
The idea for this question came from the comment thread from my answer to the "What are five things you hate about your favorite language?" question. I contended that classes in C# should be sealed by default - I won't put my reasoning in the question, but I might write a fuller explanation as an answer to this question. I was surprised at the heat of the discussion in the comments (25 comments currently).
So, what contentious opinions do you hold? I'd rather avoid the kind of thing which ends up being pretty religious with relatively little basis (e.g. brace placing) but examples might include things like "unit testing isn't actually terribly helpful" or "public fields are okay really". The important thing (to me, anyway) is that you've got reasons behind your opinions.
Please present your opinion and reasoning - I would encourage people to vote for opinions which are well-argued and interesting, whether or not you happen to agree with them.
Programmers who don't code in their spare time for fun will never become as good as those that do.
I think even the smartest and most talented people will never become truly good programmers unless they treat it as more than a job. Meaning that they do little projects on the side, or just mess with lots of different languages and ideas in their spare time.
(Note: I'm not saying good programmers do nothing else than programming, but they do more than program from 9 to 5)
The only "best practice" you should be using all the time is "Use Your Brain".
Too many people jumping on too many bandwagons and trying to force methods, patterns, frameworks etc onto things that don't warrant them. Just because something is new, or because someone respected has an opinion, doesn't mean it fits all :)
EDIT:
Just to clarify - I don't think people should ignore best practices, valued opinions etc. Just that people shouldn't just blindly jump on something without thinking about WHY this "thing" is so great, IS it applicable to what I'm doing, and WHAT benefits/drawbacks does it bring?
"Googling it" is okay!
Yes, I know it offends some people out there that their years of intense memorization and/or glorious stacks of programming books are starting to fall by the wayside to a resource that anyone can access within seconds, but you shouldn't hold that against people that use it.
Too often I hear googling answers to problems the result of criticism, and it really is without sense. First of all, it must be conceded that everyone needs materials to reference. You don't know everything and you will need to look things up. Conceding that, does it really matter where you got the information? Does it matter if you looked it up in a book, looked it up on Google, or heard it from a talking frog that you hallucinated? No. A right answer is a right answer.
What is important is that you understand the material, use it as the means to an end of a successful programming solution, and the client/your employer is happy with the results.
(although if you are getting answers from hallucinatory talking frogs, you should probably get some help all the same)
Most comments in code are in fact a pernicious form of code duplication.
We spend most of our time maintaining code written by others (or ourselves) and poor, incorrect, outdated, misleading comments must be near the top of the list of most annoying artifacts in code.
I think eventually many people just blank them out, especially those flowerbox monstrosities.
Much better to concentrate on making the code readable, refactoring as necessary, and minimising idioms and quirkiness.
On the other hand, many courses teach that comments are very nearly more important than the code itself, leading to the this next line adds one to invoiceTotal style of commenting.
XML is highly overrated
I think too many jump onto the XML bandwagon before using their brains...
XML for web stuff is great, as it's designed for it. Otherwise I think some problem definition and design thoughts should preempt any decision to use it.
My 5 cents
Not all programmers are created equal
Quite often managers think that DeveloperA == DeveloperB simply because they have same level of experience and so on. In actual fact, the performance of one developer can be 10x or even 100x that of another.
It's politically risky to talk about it, but sometimes I feel like pointing out that, even though several team members may appear to be of equal skill, it's not always the case. I have even seen cases where lead developers were 'beyond hope' and junior devs did all the actual work - I made sure they got the credit, though. :)
I fail to understand why people think that Java is absolutely the best "first" programming language to be taught in universities.
For one, I believe that first programming language should be such that it highlights the need to learn control flow and variables, not objects and syntax
For another, I believe that people who have not had experience in debugging memory leaks in C / C++ cannot fully appreciate what Java brings to the table.
Also the natural progression should be from "how can I do this" to "how can I find the library which does that" and not the other way round.
If you only know one language, no matter how well you know it, you're not a great programmer.
There seems to be an attitude that says once you're really good at C# or Java or whatever other language you started out learning then that's all you need. I don't believe it- every language I have ever learned has taught me something new about programming that I have been able to bring back into my work with all the others. I think that anyone who restricts themselves to one language will never be as good as they could be.
It also indicates to me a certain lack of inquistiveness and willingness to experiment that doesn't necessarily tally with the qualities I would expect to find in a really good programmer.
Performance does matter.
Print statements are a valid way to debug code
I believe it is perfectly fine to debug your code by littering it with System.out.println (or whatever print statement works for your language). Often, this can be quicker than debugging, and you can compare printed outputs against other runs of the app.
Just make sure to remove the print statements when you go to production (or better, turn them into logging statements)
Your job is to put yourself out of work.
When you're writing software for your employer, any software that you create is to be written in such a way that it can be picked up by any developer and understood with a minimal amount of effort. It is well designed, clearly and consistently written, formatted cleanly, documented where it needs to be, builds daily as expected, checked into the repository, and appropriately versioned.
If you get hit by a bus, laid off, fired, or walk off the job, your employer should be able to replace you on a moment's notice, and the next guy could step into your role, pick up your code and be up and running within a week tops. If he or she can't do that, then you've failed miserably.
Interestingly, I've found that having that goal has made me more valuable to my employers. The more I strive to be disposable, the more valuable I become to them.
1) The Business Apps farce:
I think that the whole "Enterprise" frameworks thing is smoke and mirrors. J2EE, .NET, the majority of the Apache frameworks and most abstractions to manage such things create far more complexity than they solve.
Take any regular Java or .NET ORM, or any supposedly modern MVC framework for either which does "magic" to solve tedious, simple tasks. You end up writing huge amounts of ugly XML boilerplate that is difficult to validate and write quickly. You have massive APIs where half of those are just to integrate the work of the other APIs, interfaces that are impossible to recycle, and abstract classes that are needed only to overcome the inflexibility of Java and C#. We simply don't need most of that.
How about all the different application servers with their own darned descriptor syntax, the overly complex database and groupware products?
The point of this is not that complexity==bad, it's that unnecessary complexity==bad. I've worked in massive enterprise installations where some of it was necessary, but even in most cases a few home-grown scripts and a simple web frontend is all that's needed to solve most use cases.
I'd try to replace all of these enterprisey apps with simple web frameworks, open source DBs, and trivial programming constructs.
2) The n-years-of-experience-required:
Unless you need a consultant or a technician to handle a specific issue related to an application, API or framework, then you don't really need someone with 5 years of experience in that application. What you need is a developer/admin who can read documentation, who has domain knowledge in whatever it is you're doing, and who can learn quickly. If you need to develop in some kind of language, a decent developer will pick it up in less than 2 months. If you need an administrator for X web server, in two days he should have read the man pages and newsgroups and be up to speed. Anything less and that person is not worth what he is paid.
3) The common "computer science" degree curriculum:
The majority of computer science and software engineering degrees are bull. If your first programming language is Java or C#, then you're doing something wrong. If you don't get several courses full of algebra and math, it's wrong. If you don't delve into functional programming, it's incomplete. If you can't apply loop invariants to a trivial for loop, you're not worth your salt as a supposed computer scientist. If you come out with experience in x and y languages and object orientation, it's full of s***. A real computer scientist sees a language in terms of the concepts and syntaxes it uses, and sees programming methodologies as one among many, and has such a good understanding of the underlying philosophies of both that picking new languages, design methods, or specification languages should be trivial.
Getters and Setters are Highly Overused
I've seen millions of people claiming that public fields are evil, so they make them private and provide getters and setters for all of them. I believe this is almost identical to making the fields public, maybe a bit different if you're using threads (but generally is not the case) or if your accessors have business/presentation logic (something 'strange' at least).
I'm not in favor of public fields, but against making a getter/setter (or Property) for everyone of them, and then claiming that doing that is encapsulation or information hiding... ha!
UPDATE:
This answer has raised some controversy in it's comments, so I'll try to clarify it a bit (I'll leave the original untouched since that is what many people upvoted).
First of all: anyone who uses public fields deserves jail time
Now, creating private fields and then using the IDE to automatically generate getters and setters for every one of them is nearly as bad as using public fields.
Many people think:
private fields + public accessors == encapsulation
I say (automatic or not) generation of getter/setter pair for your fields effectively goes against the so called encapsulation you are trying to achieve.
Lastly, let me quote Uncle Bob in this topic (taken from chapter 6 of "Clean Code"):
There is a reason that we keep our
variables private. We don't want
anyone else to depend on them. We want
the freedom to change their type or
implementation on a whim or an
impulse. Why, then, do so many
programmers automatically add getters
and setters to their objects, exposing
their private fields as if they were
public?
UML diagrams are highly overrated
Of course there are useful diagrams e.g. class diagram for the Composite Pattern, but many UML diagrams have absolutely no value.
Opinion: SQL is code. Treat it as such
That is, just like your C#, Java, or other favorite object/procedure language, develop a formatting style that is readable and maintainable.
I hate when I see sloppy free-formatted SQL code. If you scream when you see both styles of curly braces on a page, why or why don't you scream when you see free formatted SQL or SQL that obscures or obfuscates the JOIN condition?
Readability is the most important aspect of your code.
Even more so than correctness. If it's readable, it's easy to fix. It's also easy to optimize, easy to change, easy to understand. And hopefully other developers can learn something from it too.
If you're a developer, you should be able to write code
I did quite a bit of interviewing last year, and for my part of the interview I was supposed to test the way people thought, and how they implemented simple-to-moderate algorithms on a white board. I'd initially started out with questions like:
Given that Pi can be estimated using the function 4 * (1 - 1/3 + 1/5 - 1/7 + ...) with more terms giving greater accuracy, write a function that calculates Pi to an accuracy of 5 decimal places.
It's a problem that should make you think, but shouldn't be out of reach to a seasoned developer (it can be answered in about 10 lines of C#). However, many of our (supposedly pre-screened by the agency) candidates couldn't even begin to answer it, or even explain how they might go about answering it. So after a while I started asking simpler questions like:
Given the area of a circle is given by Pi times the radius squared, write a function to calculate the area of a circle.
Amazingly, more than half the candidates couldn't write this function in any language (I can read most popular languages so I let them use any language of their choice, including pseudo-code). We had "C# developers" who could not write this function in C#.
I was surprised by this. I had always thought that developers should be able to write code. It seems that, nowadays, this is a controversial opinion. Certainly it is amongst interview candidates!
Edit:
There's a lot of discussion in the comments about whether the first question is a good or bad one, and whether you should ask questions as complex as this in an interview. I'm not going to delve into this here (that's a whole new question) apart from to say you're largely missing the point of the post.
Yes, I said people couldn't make any headway with this, but the second question is trivial and many people couldn't make any headway with that one either! Anybody who calls themselves a developer should be able to write the answer to the second one in a few seconds without even thinking. And many can't.
The use of hungarian notation should be punished with death.
That should be controversial enough ;)
Design patterns are hurting good design more than they're helping it.
IMO software design, especially good software design is far too varied to be meaningfully captured in patterns, especially in the small number of patterns people can actually remember - and they're far too abstract for people to really remember more than a handful. So they're not helping much.
And on the other hand, far too many people become enamoured with the concept and try to apply patterns everywhere - usually, in the resulting code you can't find the actual design between all the (completely meaningless) Singletons and Abstract Factories.
Less code is better than more!
If the users say "that's it?", and your work remains invisible, it's done right. Glory can be found elsewhere.
PHP sucks ;-)
The proof is in the pudding.
Unit Testing won't help you write good code
The only reason to have Unit tests is to make sure that code that already works doesn't break. Writing tests first, or writing code to the tests is ridiculous. If you write to the tests before the code, you won't even know what the edge cases are. You could have code that passes the tests but still fails in unforeseen circumstances.
And furthermore, good developers will keep cohesion low, which will make the addition of new code unlikely to cause problems with existing stuff.
In fact, I'll generalize that even further,
Most "Best Practices" in Software Engineering are there to keep bad programmers from doing too much damage.
They're there to hand-hold bad developers and keep them from making dumbass mistakes. Of course, since most developers are bad, this is a good thing, but good developers should get a pass.
Write small methods. It seems that programmers love to write loooong methods where they do multiple different things.
I think that a method should be created wherever you can name one.
It's ok to write garbage code once in a while
Sometimes a quick and dirty piece of garbage code is all that is needed to fulfill a particular task. Patterns, ORMs, SRP, whatever... Throw up a Console or Web App, write some inline sql ( feels good ), and blast out the requirement.
Code == Design
I'm no fan of sophisticated UML diagrams and endless code documentation. In a high level language, your code should be readable and understandable as is. Complex documentation and diagrams aren't really any more user friendly.
Here's an article on the topic of Code as Design.
Software development is just a job
Don't get me wrong, I enjoy software development a lot. I've written a blog for the last few years on the subject. I've spent enough time on here to have >5000 reputation points. And I work in a start-up doing typically 60 hour weeks for much less money than I could get as a contractor because the team is fantastic and the work is interesting.
But in the grand scheme of things, it is just a job.
It ranks in importance below many things such as family, my girlfriend, friends, happiness etc., and below other things I'd rather be doing if I had an unlimited supply of cash such as riding motorbikes, sailing yachts, or snowboarding.
I think sometimes a lot of developers forget that developing is just something that allows us to have the more important things in life (and to have them by doing something we enjoy) rather than being the end goal in itself.
I also think there's nothing wrong with having binaries in source control.. if there is a good reason for it. If I have an assembly I don't have the source for, and might not necessarily be in the same place on each devs machine, then I will usually stick it in a "binaries" directory and reference it in a project using a relative path.
Quite a lot of people seem to think I should be burned at the stake for even mentioning "source control" and "binary" in the same sentence. I even know of places that have strict rules saying you can't add them.
Every developer should be familiar with the basic architecture of modern computers. This also applies to developers who target a virtual machine (maybe even more so, because they have been told time and time again that they don't need to worry themselves with memory management etc.)
Software Architects/Designers are Overrated
As a developer, I hate the idea of Software Architects. They are basically people that no longer code full time, read magazines and articles, and then tell you how to design software. Only people that actually write software full time for a living should be doing that. I don't care if you were the worlds best coder 5 years ago before you became an Architect, your opinion is useless to me.
How's that for controversial?
Edit (to clarify): I think most Software Architects make great Business Analysts (talking with customers, writing requirements, tests, etc), I simply think they have no place in designing software, high level or otherwise.
There is no "one size fits all" approach to development
I'm surprised that this is a controversial opinion, because it seems to me like common sense. However, there are many entries on popular blogs promoting the "one size fits all" approach to development so I think I may actually be in the minority.
Things I've seen being touted as the correct approach for any project - before any information is known about it - are things like the use of Test Driven Development (TDD), Domain Driven Design (DDD), Object-Relational Mapping (ORM), Agile (capital A), Object Orientation (OO), etc. etc. encompassing everything from methodologies to architectures to components. All with nice marketable acronyms, of course.
People even seem to go as far as putting badges on their blogs such as "I'm Test Driven" or similar, as if their strict adherence to a single approach whatever the details of the project project is actually a good thing.
It isn't.
Choosing the correct methodologies and architectures and components, etc., is something that should be done on a per-project basis, and depends not only on the type of project you're working on and its unique requirements, but also the size and ability of the team you're working with.

What code metric(s) convince you that provided code is "crappy"? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Code lines per file, methods per class, cyclomatic complexity and so on. Developers resist and workaround most if not all of them! There is a good Joel article on it (no time to find it now).
What code metric(s) you recommend for use to automatically identify "crappy code"?
What can convince most (you can't convince all of us to some crappy metric! :O) ) of developers that this code is "crap".
Only metrics that can be automatically measured counts!
Not an automated solution, but I find WTF's per minute useful.
(source: osnews.com)
No metrics regarding coding-style are part of such a warning.
For me it is about static analysis of the code, which can truly be 'on' all the time:
cyclomatic complexity (detected by checkstyle)
dependency cycle detection (through findbugs for instance)
critical errors detected by, for instance findbugs.
I would put coverage test in a second step, as such tests can take time.
Do not forget that "crappy" code are not detected by metrics, but by the combination and evolution (as in "trend) of metrics: see the What is the fascination with code metrics? question.
That means you do not have just to recommend code metrics to "automatically identify "crappy code"", but you also have to recommend the right combination and trend analysis to go along those metrics.
On a sidenote, I do share your frustration ;), and I do not share the point of view of tloach (in the comments of another answers) "Ask a vague question, get a vague answer" he says... your question deserve a specific answer.
Number of warnings the compiler spits out when I do a build.
Number of commented out lines per line of production code. Generally it indicates a sloppy programmer that doesn't understand version control.
Developers are always concerned with metrics being used against them and calling "crappy" code is not a good start. This is important because if you are worried about your developers gaming around them then don't use the metrics for anything that is to their advantage/disadvantage.
The way this works best is don't let the metric tell you where the code is crappy but use the metric to determine where you need to look. You look by having a code review and the decision of how to fix the issue is between the developer and the reviewer. I would also error on the side of the developer against the metric. If the code is still popping on the metric but the reviewers think it is good, leave it alone.
But it is important to keep in mind this gaming effect when your metrics start to improve. Great, I now have 100% coverage but are the unit tests any good? The metric tells me I am ok, but I still need to check it out and look at what got us there.
Bottom line, the human trumps the machine.
number of global variables.
Non-existent tests (revealed by code coverage). It's not necessarily an indicator that the code is bad, but it's a big warning sign.
Profanity in comments.
Metrics alone do not identify crappy code. However they can identify suspicious code.
There are a lot of metrics for OO software. Some of them can be very useful:
Average method size (both in LOC/Statements or complexity). Large methods can be a sign of bad design.
Number of methods overridden by a subclass. A large number indicates bad class design.
Specialization index (number of overridden methods * nesting level / total number of methods). High numbers indicate possible problems in the class diagram.
There are a lot more viable metrics, and they can be calculated using tools. This can be a nice help in identifying crappy code.
global variables
magic numbers
code/comment ratio
heavy coupling (for example, in C++ you can measure this by looking at class relations or number of cpp/header files that cross-include each other
const_cast or other types of casting within the same code-base (not w/ external libs)
large portions of code commented-out and left in there
My personal favourite warning flag: comment free code. Usually means the coder hasn't stopped to think about it; plus it automatically makes it hard to understand, so ups the crappy ratio.
At first sight: cargo cult application of code idioms.
As soon as I have a closer look: obvious bugs and misconceptions by the programmer.
My bet: combination of cyclomatic complexity(CC) and code coverage from automated tests(TC).
CC | TC
2 | 0% - good anyway, cyclomatic complexity too small
10 | 70% - good
10 | 50% - could be better
10 | 20% - bad
20 | 85% - good
20 | 70% - could be better
20 | 50% - bad
...
crap4j - possible tool (for java) and concept explanation ... in search for C# friendly tool :(
Number of worthless comments to meaningful comments:
'Set i to 1'
Dim i as Integer = 1
I don't believe there is any such metric. With the exception of code that actually doesn't do what it's supposed to (which is a whole extra level of crappiness) 'crappy' code means code that is hard to maintain. That usually means it's hard for the maintainer to understand, which is always to some extent a subjective thing, just like bad writing. Of course there are cases where everyone agrees the writing (or the code) is crappy, but it's very hard to write a metric for it.
Plus everything is relative. Code doing a highly complex function, in minimal memory, optimized for every last cycle of speed, will look very bad compared with a simple function under no restrictions. But it's not crappy - it's just doing what it has to.
Unfortunately there is not a metric that I know of. Something to keep in mind is no matter what you choose the programmers will game the system to make their code look good. I have seen that everywhere any kind of "automatic" metric is put into place.
A lot of conversions to and from strings. Generally it's a sign that the developer isn't clear about what's going on and is merely trying random things until something works. For example, I've often seen code like this:
object num = GetABoxedInt();
// long myLong = (long) num; // throws exception
long myLong = Int64.Parse(num.ToString());
when what they really wanted was:
long myLong = (long)(int)num;
I am surprised no one has mentioned crap4j.
Watch out for ratio of Pattern classes vs. standard classes. A high ratio would indicate Patternitis
Check for magic numbers not defined as constants
Use a pattern matching utility to detect potentially duplicated code
Sometimes, you just know it when you see it. For example, this morning I saw:
void mdLicense::SetWindows(bool Option) {
_windows = (Option ? true: false);
}
I just had to ask myself 'why would anyone ever do this?'.
Code coverage has some value, but otherwise I tend to rely more on code profiling to tell if the code is crappy.
Ratio of comments that include profanity to comments that don't.
Higher = better code.
Lines of comments / Lines of code
value > 1 -> bad (too many comments)
value < 0.1 -> bad (not enough comments)
Adjust numeric values according to your own experience ;-)
I take a multi-tiered approach with the first tier being reasonable readability offset only by the complexity of the problem being solved. If it can't pass the readability test I usually consider the code less than good.
TODO: comments in production code. Simply shows that the developer does not execute tasks to completion.
Methods with 30 arguments. On a web service. That is all.
Well, there are various different ways you could use to point out whether or not a code is a good code. Following are some of those:
Cohesiveness: Well, the block of code, whether class or a method, if found to be serving multiple functionality, then the code can be found to be lower in cohesiveness. The code lower in cohesiveness can be termed as low in re-usability. This can further be termed as code lower in maintainability.
Code complexity: One can use McCabe cyclomatic complexity (no. of decision points) to determine the code complexity. The code complexity being high can be used to represent code with less usability (difficult to read & understand).
Documentation: Code with not enough document can also attribute to low software quality from the perspective of usability of the code.
Check out following page to read about checklist for code review.
This hilarious blog post on The Code C.R.A.P Metric could be useful.