What terms do you use to describe the nature of a refactorization? - terminology

Currently I'm trying to write a good commit comment for a code refactoring I've made.
And I feel like I'm missing a word to sum up what I did instead of describing it.
Right now my description is:
"Code refactor to improve decoupling inside the generator class."
But IMO, it's not really decoupling as it's only inside a class itself. It doesn't have any strong link to the code responsibility. It's more to improve testability of the class by having more smaller methods instead of few big one.
So that led me to a quite simple question:
What are the most common terms you use in your commit messages to describe a code refactoring ?

http://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672
Is the defacto standard for basic refactoring vocabulary everywhere I've worked.

Cleaned up implementation of function x within class y, by breaking it into functions a, b & c to make unit testing easier.

Related

Avoiding spaghetti code while writing small functions

My understanding of "Spaghetti Code" is a code base that jumps from one block of code to another without an logical and legible purpose. The most common offender seems to be the GOTO statement.
I'm currently reading/referencing the function chapter of Clean Code: A Handbook of Agile Software Craftsmanship. The author, while self admittedly, is extremely strict on the size of functions. I understand the idea of keeping functions small, however, he suggests they should be around 5 lines. While Classes certainly become more legible, I'm afraid of creating spaghetti code by writing smaller functions. Smaller functions also seem to inadvertently create much higher abstractions as well.
At what point does code become spaghetti code? How abstract is too abstract? Any answers would be greatly helpful.
As an aside, I'm a long time follower of Stack Overflow although this is my first time posting a question, so any suggestions regarding my post are welcome as well.
Thanks a lot!
As already said in the comments, there is no absolute rule. At the end, you should aim for a good readability of your code. But that is not all about the length of your methods. Robert Martin suggests ordering the methods according to the degree of abstraction. Abststract methods should be at the top of your class, and the more a method is, the deeper it should be located.
Another importand aspect is the method name. It should be chosen well in order to make clear what the method does! If you choose your method names wisely, then comments should be hardly necessary. For example, consider an if-statement:
if(isValidAge(value)) {
...
}
is much more readable than
if(value > 10 && value < 99) {
...
}
because the intention of the statement becomes much clearer. Of cause you could add a comment in the second example. But comments often become outdated (there is an extra chapter in Robert Martin's book about that). I think, this style of programming leads to many short methods.
It is hard to choose the right level of abstraction. According to my expecience, it is easier to start with a low level of abstraction. So I can first concentrate on solving the problem well. When I need more abstraction later, I still can refactor the code. TDD helps a lot!
Hope, this helps ...
I agree with comments and answers here. From practical point of view the thinks which Robert Martin writes in his books are every time very good orientations and I try to get as much close as possible to this "rules" and indeed 5-lines-methodes are mostly not to bad.
In my eyes best way to avoid spaghetti code is to write (small) classes with a high Cohesion. The disadvantage is that you become a whole bunch of classes, which makes it sometimes a little bit more hard for new employees to come in the project.
I understand the idea of keeping functions small, however, he suggests they should be around 5 lines.
That sounds ideal :)
While Classes certainly become more legible, I'm afraid of creating spaghetti code by writing smaller functions.
Spaghetti code is caused by code jumping from place to place with (having different levels of abstraction in the same function - low-level IO code and high level application logic). If you extract small functions, your result is getting further away from spaghetti code, not closer).
At what point does code become spaghetti code?
When the code forces you (the programmer) to make mental jumps (switch contexts) from line to line, the code is spaghetti code. This is true whether you use GOTOs or not (but GOTOs can make the problem much worse).

When to subclass and when to make do with existing class?

I want to dispatch events to announce the progress of an asynchronous process. The event should contain 2 properties: work done and work total.
As the name suggests ;) i could use ProgressEvent; it has bytesLoaded and bytesTotal properties that i can use. However, my async process isn't loading bytes, its processing pixels, so the property names are a bit misleading for my use case - although the class name is perfect.
The alternative is to create a custom event with two properties that i can name how i like. But this means another class added to the code base.
So my question is; Is it better to reuse an existing class where the properties are suitable but maybe the naming isn't ideal; Or to create a custom class that perfectly fits the requirement? Obviously, one extra class is no big deal, but OOP is all about reusing stuff so adding an unnecessary class does make me uneasy.
I await your thoughts...
PS: This is my first question on stack so be gentle
For clarity, I'd create a new class. Adding a new class is not much overhead at all, especially for something simple like an event. I find that code is more readable when I don't have to make mental translations (like bytesLoaded really means pixelsLoaded). To me this is akin to choosing poor names for variables.
In addition, by going the other route and re-using the ProgressEvent class, I would feel compelled to document the code to indicate that we're dealing with pixels rather than bytes. That starts to get messy if you have a lot of classes that uses the event.
Re-use is great, but I'd opt for clarity as long as it doesn't impact your productivity or the app's performance.
writing in doc clearly, using custom event or ProgressEvent is well.

What is the golden rule for when to split code up into functions?

It's good to split code up into functions and classes for modularity / decoupling, but if you do it too much, you get really fragmented code which is also not good.
What is the golden rule for when to split code up into functions?
It really does depend on the size and scope of your project.
I tend to split things into functions every time there is something repeated, and that repetition generalized/abstracted, following the golden rule of DRY (Do not repeat yourself).
As far as splitting up classes, I tend to follow the Object Oriented Programming mantra of isolating what is the same from what is different, and split up classes if one class is implementing more an one large theoretical "idea" or entity.
But honestly if your code is too fragmented and opaque, you should try considering refactoring into a different approach/paradigm.
If I can give it a good name (better than the code it replaces), it becomes a function
I think you generally have to think of a chunk of codes have a chance of being reuse. It comes with experience to figure that out and plan ahead.
I agree with the answers that concentrate on reusability and eliminating repetition, but I also believe that readability is at least as important. So in addition to repetition, I would look for pieces of code (classes, functions, blocks, etc.) that do more than one thing.
If the name associated with the code does not describe what it does, then that is a good time to refactor that code into units which each have a single responsibility and a descriptive name. This separation of concerns will help both reusability, and readability.
Useful code can stick around for a long time, so it is important that you (or ideally someone else) can go back and easily understand code that was written months or years before.
Probably my own personal rule is if it is more than 2 lines long, and is referenced more than once on the same page (ASP.net), or a few times spread over a couple of pages, than I will write a function to do it.
I was taught that anything you do more than once should be a function. Anything similar should have a parent class, and above all else consult your source code "standards" within your organization. The latter mostly deals with formatting.
First, write the feature you're adding. (Notice the word "First", we tend to write a function/class before writing the feature, which might lead to having too many fragmentation).
Then, review the code you just wrote/changed, find what blocks of the code that is:
3-lines or more, and..
repeated, and..
Can be grouped under a named function. Because code is written by us, people (not programs), to be read/changed later by us, people (not programs).

Is it code-smelly to have empty classes in the middle of a class hierarchy?

I sometimes end up with a class hierarchy where I have an abstract base class with some common functionality and a couple of implementing classes that fall into two (rarely more) groups which I want to treat differently in some cases. An example would be an abstract tree node class and different branch and leaf implementations where I want to distinguish branches and leaves at some point.
These intermediate classes are then only used for "is-a" statements in flow control and they don't contain any code, although I have had cases where they "grew" some code later.
Does that seem smelly to you? In my tree example, one alternative would be to add isLeaf() / isBranch() abstract methods to the base class and implement those on the intermediate classes, but that didn't seem to be any better to me, really, unless I'd mean to have classes that could be multiple things at once.
To me, using "is-a" tests in flow control is just as smelly as using switch/case. In a good OO design, neither is needed.
Yes, deep inheritance hierarchies are a code smell anyway.
Yup, definitely a code smell -- don't code these empty classes unless you're ready to write that code into it. Think YAGNI (you aint gonna need it) -- don't do it unless you need it already.
Also, have you considered cases wherein these classes are only there to provide abstract methods, or to group them based on capabilities in terms of methods or properties?
If that's the case, maybe what you really need are interfaces, not additional abstract classes?
In general, empty classes are a code smell.
I agree your isLeaf or isBranch methods are a correct alternative.
They add information about the objects , which is helpful.
(This is because, on the super class, you can't express that subclasses are "either leaf or branch").
The two methods with opposite results might also be considered as code duplication.
You could use only one... But I would recommend return an enumerated value LEAF or BRANCH.
A class that doesn't contain any code is definitely a code-smell....
Seems alright to me, if you're going to put new functionality in later.
But if not, an enum is generally used here.
-- Edit
Though I must agree with Ber, that you shouldn't generally be using 'is-a' anyway.

Is HTML considered a programming language? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I guess the question is self-explanatory, but I'm wondering whether HTML qualifies as a programming language (obviously the "L" stands for language).
The reason for asking is more pragmatic—I'm putting together a resume and don't want to look like a fool for listing things like HTML and XML under languages, but can't figure out how to classify them.
No, HTML is not a programming language. The "M" stands for "Markup". Generally, a programming language allows you to describe some sort of process of doing something, whereas HTML is a way of adding context and structure to text.
If you're looking to add more alphabet soup to your CV, don't classify them at all. Just put them in a big pile called "Technologies" or whatever you like. Remember, however, that anything you list is fair game for a question.
HTML is so common that I'd expect almost any technology person to already know it (although not stuff like CSS and so on), so you might consider not listing every initialism you've ever come across. I tend to regard CVs listing too many things as suspicious, so I ask more questions to weed out the stuff that shouldn't be listed. :)
However, if your HTML experience includes serious web design stuff including Ajax, JavaScript, and so on, you might talk about those in your "Experience" section.
YES, a declarative programming language.
You really want to list the most important things you know that are relative to the job you're applying for on your resume. If you list ASP.NET but don't list HTML, even though it's somewhat obvious, there are a lot of managers and/or HR types that will assume you don't know HTML since it's not listed. I've had it happen to me before.
Update - Some say no it isn't a programming language, and you may not agree with me on this, but regardless on a resume it IS a programming language. You get HR types looking at your resume before the hiring manager even sees it. If the manager says you need to know HTML, and it's not listed in the 'programming languages' section then the HR person may disregard you resume thinking you don't know it because it's not listed.
Update 6-8-2012: Any instruction that tells the computer to do something is a programming language. So even after all these years, I still stand by my answer. HTML is a programming language. Something that isn't a programming language would be XML.
No, the clue is in the M - it's a Markup Language.
On some level Chris Pietschmann is correct. SQL isn't Turing complete (at least without stored procedures) yet people will list that as a language, TeX is Turing complete but most people regard it as a markup language.
Having said that: if you are just applying for jobs, not arguing formal logic, I would just list them all as technologies. Things like .NET aren't languages but would probably be listed as well.
The 'M' stands for a 'Markup'. It's a 'Markup Language' not a programming language. Some people will disagree with this, but my opinion is that if it lacks logical constructs (conditional branching, iteration, etc) its not really a programming language.
As for the resume, I would suggest putting HTML and XML under a section like 'Technologies'. I usually have a section like this where I list things like version control software, OS's I've developed for, build systems, etc.
No, HTML is a not a programming language. It is called "markup" for that reason.
If you're going to say that HTML is a programming language, then you might as well include things such as word documents, as they too are based on ML, or 'Markup Language'.
Simply put--HTML defines content!
I think not exactly a programming language, but exactly what its name says: a markup language.
We cannot program using just pure, HTML. But just annotate how to present content.
But if you consider programming the act of tell the computer how to present contents, it is a programming language.
In the advanced programming languages class I took in college, we had what I think is a pretty good definition of "programming language": a programming language is any (formal) language capable of expressing all computable functions, which the Church-Turing thesis implies is the set of all Turing-computable functions.
By that definition, no, HTML is not a programming language, even a declarative one. It is, as others have explained, a markup language.
But the people reviewing your resume may very well not care about such a formal distinction. I'd follow the good advice given by others and list it under a "Technologies" type of section.
I think that it definitely has its place on a resume. Knowledge of HTML is valuable, and there really is a lot to know, what with cross-browser compatibility issues and standards which should be followed.
I wouldn't list HTML under "programming languages" alongside C# or something, but it's worth noting your experience.
No - there's a big prejudice in IT against web design; but in this case the "real" programmers are on pretty firm ground.
If you've done a lot of web design work you've probably done some JavaScript, so you can put that down under 'programming languages'; if you want to list HTML as well, then I agree with the answer that suggests "Technologies".
But unless you're targeting agents who're trying to tick boxes rather than find you a good job, a bare list of things you've used doesn't really look all that good. You're better off listing the projects you've worked on and detailing the technologies you used on each; that demonstrates that you've got real experience of using them rather than just that you know some buzzwords.
I get around this problem by not having a "programming languages" section on my resume. Instead I label it simply as "languages", and I stick HTML and CSS at the end. I'd rather make life easier for the reviewer so that they can see whether mine checks-off all their requirements.
Only fools would disregard an applicant because he or she listed HTML under "languages" instead of some other label, especially since there is no industry standard. And who wants to work for fools?
Well, L is for language, but it doesn't imply programming language. After all, English or French are (natural) languages too! ;-)
As said above, put them under a subsidiary section, Technology seems to be a good term.
(Looking at my own resume, not updated in a while) I have made a section just called "Languages", so I can't get wrong... :-D
I have put "(X)HTML and CSS, XML/DTD/Schema and SVG" at the end of the section, clearly separated.
In French, I have a section "Langages" (programming and markup) and another "Langues" (French/English). In the English version, I titled both at "Languages", which is clumsy now that I think of it, although context clarify this. I should find a better formulation.
HTML is in no way a programming language.
Programming languages deals with ''proccessing functions'', etc. HTML just deals with the visual interface of a web page, where the actual programming handles the proccessing. PHP for example.
If anyone really knows programming, I really can't see how people can mistake HTML for an actual programming language.
In recruitment terms, having been on both sides of the fence, definitely put HTML under 'programming languages', or perhaps more safely under 'technologies'
Yes, we all know that it is a Markup Language and not a Programming Language. but a) Recruitment Agencies don't know and don't care, and b) employers don't know and don't care. Really.
And pointing out their ignorance will only serve you ill. And the techies who eventually see your CV will be grateful for a candidate who has heard of HTML, and won't worry about the taxonomy.
Honestly, it isn't an issue.
List it under technologies or something. I'd just leave it off if I were you as it's pretty much expected that you know HTML and XML at this point.