Advantages/Disadvantages of Refactoring Tools - language-agnostic

what are the advantages and disadvantages of refactoring tools, in general?

Advantage
You are more likely to do the refactoring if a tool helps you.
A tool is more likely to get “rename” type refactoring right first time then you are.
A tool lets you do refactoring on a codebase without unit tests that you could not risk doing by hand.
A tool can save you lots of time.
Both the leading tools (RefactorPro/CodeRush and Resharper) will also highlight most coding errors without you having to a compile
Both the leading tools will highlight were you don’t keep to their concept of best practises.
Disadvantages
Some times the tool will change the meaning of your code without you expecting it, due to bags in the tool or use of reflection etc in your code base.
A took may make you feel safe with less unit tests…
A tool can be very slow…, so for renameing locals vars etc it can be quicker to do it by hand.
A tool will slow down the development system a lot, as the tool as to keep is database updated while you are editing code.
A tool takes time to learn.
A tool push you towards the refactorings they include and you may ignore the ones they don't, to your disadvantage.
A tool will have a large memory footprint for a large code base, however memory is cheep these days.
No tool will cope well with very large solution files.
You will have to get your boss to agree to paying for the tool, this may take longer then the time the tool saves.
You may have to get your IT department to agree to you installing the tool
You will be lost in your next job if they will not let you use the same tool :-)

Advantage: the obvious one: speed.
Disadvantages:
they push you towards the refactorings they include and you may ignore the ones they don't, to your disadvantage;
I've only tried one, with VS, and it slowed down the app noticeably. I couldn't decide if it was worth it but had to rebuild the machine and haven't re-installed it so I guess that tells you.

Code improvement suggestions. (can be
both advantage and disadvantage)
Removes code noise (advantage)
Renaming variables, methods (advantage)

I'd say that the speed of making code changes or writing code is the biggest advantage. I have CodeRush and I am lost without it.
I'd say the biggest disadvantage is the memory footprint, if you are tight on memory then its probably going to hurt more than help. But I've got 4Gb and 8Gb on each of my dev boxes so I don't really notice. (Not that they take huge amounts of memory, but if you are 2Gb or less then it is going to be noticeable)
Also, I've noticed that the two big refactoring tools for .NET (RefactorPro/CodeRush and Resharper) both have problems with web site projects (A legacy inheritance so out of my control) with their code analysis/suggestion engine. Seems to think everything is bad (actually, that's probably a fairly accurate assessment for a web site project, but I don't want to be reminded of it constantly)

Related

Developing using pre-release dev tools

We're developing a web site. One of the development tools we're using has an alpha release available of its next version which includes a number of features which we really want to use (ie they'd save us from having to implement thousands of lines to do pretty much exactly the same thing anyway).
I've done some initial evaluations on it and I like what I see. The question is, should we start actually using it for real? ie beyond just evaluating it, actually using it for our development and relying on it?
As alpha software, it obviously isn't ready for release yet... but then nor is our own code. It is open source, and we have the skills needed to debug it, so we could in theory actually contribute bug fixes back.
But on the other hand, we don't know what the release schedule for it is (they haven't published one yet), and while I feel okay developing with it, I wouldn't be so sure about using it in production so if it isn't ready before we are then it may delay our own launch.
What do you think? Is it worth taking the risk? Do you have any experiences (good or bad) of similar situations?
[EDIT]
I've deliberately not specified the language we're using or the dev-tool in question in order to keep the scope of the question broad, as I feel it's a question that can apply to pretty much any dev environment.
[EDIT2]
Thank you to Marjan for the very helpful reply. I was hoping for more responses though, so I'm putting a bounty on this.
I've had experience contributing to an open source project once, like you said you hope to contribute. They ignored the patch for one year (they have customers to attend of course, although they don't sell the software but the support). After one year, they rejected the patch with no alternative solution to the problem, and without a sound foundation to do that. It was just out of their scope at that time, I guess.
In your situation I would try to solve one or two of their not-so-high priority, already reported bugs and see how responsive they are, and then decide. Because your success on deadlines will be compromised to theirs. If you have to maintain a copy of their artifacts, that's guaranteed pain.
In short: not only evaluate the product, evaluate the producers.
Regards.
My personal take on this: don't. If they don't come through for you in your time scale, you're stuck and will still have to put in the thousands of lines yourself and probably under a heavy time restriction.
Having said that, there is one way I see you could try and have your cake and eat it too.
If you see a way to abstract it out, that is to insulate your own code from the library's, for example using adapter or facade patterns, then go ahead and use the alpha for development. But determine beforehand what the latest date is according to your release schedule that you should start developing your own thousands of lines version behind the adapter/facade. If the alpha hasn't turned into an RC by then: grin and bear it and develop your own.
It depends.
For opensource environments it depends more on the quality of the release than the label (alpha/beta/stable) it has. I've worked with alpha code that is rock solid compared to alleged production code from another producer.
If you've got the source then you can fix the any bugs, whereas with closed source (usually commercially supported) you could never release production code built with a beta product because it's unsupported by the vendor who has the code, and so you can't fix it.
So in your position I'd be assessing the quality of the alpha version and then deciding if that could go into production.
Of course all of the above doesn't apply to anything even remotely safety critical.
It is just a question of managing risks. In open source, alpha release can mean a lot of different things. You need to be prepared to:
handle API changes;
provide bug fixes and workarounds;
test stability, performance and scalability yourself;
track changes much more closely, and decide whether to adopt then yet;
track the progress they are making and their responsiveness to patches/issues.
You do use continuous integration, do you?

Legacy code - when to move on

My team and support a large number of legacy applications all of which are currently functional but problematic to support and maintain. They all depend on code that the compiler manufacture has officially no support for.
So the question is should we leave the code as is, and risk a new compiler breaking our code, or should we bite the bullet and update all the code?
The answer is totally dependant on the resources your employer (or yourself) can afford to make the refactoring (or even totally rewrite big parts).
So you should first estimate how much time/developers you can afford to refactoring the application, then see if you think it'll be enough.
If you can afford time and people, then do it, don't hesitate! You're investing in the future by reducing the time to debug the application so it will be helpful and less expensive once the refactoring is done.
It depends on the nature of the applications, just how big and important they are, as well as the programming culture at your workplace, and the resources available to you.
If the applications are valuable enough to you that they are worth the trouble, and you have the necessary resources, then do the update. Don't let the problem persist.
If they are not valuable enough to be worth a full-scale update effort, or appropriate resources are not at hand, perhaps work on updating one at a time if possible.
Just some suggestions, but again this greatly depends on you and your organization.
It sounds like you have a large technical debt. This debt is only going to increase unless you do something. Both things you mentioned are options, and risky, but long term it's a risk you need to take.
Using an updated compiler just means you need to update the code to work in the new compiler. Something is bound to break, but then refactor the parts that break. This allows you to migrate.
The other option is to update your entire code base. This takes time, during which you need to maintain 2 copies of the code, or freeze the old version. Freezing the old version is probably not an option.
I would recommend using an updated compiler and fixing what breaks. This allows you to add features, while refactoring and fixing the current codebase.
Rewriting the code can be an useful step for you company for many reasons:
you can use a new compiler and a more recent platform
you can refactor the code deleting its weaknesses
you can motivate your people because developing new code is better than correct bugs in an old one.
Why don't you start that activity with a small number of people, beginning from the most common parts of the code? You can group them into a dll and use it also for future projects.

What are the disadvantages code reuse?

A few years ago, we needed a C++ IPC library for making function calls over TCP. We chose one and used it in our application. After a while, it became clear it didn't provide all functionality we needed. In the next version of our software, we threw the third party IPC library out and replaced it by one we wrote ourselves. From then on, I sometimes doubt whether this was a good decision, because it has proven to be quite a lot of work and it obviously felt like reinventing the wheel. So my question is: are there disadvantages to code reuse that justify this reinvention?
I can suggest a few
The bugs get replicated - If you reuse a buggy code :)
Sometimes it may add an additional overhead. As an example if you just need to do a simple thing it is not advisable to use a complex BIG library that implements the required feature.
You might face with some licensing concerns.
You may need to spend some time to learn\configure the external library. This may not be effective if the re-development takes a much lower time.
Reusing a poorly documented library may get more time than expected/estimated
P.S. The reasons for writing our own library were:
Evaluating external libraries is often very difficult and it takes a lot of time. Also, some problems only become visible after a thorough evaluation.
It made it possible to introduce some features that are specific for our project.
It is easier to do maintenance and to write extensions, as you know the library through and through.
It's pretty much always case by case. You have to look at the suitability and quality of what you're trying to reuse.
The number one issue is: you can only successfully reuse code if that code is GOOD code. If it was designed poorly, has bugs, or is very fragile then you'll run into the same issues you already did run into -- you have to go do it yourself anyway because it's so hard to modify the existing code.
However, if it's a third-party library that you are considering using that you don't have the source code for, it's a little different. You can try and get the source if it's that kind of library. Some commercial library vendors are open to suggestions and feature requests.
The Golden Wisdom :: It Has To Be Usable Before It Can Be Reusable.
The biggest disadvantage (you mention it yourself) by reusing third party libraries, is that you are strongly coupled and dependent to how that library works and how it's supposed to be used, unless you manage to create a middle interface layer that can take care of it.
But it's hard to create a generic interface, since replacing an existing library with another one, more or less requires that the new functionality works in similar ways. However, you can always rewrite the code using it, but that might be very hard and take a long time.
Another aspect is that if you reinvent the wheel, you have complete control over what's happening and you can do modifications as you see fit. This can be completely impossible if you are depending on a third part library being alive and constantly providing you with updates and bug fixes. On the other hand, reusing code this way enables you to focus on other things in your software, which sometimes might be the thing to do.
There's always a trade off.
If your code relies on external resources and those go away, you may be crippling portions of many applications.
Since most reused code comes from the internet, you run into all the issues with the Bathroom Wall of Code Atwood talks about. You can run into issues with insecure or unreliable borrowed code, and the more black boxed it is, the worse.
Disadvantages of code reuse:
Debugging takes a whole lot longer since it's not your code and it's likely that it's somewhat bloated code.
Any specific requirements will also take more work since you are constrained by the code you're re-using and have to work around it's limitations.
Constant code reuse will result in the long run in a bloated and disorganized applications with hard to chase bugs - programming hell.
Re-using code can (dependently on the case) reduce the challenge and satisfaction factor for the programmer, and also waste an opportunity to develop new skills.
It depends on the case, the language and the code you want to re-use or re-write. In general I believe that the higher-level the language is, the more I tend towards code reuse. Bugs in higher-level language can have a bigger impact, and they're easier to rewrite. High level code must stay readable, neat and flexible. Of course that could be said of all code, but, somehow, rewriting a C library sounds less of a good idea than rewriting (or rather re-factoring) PHP model code.
So anyway, these are some of the arguments I'd use to promote "reinventing the wheel".
Sometimes it's just faster, more fun, and better in the long run to rewrite from scratch than having to work around bugs and limitation of a current codebase.
Wondering what you are using to keep this library you reinvented?
Initial time for create a reusable code is more expensive and time cost
When master branch has an update you need to sync it and deploy again
The bugs get replicated - If you reuse a buggy code
Reusing a poorly documented code may get more time than expected/estimated

Benefits of cross-platform development?

Are there benefits to developing an application on two or more different platforms? Does using a different compiler on even the same platform have benefits?
Yes, especially if you plan to distribute your code for multiple platforms.
But even if you don't cross platform development is a form of futureproofing; if it runs on multiple (diverse) platforms today, it's more likely to run on future platforms than something that was tuned, tweeked, and specialized to work on a version 7.8.3 clean install of vendor X's Q-series boxes (patch level 1452) and nothing else.
There seems to be a benefit in finding and simply preventing bugs with a different compiler and a different OS. Different CPUs can pin down endian issues early. There is the pain at the GUI level if you want to stay native at that level.
Short answer: Yes.
Short of cloning a disk, it is almost impossible to make two systems exactly alike, so you are going to end up running on "different platforms" whether you meant to or not. By specifically confronting and solving the "what if system A doesn't do things like B?" problem head on you are much more likely to find those key assumptions your code makes.
That said, I would say you should get a good chunk of your base code working on system A, and then take a day (or a week or ...) and get it running on system B. It can be very educational.
My education came back in the 80's when I ported a source level C debugger to over 100 flavors of U*NX. Gack!
Are there benefits to developing an application on two or more different platforms?
If this is production software, the obvious reason is the lure of a larger client base. Your product's appeal is magnified the moment the client hears that you support multiple platforms. Remember, most enterprises do not use a single OS or even a single version of the OS. It is fairly typical to find a section using Windows, another Mac and a smaller version some flavor of Linux.
It is also seen that customizing a product for a single platform is often far more tedious than to have it run on multi-platform. The law of diminishing returns kicks in even before you know.
Of course, all of this makes little sense, if you are doing customization work for an existing product for the client's proprietary hardware. But even then, keep an eye out for the entire range of hardware your client has in his repertoire -- you never know when he might ask for it.
Does using a different compiler on even the same platform have benefits?
Yes, again. Different compilers implement different extensions. See to it that you are not dependent on a particular version of a particular compiler.
Further, there may be a bug or two in the compiler itself. Using multiple compilers helps sort these out.
I have further seen bits of a (cross-platform) product using two different compilers -- one was to used in those modules where floating point manipulation required a very high level of accuracy. (Been a while I've heard anyone else do that, but ...)
I've ported a large C++ program, originally Win32, to Linux. It wasn't very difficult. Mostly dealing with compiler incompatibilities, because the MS C++ compiler at the time was non-compliant in various ways. I expect that problem has mostly gone now (until C++0x features start gradually appearing). Also writing a simple platform abstraction library to centralize the platform-specific code in one place. It depends to what extent you are dependent on services from the OS that would be hard to mimic on a new platform.
You don't have to build portability in from the ground up. That's why "porting" is often described as an activity you can perform in one shot after an initial release on your most important platform. You don't have to do it continuously from the very start. Purely for economic reasons, if you can avoid doing work that may never pay off, obviously you should. The cost of porting later on, when really necessary, turns out to be not that bad.
Mostly, there is an existing platform where the application is written for (individual software). But you adress more developers (both platforms), if you decide to provide an independent language.
Also products (standard software) for SMEs can be sold better if they run on different platforms! You can gain access to both markets, WIN&LINUX! (and MacOSx and so on...)
Big companies mostly buy hardware which is supported/certified by the product vendor only to deploy the specified product.
If you develop on multiple platforms at the same time you get the advantage of being able to use different tools. For example I once had a memory overwrite (I still swear I didn't need the +1 for the null byte!) that cause "free" to crash. I brought the code up to speed on Windows and found the overwrite in about 1 minute with Rational Purify... it had taken me a week under Linux of chasing it (valgrind might have found it... but I didn't know about it at the time).
Different compilers on the same or different platforms is, to me, a must as each compiler will report different things, and sometimes the report from one compiler about an error will be gibberish but the other compiler makes it very clear.
Using things like multiple databases while developing means you are much less likely to tie yourself to a particular database which means you can swap out the database if there is a reason to do so. If you want to integrate something that uses Oracle into a existing infrastructure that uses SQL Server for example it can really suck - much better if the Oracle or SQL Server pieces can be moved to the other system (I know of some places that have 3 different databases for their financial systems... ick).
In general, always developing for two or three things means that the odds of you finding mistakes is better, and the odds of the system being more flexible is better.
On the other hand all of that can take time and effort that, at the immediate time, is seen as an unneeded expense.
Some platforms have really dreadful development tools. I once worked in an IB where rather than use Sun's ghastly toolset, peole developed code in VC++ and then ported to Solaris.

Switching to ORMs

I'm toying with the idea of phasing in an ORM into an application I support. The app is not very structured with no unit tests. So any change will be risky. I'm obviously concerned that I've got a good enough reason to change. The idea is that there will be less boiler plate code for data access and there for greater productivity.
Do this ring true with your experiences?
Is it possible or even a good idea to phase it in?
What are the downsides of an ORM?
I would strongly recommend getting a copy of Michael Feather's book Working Effectively With Legacy Code (by "Legacy Code" Feathers means any system that isn't adequately covered by unit tests). It is full of good ideas which should help you with your refactoring and phasing in of best practices.
Sure, you could phase in the introduction of an ORM, initially using it for accessing some subset of your domain model. And yes, I have found that use of an ORM speeds up development time - this is one of the key benefits and I certainly don't miss the days when I used to laboriously hand-craft data access layers.
Downsides of ORM - from experience, there is inevitably a bit of a learning curve in getting to grips with the concepts, configuration and idiosyncracies of the chosen ORM solution.
Edit: corrected author's name
The "Robert C Martin" book, which was actually written by Michael Feathers ("Uncle Bob" is, it seems, a brand name these days!) is a must.
It's near-impossible - not to mention insanely time-consuming - to put unit tests into an application not developed with them. The code just won't be amenable.
But that's not a problem. Refactoring is about changing design without changing function (I hope I haven't corrupted the meaning too badly there) so you can work in a much broader fashion.
Start out with big chunks. Set up a repeatable execution, and capture what happens as the expected result for subsequent executions. Now you have your app, or part of it at least, under test. Not a very good or comprehensive test, sure, but it's a start and things can only get better from there.
Now you can start to refactor. You want to start extracting your data access code so that it can be replaced with ORM functionality without disturbing too much. Test often: with legacy apps you'll be surprised what breaks; cohesion and coupling are seldom what they might be.
I'd also consider looking at Martin Fowler's Refactoring, which is, obviously enough, the definitive work on the process.
I work on a large ASP.net application where we recently started to use NHibernate. We moved a large number of domain objects that we had been persisting manually to Sql Server over to NHibernate instead. It simplified things quite a bit and made it much easier to change things over time. We're glad we made the changes and are using NHibernate where appropriate for a lot of our new work.
I heard that TypeMock is often being used to refactor legacy code.
I seriously think introducing ORM into a legacy application is calling for trouble (and might be the same amount of trouble as a complete rewrite).
Other than that, ORM is a great way to go, and should definitely by considered.
The rule for refactoring is. Do unit tests.
So maybe first you should place some unittests at least for the core/major things.
The ORM should be designed for decreasing boilerplate code. The time/trouble vs. ROI to be enterprisy is up to you to estimate :)
Unless your code is already architectured to allow for "hot swapping" of your model layer backend, changing it in any way will always be extremely risky.
Trying to build a safety net of unit tests on poorly architected code isn't going to guarantee success, only make you feel safer about changing it.
So, unless you have a strong business case for taking on the risks involved it's probably best to leave well enough alone.