As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Currently I have started on very large Legacy Project and am in a situation which I always feared Reading and Understanding Other People's code, I have known that this is an essential skill which is required but haven't developed it as till date it was not required and now its like necessity to develop this skill rather than hobby and so I would like to know from SO Readers about:
How you have overcome the hurdle of reading other people's code ?
What techniques or skill have you developed to polish your art of reading and understanding other people code ?
Are there any books or articles which you have referred to or in general how did you developed the skill of reading and understanding other people's code ?
I would highly appreciate useful answers to this questions as now I can understand how one would feel while trying to understand my code.
Practice. Practice. Practice.
I overcame the hurdle by interacting with people on open-source projects. Discussing my contributions with others, and seeing their suggestions and ways of looking at things really opened my eyes.
I suggest you find a project that fits you, check out the source and contribute what you can (no matter how small to begin with). Over time the skill of reading code should just come naturally. Some projects even offer mentors specifically for helping out new contributors.
Michael Feathers' Working Effectively with Legacy Code is a great resource that contains a large number of techniques for working with older code.
Practice, Practice, Practice.
If you can, talk to whoever wrote the code or has an idea about it. Draw lots of pictures and have them explain big things to you while YOU write comments.
The quickest way to find your way around is to get lost. Dive into the code and break stuff. See if you can change an int into to a string or something.
Patience: Understand that reading code is more difficult than writing new code. Then you need to respect the code, even if it is not very readable, for it does its job and in many cases pretty efficiently. You need to give the code time and effort to understand it.
Understand the Architecture: It is best if there is any documentation on this. Try talking to people who know more about it if they are available.
Test it: You need to spend some time testing and debugging the code so you know what it does.
For those parts you understand, write some unit tests if possible so you can use them later.
Be Unassuming: Many times the names of the patterns are misused. The classes have names which do not indicate their purpose. So don't assume anything about them.
Learn Refactoring: The best book I found on this topic is Refactoring: Improving the Design of Existing Code - By Martin Fowler. Working Effectively with Legacy Code is another awesome one.
I don't know of any way to do it, i have participated in several proyects, where i have to undertand by my own, how do they think in and achieve that solution, so i can undertand it too.
That's why every time i can, i recommend, if you are a developer, try comment the code, all you can, because, you don't know if someone will find it useful (i think i make my point)
If you're not a developer, you should find someone to support you.
How do you learn to read what other people have written? You get really good at writing yourself and try to make sure the person is as good a writer as possible (suggest things they could do better, like adding some darn comments) Sadly, code is almost always read by someone else other than the author. Practice, and familiarity with some of the person's other code can help. But whenever you spend over 5 minutes figuring out what a particular line means, note how they could have done it better, and make sure you never make the same mistake.
good luck. :D
When I'm approaching an unfamiliar code base, I like to start at the beginning. Find main(), and write out a summary of what main() does. Create a list of the functions/methods called in main(). If you're visual, create a flowchart of main().
Once you have a list of the methods called directly from main(), look up those methods and repeat the process. As you figure out what each of those methods is doing, write it up in JavaDoc format and paste it into its corresponding box in the flowchart. If it's an API call, document which API it is using and put a link to the relevant API documentation.
Working recursively, you will create a map of the application and figure out what the program actually does. Once you know what it actually does, you will be able to find inconsistencies between what it does and what it's supposed to do.
The answer varies depending on the tools and documentation available.
If any documentation is available, I try to understand the high level overview of the system - the different modules, interfaces and subsystem interaction. This helps to divide and conquer the code into sizable pieces as you go along reading the code. If any design patterns are commonly used across the code try to build up your knowledge on that.
Also depending on the tools available I may use any of the popular source code browsers (Source Navigator/Source Insight) to quickly view the dependencies (class hierarchy etc). This speeds up code understanding. Also if I have some handy unit testing framework try playing with the code- different inputs and expected output. I also recommend using debugger to step through selected complex functions to get a hang of the code flow.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Has using an acknowledged anti-pattern ever been proven to actually work in a certain specific case? Did you ever solve a problem or gain any kind of benefit in one of your projects by using an anti-pattern?
My understanding of the "anti-pattern" concept is that it encompasses solutions that have drawbacks that only reveal themselves over the long term. Indeed, the primary danger associated with a lot of them---like writing spaghetti code with loads of global variables and gotos every which way, or tossing exceptions into the black hole of an empty catch block---is that they're seductive because they provide an expedient solution to an immediate problem.
EDIT to add: Because of that, sometimes you do derive benefit from these anti-patterns. Sometimes your calculation that you're writing throwaway code that no one will touch again is dead wrong and you wind up with maintenance programmers slandering your heritage and sexual hygiene, but other times you're right and that crummy shell script that's held together with baling wire and spit does the job you intended it to do and is then blessedly forgotten, saving you the considerable time and effort of putting together something decent.
Anti-Patterns are still so widely around just because they solve a particular problem (while creating 10 new ones). Also known as workaround. But how do they say? Nothing lasts longer than a makeshift.
In fact I believe we'd all be jobless if things had been done right from the beginning.
The biggest problem that it has solved in my experience is launching a new application.
When the dev team has scoped the new application thouroughly, the timeline to implement the correct solution is usually too much for management to bear. Therefore, oftentimes, you code to meet the timeline, rather than "correctness" of the solution to get to the launch date, (but have others coding the "correct" solution for the next rev), making it essentially "throw-away" code.
One software anti-pattern is Softcoding, also defined at the daily WTF. Softcoding happens when programmers put material that "should be" inside code into external resources.
I'm working with software that some might say is suffering from softcoding. External files drive the software. Those external files are a micro-language: they must be compiled to XML before the software can use them. This micro-language has its own tools.
But softcoding is always in the mind of the beholder.
Having the material in a micro-language with its own parser has made my life easier. One data source can generate many different outputs: In addition to the version that the main program uses, I am able to extract information into HTML, .csv, and other formats that our customers want. Other programs can generate code in the micro-language, making automation easier.
In our case, softcoding has been a useful pattern, not an anti-pattern.
There is a reason for calling it a pattern rather than a law.
I would surmise that almost everyone has at least one example of a place in code where exactly the wrong thing was done, and it turned out better in the long term than the "right" thing would have.
And a far longer list of examples of anti-patterns causing trouble.
I have used magic pushbuttons a number of times, out of ignorance or laziness, and sometimes it actually worked out just fine, and it turned out that I did not need the extra abstraction of proper MVC.
Duff's Device utilizes the Loop-Switch Sequence (AKA For-Case Paradigm) anti-pattern.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I write hobby code from time to time. The thing is these tools, classes or tiny libraries of code end up in a flash stick with hopeless future! I would love to develop my projects further, and let other programmers trust them. If you were going to use something you found on the Internet, what is the most important thing you look for in that programming tool or small library? e.g. would you consider separate documentation a must?
Thanks for all contributers. I'll try my best to summarized what have been said. Feel free to modify the list. Corrections and additions are more that welcome :)
Start a blog and let others know you
are here.
Choose the most
suitable license. Possibly Open
Source licenses are the best for
hobby projects.
Put your project where people can
reach it. Consider google-code,
github, sourceforge or
other sites.
Use public version-control and
bug-tracker, So others can acquire the
latest source code of your project to
compile and use.
Write a decent documentation, beside
commenting your code clearly of
course. The documentation should
explain the purpose of the library
and provide at least simple examples.
Write tests if you are willing to provide real-world code.
If you are building a library, put a
lot of effort into designing a stable
interface.
Get a blog, release code through it. Explain why you wrote it, what problem it solves. And encourage others to improve upon it, keep the code posted as current as possible. If your tools are useful you will very quickly develop a following that 'trusts' your code.
Separate documentation isn't a must for small tools, but anything creeping into the framework world should probably have ample documentation and examples if you want any serious adoption from the community at large.
The most important thing is that the library is that it be open source, so I can read the code myself. If that is not possible then I insist on documentation.
Also consider using a project-hosting site (like google code or github).
Have a clear license with your code if you don't have one already
(preferably one which encourages modifying / improving / sharing your
code ...)
Have public version control and/or a public bug/issue tracker and/or a mailing list. There are a lot of good sites which offer these services for free.
Seperate documentation is not a deciding factor to me (if the code is well documented and the code quality is high).
Documentation explaining why you wrote it, when you started it, and it's intended function. Understanding where you're coming from will allow me to see future ideas as well as short coming you may not have seen.
Technical documentation explaining the API and some examples on how to implement it. Ideally, keep your documentation in the format that follows the language. For example C# tends to use the XML syntax for defining items. This allows me to feel at home when I'm reading it.
Clean code -- I can't stress this enough because far too many people write exceptionally ugly code. If you're code is ugly and/or unreadable, it may be easier for me to write it from scratch on my own. At the very least, make your code consistent. If I can't understand the code, I won't feel comfortable with it.
Historical records explaining your changes. Seeing how the project has grown allows me to plan better. It also allows people to see how you learn from your mistakes and get a sense of your skill level. Compared to a forum, you can get a feel for how fast things get fixed and then placed in to a new release.
Think long and hard on what kind of license you want there. Public domain? BSD? GPL? More restrictive?
A note on whether or not you mind being contacted and if there are any restrictions in this. For example, would you mind updates? Me explaining security holes? Or perhaps you might use a forum or wiki?
The ability for me to get your latest work and/or nightly builds. SVN or something. This is useful so I know if a bug I found is already fixed.
I think that documentation is a key point for your project.
The document must indicate:
what is the purpose of your library
what are the main features
a really short tutorial, to make it run in 5 minutes.
Many examples
I let people trust my code in a number of projects, but I urge people to make and maintain their own tests, and I make sure that I'm content with the unit tests.
Documentation is always good, but I'm very guilty of finding time to do as much as I would like. But having the author fairly contactable is helpful.
Posting it in an open source repository such as code.google.com or sourceforge.net is probably where to start...
Next to attract attention, document clearly and succintly the purpose of the library / application as outline in one of the answer above.
Finally, blogging and direct mail exchanges happen...
One reason documentation helps people trust your code, is that they know whether a given feature is something which you intended the code to do (and which you will, all else being equal, preserve in future versions of the code), or something that the current code just so happens to do, but which might change at any time as a side-effect of a bugfix or just a refactor.
Some people prefer find out what code really does by looking at it, and that's fine, but documentation tells you (a) what the code is supposed to do, and with any luck (b) what the next version of the code will do. If I want to use your code long-term, and take bugfix updates as you provide them, then I need to know that you've designed an interface that I can rely on and that you're willing to stick to. Documenting it is a strong hint that you're at least trying to do that.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 13 years ago.
Duplicate:
Learning implementing design patterns for newbies
I have been a developer for years and have my way of developing and have always kept up with the latest techologies. I want to start using a design pattern in the hope it will improve my development speed but I need to find one to apply and I need to find a full open source sample that demonstrates it.
I use and have an application that uses LINQ to SQL and .net 3.5 I tried to apply the repository pattern but found the structure complex and having to hack my way through it.
Any advice for someone who wants to better their programming style?
Read blogs (RSS Feeds are prime). Read magazines. Read random MSDN entries. Write little trial applications. The only way to keep up is to discover it and practice it.
Patterns aren't really "tech" in the traditional sense. Using patterns means applying your specific knowledge of a domain to a problem keeping in mind the patterns which apply to that domain. They are useful to exactly the extent that you have a base of experience to put them in context.
The repository pattern, for example, is maybe not the best starting place for constructing a database architecture based on a pattern. Have you got a simpler pattern implemented such as Table Module or (in the specific case of data access) Active Record? If not then perhaps you should start there. These patterns focus on a fairly limited, basic way of organizing data and operations. Repository is more like a meta-pattern that then builds on top of these patterns, organizing a complex domain-data boundary into a simpler collection-like interface.
Two books that I would suggest reading are:
Refactoring: Improving the Design of Existing Code (ISBN: 0-201-48567-2)
and
Refactoring To Patterns (ISBN: 0-321-21335-1)
Both are great books that will help you, at a high level, understand the when's and why's to applying patterns to your code. In addition, they are great reference material for some of the most commonly used patterns out there.
To be clear, these books are by no means the "complete library" of design patterns.
My simple advice for bettering your programming style:
Pick a technology that you find productive and "fun" and keep with it to learn how to fully explore it's potential.
Don't try to learn all the new technologies all the time - just keep yourself oriented.
Seek advice and solutions where and when you actually need them - don't waste time learning solutions to problems you don't (yet) have.
Regarding design patterns... Well... I'll probably get shot for this, but I don't really like the idea of cramming them all into my head "just in case". They are really a cooking book of "good solutions" for common problems. My advice here is: Whenever you run into problems that you can't come up with an obvious/immediate solution for - use them as reference.
Learn from your mistakes (you'll make them).
Don't marry your code. Throw away and rewrite is an excellent way of bettering the style.
I would sincerely recommend dofactory.com
which also offers code examples in vb.net + c# for all the design patterns
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Joining an existing team with a large codebase already in place can be daunting. What's the best approach;
Broad; try to get a general overview of how everything links together, from the code
Narrow; focus on small sections of code at a time, understanding how they work fully
Pick a feature to develop and learn as you go along
Try to gain insight from class diagrams and uml, if available (and up to date)
Something else entirely?
I'm working on what is currently an approx 20k line C++ app & library (Edit: small in the grand scheme of things!). In industry I imagine you'd get an introduction by an experienced programmer. However if this is not the case, what can you do to start adding value as quickly as possible?
--
Summary of answers:
Step through code in debug mode to see how it works
Pair up with someone more familiar with the code base than you, taking turns to be the person coding and the person watching/discussing. Rotate partners amongst team members so knowledge gets spread around.
Write unit tests. Start with an assertion of how you think code will work. If it turns out as you expected, you've probably understood the code. If not, you've got a puzzle to solve and or an enquiry to make. (Thanks Donal, this is a great answer)
Go through existing unit tests for functional code, in a similar fashion to above
Read UML, Doxygen generated class diagrams and other documentation to get a broad feel of the code.
Make small edits or bug fixes, then gradually build up
Keep notes, and don't jump in and start developing; it's more valuable to spend time understanding than to generate messy or inappropriate code.
this post is a partial duplicate of the-best-way-to-familiarize-yourself-with-an-inherited-codebase
Start with some small task if possible, debug the code around your problem.
Stepping through code in debug mode is the easiest way to learn how something works.
Another option is to write tests for the features you're interested in. Setting up the test harness is a good way of establishing what dependencies the system has and where its state resides. Each test starts with an assertion about the way you think the system should work. If it turns out to work that way, you've achieved something and you've got some working sample code to reproduce it. If it doesn't work that way, you've got a puzzle to solve and a line of enquiry to follow.
One thing that I usually suggest to people that has not yet been mentioned is that it is important to become a competent user of the existing code base before you can be a developer. When new developers come into our large software project, I suggest that they spend time becoming expert users before diving in trying to work on the code.
Maybe that's obvious, but I have seen a lot of people try to jump into the code too quickly because they are eager to start making progress.
This is quite dependent on what sort of learner and what sort of programmer you are, but:
Broad first - you need an idea of scope and size. This might include skimming docs/uml if they're good. If it's a long term project and you're going to need a full understanding of everything, I might actually read the docs properly. Again, if they're good.
Narrow - pick something manageable and try to understand it. Get a "taste" for the code.
Pick a feature - possibly a different one to the one you just looked at if you're feeling confident, and start making some small changes.
Iterate - assess how well things have gone and see if you could benefit from repeating an early step in more depth.
Pairing with strict rotation.
If possible, while going through the documentation/codebase, try to employ pairing with strict rotation. Meaning, two of you sit together for a fixed period of time (say, a 2 hour session), then you switch pairs, one person will continue working on that task while the other moves to another task with another partner.
In pairs you'll both pick up a piece of knowledge, which can then be fed to other members of the team when the rotation occurs. What's good about this also, is that when a new pair is brought together, the one who worked on the task (in this case, investigating the code) can then summarise and explain the concepts in a more easily understood way. As time progresses everyone should be at a similar level of understanding, and hopefully avoid the "Oh, only John knows that bit of the code" syndrome.
From what I can tell about your scenario, you have a good number for this (3 pairs), however, if you're distributed, or not working to the same timescale, it's unlikely to be possible.
I would suggest running Doxygen on it to get an up-to-date class diagram, then going broad-in for a while. This gives you a quickie big picture that you can use as you get up close and dirty with the code.
I agree that it depends entirely on what type of learner you are. Having said that, I've been at two companies which had very large code-bases to begin with. Typically, I work like this:
If possible, before looking at any of the functional code, I go through unit tests that are already written. These can generally help out quite a lot. If they aren't available, then I do the following.
First, I largely ignore implementation and look only at header files, or just the class interfaces. I try to get an idea of what the purpose of each class is. Second, I go one level deep into the implementation starting with what seems to be the area of most importance. This is hard to gauge, so occasionally I just start at the top and work my way down in the file list. I call this breadth-first learning. After this initial step, I generally go depth-wise through the rest of the code. The initial breadth-first look helps to solidify/fix any ideas I got from the interface level, and then the depth-wise look shows me the patterns that have been used to implement the system, as well as the different design ideas. By depth-first, I mean you basically step through the program using the debugger, stepping into each function to see how it works, and so on. This obviously isn't possible with really large systems, but 20k LOC is not that many. :)
Work with another programmer who is more familiar with the system to develop a new feature or to fix a bug. This is the method that I've seen work out the best.
I think you need to tie this to a particular task. When you have time on your hands, go for whichever approach you are in the mood for.
When you have something that needs to get done, give yourself a narrow focus and get it done.
Get the team to put you on bug fixing for two weeks (if you have two weeks). They'll be happy to get someone to take responsibility for that, and by the end of the period you will have spent so much time problem-solving with the library that you'll probably know it pretty well.
If it has unit tests (I'm betting it doesn't). Start small and make sure the unit tests don't fail. If you stare at the entire codebase at once your eyes will glaze over and you will feel overwhelmed.
If there are no unit tests, you need to focus on the feature that you want. Run the app and look at the results of things that your feature should affect. Then start looking through the code trying to figure out how the app creates the things you want to change. Finally change it and check that the results come out the way you want.
You mentioned it is an app and a library. First change the app and stick to using the library as a user. Then after you learn the library it will be easier to change.
From a top down approach, the app probably has a main loop or a main gui that controls all the action. It is worth understanding the main control flow of the application. It is worth reading the code to give yourself a broad overview of the main flow of the app. If it is a GUI app, creating a paper that shows which screens there are and how to get from one screen to another. If it is a command line app, how the processing is done.
Even in companies it is not unusual to have this approach. Often no one fully understands how an application works. And people don't have time to show you around. They prefer specific questions about specific things so you have to dig in and experiment on your own. Then once you get your specific question you can try to isolate the source of knowledge for that piece of the application and ask it.
Start by understanding the 'problem domain' (is it a payroll system? inventory? real time control or whatever). If you don't understand the jargon the users use, you'll never understand the code.
Then look at the object model; there might already be a diagram or you might have to reverse engineer one (either manually or using a tool as suggested by Doug). At this stage you could also investigate the database (if any), if should follow the object model but it may not, and that's important to know.
Have a look at the change history or bug database, if there's an area that comes up a lot, look into that bit first. This doesn't mean that it's badly written, but that it's the bit everyone uses.
Lastly, keep some notes (I prefer a wiki).
The existing guys can use it to sanity check your assumptions and help you out.
You will need to refer back to it later.
The next new guy on the team will really thank you.
I had a similar situation. I'd say you go like this:
If its a database driven application, start from the database and try to make sense of each table, its fields and then its relation to the other tables.
Once fine with the underlying store, move up to the ORM layer. Those table must have some kind of representation in code.
Once done with that then move on to how and where from these objects are coming from. Interface? what interface? Any validations? What preprocessing takes place on them before they go to the datastore?
This would familiarize you better with the system. Remember that trying to write or understand unit tests is only possible when you know very well what is being tested and why it needs to be tested in only that way.
And in case of a large application that is not driven towards databases, I'd recommend an other approach:
What the main goal of the system?
What are the major components of the system then to solve this problem?
What interactions each of the component has among them? Make a graph that depicts component dependencies. Ask someone already working on it. These componentns must be exchanging something among each other so try to figure out those as well (like IO might be returning File object back to GUI and like)
Once comfortable to this, dive into component that is least dependent among others. Now study how that component is further divided into classes and how they interact wtih each other. This way you've got a hang of a single component in total
Move to the next least dependent component
To the very end, move to the core component that typically would have dependencies on many of the other components which you've already tackled
While looking at the core component, you might be referring back to the components you examined earlier, so dont worry keep working hard!
For the first strategy:
Take the example of this stackoverflow site for instance. Examine the datastore, what is being stored, how being stored, what representations those items have in the code, how an where those are presented on the UI. Where from do they come and what processing takes place on them once they're going back to the datastore.
For the second one
Take the example of a word processor for example. What components are there? IO, UI, Page and like. How these are interacting with each other? Move along as you learn further.
Be relaxed. Written code is someone's mindset, froze logic and thinking style and it would take time to read that mind.
First, if you have team members available who have experience with the code you should arrange for them to do an overview of the code with you. Each team member should provide you with information on their area of expertise. It is usually valuable to get multiple people explaining things, because some will be better at explaining than others and some will have a better understanding than others.
Then, you need to start reading the code for a while without any pressure (a couple of days or a week if your boss will provide that). It often helps to compile/build the project yourself and be able to run the project in debug mode so you can step through the code. Then, start getting your feet wet, fixing small bugs and making small enhancements. You will hopefully soon be ready for a medium-sized project, and later, a big project. Continue to lean on your team-mates as you go - often you can find one in particular who is willing to mentor you.
Don't be too hard on yourself if you struggle - that's normal. It can take a long time, maybe years, to understand a large code base. Actually, it's often the case that even after years there are still some parts of the code that are still a bit scary and opaque. When you get downtime between projects you can dig in to those areas and you'll often find that after a few tries you can figure even those parts out.
Good luck!
You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:
SWAG Kit (Linux only) link
Bauhaus academic commercial
Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.
This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.
I find that just jumping in to code can be a a bit overwhelming. Try to read as much documentation on the design as possible. This will hopefully explain the purpose and structure of each component. Its best if an existing developer can take you through it but that isn't always possible.
Once you are comfortable with the high level structure of the code, try to fix a bug or two. this will help you get to grips with the actual code.
I like all the answers that say you should use a tool like Doxygen to get a class diagram, and first try to understand the big picture. I totally agree with this.
That said, this largely depends on how well factored the code is to begin with. If its a gigantic mess, it's going to be hard to learn. If its clean, and organized properly, it shouldn't be that bad.
See this answer on how to use test coverage tools to locate the code for a feature of interest, without knowing anything about where that feature is, or how it is spread across many modules.
(shameless marketing ahead)
You should check out nWire. It is an Eclipse plugin for navigating and visualizing large codebases. Many of our customers use it to break-in new developers by printing out visualizations of the major flows.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Question
My question is how can you teach the methods and importance of tidying-up and refactoring code?
Background
I was recently working on a code review for a colleague. They had made some modifications to a long-gone colleagues work. During the new changes, my colleague had tried to refactor items but gave up as soon as they hit a crash or some other problem (rather than chasing the rabbit down the hole to find the root of the issue) and so reimplemented the problem code and built more on top of that. This left the code in a tangle of workarounds and magic numbers, so I sat down with them to go through refactoring it.
I tried to explain how I was identifying the places we could refactor and how each refactoring can often highlight new areas. For example, there were two variables that stored the same information - why? I guessed it was a workaround for a bigger issue so I took out one variable and chased the rabbit down the hole, discovering other problems as we went. This eventually led to finding a problem where we were looping over the same things several times. This was due in no small part to the use of arrays of magic number sizes that obfuscated what was being done - fixing the initial "double-variable" problem led to this discovery (and others).
As I went on this refactoring journey with my colleague, it was evident that she wasn't always able to grasp why we made certain changes and how we could be sure the new functionality matched the original, so I took the time to explain and prove each change by comparing with earlier versions and stepping through the changes on paper. I also explained, through examples, how to tell if a refactoring choice was a bad idea, when to choose comments instead of code changes, and how to select good variable names.
I felt that the process of sitting together to do this was worthwhile for both myself (I got to learn a bit more about how best to explain things to others) and my colleague (they got to understand more of our code and our coding practices) but, the experience led me to wonder if there was a better way to teach the refactoring process.
...and finally...
I understand that what does or does not need refactoring, and how to refactor it are very subjective so I want to steer clear of that discussion, but I am interested to learn how others would tackle the challenge of teaching this important skill, and if others here have had similar experiences and what they learned from them (either as the teacher or the student).
Like most programming, refactoring skill comes with practice and experience. It would be nice to think it can be taught, but it has to be learned - and there is a significant difference in the amount of learning that can be accomplished in different environments.
To answer your question, you can teach refactoring methods and good design in a pedagogical fashion, and that's fine. But, ultimately, you and I both know attaining a certain level is only through long hard experience.
I am not 100% to understand your question but I think you can refer yourself to Code Smell that need to be refactored.It contain a lot of example that you could show to other.
Here is a list of when refactoring should be used (list of code smell)
If you haven't read it, Martin Fowler has an excellent book on the subject called Refactoring: Improving the Design of Existing Code. He goes into substantial detail about how and why a specific piece of code should be refactored.
I hesitated to even mention it for fear that knowledge of this book is assumed by someone asking about refactoring, and that you would think, "Duh, I meant besides the Fowler book." But what the hey, there you go. :-)
You don't mention tests. To 'prove' that a refactoring does not break the existing functionality you need to either have existing tests or write tests before doing the refactoring.
Pair Programming seems to be the best way for me to get this across. This way, as we're working on real, production code, and we both encounter some code that doesn't smell right, we tackle a code refactoring together. The pair acts as the driver's conscience saying to do the right thing instead of the quick fix, and in turn, they both learn what good code looks like in the process.
Refactoring can be an art, and just takes practice. The more you do it, the better you get at it. Keep studying the methods described in Martin Fowler's Ractoring book, and use your tools (Resharper for Visual Studio folk)
One simple way to conceive of refactoring is right there in the name -- it's just like when you factor a common variable out of an equation:
xy + xz
becomes
x(y + z)
The x has been factored out. Refactoring code is the same thing, in that you're finding duplicate code or logic and factoring it out.
It sounds like your approach is a very good one. At the end of the process, you showed how you were able to uncover and fix a lot of problems. For educational purposes, it could then be interesting to invent a new change/enhancement/fix. You could then ask your mentoree how they would enact that change with the old a new codebase. Hopefully they'll see that it's much easier to make the new change with the refactored code (or how doing more refactoring would be the easiest way to prepare for the hypothetical change).
I see a couple of different ways you could try to teach refactoring:
Given textbook-like examples. A downside here is that you may have contrived or simplistic examples where why refactoring is useful doesn't necessarily shine through as well as in other cases.
Refactoring existing code. In this case you could take existing legacy code that you'd clean up, or your own code in development and show the before and after doing various bits to see how much better the after is, in terms of readability and ease of maintanence. This may be a better exercise to do as this is live code being improved and enhanced to some extent.
It isn't something that someone can pick up instantly, it takes time, practice, effort and patience as some refactorings may be done for personal preference rather than because the code runs optimally one way or another.
Teaching someone to refactor when they aren't a natural is a tough job. In my experience your best bet is to sit down with them in an office and refactor some code. While you are doing this keep up a "stream of consciousness" dialog. Talk about what you see, why the code doesn't smell right, options to refactor to, etc. Also you should make sure they're doing the same thing. The most important thing is to impart why, not how, to change the code. Any decent programmer can make a change and have it work, but it takes skill and experience to be able to state why the new solution is better than the previous.