Related
When I was first started teaching myself programming, after finishing a tutorial I would feel like I still couldn't do anything in the language. So, I looked around to find something to work on. Since I had just learned a few of the basics, the amount of work involved in finding, reading and adding to an open source project seemed insurmountable. Instead I started on a couple toy projects, which ended up being incredibly beneficial.
Having seen a lot of questions from beginners similar to "what should I do now?" and a lot of answers similar to "start working for an open source project" has made me think there has to be better advice for a new programmer. While working on an open source project surely gives great experience, there is a perceptible barrier to entry.
Instead, I think it would be great if new programmers were prodded towards working on a toy program related to some interest they have. Since there are so many directions that programming can take you, I think it would be interesting to list some simple (but fun/rewarding) projects grouped by the direction the new programmer is looking to pursue. Such as:
Game Design:
Write a text adventure (like Zork)
Natural Language Processing:
Create a program that writes meaningless, but grammatically valid essays.
I recently asked a similar question (Diverse resource of problems to show merits of different languages) and got links to sites that provide problem sets, as well as validation. Check out:
http://www.codechef.com/
https://www.spoj.pl/problems/classical/
http://wiki.python.org/moin/ProblemSets
http://projecteuler.net/
Although these problems don't oftem amount to projects, they are still interesting. I'm interested in seeing what people come up with here.
I actually think that a TopCoder approach might be better... programmers can still pick topics of interests, but they're actually working for a prize on a REAL project and they get feedback. Frankly speaking, TopCoder is a bit of a bloat and as far as I can tell, they don't allow people to make free competitions. It would be great if there is a TopCoder/StackOverflow type of site: people can submit code, get voted on their implementation and just have a good time!
I'll even pitch my idea, I'm starting to work on my own version of TopCoder/StackOverflow hybrid monstrosity called MyDevArmy (although I have not done anything so far except buy the domain).
Write a program which renders Wolfram automata (esp. Rule 110).
See YelloSoft for example code.
Start by writing a Blackjack simulation. Choose whichever strategy you want for the first run.
Next, start adding additional runs for different strategies like hitting/standing when your hand's value is 15 vs. 16 vs. 17 vs. 18, and whether the hand is soft or hard (an ace's value being counted as 1 or 11). The dealer's strategy will be constant, as they really are in casinos.
By the end, your program will run, say, 1000 instances of each strategy combination. It will print out a summary of the rate of hand wins (percentage of times you beat the dealer) for each stand value and hard/soft combination.
This is easily one of my favorite projects I've done and it can really cement some techniques in the language of your choosing. Plus, if you have the initiative to start learning some of the (fairly simple) discrete math that's involved in coming up with the odds of these situations as a side project, you can come away with an even better experience. Who knows, maybe you could ditch this computer stuff and take up card counting?
I've decided to get some experience working on some project this summer.
Due to local demand on market I would prefer to learn Java (Standard and Enterprise Editions).
But I can't even to conjecture what kind of project to do. Recently I had some ideas about C. With C I could to contribute to huge Linux projects. I don't mean that my work will be surely commited. I could get the code and practice with it. But C it's not right thing to get good job in my area. In case of JavaSE there is a chance to develop some desktop applications. But thinking about JavaEE I get stuck. I'll be very thankful for answers.
CodingBat.com will give you good core Java practice.
Project Euler is still the best for all around practice. You can use whatever language you'd like to solve the problems there.
For actual projects, I almost always start on something easy like a Twitter client. It gets you exposure to all the basics along with UI and network communication. You can work up from there. Just don't start with something so overwhelming that you can't figure it out and want to give up. That's not going to get you anywhere.
The best advice is: work on a project that you have personal interest in. Something based on your hobbies, maybe.
If that doesn't work, make a blogging / CMS engine. Or an online photo album. Or an eStore. The world doesn't really need another of any of these things, but it will give you some good practical experience with JavaEE.
Another benefit of "re-inventing the wheel" (for learning) is that you have probably already used systems like these described above, and you have a good idea of how it can work, and maybe you have your own ideas of how it could work better. That can make requirements much simpler, and also will give you a sort of benchmark so you can see how close you can come to building a tool like the "real" ones out there. And if yours is really great, well, maybe release it and see what happens. ;)
There are many Java-based projects on SourceForge. Tinker with one you find interesting.
I've implemented either a betting pool or a Baccarat game in almost every language I've
learned.
This type of software covers:
Dates and times, with calculations
Currency types and things that can be converted to and from currency.
A discrete set of rules that is easy to test
States, transition between states and multiple entities responsible for state transition
Multiple users with different views of the same model End conditions
Multiple player blackjack and poker would work also.
One caveat is that in my day job I work on financial systems and there is a huge overlap
between things to consider when writing a multiplayer game of chance and a trading system.
build an address book. the concept is simple, so you're not stuck on "what" to write. You can focus on learning your chosen language. You get experience in working with a database, java ( insert any language here), and UI design.
when you decide to learn another language you can create the same thing. Since the database has been created already, you can focus on the language itself.
the concept of inputting data, storing data, and retrieving data is central to a lot of applications.
Have a look around http://openhatch.org/ for a project that sounds interesting.
Here's what I'm wondering. Every night that our 3 months old baby lets us sleep, I jump to my computer and start coding my hobby projects. I have about 20 different projects that I'm working on: different types of projects, from C++ games to web apps along with some contribution to open source projects. It's truly a passion and has been for a lot of years.
Yet, when I look back, I see that I haven't been able to fully complete one of my hobby projects. I've always done the prototypes and setup the most important features, but with time instead of finishing my project I end up switching to another project that seems "so much cooler" at the moment. Hence I usually end up with buggy and incomplete games that have no end nor story, 3D engines that have the fastest PolygonDraw routine ever, yet lack to implement anything else, etc... The list is long. I think I must have written unfinished Pong over a hundred times different!
I've been told that the remedy is to write specs for my hobby projects.
On one hand, I write a lot of specs at work. I know how crucial they are for defining a product's roadmap and staying within schedule. On the other hand, specs and hobby project just quite don't seem to go along! It seems to me that the learning curve to building a game is actually what makes it fun; not the game itself. Hence the fun of losing time restructuring an entire engine, the fun of creating the most useless features, and so on...
So here comes the question: Do you ever write specifications for your hobby projects? How are they different then the ones from work? How do you manage to complete your hobby projects?
I'd be glad to know while I work on my new project: a piano sonata generator :)
I don't think writing specs is the solution to your problem. Clearly, your "hobby projects" are things that you find fun. You write the fun parts but then avoid the not fun parts that would be necessary to complete something.
If you're just "programming for fun" then good, you're succeeding. I don't think writing specs is fun.
If you really want to "finish" something, the best way isn't to write a spec, it's to not jump to another project when the fun factor dips.
It is all about 'Self project management' ... even for fun.
I feel for you ... I used to have many repos that tended to all get stuck at around revision 200 or so.
Here is what used to happen, because I didn't do enough planning, after around 200 commits, things get messy and need a rewrite ... then interest disappears because it seems like too much hassle.
I learned to write my own specs for personal use
to
Give me focus to get the job done, and not go off into feature creep lane
Remind me what I am working towards
To have great ideas before I get coding
Keep thing more fun for a longer time
For me, writing my own specs is vital to getting anything done!
You wouldn't start a business without a plan would you?
For personal projects I have tons of moleskine books filled with rough specs and ideas. When they mature, they migrate from the note books into real documents and the coding begins.
BIG EDIT: On a drive for personal efficiency and, to get projects finished. I read "Getting Things Done" ... Despite all the hippy crap about 'psyche' and various levels of mind (which im sure is not based in any science) the tips are very good.
I don't get too complicated, but listing out all of the features and requirements that you want included in your application really does help. As with most hobby projects you often don't just sit down and code them straight through for 2 months and finish them. It's an hour here, two hours there, etc. Basically it's very common to forget what you were working on last and what the original purpose of this super great idea for an application was.
If you spend a few hours writing down specs and requirements it will be very valuable to you 6 months down the road when you get some free time or your ADD switches to that project and you try to remember what it is this was suppose to do.
I just found out recently that writing specs is really the thing I need to get my projects done.
I've been a bit like you, tons of projects, jumping from one to the other and never getting things finished. Until about 6 months ago, when I started to actually write specs and have a kind of roadmap for my projects.
All that I can say is that, it actually works, because you break your projects into smaller steps, just like a race with checkpoints, and when you start to mark the checkpoints as done, it feels good, addictive and your focus will be on the finish line.
This way, you get to keep only 1 or 2 projects at the same time, but actually finish them. And of course, you have the extra and pretty valuable bonus of keeping up with the project even if you don't touch it for about a month or more. The specs will always be there to remind you of the goals and purposes of your project.
This is just my personal experience, and I believe that you should give it a try. Hopefully it will workout for you too.
I've been able to do some hobby projects and finish some of them. I try to finish them all but some i just cant muster.
The reason i think is that the amount of details that are needed to finish a projects are so many that it goes from a passion project to a chore of a project.
What helped me finish most of mine is that they stayed a passion until the finishing touches were left. So i just plowed through them.
Will a spec help, to some degree yes. They get you further into the project but almost always there's a point where the passion fades and you look for the next shiny object.
It doesn't work for me! Infact whenever I'm writing up specs I'm generally making the projects even bigger, and less likely to be finished.
Sometimes the best way to do it is to just do it.
Ze Frank explains this much better than me:
http://www.zefrank.com/theshow/archives/2006/07/071106.html (video link with swearing)
EDIT: Just to add. If you are finding you want to leave your half-finished project for a new, grand idea... do it! Don't look back!
Completion is not a requirement for your own pet projects. Nobody will blame you for not finishing stuff that barely anyone else would even bother starting.
The reason you started was because of passion. That is very important. You should not force yourself to 'slog through' in your free time. You will drain your passion which is your most vital resource.
I usually write a first set of spec when I get started.
I'm also a big fan of paper thinking, so I'll draw screens, UML, diagrams, flow charts, design elements... It's just a matter of defining the scope of your project and be able to watch what you had in mind. It really helps me think.
These documents will be my specs for the whole project. I will add others as I go, but I'm not trying to maintain the old ones as much as I would have it it was a work project: I know where I'm going and I can keep track of the changes looking at my code.
Of course, some of my hobby projects are done collaboratively. In these cases, I write down more specs in order to have a better communication with my team and I try to keep documents such as DB Diagrams up to date.
I also have several hobby projects that I have not finished. I have about 10 and have written a specification for exactly one of them, the largest in scope (also a game).
I have not finished either the ones without specifications, nor the one with. I think this is because I never publish the work or show it to anyone so it remains full of bugs and never 'finished.
I suppose that this means that regardless of whether or not you have a spec, it will not affect the success of the project as much as other factors, like having the time, motivation, help, and having confidence.
The single best thing I've ever found to help move towards completion is to have someone else working on the project with you. Find a friend (or two) who is interested in the same thing and design/code it with them. Not only do you have someone to bounce ideas off of, but you've also got someone to motivate you, not to mention progress is twice as fast so you'll hopefully finish before you give up :)
Of course, it requires source control, but you were already using that for your projects, right? :)
Do you want to finish them?
I think it's reasonable to never finish a hobby project. You can just keep working on it as long as you live. Aciddose has been working on his virtual instrument xhip for years, stubbornly never getting to 1.0, making the instrument patches people program worthless from one release to the next. Yet he and the users of his softsynth seem to be having a grand time.
Maybe if you just aim for a "release" and not being "finished" you'll be more satisfied. Betas let you keep dreaming.
Yes and no. I write notes in a notebook as I'm thinking about it, and add to it as I implement it. It is a somewhat different process from work projects where someone else may have to see the spec.
I finish about half of what I start.
I've helped with development on a range of systems from safety critical avionics to throwaway personal projects like a Sudoku solver. Obviously with the avionics systems, specifications were critical to the safe operation of the system and to prevent killing somebody, but I've never bothered with my personal projects.
I think this is because specs are generally boring to read and write. Joel wrote an interesting article about this, and how to make them better if you do write them:
Painless Functional Specifications
Unfortunately I haven't had the guts to try making my specs more fun to read at work yet.
Maybe intead of writing specs you should try working on some projects for or with other people? That could provide some external motivation. I do some web devleopment for my cousin's drive in theater, and if they need a feature they won't stop asking me about it until I finish it.
The single biggest piece of advice I could give you would be to get something out there - make the spec for your first version small enough that you actually feel you can complete it, even though it won't have nearly all the features you want.
Once you get something out there, the pressure from users of your software will be enough to hopefully keep you going on it. It also ensures that the direction you take in development is the same direction your users want you to go.
If you don't actually get any users, then don't feel so bad about dropping the project - if nobody is interested, it probably isn't worth pursuing.
If pressure from your users isn't enough to keep you focused, then open source it. If there's enough interest in it, somebody else will pick it up where you left off, and you are free to move on to bigger and better things.
Unfortunately, after writing specs for the core of the DIFL engine (don't bother looking it up, as there's no trace of it outside my home systems), I still didn't finish it up.
Short answer: developing specifications for a hobby project is neither necessary nor sufficient to guarantee completion.
That being said...
I keep an engineering notebook for all of my personal projects. I use the notebook to capture all sorts of things about the projects on which I work. This includes project motivation, valuable resources leveraged during the project, things developed over the course of the project that might potentially be reused later, key insights gained, etc. etc. It also includes, more to your question, specifications for most of the projects. I employ an agile/lean approach to creating these specifications which, for me, is compelling from a cost/benefit perspective.
btw...I have many, many personal projects that did not culminate in a complete working system. Some of these I might get around to completing 'someday maybe'. I consciously chose to stop working on some of the others because they had served their purpose (e.g. introduced me to a new technology, helped me better understand a language feature, etc.) Continuing to crank away at projects like these would have led to diminishing returns so I chose to reallocate my time to projects I felt were higher leverage.
The real question is: what is your hobby? Is it finishing a project, or tinkering. If getting the last ten yards is a chore, you have to decide if it's worth it to you. Writing detailed specs will work; so will self-flagellation if you're into that sort of self-discipline. Nothing will make it easy if it's against your make-up, so you have to decide whether the end-goal is worth anything to you.
And, just to demonstrate that there is nothing programming-specific about this point, you might really like this guy. One of the main points in his work is that conceptual artists, such as Picasso and Da Vinci never really cared about the final execution--the idea was everything, and, having asserted it, they were strangely content with someone else finishing the actual work or leaving the sketch unfinished and unpublished.
I'm not sure that writing specs is the solution to your problems (or mine which seem similar) however in the case where I want to make something more than a throwaway experiment there are a few things that help me slightly without taking the fun out of it.
Specs really are quite tight and should be technical but for a hobby approach you could write up a little bit of something similar much more loose that outlines some of the things you would like to feature and shows how they fit together in a sort of design draft. Though not as detailed or restrictive as a proper spec it might help to keep the tinkering leading in the right direction.
Secondly you could break it down and depending on your time allowances maybe add a few goals in. If you focus on building one part of the project as a time breaking it into subprojects that can be linked together at the end, it gives a feeling of progress as you move from part to part rather than feeling like you have been working on the same thing for ages and can't be bothered any more. It works if you tick it off on a list, since usually it has to happen atleast mentally anyway.
In saying this if your goal is to play with certain concepts and not actually create a final product then you probably won't because you aren't working towards it. One way might be to take the above mentioned idea of breaking it up and then find a way of adding something personally interesting into each part that bores you, maybe trying to add a challenge into it or something.
I'm not particularly experienced still learning, but this is how I keep my tinkering together(sometimes unless I hit a total block cause by inexperience) and how I've approached many multimedia and web projects on a hobby basis in past years. Though the guy who said open-source it when you get bored and let someone else pick it up, that was a good idea if you want to see your code used but have satisfied your personal goals.
I have much the same problem. One thing I've noticed that HAS helped though, is lowering my ambitions. like WAY WAY low. Writing a spec is one way to reign in the ambitions, if you have some kind of limiting rule for the spec, like "The spec can only be one page", or "the spec can be no longer than 300 words long", or "Spec only something that I can get done in one day of coding". Getting the balance right can take some practice. If you go with the last limit, you can impose the rule of MANDATORY dismissal of the project if you can't finish it in one day.
The nice thing about this, is it limits you to achievable goals. This might sound really stupid or wrong at first. Or maybe it sounds reasonable, but you just can't help it, you wanna do amazing things, not ordinary things! Not small things that you can only get done in a few hours!
but keep this in mind:
“A complex system that works is
invariably found to have evolved from
a simple system that worked. The
inverse proposition also appears to be
true: A complex system designed from
scratch never works and cannot be made
to work. You have to start over,
beginning with a working simple
system.”
—John Gall
It is SO MUCH easier to make that ambitious project, if you already have a FINISHED and WORKING project to base it on. Then the "more complex thing" CAN be a project that fits in a day. This is the ideal and philosophy I'm working towards, because I think it has the best chance of succeeding. Looking at past successful projects, the vast majority of them evolved in this way, whether it was intentional or not.
What helps me a lot is to split a new feature into small tasks that could each be done in an evening hacksession. So if I have time, I simply pick one task from the list and just finish it. This is often enough to get "in the flow" and do "just one more".
I do this only for one feature at a time so I don't get distracted by all the other cool things I could add to my application.
I constantly write specs for my projects, in work, at university and outside in my free time. The biggest weakness of a programmer is his/her memory, so I find it good to keep myself busy during my thinking time by writing down my every thought into some sort of structured document. Before you know it you've written a full database schema or have a Requirements Specification.
At the moment I'm working on improving my SQL skills, and I've been spending a lot of this free time between writing queries writing down my experienced. After a couple of tweaks I had a decent document outlining what needed to be done.
I think the core problem is not the lack of specs, but rather that finishing something (anything) is hard.
It is hard work. It may seem as if your program is 90 % done. But doing those last 10 % (rooting out all bugs, getting the application to release quality, writing documentation, etc) requires as much work as the first 90 %. And if you want to be serious about marketing your program, answering support emails, fixing other people's bugs, that's more work still. And perhaps not the kind of work you are most interested in.
It is also hard mentally. An unfinished project has unlimited potential. It is an empty canvas where you can project your unbridled ambitions, lofty ideals and revolutionary thoughts. Once it is finished and made real you have to see it for what it is. Limited. Flawed. Never as pretty as the idea that spawned it.
That said, finishing something can also be very rewarding. You learn a lot, get a reality check on your ideas, the satisfaction of having completed something and you get to see what other people think of your work.
Some advice:
Make sure that you really want to finish the project. I.e., that the rewards are worth all the hard work. (If not, then accept that fact and remain a happy tinkerer.)
Find ways of motiviating yourself through the "boring" parts. Specs, maybe, if it keeps you focused. But find whatever works for you, whether it is ticking of todo-items, rewarding yourself with a cookie or dreaming of fame and fortune.
Release early, release often. The more you save for a "big release" the bigger is the chance that that release never happens.
First release, then rewrite. When you feel the urge to do a major rewrite, do a release first, then do the rewrite (if you are still up for it). Software is never perfect. If you strive for perfection without any pressure to release your half-baked (but existing) code, then you will never be done.
Most hobby projects of mine don't really get finished either. As long as I'm working on something and learning though I don't think thats a problem. Currently I'm not writing specs, but I am practicing/training TDD. I bring it up as I write tests that are the specs. Some days I'll sit down and just create a bunch of tests outlining what the software should do. Some days I make those tests pass. Its enjoyable in that I don't have to keep the code all in my head, and at any point I can sit down and make further progress by fixing the broken tests. Things just work, its kind of surreal.
Joel's article about the Evidence Based Scheduling works for me. Though I implemented it differently.
The idea is to break the project into small tasks and give estimates, then make a forecast when your project will finish based on the time the finished tasks took to finish them.
You may think your project will take years to finish, but actually from the estimate it's just two months or less. If you work more and finish tasks quickly, you will see the finish date coming earlier.
I think the most motivating thing to proceed forward is seeing the goal coming closer you run towards.
Plus: create something you will use later. Using stuff gives you incentive to improve it later.
Say there are two possible solutions to a problem: the first is quick but hacky; the second is preferable but would take longer to implement. You need to solve the problem fast, so you decide to get the hack in place as quickly as you can, planning to start work on the better solution afterwards. The trouble is, as soon as the problem is alleviated, it plummets down the to-do list. You're still planning to put in the better solution at some point, but it's hard to justify implementing it right now. Suddenly you find you've spent five years using the less-than-perfect solution, cursing it the while.
Does this sound familiar? I know it's happened more than once where I work. One colleague describes deliberately making a bad GUI so that it wouldn't be accidentally adopted long-term. Do you have a better strategy?
Write a test case which the hack fails.
If you can't write a test which the hack fails, then either there's nothing wrong with the hack after all, or else your test framework is inadequate. If the former, run away quick before you waste your life on needless optimisation. If the latter, seek another approach (either to flagging hacks, or to testing...)
Strategy 1 (almost never selected): Don't implement the kluge. Don't even let people know it's a possibility. Just do it the right way the first time. Like I said, this one is almost never selected, due to time constraints.
Strategy 2 (dishonest): Lie and Cheat. Tell management that there are bugs in the hack, and they could cause major problems later on. Unfortunately, most of the time, the managers just say to wait until the bugs become a problem, then fix the bugs.
Strategy 2a: Same as strategy 2, except there really are bugs. Same problem, though.
Strategy 3 (and my personal favorite): Design the solution whenever you can, and do it well enough that an intern or code-monkey could do it. It's easier to justify spending the small amount of code-monkey money than to justify your own salary, so it might just get done.
Strategy 4: Wait for a rewrite. Keep waiting. Sooner or later (probably later), someone is going to have to rewrite the thing. Might as well do it right then.
Here is a great related article on technical debt.
Basically, it is an analogy of debt with all the technical decisions you make. There is good debt and bad debt... and you have to pick the debt that is going to achieve the goals you want with the least long term cost.
The worst kind of debt is small little accumulating shortcuts that are analogous to credit card debt... each one doesn't hurt, but pretty soon you are in the poor house.
This is a major issue when doing deadline driven work. I find that adding very detailed comments about why this way was chosen and some hints at how it should be coded help. This way people looking at the code see it and keep it fresh.
Another option that will work is add a bug.feature in your tracking framework (you do have one, right?) detailing the rework. That way it is visible and may force the issue at some point.
The only time you can ever justify fixing these things (because they're not really broken, just ugly) is when you have another feature or bug fix that touches the same section of code, and you might as well re-write it.
You have to do the math on what a developer's time costs. If software requirements are being met, and the only thing wrong is that the code is embarrasing under the hood, it's not really worth fixing.
Whole companies can go out of business because over-zealous engineers insist on a re-architecture every year or so when they get antsy.
If it's bug-free and meets requirements, it's done. Ship it. Move on.
[Edit]
Of course I'm not advocating that everything be hacked in all the time. You have to design and write code carefully in the normal course of the development process. But when you do end up with hacks that just had to be done quickly, you have to do a cost-benefit analysis on whether or not it's worth it to clean up the code. If over the lifetime of the application you will spend more time coding around a messy hack than you would have fixing it, then of course fix it. But if not, it's way too expensive and risky to re-code a working, bug-free application just because looking at the source makes you ill.
YOU DON'T DO INTERIM SOLUTIONS.
Sometimes I think programmers just need to be told this.
Sorry about that, but seriously--a hacky solution is worthless and even on the first iteration can take longer than doing a portion of the solution correctly.
Please stop leaving me your crap code to maintain. Just ALWAYS CODE IT RIGHT. No matter how long it takes and who yells at you.
When you are sitting there twiddling your thumbs after delivering early while everyone else is debugging their stupid hacks, you'll thank me.
Even if you don't think you are a great programmer, always strive to do the best you can, never take shortcuts--it doesn't cost you ANY time to do it right. I can justify this statement if you don't believe me.
Suddenly you find you've spent five years using the less-than-perfect solution, cursing it the while.
If you're cursing it, why is it at the bottom of the TODO list?
If it's not affecting you, why are you cursing it?
If it is affecting you, then it's a problem that needs to be fixed NOW.
I make sure that I'm vocal about the priority of the long term fix ESPECIALLY after the short term fix has gone in.I detail the reasons why it's a hack and not a good long term solution and use those to get the stakeholders (managers, clients, etc) to understand why it needs to be fixed Depending on the case, I may even inject a bit of worst case scenario fear in there. "If this safely line snaps, the whole bridge could collapse!"I take responsibility for coming up with a long term solution and make sure that it gets deployed
It is a hard call. I have done hacks personally cause, sometimes you HAVE to get that product out the door and into the customers hands. However, the way that I take care of it is to just do it.
Tell the project lead or your boss, or the customer: There are some spots that need to be cleaned up, and coded better. I need a week to do it, and it is going to cost less to do it now, then it will be to do it 6 months from now, when we need to implement an extension onto the subsystem.
Usually problems like this arise from bad communication with management or the customer. If the solution works for the customer then they see no reason to ask for it to be changed. So they need to be told about the tradeoffs you made beforehand so they can plan extra time to fix the problems after you've implemented the quick solution.
How to solve it depends a bit on why it's a bad solution. If your solution is bad because it's hard to change or maintain then the first time you have to do maintenance and have a bit more time then that is the right time to upgrade to a better solution. In this case it helps if you tell the customer or your boss that you took a shortcut in the first place. That way they know that they can't expect a fast solution next time around. Cripling the UI can be a good way to make sure the customer comes back to get stuff fixed.
If the solution is bad because it's risky or unstable then you really need to talk to the person doing the planning and have some time planned in to fix the problem asap.
Good luck. In my experience this is almost impossible to achieve.
Once you go down the slippery slope of implementing a hack because you are under pressure then you might as well get used to living with it for all time. There is almost NEVER enough time to re-work something that already works, no matter how badly it is implemented internally. What makes you think you will magically have more time "at some later date" to fix the hack?
The only exception I can think of to this rule is if the hack completely prevents you from implementing another piece of functionality that is needed by a customer. Then you have no choice but to do the re-work.
I try to build the hacky solution so that it can be migrated to the longterm way as painlessly as possible. Say you got a guy who is building a database in SQL Server cuz that's his strongest DB, but your corporate standard is Oracle. Build the db with as few non-transferable features (like Bit datatypes) as possible. In this example, it's not hard to avoid bit types, but it makes transitioning later an easier process.
Educate whoever is in charge of making the final decision why the hacky way of doing things is bad in the long-run.
Describe the problem in terms they can relate to.
Include a graph of cost, productivity, and revenue curves.
Teach them about technical debt.
Regularly refactor if you're pushed forward.
Never call it "refactoring" or "going back and cleaning up" in front of non-technical people. Instead, call it "adapting" the system to handle "new features".
Basically, people who don't understand software don't get the concept of revisiting things that already work. The way they look at it, developers are like mechanics who want to keep taking apart and reassembling the entire car every time someone wants to add a feature, which sounds insane to them.
It helps to make analogies to everyday things. Explain to them how when you made the interim solution, you made choices that suited building it quickly, as opposed to being stable, maintainable, etc. It's like choosing to build with wood instead of steel because wood is easier to cut, and thus, you could build the interim solution quicker. The wood, however, simply can not support the foundation of a 20-story building.
We use Java and Hudson for continuous integration. 'Interim solutions' must be commented with:
// TODO: Better solution required.
Every time Hudson runs a build it provides a report of each TODO item so that we have an up to date, highly visible record of any outstanding items that need improved.
Great question. This bothers me a lot, too - and most of the time I'm the sole person responsible for prioritizing issues in my own projects (yep, small business).
I found out that the problem that needs to be fixed is usually just a subset of the problem. IOW, the customer that needs an urgent fix does not need the whole problem to be solved, just a part of it - smaller or larger. That sometimes enables me to create a workaround that is not solution to the complete problem but just to the customer's subset and that allows me to leave the bigger issue open in the issue tracker.
That may of course not apply at all to your work environment :(
This reminds me of the story of "CTool". In the beginning CTool was put forward by one of our devs, I'll call him Don, as one possible way to solve the problem we were having. Being an earnest hard-working type, Don plugged away and delivered a working prototype. You know where I am going with this. Overnight, CTool became a part of the company work flow with an entire department depending on it. By the second or third day, bitter complaints started streaming in about CTool's shortcomings. Users questioned Don's competence, commitment and IQ. Don's protests that this was never supposed to be a production app fell on deaf ears. This went on for years. Finally, someone got around to re-writing the app, well after Don had departed. By this time, so much loathing had become attached to the name CTool that naming it CTool version 2 was out of the question. There was even a formal funeral for CTool, somewhat reminiscent of the copier (or was it a printer?) execution scene in Office Space.
Some might say Don deserved the slings and arrows for not making it go right to fix CTool. My only point is that saying you should never hack out a solution is probably unjustifiable in the Real World. But if you are the one to do it, tread cautiously.
Get it in writing (an email). So when it becomes a problem later management doesn't "forget" that it was supposed to be temporary.
Make it visible to the users. The more visible it is the less likely people are going to forget to go back and do it the right way when the crisis is over.
Negotiate before the temp solution is in place for a project, resources, and time lines to get the real fix in. Work for the real solution should probably begin as soon as the temp solution is finished.
You file a second very descriptive bug against your own "fix" and put a to-do comment right in the affected areas that says, "This area needs a lot of work. See defect #555" (use the right number of course). People who say "don't put in a hack" don't seem to understand the question. Assume you have a system that needs to be up and running now, your non-hack solution is 8 days of work, your hack is 38 minutes of work, the hack is there to buy you time to do the work and not lose money while you're doing it.
Now you still have to get your customer or management agree to schedule the N*100 minutes of time required to do the real fix in addition to the N minutes needed now to fix it. If you must refuse to implement the hack until you get such agreement, then maybe that's what you have to do, but I've worked with some understanding people in that regard.
The real price of introducing a quick-fix is that when someone else needs to introduce a 2nd quick fix, they will introduce it based on your own quick-fix. So, the longer a quick-fix is in place, the more entrenched it will become. Quite often, a hack takes only a little bit longer than doing things right, until you encounter a 2nd hack which builds on the first.
So, obviously it is (or seems to be) sometimes necessary to introduce a quick fix.
One possible solution, assuming your version control supports it, is to introduce a fork from the source whenever you make such a hack. If people are encouraged to avoid coding new features within these special "get it done" forks, then it will eventually be more work to integrate the new features with the fork than it will be to get rid of the hack. More likely, though, the "good" fork will get discarded. And if you are far enough away from release that making such a fork will not be practical (because it is not worth doing the dual integration mentioned above), then you probably shouldn't even be using a hack anyways.
A very idealistic approach.
A more realistic solution is to keep your program segmented into as many orthogonal components as possible and to occasionally do a complete rewrite of some of the components.
A better question is why the hacky solution is bad. If it is bad because it reduces flexibility, ignore it until you need flexibility. If it is bad because it impacts the programs behavior, ignore it and eventually it will become a bug fix and WILL be addressed. If it is bad because it looks ugly, ignore it, as long as the hack is localized.
Some solutions I've seen in the past:
Mark it with a comment HACK in the code (or similar scheme such as XXX)
Have an automatic report run and emailed weekly to those that care which counts how many times the HACK comments appear
Add a new entry in your bug tracking system with the line number and description of the right solution (so the knowledge gained from the research before writing the hack isn't lost)
write a test case that demonstrates how the hack fails (if possible) and check it into the appropriate test suite (i.e. so that it throws errors that someone will eventually want to cleanup)
once the hack is installed and the pressure is off, immediately start on the right solution
This is an excellent question. One thing I've noticed as I get more experience: hacks buy you a very short amount of time, and often cost you a huge amount more. Closely related is the 'quick fix' that solves what you think is the problem -- only to find when it blows up that that it wasn't the problem at all.
Setting aside the debate about whether you should do it, let's assume that you have to do it. The trick now is to do it in a way that minimizes long range affects, it easily ripped out later, and makes itself a nuisance so you remember to fix it.
The nuisance part is easy: make it issue a warning every time you execute the kludge.
The ripped out part can be easy: I like to do this be putting the kludge behind a subroutine name. That makes it easier to update since you compartmentalize the code. When you get your permanent solution, you're subroutine can either implement it or be a no-op. Sometimes a subclass can work nicely for this too. Don't let other people depend on whatever your quick fix is, though. It's difficult to recommend any particular technique without seeing the situation.
Minimizing long range effects should be easy if the rest of the code is nice. Always go through the published interface, and so on.
Try to make the cost of the hack clear to the business folks. Then they can make an informed decision either way.
You could intentionally write it in way that is overly restrictive and singe purposed and would require a re-write to be modified.
We had to do this once - make a short term demo version that we knew we did not want to keep. The customer wanted it on a winTel box, so we developed the prototype in SGI/XWindows. (We were fluent in both, so it wasn't a problem).
Confession:
I have used '#define private public' in C++ in order to read data from some other code layer. It went in as a hack but works well and fixing it has never become a priority. It is now 3 years later...
One of the main reasons hacks do not get removed is the risk that one introduces new bugs while fixing the hack. (Especially when dealing with pre-TDD code bases.)
My answer is a bit different from the others. My experience is that the following practices help you stay agile and move from hackey first iteration/alpha solutions to beta/production ready:
Test Driven Development
Small units of refactoring
Continous Integration
Good Configuration management
Agile database techniques/database refactoring
And it should go without saying you have to have stakeholder support to do any of these correctly. But with these products in place you have the right tools and processes to quickly change a product in major ways with confidence. Sometimes your ability to change is your ability to manage the risk of the changes and from the development perspective these tools/techniques give you surer footing.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Joining an existing team with a large codebase already in place can be daunting. What's the best approach;
Broad; try to get a general overview of how everything links together, from the code
Narrow; focus on small sections of code at a time, understanding how they work fully
Pick a feature to develop and learn as you go along
Try to gain insight from class diagrams and uml, if available (and up to date)
Something else entirely?
I'm working on what is currently an approx 20k line C++ app & library (Edit: small in the grand scheme of things!). In industry I imagine you'd get an introduction by an experienced programmer. However if this is not the case, what can you do to start adding value as quickly as possible?
--
Summary of answers:
Step through code in debug mode to see how it works
Pair up with someone more familiar with the code base than you, taking turns to be the person coding and the person watching/discussing. Rotate partners amongst team members so knowledge gets spread around.
Write unit tests. Start with an assertion of how you think code will work. If it turns out as you expected, you've probably understood the code. If not, you've got a puzzle to solve and or an enquiry to make. (Thanks Donal, this is a great answer)
Go through existing unit tests for functional code, in a similar fashion to above
Read UML, Doxygen generated class diagrams and other documentation to get a broad feel of the code.
Make small edits or bug fixes, then gradually build up
Keep notes, and don't jump in and start developing; it's more valuable to spend time understanding than to generate messy or inappropriate code.
this post is a partial duplicate of the-best-way-to-familiarize-yourself-with-an-inherited-codebase
Start with some small task if possible, debug the code around your problem.
Stepping through code in debug mode is the easiest way to learn how something works.
Another option is to write tests for the features you're interested in. Setting up the test harness is a good way of establishing what dependencies the system has and where its state resides. Each test starts with an assertion about the way you think the system should work. If it turns out to work that way, you've achieved something and you've got some working sample code to reproduce it. If it doesn't work that way, you've got a puzzle to solve and a line of enquiry to follow.
One thing that I usually suggest to people that has not yet been mentioned is that it is important to become a competent user of the existing code base before you can be a developer. When new developers come into our large software project, I suggest that they spend time becoming expert users before diving in trying to work on the code.
Maybe that's obvious, but I have seen a lot of people try to jump into the code too quickly because they are eager to start making progress.
This is quite dependent on what sort of learner and what sort of programmer you are, but:
Broad first - you need an idea of scope and size. This might include skimming docs/uml if they're good. If it's a long term project and you're going to need a full understanding of everything, I might actually read the docs properly. Again, if they're good.
Narrow - pick something manageable and try to understand it. Get a "taste" for the code.
Pick a feature - possibly a different one to the one you just looked at if you're feeling confident, and start making some small changes.
Iterate - assess how well things have gone and see if you could benefit from repeating an early step in more depth.
Pairing with strict rotation.
If possible, while going through the documentation/codebase, try to employ pairing with strict rotation. Meaning, two of you sit together for a fixed period of time (say, a 2 hour session), then you switch pairs, one person will continue working on that task while the other moves to another task with another partner.
In pairs you'll both pick up a piece of knowledge, which can then be fed to other members of the team when the rotation occurs. What's good about this also, is that when a new pair is brought together, the one who worked on the task (in this case, investigating the code) can then summarise and explain the concepts in a more easily understood way. As time progresses everyone should be at a similar level of understanding, and hopefully avoid the "Oh, only John knows that bit of the code" syndrome.
From what I can tell about your scenario, you have a good number for this (3 pairs), however, if you're distributed, or not working to the same timescale, it's unlikely to be possible.
I would suggest running Doxygen on it to get an up-to-date class diagram, then going broad-in for a while. This gives you a quickie big picture that you can use as you get up close and dirty with the code.
I agree that it depends entirely on what type of learner you are. Having said that, I've been at two companies which had very large code-bases to begin with. Typically, I work like this:
If possible, before looking at any of the functional code, I go through unit tests that are already written. These can generally help out quite a lot. If they aren't available, then I do the following.
First, I largely ignore implementation and look only at header files, or just the class interfaces. I try to get an idea of what the purpose of each class is. Second, I go one level deep into the implementation starting with what seems to be the area of most importance. This is hard to gauge, so occasionally I just start at the top and work my way down in the file list. I call this breadth-first learning. After this initial step, I generally go depth-wise through the rest of the code. The initial breadth-first look helps to solidify/fix any ideas I got from the interface level, and then the depth-wise look shows me the patterns that have been used to implement the system, as well as the different design ideas. By depth-first, I mean you basically step through the program using the debugger, stepping into each function to see how it works, and so on. This obviously isn't possible with really large systems, but 20k LOC is not that many. :)
Work with another programmer who is more familiar with the system to develop a new feature or to fix a bug. This is the method that I've seen work out the best.
I think you need to tie this to a particular task. When you have time on your hands, go for whichever approach you are in the mood for.
When you have something that needs to get done, give yourself a narrow focus and get it done.
Get the team to put you on bug fixing for two weeks (if you have two weeks). They'll be happy to get someone to take responsibility for that, and by the end of the period you will have spent so much time problem-solving with the library that you'll probably know it pretty well.
If it has unit tests (I'm betting it doesn't). Start small and make sure the unit tests don't fail. If you stare at the entire codebase at once your eyes will glaze over and you will feel overwhelmed.
If there are no unit tests, you need to focus on the feature that you want. Run the app and look at the results of things that your feature should affect. Then start looking through the code trying to figure out how the app creates the things you want to change. Finally change it and check that the results come out the way you want.
You mentioned it is an app and a library. First change the app and stick to using the library as a user. Then after you learn the library it will be easier to change.
From a top down approach, the app probably has a main loop or a main gui that controls all the action. It is worth understanding the main control flow of the application. It is worth reading the code to give yourself a broad overview of the main flow of the app. If it is a GUI app, creating a paper that shows which screens there are and how to get from one screen to another. If it is a command line app, how the processing is done.
Even in companies it is not unusual to have this approach. Often no one fully understands how an application works. And people don't have time to show you around. They prefer specific questions about specific things so you have to dig in and experiment on your own. Then once you get your specific question you can try to isolate the source of knowledge for that piece of the application and ask it.
Start by understanding the 'problem domain' (is it a payroll system? inventory? real time control or whatever). If you don't understand the jargon the users use, you'll never understand the code.
Then look at the object model; there might already be a diagram or you might have to reverse engineer one (either manually or using a tool as suggested by Doug). At this stage you could also investigate the database (if any), if should follow the object model but it may not, and that's important to know.
Have a look at the change history or bug database, if there's an area that comes up a lot, look into that bit first. This doesn't mean that it's badly written, but that it's the bit everyone uses.
Lastly, keep some notes (I prefer a wiki).
The existing guys can use it to sanity check your assumptions and help you out.
You will need to refer back to it later.
The next new guy on the team will really thank you.
I had a similar situation. I'd say you go like this:
If its a database driven application, start from the database and try to make sense of each table, its fields and then its relation to the other tables.
Once fine with the underlying store, move up to the ORM layer. Those table must have some kind of representation in code.
Once done with that then move on to how and where from these objects are coming from. Interface? what interface? Any validations? What preprocessing takes place on them before they go to the datastore?
This would familiarize you better with the system. Remember that trying to write or understand unit tests is only possible when you know very well what is being tested and why it needs to be tested in only that way.
And in case of a large application that is not driven towards databases, I'd recommend an other approach:
What the main goal of the system?
What are the major components of the system then to solve this problem?
What interactions each of the component has among them? Make a graph that depicts component dependencies. Ask someone already working on it. These componentns must be exchanging something among each other so try to figure out those as well (like IO might be returning File object back to GUI and like)
Once comfortable to this, dive into component that is least dependent among others. Now study how that component is further divided into classes and how they interact wtih each other. This way you've got a hang of a single component in total
Move to the next least dependent component
To the very end, move to the core component that typically would have dependencies on many of the other components which you've already tackled
While looking at the core component, you might be referring back to the components you examined earlier, so dont worry keep working hard!
For the first strategy:
Take the example of this stackoverflow site for instance. Examine the datastore, what is being stored, how being stored, what representations those items have in the code, how an where those are presented on the UI. Where from do they come and what processing takes place on them once they're going back to the datastore.
For the second one
Take the example of a word processor for example. What components are there? IO, UI, Page and like. How these are interacting with each other? Move along as you learn further.
Be relaxed. Written code is someone's mindset, froze logic and thinking style and it would take time to read that mind.
First, if you have team members available who have experience with the code you should arrange for them to do an overview of the code with you. Each team member should provide you with information on their area of expertise. It is usually valuable to get multiple people explaining things, because some will be better at explaining than others and some will have a better understanding than others.
Then, you need to start reading the code for a while without any pressure (a couple of days or a week if your boss will provide that). It often helps to compile/build the project yourself and be able to run the project in debug mode so you can step through the code. Then, start getting your feet wet, fixing small bugs and making small enhancements. You will hopefully soon be ready for a medium-sized project, and later, a big project. Continue to lean on your team-mates as you go - often you can find one in particular who is willing to mentor you.
Don't be too hard on yourself if you struggle - that's normal. It can take a long time, maybe years, to understand a large code base. Actually, it's often the case that even after years there are still some parts of the code that are still a bit scary and opaque. When you get downtime between projects you can dig in to those areas and you'll often find that after a few tries you can figure even those parts out.
Good luck!
You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:
SWAG Kit (Linux only) link
Bauhaus academic commercial
Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.
This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.
I find that just jumping in to code can be a a bit overwhelming. Try to read as much documentation on the design as possible. This will hopefully explain the purpose and structure of each component. Its best if an existing developer can take you through it but that isn't always possible.
Once you are comfortable with the high level structure of the code, try to fix a bug or two. this will help you get to grips with the actual code.
I like all the answers that say you should use a tool like Doxygen to get a class diagram, and first try to understand the big picture. I totally agree with this.
That said, this largely depends on how well factored the code is to begin with. If its a gigantic mess, it's going to be hard to learn. If its clean, and organized properly, it shouldn't be that bad.
See this answer on how to use test coverage tools to locate the code for a feature of interest, without knowing anything about where that feature is, or how it is spread across many modules.
(shameless marketing ahead)
You should check out nWire. It is an Eclipse plugin for navigating and visualizing large codebases. Many of our customers use it to break-in new developers by printing out visualizations of the major flows.