What should I put in header comments at the top of source files? - language-agnostic

I've got lots of source code files written in various languages, but none of them have a standard comment at the top (sometimes even across the same project). Some of them don't have any header comment at all :-)
I've been thinking about creating a standard template that I can use at the top of my source files, and was wondering what fields I should include.
I know I want to include my name and a short description of what the file contains/does. Should I also include the date created? The date last modified? The programmer who last modified the file? What other fields have you found to be useful?
Any tips and comments welcome.
Thanks,
Cameron

This seems to be a dying practice.
Some people here on StackOverflow are against code comments altogether (reasoning that code should be written to be self explanatory) While I wouldn't go that far, some of the points of the anti-comment crowd make sense, such as the fact that comments tend to be out of date.
Header blocks of comments suffer from these symptoms even more so. Every organization I've been with that has had these header blocks, they are out of date. They have a author name of some guy who doesnt even work there any more, a description that does not match the code at all (assuming it ever did) and a last modified date, that once compared with version control history, seems to have missed its last dozen updates.
In my personal opinion, keep comments close to the code. If you want to know purpose of, and/or history of, a code file, use your version control system.

Date created, date modified and author who last changed the file should be stored in your source control software.
I usually put:
The main purpose of the file and things within the file.
The project/module the file belongs to.
The license associated with the file (and a LICENSE file in the project root).
Who is responsible for the file (either the team, person, or both)

Back in 2002, when I was straight out of college and jobs were few and far between after the dot-com bust, I joined a service company which used to create software customized for their clients in Java. I had to sit in the office of a client (which was a ramshackle room in an electric sub-station rigged with an AC to keep the servers running), sharing chairs/PCs with other guys in the team. The other engineers (if I can call them engineers ;) in the group used to make changes ad-hoc to the source code, compile the files and put them into production.
No way to figure out who made what change.
No way to figure out why any change was made.
No way to go to previous version of code, unless the engineer "remembered" what he modified.
Backup: Copy over files from the production server, which were replaced with new files.
Location of backup: Home directory of engineer copying over files to production server.
Reports of production servers going down due to botched attempts of copying over files to the server (missed a file to be copied over, backups getting lost or wrong files being copied over or not all files being copied over) were met with shrugs (oh no, is it down? let's see what happened; hey who changed what recently...? ummm...).
During those days, after spending several frustrating days trying to figure out the whos and whys behind the code, I had devised a system for comments in a list in the header of the source file which detailed the following:
Date of change made
Who made the change
Why was the change made
Two months later when the list threatened to challenge the size of the source code in the file, the manager had the bright idea of getting a source version control system.
I have never needed to put any comments in headers of source files (except for copyright notices) in any company I worked since. In my current company, everything else is mostly self-evident by looking at the code, or going to the bug reporting system which is integrated with the source version control system.

What fields do you need? If you have to ask whether to put some info there, you don't really need that info. Unless you are forced, by some bureaucratic incompetence of your employer, I don't see why you should go looking for more info than you already feel should be there.\

In most organizations, all source files have to begin with a legal blurb. If you're really lucky, it's just a one-liner, but in most cases it's a really long block of legalese. As a result, few people ever read these. Our eye just travels to the first program element and then goes up to its documentation.
So if you want to write anything, write it in association with the topmost program element, not the file.
Any other bookkeeping information should generally be part of your version control, not maintained (poorly) in the file itself.

In addition to the comment above stating license, the project that it belongs to, etc I also tend to put the "weird" requirements at the top as well (such as "built with version X of library Y") so you, or the person who picks it up after you won't change something that the program relies on without realizing it (or, if they do, they will at least know what to change back)

A lot depends on whether you're using an auto-documentation generation tool or not.
While I agree with many of the comments, if you're using JavaDoc or some other documentation generating tool that depends on comments, you'll obviously need to include the things it wants to see.

You did not mention that you are using a version control system and your comment in Neil N's answer confirms this for your older code. While using version control is the best way to go I also have experienced many situations where the cost of doing so for older code would not be paid for by the project's sponsor. If you do not have a centralized change history for the project then the change history can be put in the modules. It is good that you are using a version control system for your new code.
Your company name
All rights reserved (c) year - or reference to appropriate license
Project or library this file is for
Module it belongs to
Description of what it contains
History
-------
01/08/2010 - Programmer - version
Initial creation.
01/09/2010 - Programmer - version
Change description.
01/10/2010 - Programmer - version
Change description.

Those useful fields that you mentioned are good ones. Who modified the file and when.
Your version control software should allow for the embedding of keywords within comments. For example, in CVS, the $Id$ will resolve to the file, date/time modified, and user that modified the file. It will automatically be kept up to date with each check-in.

Include the following information:
What this file is for. That's a very useful piece of knowledge and it's more important than anything else. You should tell the reader, why there is such a file, why did you group functions in a separate file/package/module and why they are used. Maybe briefly, one or two lines, but that should be there.
Legal stuff, if appplicant.
Leave the place for special commands of console editors, such as of Emacs.
Add special commands that your auto-documenting system requires.
Things things you shouldn not include are
Who created the file
When it was created
Who modified it the last time
When it was last modified
What was added by the latest modification
You can--and should--retrieve it via the version control system, where it's constantly and automatically kept up-to-date. Let alone that most of these points are just useless.

Who created the file
When it was created
Who modified it the last time
When it was last modified
What was added by the latest modification

Related

Fix or Start Over (MS Access Database)?

I have sort of agreed to help out a small local community non-profit org with a database that was created for them many years ago. The original developer, along with his original uncompiled files are long gone, the current file has many problems, but since it is a .mde file I can't pull it apart to see how it all works. I do have access to the Tables and I can see what the Reports look like (just not how and where they get the data).
My feeling is that it might just be easier (and quicker!) to create a new version using what I can see in the original database as the basis for the new one. This way whatever problems they are experiencing will be resolved and all of the legacy stuff they no longer need/use will be removed. Plus, they will have a version that can be modified in the future even if I'm not around.
If you've had to deal with a similar situation how did you go about deciding which direction to head in? In other words, what are the typical considerations/traps I need to know about before taking on this adventure (other than it taking way more time and effort than I think it will)?
Thanks!
I'd first determine if big or minor changes are needed now or to be expected in the near future. If not, I'd try to find a way to keep the system alive.
On the other hand, you already provided some good arguments for setting up a new system that I agree with. I would add to that that is much more satisfying to do this. Almost always you wish you started from greenfield if you try to alter an old legacy system.
Point to considder, is that you might need to train the existing users to work with the new system.
Good luck!

Licensing an open source project after reading code of a similar one

I recently contributed some code to an open source library/tool just before I realized that I'd like to re-write this project myself (different programming language and design choices). However, some aspects of the project are just like I would have done them myself or are simply worth "copying". Even if I tried really hard to forget about the original code -- most class names, constants and other stuff just are naturally named the same. The original project's license is AGPL.
Can I use a different license (e.g. MIT)? Which ones?
Will I have to mention the original project somewhere?
If so, where? And will I ever be allowed to remove the notice (maybe after the two projects have truly diverged after a few years of development)?
I am not a lawyer and this is probably not the correct venue for soliciting legal advice.
However, this is my take, as one FOSS contributor to another.
Generally speaking when you contribute to open source projects, you retain the copyright over the code that you wrote, and you release it under a license that permits everyone to use modify and distribute it.
Thus, you retain the right to relicense code that you so contributed under MIT or another license. That doesn't count as "stealing" the code and erasing the GPL license -- you were the original source and you retain the right to release what was originally yours again under a different license.
In some projects, the leaders may request that people DO transfer copyright, although its pretty rare I think. You should check the licensing statement to be sure. Unless there is something in writing somewhere saying that you explicitly agree to transfer copyright to them, then most likely you retained it.
You do not have the right to relicense other peoples work though. In cases where you modified someone elses code, contributing some changes to their class or something, you probably become joint owners, and at least I would not feel comfortable copying the part that they made and relicensing it without permission.
That's just the text of the code though. If you want to rewrite another program from scratch, using a similar high level plan but different execution, I don't think copyright will encumber you. Intellectual property law can still encumber you if some technique or method in the code is covered by a software patent. But it doesn't sound like that's the case here.
To avoid legal issues, sometimes companies / groups of people will use "clean room design" (https://en.wikipedia.org/wiki/Clean_room_design). But iiuc this is just done as a precaution to unambiguously head off any possible lawsuit -- the law does not require that you use such techniques just because you once looked at GPL code, iiuc.
For an example of this playing out, you can look at the history of the MinGW cross compiler project, and the mingw-w64 spin-off of it, which originally began because a private company wanted a version of mingw which supported 64-bit processors and other things, and so used clean room design to reverse engineer the project. The result of this was eventually made fully open source, but was not accepted back into the original mingw project and so there are now two projects. (Hope that this is a fair and impartial summary of the history.)
https://en.wikipedia.org/wiki/MinGW#History

How to make a MediaWiki User required to select license on file upload

It's the first time that I have had to set up a MediaWiki site and I am relatively new to it as a whole.
My problem is that users can upload files (only images in my case) without selecting one of the licenses in the drop down box on the Special:upload page, I have defined licenses though under MediaWiki:Licenses so there are licenses select-able. I would like the upload warning function to tell users that a license is required to upload a file (the same way you are warned the name is to short or whatever). The user must first pick a license before being able to continue.
I have searched around quite a bit and it doesn't seem as easy a changing a variable somewhere. If the solution is something that I should have known I apologize for posting a stupid question...
Well, I've not checked again but this is not really possible with MediaWiki:Licenses alone IIRC. You can however easily set MediaWiki:Licenses to that the default/fallback "license"/tag is something clearly bad like {{DELETEME}} and then set up an AbuseFilter rule to prevent the upload in question. AbuseFilter is very useful, so I recommend trying it anyway.
Alternatively, you can install the UploadWizard, which is very cumbersome to configure but is quite good at enforcing Wikimedia-specific habits like template/copyright paranoia. ;-) Then you can set $wgUploadNavigationUrl to point to UploadWizard, or even restrict Special:Upload visits to privileged users.

Best Practices for Setup and Management of an Open Source Project

Later this year I want to release a PHP framework that I've been working on as open source. I do use source control (SVN), but it's on an extremely limited basis. I'm self-taught, I develop by myself and don't have the experience of working with large teams. I have some ideas about what can help make a project successful, but I'm fuzzy on some of the details. Since it's not yet released, I want to do everything I can to set up the right infrastructure from the beginning. What do I need to know in order to setup and manage a successful project?
Some ideas that I have to make it successful (beyond marketing it):
Good documentation and tutorials
Automated unit tests and builds to
push update to the website
A clear roadmap
Bug Tracking integrated with the
source control
A style guide to keep the code
consistent
A forum for the community to get
support, share ideas, etc.
A good example application built with
the framework
A blog to keep the community informed
Maintaining backwards compatibility
wherever possible
Some of my questions:
How do I setup and automate a one
step submit-test-commit-generate API
docs-push update to website process? Edit: Would Ant or Maven be good candidates for this? If so, do you know of any resources for setting up a PHP project using them?
How do I handle (technically)
submissions from other users? How can
I ensure that those submissions must
be approved before being integrated?
What are some of the pitfalls that
can be avoided in terms of the
project community? I'd prefer to have
it be as friendly and helpful as
possible without a lot of drama.
I'd love to learn from your experience on any of these points. If you think I'm missing anything big, please share that as well. Any resources (preferably geared toward a beginner) that you could point me towards would also be greatly appreciated.
I'm just getting started in community projects, but I'll give you some advice on what I know.
How do I setup and automate a one step submit-test-commit-generate API docs-push update to website process?
I've never implemented it as one process. You could just have a checklist, and possibly even create some scripts to do certain tasks. I've never worked with any source control that automates the uploading and such to be done by a script. Most of the time, there is some web interaction to be involved.
You don't want to push API changes until it's an official release.
EDIT: Working Environment
For PHP, most of the time, I either edit directly on the server and test it there, using a beta.example.com, or similar, before pushing to example.com. You could also set up an web environment on your home PC (using XAMPP for Windows, or the standard LAMP installation on Linux). You would probably just use a mirror of your repository here, so you'd do svn commit, or whichever is appropriate for the VCS or DVCS you choose.
The fun part is testing this with different PHP versions. I've not done this myself, but you could probably use a .htaccess file to run a different PHP binary in order to test it out. I'm not really sure what the best option is for this is.
I've not done much with API, as I've never created a library, but just doing a quick search I found http://www.phpdoc.org/. It looks like a mature project, so that might be a starting point.
As far as creating releases go, I generally create a script that only includes the files that are part of the distribution (it will filter out any VCS files, and anything that you don't want in the distributed file). You could write a script around find on linux (which is what I do most of the time), or there may be other better options.
How do I handle (technically) submissions from other users? How can I ensure that those submissions must be approved before being integrated?
This is mostly handled by the bug tracker, and limited access in the Version Control System. Usually, you, and the people you allow, can commit to the VCS. Other users can submit patches, but then you might have someone review the patch, test the patch, and commit. You could split these tasks up as a team, or assign a patch to one person and have them do it all.
What are some of the pitfalls that can be avoided in terms of the project community? I'd prefer to have it be as friendly and helpful as possible without a lot of drama.
I would just make sure to keep it as positive as possible with the project members and community. There's going to be some disagreements, and it will drive a few people away, but as long as you have a stable product that meets the needs of most people, I think that's all that anyone can expect.
One minor suggestion that's worked well for me: start using first-person plural pronouns, rather than singular ones. That is, talk about "we" and "us" rather than "I" and "me." It encourages other people to participate when they feel like part of team, rather than when they feel like they're contributing your own self-aggrandizement.
The most important thing you have to do is to attract users. Without users, you won't get any contributions and developers helping you out. Because developers are users first, and then they decide to extend/fix something they use and might become contributors.
So to get users, you should consider
describe what your framework does in one or two sentences at the top of your project page
mention how your framework can be used and for what, what situations it is most useful for
add a lot of examples on how to use it
mention whether your framework is stable, beta or alpha. That's important because user need to know that before they start using it
also mention whether you want to keep improving it and keep working on it - most users don't want to use a framework that's abandoned (also keep in mind that a lot of users check your commits to see whether you really are working on it - if your last commit to the repository was months ago then you're not really working on it, so cheating isn't possible)
If you got all this, and people start submitting patches, you can use a patch tool to apply those to your source. Depending on your version control system, you can either use the GNU patch, a diff/patch tool that comes with your version control or maybe even a GUI tool that helps you with this. SVN doesn't have a patch tool (yet), but 'svn diff' will create a patchfile which you can then apply with the GNU patch tool, or in case you're using TortoiseSVN, right-drag the patchfile to your working copy and have TortoiseMerge apply it for you.
And on how to best deal with the community:
answer questions in time, don't wait more than two or three days to answer questions
try to be nice, even with upset and angry people. Only if they keep bothering tell them to (still in a nice way if possible) go elsewhere
always keep discussions about the project on a mailing list. You don't want to repeat the same discussions over and over again - if you have a mailing list, just point users to the archives before the discussion starts all over again
And you should watch the talk "How Open Source Projects Survive Poisonous People (And You Can Too)" - it's really good and tells you a lot on how to deal not just with 'poisonous people' but also how to deal with all people involved in your project.
I'd like to add that you should make it as easy as possible for your users to get the whole thing running and modify the code - these 'power users' can be 'converted' into developers or at least people who send smaller patches.
Don't try to do it all yourself - for open source projects there are several hosting providers that solve most of the problems. I recommend codeplex or google code.
Setting up build scripts will depend a certain amount on what platform you set up, but in general it's easy to add any tool you want into the script once you start using any sort of build script.
If you really need the one step process you describe, you need a build server. I use TeamCity, which I have set up to watch for any changes in svn and trigger build/test whenever something is checked in. The build server will generally be able to perform any steps that you put into the build script.
Read up on Git as an alternative to SVN
free public repository/bug tracker/wiki/fork-enabled community in Github (which hosts symfony and PHPUnit amongst others)
"How do I handle (technically) submissions from other users? How can I ensure that those submissions must be approved before being integrated?" - with Git, pull what you/your closest team finds most interesting to the master branch
Consistent API
be inspired of other public API:s
only change in major versions
guessable
Interesting for both users & developers
clear goal (your roadmap - excellent)
useful, contra everything else available
easy to use, but still not easy-enough-to-write/maintain-yourself
You could check out either Ant or Phing to build your project. Include CodeSniffer in the build and you'll save time checking for basic formatting errors/differences.
These are all technical tips, about the soft part... treat humans with respect, a lot of interest and be overly excited about their contributions and make them feel that they're not wasting their time. That would appeal to me.
Take a look at Karl Fogel's book on Producing Open Source Software. It probably has everything that you asked.
You should also plan for engaging the community. I'd recommend reading Jono Bacon's The Art of Community [http://www.artofcommunityonline.org/].
You have a great set of ideas to start. You might have to start by trimming them down! Ask yourself what's necessary for a first release.
For automating the builds and tests, the scripting can be done with ant, maven or phing for PHP projects.
You'll probably need a host so you can demo the product. For PHP that is pretty easy to find.
You need an open source hosting provider-- especially github (but also google code, source forge, etc). Github provides bug tracking, default licenses, blog and great mechanisms for accepting changes from the community. Built on git, it facilitates distributed projects quite well.
Although it's good to have a one-step build and install in place, automating integration of others changes probably isn't important (or desirable) off the bat.
Good luck!

How to define the version number of a software?

What is the best method to determine the version number I should use for a software or component? Is there a general rule to set version numbers?
I'm pretty sure it is a basic question but I didn't find anything useful after searching a while.
Microsoft have a convention of:
[major].[minor].[revision].[build]
Or follow Jeff's versioning system.
I've been doing this as an interim until I find a better solution. I don't build many large applications, mostly reports and smaller macros, but it's still important for me to keep track of changes and versions.
[Current year].[Current month].[Current day]
FileName 9.7.17.rpt for example.
It works for me and my boss, and it gives a value which you can compare to today's date to see how old the file is. I also keep a changelog.txt file in the same folder as the most current version and it keeps track of all the changes from the previous versions. I also keep track of all versions in a version control page on each projects tab in OneNote.
Thanks for the answer. I'll also throw in how I store the projects for giggles.
Every project gets its own folder. Inside that folder I'll have 4 main items that help me keep track of what's going on in the project.
An old versions folder
A folder for any reference material I might need for the project
The actual project file
And the changelog
That tree will look something like this.
Project X
Old versions
X Report 9.4.12.rpt
X Report 9.5.3.rpt
X Report 9.7.20.rpt
Reference
SQL calls.txt
Client list.txt
Procedures.doc
X Report 9.7.29.rpt
X Report changelog.txt
This way of keeping track of my work really cuts down on the amount of time that I need to spend documenting anything and organizes it in a standard way so if my boss needs to grab something I've worked on, even he knows exactly what everything means and where it is.
For storing multiple projects in my network folder I have these folders.
Inbox
Projects
#Archived Projects
Current Project 1
Current Project 2
Current Project 3
Reference
Inbox is where I toss random things to process later, or a folder where my boss can throw something I'm going to need for a later project. The Projects folder contains all the projects I'm currently working on, and then when I'm done or they no longer become a current priority, they get tossed in #Archived Projects. Reference is a folder for general job reference material, like policies and procedures, phone lists, org charts, fire escape plans. I may never use them, but it's comforting to have a place to put that kind of stuff as opposed to digging through old email.
This is a very common question. Are you sure you searched around? Wikipedia has a good article on software versioning.
Or, you can follow Ubuntu's convention of using year and month.
For example, release on April 2009 would be:
v9.04
Do it like Donald Knuth does with TeX---its version converges to π with each release and will in fact become π when he dies.
Since version 3, TeX has used an
idiosyncratic version numbering
system, where updates have been
indicated by adding an extra digit at
the end of the decimal, so that the
version number asymptotically
approaches π. This is a reflection of
the fact that TeX is now very stable,
and only minor updates are
anticipated. The current version of
TeX is 3.1415926; it was last updated
in March 2008.
from Wikipedia
A common scheme seems to be to use [major].[minor].[revision]. Where the major version number increments on large/major feature changes or rewrites (or stays 0 as long as you didn't reach a stable version, although many open source projects never get past 0 here), minor version number increases on minor changes, such as a collection of bugfixes, an added small feature and the like. revision increments with each build and reflects the smallest granularity of tracking your exact version. Things like small fixes, etc. get rolled into this, usually.
Usually the first number are major changes/major releases, the second number are used when minor features and bug fixes are added, and the third number is used for minor bug fixes and revision numbers.
Ex. 1.0.0
Depends on a lot of things.
If you are doing .Net work, you can have the system keep track of version numbers for your .dlls and .exe files automatically.
We frequently use the subversion revision as part of our version number. We use a system like:
major.minor.svn-version
We increment the major/minor manually based on internal decisions, and have the svn-version propagate to distinguish builds.
The most important thing is that version numbers make sense to your users.