Mercurial setup: One central repo or several? - mercurial

My company is switching from Subversion to Mercurial. We're using .NET for our product. We have a solution with about a dozen projects that are separate modules with no dependencies on each other. We're using a central repo on a server with push/pull for our integration build.
I'm trying to figure out if I should create one central repo with all the projects in it, or if I should create a separate repo for each project. One argument for separate repos is that branching the individual modules would be easier, but an argument for a single repo is easier management and workflow.
I'm very new to hg and DVCS, so some guidance is greatly appreciated.
ETA: At hginit.com, Joel says:
[I]f you’re used to having one big
gigantic repository for the whole
company, where some people only check
out and work on subdirectories that
they care about, this isn’t a very
good way to work with Mercurial—you’re
better off having lots of smaller
repositories for each project.
It'd be great if someone could expand on this or point me to more documentation.

One thing you should take into consideration here is the fact that Mercurial does not support checking out directories like subversion does. One typical subversion setup is to have one giant repo with multiple separate projects in it, and when somebody needs code they will just checkout a subdirectory containing that project. You can't do this in mercurial. You either take the whole repo, or nothing. If everybody working on these projects does not need all the code, all the time, you might want to split it up into separate repositories.
EDIT: This link might be helpful in setting things up, in particular the "Publishing Multiple Repositories" section.

if completely separate repos don't work for you maybe have each project as a subrepo of some umbrella repo. I have to say that seperate repos sounds like what you need though given that each project sounds totally independent.

I'm fairly new to Mercurial myself (my company is making the leap from SourceSafe) so I don't know what more experience would say.
For me it makes sense to have one repository per Visual Studio Solution. If your modules are truly not dependent on each other, why are they all in the same solution? If you have a good reason for them all being in one solution, then that's probably the reason to keep them in one repository. If there's not a good reason for them to be in one solution, then a repository and a solution for each makes more sense to me.
Edit: So, since all the modules are built together and need to integrate, that would push me towards a single solution and a single repository.
Mercurial does a great job of merging, but the one thing I've had issues with is the solution file when merging the addition of more than one project at a time. It gets confused with multiple End Project lines. So, as long as you aren't adding new projects very often, your merges should be smooth.

From my experience, and not based upon studies etc, I would say that each logical blob is a repository. If you share code between subprojects, they need to be in the same repo. There will be full subrepo functionality, but currently (apr 2010) it's not fully implemented.

Related

Mercurial managing subtle variations/configurations of a same project

I am currently using Mercurial, along with the Guestrepo extension, to manage and version the different components of a project. I have come to a quite stable workflow to manage the different versions of the components.
However, I can't come up with an effective solution when it comes to versioning subtle variations of a component. This is, for example, a slightly different embedded device driver (different serial port speed for example), or a GUI which is written in English instead of German.
I don't think stacking them in the Release/Stable branch is a good workflow, as the proliferation of different configurations (English,Spanish,Chinese,...) could lead to a serious and nonsense bloat of the Release branch.
On the other hand, creating a separate Release branch for each, would lead me to many, many branches, which is not the best solution IMHO.
Creating separate repositories for each of the configurations would suppose a quite tedious task whenever a structural change had to be made, as all of the repos would have to be updated with that change.
Any idea on this?
Thank you.
As #EldadAK suggests, creating a repo for each configuration and importing core functionalities from other repos seems a nice idea.
However, I still can't figure out how to arrange "same but slightly different" components, which differ in some subtle features but share their core.
Is it a code architecture issue? Should the components be refactored so that the main core and the differing features lay in different components, which are related using custom builds for each configuration?
IMHO, you should keep all changes in your main branch. Managing all branches or even multiple repositories is not scalable and will eventually get out of hand.
I don't think you should care about the size of your release branch. By keeping it all together, you will always know where you are, what is included in a release and when you really need to branch, have all the changes accumulated to that point.
It's my personal opinion that you should try and keep it simple to manage looking many revisions and years ahead...
Note - Project managers tend to think about next week. You need to think about next year...
I hope this helps.
effective solution when it comes to versioning subtle variations of a component
While I can't see any serious drawbacks from using named branches in one repo, you can use another ("default" de-facto) solution for configuration management inside Mercurial: MQ
You have only slightly adopt your workflow (same amount of branches, same amount of repos) to MQ

Setting up versioning repositories - one big or many small?

I have worked with mercurial for some time now and it feels like a big asset to what I do. It's great to never mess up code again by accident.
I love the workflow and wants to setup versioning on all my projects. What is the best option of the two below illustrated alternatives?
Alternative A:
/repo/
/repo/ownWork/
/project1/
/repo/clients/
/client1/
Alternative B:
/repo/project1/
/repo/client1/
I don't think there is a "right" answer. As with many things, it depends.
Personally, I have a separate repository for each project and, possibly, one or more repositories for shared code. With distributed source control you have to check out/clone the whole repository, not just sub-folders like you can with, say, SVN. Therefore I like to keep each project/client as self contained as possible but, if necessary, clone shared repos too.
However, I still maintain a single 'central' web server to host them all. I like 'distributed' and I like 'centralised' too :-)
The good thing about hg is that it seems (to my newbie eyes, anyway) to be very easy to chop and change your layout/structure as time progresses.
In mercurial it's very easy to combine repos later, but not possible to separate them without invalidating existing clones. Start separate and merge later if it has become a hassle. Consider subrepos for code shared among projects.

Mercurial, Branch each project in a solution?

Currently, we're using Mercurial as our VCS on BitBucket.
The way the project is right now, is a solution with all the code checked in with all the developers working on the "default" branch. Every morning, we create a build release and the QA have it.
I'm wondering whether it makes more sense to branch each dev on his own since each dev is working on a project in the solution.
The other main point is how would this affect the QAs? Would they need to merge all the branches prior to building?
I'm really confused about this.
As mentioned in "When should you make a branch", you use branching to isolate a development effort.
In your case, you would isolate each project in the solution on which you are working.
That would allows for:
intermediate commits, project per project
QA testing for each project
But that would also require a merge in a common branch for all the project to be tested together as a solution.
See HgInit (from Joel Spolsky) for more on that kind of collaboration workflow.
In "Repository Architecture", Joel illustrates two development effort isolated in two different teams, but still including a synchronization (merge) effort at the end.

To keep my own versioned app or not

I need some opinions here.
I'm working on a Django project using buildout to get the dependencies, etc...
I use mercurial as DVCS.
Now... I need to customize one of the dependencies, so I can do one of the following:
(* The changes may not be useful for everyone else.)
1- Do a fork of the project in (github, bitbucket, etc...) maintain my version, and get the dependency with (mercurial or git) recipe.
2- Clone the project, put it in the PYTHONPATH, erase DVCS dirs and add it to my projects version. So every change will be private. Here I need to erase all the info from their DVCS or something.
Any other you can think of.
I'm missing something? I'm too off?
Thanks!
Esteban, take these steps: I'll talk in mercurial-speak, but this is all do able in git too.
clone their project
make your clone of their project a subrepo in your project
That gives you the best of all worlds. You can edit code in your project and their project without paying attention to which is which, and when you commit the changes to your code go into your repo along with a pointer to a new changeset in your clone of their project. Then when you want to update your clone of their project you can do so in place and merge simply.
So this is pretty much what you said in '1' but there's no need to do a fork or host that repo publically. Just edit their clone as a subrepo of your project and never push (which wouldn't work anyway since you don't have write access to their repo).
Your option two's primary drawback is that as they modify and improve their project on which you depend you'll have a hardtime pulling their improvements in and merging them with yours.
Well if you're using DVCS then all your commits are kept as change sets, and people can choose to apply your change set or not. So as long as you comment that change, people can choose apply the change or not as they see fit. What's more if they don't want that change, but want your other changes, they can pick and choose. So the truth is the DVCS takes care of the problem for you (provided the people pulling from you are using the DVCS properly).
Personally, I recommend forking, but like I said, it doesn't really matter.
You ask this question in a rather confusing way, and I don't know if you really understand the point of a DVCS.
The whole point of a DVCS is to allow you to have your own private repository. You do not need to publish your repository on github or bitbucket or any of those places unless you want to, but I certainly would not erase the DVCS information.
If the upstream project makes changes you do want in addition to your own private changes, you will have a devil of a time merging them unless you keep the DVCS information around.
Using Mercurial, you can include a project in yours by using the Mercurial subrepo feature.

Mercurial Novice - Very Confused

Im in the process of trying to get my head round a dvcs such as mercurial. Im getting quite confused with certain points though. Firstly, a bit of context:
At the minute i mostly use subversion, and it works fine for my workflow,
Mostly the repository is for my own use, im the only web developer,and i only ever submit raw code to my manager, he never has to see the repository.
I use the repo to create major versions, and as backup so i can revert to it when something doesnt work out.
The repo also acts a file share, enabling me to work from the same codebase at work and at home.
My main reason for wanting to switch to mercurial, is the offline commits and easier branching / merging.
Firstly can anyone tell me how i would get mercurial to fit this workflow?
How do i go about sharing multiple repositories (i.e. one for each project) between computers?
Any help would be hugely appreciated,
Thanks
http://hginit.com/
There is a fantastic pre-chapter there specifically for SVN users. The rest of the tutorial will get you on your feet fairly quickly.
I'll answer just one part of your question, that of how to manage access to your repository from both home and work, because this is one of the situations where distributed version control is really useful.
The answer is that your two repositories are clones of one-another (to be correct, one is the clone of the other). You do some work during the day, check it in, then pull that work to your home repository (or push, but that requires more work). The next morning, you do the same thing in reverse. Mercurial comes with a built-in read-only HTTP server that makes it really easy, provided that you can expose a port.
The end result is that you have two repositories (ie, automatic backup of the entire history). At any given point in time, one is "better" than the other, but since you're the sole committer to both, they won't diverge.