Mercurial: repository layout for multi-customer project

Mercurial: repository layout for multi-customer project - mercurial

We have this webapp product being deployed for multiple customers and think about switching from svn to mercurial. In svn, we see the trunk as our projects 'core', while branches are setup for each customer.
Now I wonder what the best repo layout might be in mercurial.
Lets say the project basically is made out of three main folders: html, css, js.
While the contents of /html stay the same across customers, we have customizations in /css & /js.
Right now these customizations live in separate files, such as skin.css, so we can clearly tell the real on-purpose-customizations from fixes/changes to the core/common files.
SVN then lets us partially commit changes from customer branches back to the trunk, so we can fix global stuff while working on customer projects. As I understand, partial commits are not supported in mercurial right now.
So, how do we best approach this situation in mercurial?
Should we have one central core repo(possibly with release branches) and clone customer projects as separate (remote) repos?
Is it better to have all - core & customer branches - inside one repo?
Thanks a lot for any pointer!

I would split the project to two repos: core (html) and customer (css, js). "core" is easy to understand. For "customer", I would start with a generic or fallback set of styles. Then make a clone off the generic styles for each customer. To make a build for one customer, you need to pull from both core and that customer clone. This way different customers are isolated but still know about each other through the generic style. So if there's any style you want to change cross all customers, you just need to commit to the generic styles and let each customer clone pull. I wouldn't make different customers totally different repos.
The problem with having one single repo is whenever you make a clone for one customer, you carry around an irrelevant part (html). And unexpected modifications can be easily leaked into html. For the same reason, your project should probably have been split to two projects even in SVN.

Related

mercurial repo with multiple customer branches

My question is similar to: Mercurial branches with different codebase
But the solution given there was to do all the work on the one customer branch and merge it to default. I don't see how that is workable in my case.
I have a project that gets distributed to 4 customers. I've setup a named branch for each customer. What is an effective way of merging changes to common code, while leaving alone some customer specific data and/or requirements?
Edit:
I have customers a,b,c. Machines M,N. and parts 1,2,3,4,5.
Right now I have subrepos a,b,c,M,N,1,2,3,4,5 and repos aM1,aM2,bM1,bN1,... . I am considering having subrepo customer (branches a,b,c). machine (branches M,N). parts (1,2,3,4,5)
Are there techniques for making change propagation easy, but also keeping some differences permanent. Maybe something like this:
TipsAndTricks.

Based on your question and clarifying comment, what you want is both impossible and not really what hg is supposed to do.
Everything I am about to say, I say as someone who has made exactly the kind of mistake you are about to make and who has lived to regret it (and slowly, painfully, try to undo it).
As I said in the comments, this seems like a faulty design. If you want something to be part of a code base only for some purposes, then that's a good indicator that what you really need is one of the following:
more configurability to turn features on or off,
a custom build script to assemble various versions of the project, or
a core code base and several customer- or purpose-specific code bases for plugins and add-ons.
You probably don't want customer-specific functionality---and definitely don't want customer-specific data---in your main project's source tree. Anytime you are about to commit a change that includes either customer configurations or code that only one customer will ever use, you need to step back and ask (1) why it should be in your core project instead of a plugin and (2) what the implications would be if another customer or third party ever got access to it by mistake.

I see at least two possible ways:
Customer-specific changes are always in separate changesets, and you move common changes from customer-specific branch into mainline not with merges, but using hg graft, cherry-picking only needed changesets
Using MQ extension, you can have customer-specific changes in MQ-patches (patch per customer), applied|unapplied individually on demand and just merge branches
In the last case separate named branches may be not needed at all (you have default and different patch applied for each customer) - at least I moved to this workflow, when I had to maintain a set of clones with only slightly different configurations and merging from default to each branch become boring

Mercurial Repository Architecture for Code Reviews

We are in the process of moving to Mercurial from Clearcase (for version control) and to Jira/Crucible from ClearQuest (for issue tracking and code reviews). We perform mandatory pre-push reviews.
We have encountered a problem with Crucible and pre-push support, and we are looking for several solutions. The main way to resolve the problem is to make Atlassian products "watch" as least amount of repositories as possible (the issue we encountered is slowness that is directly linked to the amount of repositories watched).
What we do now is watch every single development repository to allow us to perform code reviews on them. We also have one central repository that holds a stable version. My question is how to plan our repository architecture so we can perform code reviews and still keep a clean central repository (I guess some sort of review repository is needed, but I can't figure out how to get it to work for several reviews at once).

We do pre-push reviews the easy way: we use patches instead of having development repositories on a central server.
Only if we need to build something big, we create a development/feature repository on the server, but even then, we still review patches before pushing to those repos.
To enforce this, you need assign roles for pushing to the repos, instead of allowing all development team to push.

Multiple Mercurial repositories or a single one with clones

We are working on a big project organized in 4 teams (website, server, applet, deployment). Sometimes members of each teams have to make minor changes in the code of the other team (for instance: a member from the applet team has to add a field in the website, or a member of the server team has to change a deployment script).
For the moment every team has its own mercurial repository, and we are using version number to organize dependencies between teams. (The applet version 3.4 needs the website version 1.7)
I think that our mercurial organisation is not optimal, my idea is to have one big mercurial repository MASTER, which will be cloned in WEBSITE, SERVER, APPLET and DEPLOYMENT. Each member of each team can have access to the code of others and they are not impacted by the commit of other teams because of the different clones.
What the SO community think about that ?

You should maintain separate repositories for separate components, and other separate repos for any pieces shared between components (libraries if you will). Then use the subrepository feature to have the component repos include the shared stuff.
Merging everything into one large repo works well in svn, but in Mercurial, or any DVCS, you're better off with a composition of smaller repos.

Creating multiple heads in remote repository

We are looking to move our team (~10 developers) from SVN to mercurial. We are trying to figure out how to manage our workflow. In particular, we are trying to see if creating remote heads is the right solution.
We currently have a very large repository with multiple, related projects. They share a lot of code, but pieces of the project are deployed by different teams (3 teams) independent of other portions of the code-base. So each team is working on concurrent large features.
The way we currently handles this in SVN are branches. Team1 has a branch for Feature1, same deal for the other teams. When Team1 finishes their change, it gets merged into the trunk and deployed out. The other teams follow suite when their project is complete, merging of course.
So my initial thought are using Named Branches for these situations. Team1 makes a Feature1 branch off of the default branch in Hg. Now, here is the question. Should the team PUSH that branch, in it's current/half-state to the repository. This will create a second head in the core repo.
My initial reaction was "NO!" as it seems like a bad idea. Handling multiple heads on our repository just sounds awful, but there are some advantages...
First, the teams want to setup Continuous Integration to build this branch during their development cycle (months long). This will only work if the CI can pull this branch from the repo. This is something we do now with SVN, copy a CI build and change the branch. Easy.
Second, it makes it easier for any team member to jump onto the branch and start working. Without pushing to the core repo, they would have to receive a push from a developer on that team with the changeset information. It is also possible to lose local commits to hardware failure. The chances increase a lot if it's a branch by a single developer who has followed the "don't push until finished" approach.
And lastly is just for ease of use. The developers can easily just commit and push on their branch at any time without consequence (as they do today, in their SVN branches).
Is there a better way to handle this scenario that I may be missing? I just want a veteran's opinion before moving forward with the strategy.
For bug fixes we like the general workflow of Mercurial, anonymous branches that only consist of 1-2 commits. The simplicity is great for those cases.
By the way, I've read this, great article which seems to favor Named branches.

You're definitely thinking about this right, and it sounds like you're going down a good path. I'm a branches as clones person, but named branches have come a long way.
Having a central-ish repo to which all named branches are pushed is convenient for control and backups. Teams working on only branch X can easily create their own branch X only repo by doing hg clone -r X central-ish repo.
The best thing you can do to help the teams out is to let them do clones themselves somewhere that's sitting behind a hgwebdir.cgi instance (as, presumably your central-ish repo will be). You'll find not just teams, but sub-teams and pairs of teams will set up their own repos for mini-efforts you never new about. They'll put them on the named branches that make sense to them and merge back into central as appropriate.

I would make the decision if these three projects should go into one repository by the coupling between these projects (and how many patches are interchanged within them). The more independent they are the less are the advantages of having them in one repo (backup and management aside). There are some different kind of setups:
As you showed, one repository, with one branch for the shared code, and one branch for each project. When the projects itself are generated by forking the shared code base care must be taken when merging back to common (cherry-picking). When inside of each project-branch updates to the common-branch are generated as direct ancestors of the common-branch, and get merges into the project-branch, chances are good they can also be merged back into common. But if changes to common are developed on top of the project branch, merging back will require cherry-picking. I don't have experiences with such a setup, but I fear that the merges can get problematic.
one repo for the shared code and one for each project, connected by symlinks or as subrepo. Here care must be taken to not step on each others feed. In my experience this kind of usage has the potential to grow into a very big PITA. OTOH you seem to have this setup already and your fellow developers can work with it.
one repo for shared and one for each project, with the code from the shared one used as internal releases. I would go for this setup when there are not big regular changes on the shared code base.
All these situations can also be combined with customization-branches for each project within the common part. But I would try to minimize the number of currently active branches, since every new branch requires additional care of merges.
I'm sorry to not give a concrete answer, but "The right thing" (TM) depends to much on the local details.

Mercurial setup: One central repo or several?

My company is switching from Subversion to Mercurial. We're using .NET for our product. We have a solution with about a dozen projects that are separate modules with no dependencies on each other. We're using a central repo on a server with push/pull for our integration build.
I'm trying to figure out if I should create one central repo with all the projects in it, or if I should create a separate repo for each project. One argument for separate repos is that branching the individual modules would be easier, but an argument for a single repo is easier management and workflow.
I'm very new to hg and DVCS, so some guidance is greatly appreciated.
ETA: At hginit.com, Joel says:
[I]f you’re used to having one big
gigantic repository for the whole
company, where some people only check
out and work on subdirectories that
they care about, this isn’t a very
good way to work with Mercurial—you’re
better off having lots of smaller
repositories for each project.
It'd be great if someone could expand on this or point me to more documentation.

One thing you should take into consideration here is the fact that Mercurial does not support checking out directories like subversion does. One typical subversion setup is to have one giant repo with multiple separate projects in it, and when somebody needs code they will just checkout a subdirectory containing that project. You can't do this in mercurial. You either take the whole repo, or nothing. If everybody working on these projects does not need all the code, all the time, you might want to split it up into separate repositories.
EDIT: This link might be helpful in setting things up, in particular the "Publishing Multiple Repositories" section.

if completely separate repos don't work for you maybe have each project as a subrepo of some umbrella repo. I have to say that seperate repos sounds like what you need though given that each project sounds totally independent.

I'm fairly new to Mercurial myself (my company is making the leap from SourceSafe) so I don't know what more experience would say.
For me it makes sense to have one repository per Visual Studio Solution. If your modules are truly not dependent on each other, why are they all in the same solution? If you have a good reason for them all being in one solution, then that's probably the reason to keep them in one repository. If there's not a good reason for them to be in one solution, then a repository and a solution for each makes more sense to me.
Edit: So, since all the modules are built together and need to integrate, that would push me towards a single solution and a single repository.
Mercurial does a great job of merging, but the one thing I've had issues with is the solution file when merging the addition of more than one project at a time. It gets confused with multiple End Project lines. So, as long as you aren't adding new projects very often, your merges should be smooth.

From my experience, and not based upon studies etc, I would say that each logical blob is a repository. If you share code between subprojects, they need to be in the same repo. There will be full subrepo functionality, but currently (apr 2010) it's not fully implemented.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008