Multiple Mercurial repositories or a single one with clones - mercurial

We are working on a big project organized in 4 teams (website, server, applet, deployment). Sometimes members of each teams have to make minor changes in the code of the other team (for instance: a member from the applet team has to add a field in the website, or a member of the server team has to change a deployment script).
For the moment every team has its own mercurial repository, and we are using version number to organize dependencies between teams. (The applet version 3.4 needs the website version 1.7)
I think that our mercurial organisation is not optimal, my idea is to have one big mercurial repository MASTER, which will be cloned in WEBSITE, SERVER, APPLET and DEPLOYMENT. Each member of each team can have access to the code of others and they are not impacted by the commit of other teams because of the different clones.
What the SO community think about that ?

You should maintain separate repositories for separate components, and other separate repos for any pieces shared between components (libraries if you will). Then use the subrepository feature to have the component repos include the shared stuff.
Merging everything into one large repo works well in svn, but in Mercurial, or any DVCS, you're better off with a composition of smaller repos.

Related

Storing PCB files *and* software files in the same Mercurial repo

I have my Banana Pi set up as my Mercurial server. It works well for me for my software as generally speaking I have firmware and that's about it in my repositories. I can access it via open VPN from anywhere in the world. However, I have started to use version control for my PCB files as well now, due to a new CAD system which complicates my old, crude but effective way of doing my PCB archiving and backup. (Also, everything in my new CAD system, all the PCBs and schamtics, are text files which makes version control work nicely.)
So, with Mercurial I started doing as I did with software and creating a new repo for my PCB for one of the boards I'm updating for a customer, and immediately came across an issue that svn seems to cope with easily and I was wondering whether Mercurial can do the same.
I have my BH0001 project repository which has all the embedded C in it and I have started creating a new issue of the PCB for which the C code is used. I had to create a new Mercurial repo called BH0001_pcb to differentiate between code and PCB. With svn you can have a project repo and then Hardware and Software directories within the project number, but still be able to check out the two different types of files to different places independently.
I could, of course, clone the BH0001 software repository to a local machine, add the PCB info in a new folder in the local Mercurial repo send it all back to the server and it would be perfectly happy. The problem then comes when checking out because I would be cloning both firmware and PCB on to a machine when I might only want one or the other.
Also, this goes against how I store stuff locally. In my /username/home directory I have a Software directory and a CAD directory and within those I have projects. So I would have:
home/CAD/CustomerName/BH0001
and
home/Software/CustomerName/BH0001.
If I'm to carry on using my current method do I have to:
Change my local directory structures to be something like:
home/Projects/CustomerName/BH0001/CAD
and
home/Projects/CustomerName/BH0001/Software
Suck it up and use things like ProjectName_pcb for separate repos.
Some other way I can't think of/can't find/am unaware of? e.g. There's a way of checking out part of a Mercurial repository to one directory and a different part of the repo to a different directory.
Or should I just use svn if I really want to carry on as I have?
With default mercurial you currently cannot do partial repository clones as you can do with SVN. So your approach to use separate repositories is a good choice.
However there ways to achieve a similar result: sub-repositories. In your case I'd create a parent repository which contains your two current repositories as sub-repositories. Mind though, sub-repositories have some rough edges, so read the linked page carefully - I'd like to especially stress that it's good practise to have a parent repo which basically only contains the 'real' repos but not much on its own.
There exist ideas like a thin or narrow clone (which is somewhat identical to what SVN does), but I haven't seen them in production.

VCS: managing multiple feature pull requests

Assuming I am expanding some project hosted on bitbucket with multiple features (managed by mercurial).
If I build up the features one on top of the other (linear local history) I have a local code base that has all the features I need, but the maintainer of the package cannot pick and choose the features he likes. (Because they build on each other.)
If I build up each feature in a separate branch based on the origin master, all the feature PRs are independent of each other (allowing the maintainer to pick and choose), but I no longer have a unified local code base with all my required features.
How does one solve this problem? With patch queues? If so, how?
I'd go for both actually: create a separate feature branch (anonymous heads, maybe named by means of a bookmark) and pull request for each feature.
Additionally, for your own benefit, and maybe also for others to check it all quickly, merge those into your main development branch, your main line.
In principle mercurial has the system of phases and allows for non-publishing repositories which allows to keep a draft history and allows you to easier make updates - but that's afaik something which bitbucket does not yet (fully?) support.

Mercurial Repository Architecture for Code Reviews

We are in the process of moving to Mercurial from Clearcase (for version control) and to Jira/Crucible from ClearQuest (for issue tracking and code reviews). We perform mandatory pre-push reviews.
We have encountered a problem with Crucible and pre-push support, and we are looking for several solutions. The main way to resolve the problem is to make Atlassian products "watch" as least amount of repositories as possible (the issue we encountered is slowness that is directly linked to the amount of repositories watched).
What we do now is watch every single development repository to allow us to perform code reviews on them. We also have one central repository that holds a stable version. My question is how to plan our repository architecture so we can perform code reviews and still keep a clean central repository (I guess some sort of review repository is needed, but I can't figure out how to get it to work for several reviews at once).
We do pre-push reviews the easy way: we use patches instead of having development repositories on a central server.
Only if we need to build something big, we create a development/feature repository on the server, but even then, we still review patches before pushing to those repos.
To enforce this, you need assign roles for pushing to the repos, instead of allowing all development team to push.

Mercurial: Granular Repositories Vs large Repositories and shared third party tools in version control

Scenario:
Various products made up combinations of the smaller projects. A few different versions of each product in dev, release and maintennace (bugs/patches/minor releases).
Most the the team use various third party tools and libraries for dev and for release (common two are XUnit for dev, and AutoMapper in product). They're fans of version controlling these tools/libraries where it makes sense.
I cannot seem understand the best way to organise the structure in mercurial. In the the central SVN style, I'd organise by having the the third party tools as their own projects and then having small builds for the projects that would grab the output of the other projects, and then a release project that would be built from the built projects. All would be in a hierarchy,
(dev branch)
Root/dev/ProjectX/
Root/dev/ProjectY/
Root/dev/ThirdParty/XXX -- could be a 3rd party lib
Root/dev/ThirdParty/YYY -- could be a 3rd party lib
(branch 1)
Root/release1/ProjectX/
Root/release1/ProjectY/
Root/release1/ThirdParty/XXX
Root/release1/ThirdParty/YYY
(branch 2)
Root/release2/ProjectX/
Root/release3/ProjectY/
Root/release2/ThirdParty/XXX
Root/release2/ThirdParty/YYY
etc.
Here comes the rub, due to the way that developers keep their machines upto date (using NUGET package manager) the third party items all have to be in the ThirdParty folder to to ensure that the devs don't have to have multiple copies of these libraries for each project.
My question is this:
If they implement mercurial should implement a simmilar strategy (big repo) and clone/branch here or should they break up the repository say at the project level and clone/branch these. In the latter case would they have a product/version branch/repo? I know they'd prefer a distributed model if it works better in the long term, even if the pain of learning a new workflow is hard initially.
I've read http://nvie.com/posts/a-successful-git-branching-model/ and a number of articles but I'm still unsure as to how to organise.
Basically, you make a separate repo for each product, each project, and each third-party library, and then you combine them as appropriate in subrepositories. On your central server (or servers), I would probably set up a structure of bare repos like this:
products/ProductA/
ProductB/
ProductC/
projects/ProjectA/
ProjectB/
ProjectC/
thirdparty/ThirdPartyA/
ThirdPartyB/
ThirdPartyC/
This way you have exactly one copy of each repo on your central server. You don't need to make separate repos for each branch, because mercurial can hold multiple branches in one repository.
Then, when someone checks out a Product in their local repository, you use the subrepo mechanism to also check out the working trees for the appropriate projects and third party libraries. On your local disk, the structure will look like this:
ProductA/
ProjectA/
ProjectB/
ThirdParty/
ThirdPartyC/
This looks different than on the central server because there you don't have working trees, only pointers into the subrepos.

Creating multiple heads in remote repository

We are looking to move our team (~10 developers) from SVN to mercurial. We are trying to figure out how to manage our workflow. In particular, we are trying to see if creating remote heads is the right solution.
We currently have a very large repository with multiple, related projects. They share a lot of code, but pieces of the project are deployed by different teams (3 teams) independent of other portions of the code-base. So each team is working on concurrent large features.
The way we currently handles this in SVN are branches. Team1 has a branch for Feature1, same deal for the other teams. When Team1 finishes their change, it gets merged into the trunk and deployed out. The other teams follow suite when their project is complete, merging of course.
So my initial thought are using Named Branches for these situations. Team1 makes a Feature1 branch off of the default branch in Hg. Now, here is the question. Should the team PUSH that branch, in it's current/half-state to the repository. This will create a second head in the core repo.
My initial reaction was "NO!" as it seems like a bad idea. Handling multiple heads on our repository just sounds awful, but there are some advantages...
First, the teams want to setup Continuous Integration to build this branch during their development cycle (months long). This will only work if the CI can pull this branch from the repo. This is something we do now with SVN, copy a CI build and change the branch. Easy.
Second, it makes it easier for any team member to jump onto the branch and start working. Without pushing to the core repo, they would have to receive a push from a developer on that team with the changeset information. It is also possible to lose local commits to hardware failure. The chances increase a lot if it's a branch by a single developer who has followed the "don't push until finished" approach.
And lastly is just for ease of use. The developers can easily just commit and push on their branch at any time without consequence (as they do today, in their SVN branches).
Is there a better way to handle this scenario that I may be missing? I just want a veteran's opinion before moving forward with the strategy.
For bug fixes we like the general workflow of Mercurial, anonymous branches that only consist of 1-2 commits. The simplicity is great for those cases.
By the way, I've read this, great article which seems to favor Named branches.
You're definitely thinking about this right, and it sounds like you're going down a good path. I'm a branches as clones person, but named branches have come a long way.
Having a central-ish repo to which all named branches are pushed is convenient for control and backups. Teams working on only branch X can easily create their own branch X only repo by doing hg clone -r X central-ish repo.
The best thing you can do to help the teams out is to let them do clones themselves somewhere that's sitting behind a hgwebdir.cgi instance (as, presumably your central-ish repo will be). You'll find not just teams, but sub-teams and pairs of teams will set up their own repos for mini-efforts you never new about. They'll put them on the named branches that make sense to them and merge back into central as appropriate.
I would make the decision if these three projects should go into one repository by the coupling between these projects (and how many patches are interchanged within them). The more independent they are the less are the advantages of having them in one repo (backup and management aside). There are some different kind of setups:
As you showed, one repository, with one branch for the shared code, and one branch for each project. When the projects itself are generated by forking the shared code base care must be taken when merging back to common (cherry-picking). When inside of each project-branch updates to the common-branch are generated as direct ancestors of the common-branch, and get merges into the project-branch, chances are good they can also be merged back into common. But if changes to common are developed on top of the project branch, merging back will require cherry-picking. I don't have experiences with such a setup, but I fear that the merges can get problematic.
one repo for the shared code and one for each project, connected by symlinks or as subrepo. Here care must be taken to not step on each others feed. In my experience this kind of usage has the potential to grow into a very big PITA. OTOH you seem to have this setup already and your fellow developers can work with it.
one repo for shared and one for each project, with the code from the shared one used as internal releases. I would go for this setup when there are not big regular changes on the shared code base.
All these situations can also be combined with customization-branches for each project within the common part. But I would try to minimize the number of currently active branches, since every new branch requires additional care of merges.
I'm sorry to not give a concrete answer, but "The right thing" (TM) depends to much on the local details.