My question is similar to: Mercurial branches with different codebase
But the solution given there was to do all the work on the one customer branch and merge it to default. I don't see how that is workable in my case.
I have a project that gets distributed to 4 customers. I've setup a named branch for each customer. What is an effective way of merging changes to common code, while leaving alone some customer specific data and/or requirements?
Edit:
I have customers a,b,c. Machines M,N. and parts 1,2,3,4,5.
Right now I have subrepos a,b,c,M,N,1,2,3,4,5 and repos aM1,aM2,bM1,bN1,... . I am considering having subrepo customer (branches a,b,c). machine (branches M,N). parts (1,2,3,4,5)
Are there techniques for making change propagation easy, but also keeping some differences permanent. Maybe something like this:
TipsAndTricks.
Based on your question and clarifying comment, what you want is both impossible and not really what hg is supposed to do.
Everything I am about to say, I say as someone who has made exactly the kind of mistake you are about to make and who has lived to regret it (and slowly, painfully, try to undo it).
As I said in the comments, this seems like a faulty design. If you want something to be part of a code base only for some purposes, then that's a good indicator that what you really need is one of the following:
more configurability to turn features on or off,
a custom build script to assemble various versions of the project, or
a core code base and several customer- or purpose-specific code bases for plugins and add-ons.
You probably don't want customer-specific functionality---and definitely don't want customer-specific data---in your main project's source tree. Anytime you are about to commit a change that includes either customer configurations or code that only one customer will ever use, you need to step back and ask (1) why it should be in your core project instead of a plugin and (2) what the implications would be if another customer or third party ever got access to it by mistake.
I see at least two possible ways:
Customer-specific changes are always in separate changesets, and you move common changes from customer-specific branch into mainline not with merges, but using hg graft, cherry-picking only needed changesets
Using MQ extension, you can have customer-specific changes in MQ-patches (patch per customer), applied|unapplied individually on demand and just merge branches
In the last case separate named branches may be not needed at all (you have default and different patch applied for each customer) - at least I moved to this workflow, when I had to maintain a set of clones with only slightly different configurations and merging from default to each branch become boring
Related
I've thought this through and I think I understand the implications, but I wanted to get a sanity check because the caveats on https://www.mercurial-scm.org/wiki/ShareExtension are pretty general.
Specifically, the warning is "It's probably not a good idea to mix MQ and shared clones; if you do so, you should definitely avoid pushing/popping patches in one clone while another clone has patches applied."
However based on my understanding of how Mq works, it's only unsafe to push/pop patches (create/destroy history) if you have two shares whose working directory parent would be affected by such changes. That is, if you have two shares that are updated to separate named branches, pushing/popping patches from one should only have the effect on the other of creating/destroying history that is unrelated to the working directory and thus should NOT have any undesirable side-effects.
There will be small side effects, such as revision sequence number changes in some situations, but nothing that should jeopardize correctness or cause problems with the working directory.
Is this correct or am I missing something?
I'm not sure about this, but AFAIK if you end up having the exact same file content in both branches (repos), this might still wind up as shared storage and wreak havoc.
This is obviously not a definitive answer, but I just wanted to report back in case anyone else is interested in this situation. I've been running for several months now with multiple shares on a "central copy" of a large repo, with each share being dedicated to its own branch, and using MQ freely within each share. I have not hit any problems. History changes on other branches just look the same as pulls/strips would -- unrelated changesets being added, modified, and removed.
We are using Mercurial to manage a project, but we now want to create a "Lite" version of the same project (ie a version with some of the functionality removed or simplified).
Since the Lite version will share most of its code with the Full version, we are considering whether it is better to either:
Create a clone of the original project and keep each project separate.
Use named branches to maintain both versions of the project within the same repository.
We are fairly new to version control software and this will be the first time that we have used named branches. Can someone please help outline the pros and cons of each approach. Which approach would make it easiest to maintain bug fixes between the two projects?
Thanks,
Andrew
You can use any of 4 methods - they have more or less the same amount of functionality and differ only in used commands for syncing codebase
For "Separate clones" solution "...keep each project separate" is Bad Thing (tm) and violation of DRY principle: re-used code (Lite version) must be maintained in single place and changes in core pulled from Lite repo to Full repo
Note:
When I wrote about 4 solutions, I had in mind, except clones and named branches also
MQ (Lite version is changesets in repository, Full version is MQ-patch(es) on top of Lie in the same repo
Subrepositories|Guestrepo (Lite version is superrepo, Full version functionality is one or more subrepositories in superrepository)
Having the same problem, I stumbled across this question. Now I found a solution which I'd like to share with you.
Suppose we have two devices, A and B, both having essentially the same firmware, but with certain differences which are supposed to be retained. Thus the two versions are retained in two branches - A and B.
If I make now a change at one of the versions, I can merge them over to the other one under certain conditions.
The condition is that the common precedessor (common base) must be of the "giving" branch.
In order to ensure this, you can do
hg debugsetparents . <other branch before the modifications>
After doing this, the current working set is like a merge of two branches, except that the data remains stable, i. e. "we" keep "our" stuff.
After this, you can do a "real" merge with the other branch after the modifications.
The result of doing so is that you get exactly the difference between the manually-created common base in the giving branch and the final state of the giving branch, resulting in exactly the modifications you want to get.
What is a good workflow for using mercurial with two long-running branches that are slightly divergent (i.e. I never intend to entirely merge them back
together)?
In my case, this is CMS software that has been customized differently for two
different web sites. I started with projectA, and once that was working cloned it to projectB and make further tweaks to both A and B to customize them. Now I want to develop some features that show up in both A and B, without merging the site-specific customizations. How?
hg push will push everything, so that won't work
Transplant appears to give me different changeset hashes, which worries me
I feel like maybe the repositories should be set up differently, but I'm not
sure how.
As Thilo comments, the common part would be best developed (and published in A and B) as a third repo declared as a SubRepo.
That way, you respect the first two repos which are independent (one evolution on A doesn't always mean an evolution on B), and you can develop the common part in subrepo C.
A solution for Mercurial might be if you can put the different areas in files that can be in .hgignore, but then they won't be versioned, so that may not be so good.
Another way is to just use 1 repo, and set a global flag, and use template A or B depending on the flag, and / or include different code source file depending on the flag. If the difference is small, then can use if-then-else inside the same file.
You can use hg push to push the changes back together, but you don't necessarily have to merge all the changesets into the trunk. Just take the ones you want.
As stated above, a subrepo is probably the best option. Another alternative would be to have a third branch with the common work, and merge from that branch to projectA and projectB (but never back to the common branch).
This alternative is more likely to have accidents (merging the wrong way) but you might find that it is easier to set up and get working quickly.
We are looking to move our team (~10 developers) from SVN to mercurial. We are trying to figure out how to manage our workflow. In particular, we are trying to see if creating remote heads is the right solution.
We currently have a very large repository with multiple, related projects. They share a lot of code, but pieces of the project are deployed by different teams (3 teams) independent of other portions of the code-base. So each team is working on concurrent large features.
The way we currently handles this in SVN are branches. Team1 has a branch for Feature1, same deal for the other teams. When Team1 finishes their change, it gets merged into the trunk and deployed out. The other teams follow suite when their project is complete, merging of course.
So my initial thought are using Named Branches for these situations. Team1 makes a Feature1 branch off of the default branch in Hg. Now, here is the question. Should the team PUSH that branch, in it's current/half-state to the repository. This will create a second head in the core repo.
My initial reaction was "NO!" as it seems like a bad idea. Handling multiple heads on our repository just sounds awful, but there are some advantages...
First, the teams want to setup Continuous Integration to build this branch during their development cycle (months long). This will only work if the CI can pull this branch from the repo. This is something we do now with SVN, copy a CI build and change the branch. Easy.
Second, it makes it easier for any team member to jump onto the branch and start working. Without pushing to the core repo, they would have to receive a push from a developer on that team with the changeset information. It is also possible to lose local commits to hardware failure. The chances increase a lot if it's a branch by a single developer who has followed the "don't push until finished" approach.
And lastly is just for ease of use. The developers can easily just commit and push on their branch at any time without consequence (as they do today, in their SVN branches).
Is there a better way to handle this scenario that I may be missing? I just want a veteran's opinion before moving forward with the strategy.
For bug fixes we like the general workflow of Mercurial, anonymous branches that only consist of 1-2 commits. The simplicity is great for those cases.
By the way, I've read this, great article which seems to favor Named branches.
You're definitely thinking about this right, and it sounds like you're going down a good path. I'm a branches as clones person, but named branches have come a long way.
Having a central-ish repo to which all named branches are pushed is convenient for control and backups. Teams working on only branch X can easily create their own branch X only repo by doing hg clone -r X central-ish repo.
The best thing you can do to help the teams out is to let them do clones themselves somewhere that's sitting behind a hgwebdir.cgi instance (as, presumably your central-ish repo will be). You'll find not just teams, but sub-teams and pairs of teams will set up their own repos for mini-efforts you never new about. They'll put them on the named branches that make sense to them and merge back into central as appropriate.
I would make the decision if these three projects should go into one repository by the coupling between these projects (and how many patches are interchanged within them). The more independent they are the less are the advantages of having them in one repo (backup and management aside). There are some different kind of setups:
As you showed, one repository, with one branch for the shared code, and one branch for each project. When the projects itself are generated by forking the shared code base care must be taken when merging back to common (cherry-picking). When inside of each project-branch updates to the common-branch are generated as direct ancestors of the common-branch, and get merges into the project-branch, chances are good they can also be merged back into common. But if changes to common are developed on top of the project branch, merging back will require cherry-picking. I don't have experiences with such a setup, but I fear that the merges can get problematic.
one repo for the shared code and one for each project, connected by symlinks or as subrepo. Here care must be taken to not step on each others feed. In my experience this kind of usage has the potential to grow into a very big PITA. OTOH you seem to have this setup already and your fellow developers can work with it.
one repo for shared and one for each project, with the code from the shared one used as internal releases. I would go for this setup when there are not big regular changes on the shared code base.
All these situations can also be combined with customization-branches for each project within the common part. But I would try to minimize the number of currently active branches, since every new branch requires additional care of merges.
I'm sorry to not give a concrete answer, but "The right thing" (TM) depends to much on the local details.
I have several projects with a very large over-lapping code-base. We've just recently started using SVN so I'm trying to figure out how I should be using it.
The problem is that as I'm finishing a task on one project, I'm starting a task on another, with some overlap. Often there's a lot of interrupt driven development as well. So, my code is never really in a completely stable state that I feel comfortable checking in.
The result is that we're not really using the VC system, which is a VERY bad thing, we all know... so, suggestions?
Check out a personal branch of the code and merge in changes. At least you will have some version control for your own changes, in case you need to roll back. Once you are comfortable with the state that your branch is in, merge that branch back into the trunk.
You can also check out a branch for each task, instead of one for each individual. You can also merge changes to your branch from the trunk if someone changes the trunk, and you want your branch to reflect the changes.
This is a common way to use SVN, although there are other workflows. I have worked on projects where I was afraid to commit(I would break the build possibly) because we did not effectively use branching.
Branching is really powerful in helping your workflow, use it until you're comfortable with the idea of merging.
Edit: 'Checking out a branch' refers to creating branch in your branches folder, and then checking out that branch. The standard svn repository structure consists of the folders trunk, tags, and branches at the root.
So, my code is never really in a completely stable state that I feel comfortable checking in.
Why is that ?
If your branch is appropriate for your work (with a good naming convention for instance), everyone will know its HEAD is not always stable.
In this kind of "working" branch, just put some tag along the way to indicate some "stable code points" (which can then be queried by any tester to be deployed).
Any other version on that working branch is just made to record changes, even though the current state is not stable.
Then later you merge all on a branch supposed to represent a stable state.
In TFS, you are able to create 'Shelf Sets' (I'm not sure what they'd be called in other source control providers). When you shelve some code, you are saving it to your repository, but not checking it in.
The reason this is important is that if you are working on Bug XXXX, and you fix half of the code, but it's not stable and not 'check-in-able', but you get assigned to NewFeature YYYY, you SHOULD NOT continue working with the same code base. You should 'Shelf' your Bug XXXX code, then return your local codebase to the latest checked-in code, and implement NewFeature YYYY.
This way you are keeping your check-ins atomic. You don't have to worry about losing your work, because it is still held by the repository (so if your computer bursts into flames, you don't have to burst into tears), and you aren't mixing your fixes for XXXX with your new code for YYYY.
Then, once you are asked to go back to XXXX (assuming you've checked in YYYY) you can just unshelve your 'shelf set' and jump right back into it where you left off.
Either accept that the code in SVN is not in a completely stable state and check it in anyway (and reserve time for stabilization and refactoring every X days/weeks so the code doesn't degrade too much).
Or force your team to work in a more structured way with minimal interruption based development so you can check in good code.
The first option is not ideal (but better then no source control), the second is probably impossible - there is no third option.
If you don't have time to get the code to a stable state you defiantly don't have the time to branch and merge all the time.
In distributed sourcecontrol systems like GIT, you commit to your local repository. Only when you push your code, it's 'committed' to the remote repository.
In this way, its much easier to 'safe' your work in between.