MQ vs. branches in Mercurial - mercurial

I've been working with Mercurial now for some time. When making (private) changes to some third party software, in the past I always created a separate named branch for these changes. When the upstream code updates, I simply merge it into my named branch.
Today I read about MQ (Mercurial Queues - chapters 12 and 13). I think I understood the concept behind MQ, so my question is:
Is there any advantage of MQ over (named) branches in Mercurial (for my scenario)?

The main advantage of MQ over named branches are:
You can revise your patches. This lets you edit history and so you can maintain a clean and logical series of patches on top of the upstream code: if you notice a mistake in a patch you refresh the patch instead of making a new commit.
The changes in your patches will be cleanly separated from the changes made upstream. When you merge two branches, you mixing the two streams of development. This makes it difficult to see the changes you've made without also seeing the changes coming in from the upstream branch.
The patch names are transient. When you hg qfinish an applied patch, there's no trace of the patch name left in the commit. So you can use MQ without coordinating first with the upstream repository since they'll never notice MQ.
You avoid merges. Instead of merging with the latest code from upstream, you rebase your applied patches. This gives you a simpler history. The history is obviously fake since you pretend that you made all your patches after seeing the code from upstream — when infact you made it in parallel with upstream and later moved your patches to the tip of upstream.
You have no permanent branch name in the changesets. People sometimes treat named branches as disposable and become upset when they realize that a named branch is fixed in history. (You can actually set the branch name with hg branch before pushing patches so this point is not so bad.)
Disadvantages of MQ are:
It's an extra tool to learn. It's powerful, but it also gives you more opportunity to shoot yourself in the foot. Running hg qdelete will really delete the patch and so you can throw away data. (I think this is fine, but we've had a Git user coming to our mailinglist complaining about this.)
You make it much harder to collaborate with others. You can turn .hg/patches into a repository and push/pull patches around between repositories but it's difficult to do that if you're more than a single developer. The problem is that you end up merging patches if more than one persons refreshes the same patch.
You have no permanent branch name in the changesets. If you're using named branches right and use stable, long-term branch names, then you will miss that when using MQ.

Good question. It depends. Personally I dislikes mercurial branching system, and I try to avoid it when I can (using pushed bookmarks instead of branch).
MQ is a great tool, with great power and great pitfalls. You can also consider using pbranch.
MQ is a great tool if you need to produce and maintain a patch-set for a project, something like adding feature-x to a project and keeping patches updated with the upstream code.
Bookmarks (or branches if you like) are good for short-development task that require to be merged into the upstream code.

Related

Mercurial: devs work on separate folders, why do they have to merge all the time

I have four devs working in four separate source folders in a mercurial repo. Why do they have to merge all the time and pollute the repo with merge changesets? It annoys them and it annoys me.
Is there a better way to do this?
Assuming the changes really don't conflict, you can use the rebase extension in lieu of merging.
First, put this in your .hgrc file:
[extensions]
rebase =
Now, instead of merging, just do hg rebase. It will "detach" your local changesets and move them to be descendants of the public tip. You can also pass various arguments to modify what gets rebased.
Again, this is not a good idea if your developers are going to encounter physical merge conflicts, or logical conflicts (e.g. Alice changed a feature in file A at the same time as Bob altered related functionality in file B). In those cases, you should probably use a real merge in order to properly represent the relevant history. hg rebase can be easily aborted if physical conflicts are encountered, but it's a good idea to check for logical conflicts by hand, since the extension cannot detect those automatically.
Your development team are committing little and often; this is just what you want so you don't want to change that habit for the sake of a clean line of commits.
#Kevin has described using the rebase extension and I agree that can work fine. However, you'll also see all the work sequence of each developer squished together in a single line of commits. If you're working on a stable code base and just submitting quick single-commit fixes then that may be fine - if you have ongoing lines of development then you might not won't want to lose the continuity of a developer's commits.
Another option is to split your repository into smaller self-contained repositories.
If your developers are always working in 4 separate folders, perhaps the contents of these folders can be modularised and stored as separate Mercurial repositories. You could then have a separate master repository that brought all these smaller repositories together within the sub-repository framework.
Mercurial is distributed, it means that if you have a central repository, every developer also has a private repository on his/her workstation, and also a working copy of course.
So now let's suppose that they make a change and commit it, i.e., to their private repository. When they want to hg push two things can happen:
either they are the first one to push a new changeset on the central server, then no merge will be required, or
either somebody else, starting from the same version, has committed and pushed before them. We can see that there is a fork here: from the same starting point Mercurial has two different directions, thus a merge is required, even if there is no conflict, because we do not want four different divergent contexts on the central server (which by the way is possible with Mercurial, they are called heads and you can force the push without merge, but you still have the divergence, no magic, and this is probably not what you want because you want to be able to checkout the sum of all the contributions..).
Now how to avoid performing merges is quite simple: you need to tell your developers to integrate others changes before committing their own changes:
$ hg pull
$ hg update
$ hg commit -m"..."
$ hg push
When the commit is made against the latest central version, no merge should be required.
If they where working on the same code, after pull and update some running of tests would be required as well to ensure that what was working in isolation still works when other developers work have been integrated. Taking others contributions frequently and pushing our own changes also frequently is called continuous integration and ensures that integration issues are discovered quickly.
Hope it'll help.

HgFlow and multiple developers

I really like the Hg Flow for Mercurial repositories. we are currently using Bitbucket, and in each product multiple developers are working. basically they can work as below:
a team might work on a single feature.
another team might work on a release/hot fix.
So do i keep the "develop" branch in BitBucket or local repositories. and how about feature branches, should i push them to the central repository and remove when required. i assume we should do so right?
Thanks
I personally neither use git flow or hg flow as tools, but I do use some of the methods for my own projects (manually).
Before going into detail, you always need to provide branches in the main/bitbucket repository when multiple people need to merge or branch from them.
This definately includes "develop" and probably also features/fixes multiple people need to work on (unless you have another repository or method to exchange branches/commits between them)
The difference between using git and mercurial/hg is relevant here, since the branching models are quite different.
See A Guide to Branching in Mercurial for details. Using hg bookmarks would be quite similar to what git does with branches, but there is no full support for the bookmark branching model on BitBucket (see this ticket).
hg flow (the tool) uses named branches. In contrast to git branches, these are not at all light-weight, but permanent and global (they can at least be closed now).
This means whenever any commit created on any (named) branch other than "default" is pushed to bitbucket (even after merging) this will create the branch in the bitbucket repository.
So you don't have any other choice than keeping all branches in the main repository.
However, You can decide when to push and when to close these.
I would advise using hg push -r to push only the branches/heads you want to push and only pushing these when they are either needed by somebody else or finished and merged.
Branches should be closed as soon they are not needed anymore. (This is probably done by hg flow automatically)
You should close branches locally whenever possible. This way they might not even appear in the bitbucket interface. Some might reach the bitbucket repository only in closed state (which hides them from the interface).
Obviously you should often push any branches multiple people need to merge from.
In my understanding of the workflow the "develop" branch is always exactly one branch per project that should be pushed frequently (after local testing).
In case you are either not using hg-flow or named branches things are a bit different.
Both, using forks/clones or bookmarks as a branching method doesn't generate permanent or necessarily global branches.
Like mentioned above, you can't use bookmarks (reliably) when you also want to use bitbucket pull requests. You have to push bookmarks separately. A normal push will only update (a head of) the branch so you might miss commits from other team members when marging later. Hg will tell you when a new head is created. In that case you might want to merge the branch with the remote bookmark into your branch before pushing.
When using forks as branches it works a bit like with bookmarks, but bitbucket has full support for that. You need to have a new fork on bitbucket for every branch.
You naturally only want to create extra forks if you need different people to work on it and you don't have other means of commit exchange for them. You will need at least a separate "develop" repository then.
I personally wouldn't use the full "flow" with hg on bitbucket.
For my projects the "develop" branch is the same as master/default, since I don't roll out releases with git (other than development builds, that wouldn't use the release branch anyways). I don't need a separate "production" branch, since tags can mostly be used for production usage.
I also don't create a separate "release-preparation" branch. There is only a point in time when I only apply bugfixes on develop and stop merging features. That obviously won't work when you need to work at the same time on features that are dependendant on features not to be released in the next release.
Always using the full "git flow" is easy because git branching is easy and light-weight.
Depending on the branching model you use and how supportive the other tools are,
using the full "hg flow" might not be "worth it".
The hg guide actually discourages use of named branches for short-lived branches.
See Feature separation through named branches.
The "easy" branching concept promoted in the guide is forking/cloning. Bookmarks would be the natural way to translate git flow if the tool/bitbucket support would be better (and bookmarks longer a core hg feature).
Disclaimer:
I prefer git when I can choose. I do use hg, but not as my personal choice.
You also might have considered most of this, but since you didn't state any of these details and accept an answer (in the comments) that is quite different to what you are asking, I wanted to elaborate a bit.
Edit:
To follow-up on the comments:
I think hg bookmarks are comparable to git branches because both are just movable pointers to commits.
The main difference is, that when you delete a branch in git, the commits are possibly lost (when not part of other branches or pointed to in a another branch before they are garbage collected). When you delete a bookmark in hg, then the commits are still part of the repository (part of the (named or default) branch) unless manually stripped.
Anonymous heads are related, but only as something the bookmarks point to. Without bookmarks pointing to them the anonymous heads are not usable as a branch to work with (for more than just a local merge) and share. When you have anonymous heads in a repository you don't know what they are supposed to be or where they came from, unless you remember or have other clues. In my eyes anonymous heads are only a workaround for late implementation of bookmarks and no good implementation of remotes/remote heads.
Named branches are rather unrelated, as the only thing they have in common with git branches is having a name. They are light-weight in comparision to cloning the whole repository (forking as branch model), but not in terms of "you can't get rid of them". They are permanent.
Most places tell you not to use named branches unless you have a very good reason or it is a long-running branch.

Pull commits on repo post-rebase

I'm looking for a simple way to pull in additional commits after rebasing or a good reason to tell someone not to rebase.
Essentially we have a project, crons. I make changes to this frequently, and the maintainer of the project pulls in changes when I request it and rebases every time.
This is usually okay, but it can lead to problems in two scenarios:
Releasing from two branches simultaneously
Having to release an additional commit afterwards.
For example, I commit revision 1000. Maintainer pulls and rebases to create revision 1000', but at around the same time I realize a horrible mistake and create revision 1001 (child of 1000). Since 1000 doesn't exist in the target branch, this creates an unusable merge, and the maintainer usually laughs at me and tells me to try again (which requires me getting a fresh checkout of the main branch at 1000' and creating and importing a patch manually from the other checkout). I'm sure you can see how the same problem could occur with me trying to release from two separate branches simultaneously as well.
Anyway, once the main branch has 1000', is there anything that can be done to pull in 1001 without having to merge the same changes again? Or does rebasing ruin this? Regardless is there anything I can say to get Maintainer to stop rebasing? Is he using it incorrectly?
Tell your maintainer to stop being a jacka**.
Rebasing is something that should only be done by you, the one that created the changesets you want to rebase, and not done to changesets that are:
already shared with someone else
gotten from someone else
Your maintainer probably wants a non-distributed version control system, like Subversion, where changesets follows a straight line, instead of the branchy nature of a DVCS. In that respect, the choice of Mercurial is wrong, or the usage of Mercurial is wrong.
Also note that rebasing is one way of changing history, and since Mercurial discourages that (changing history), rebasing is only available as an extension, not available "out of the box" of a vanilla Mercurial configuration.
So to answer your question: No, since your maintainer insists on breaking the nature of a DVCS, the tools will fight against you (and him), and you're going to have a hard time getting the tools to cooperate with you.
Tell your maintainer to embrace how a DVCS really works. Now, he may still insist on not accepting new branches or heads in his repository, and insist on you pulling and merging before pushing back a single head to his repository, but that's OK.
Rebasing shared changesets, however, is not.
If you really want to use rebasing, the correct way to do it is like this:
You pull the latest changes from some source repository
You commit a lot of changesets locally, fixing bugs, adding new features, whatnot
You then try to push, gets told that this will create new heads in the target repository. This tells you that there are new changesets in the target repository that you did not get when you last pulled, because they have been added after that
Instead, you pull, this will add a new head in your local repository. Now you have the head that was created from your new changesets, and the head that was retrieved from the source repository created by others.
You then rebase your changesets on top of the ones you got from the source repository, in essence moving your changesets in the history to appear that you started your work from the latest changeset in the current source repository
You then attempt a new push, succeeding
The end result is that the target repository, and your own repository, will have a more linear changeset history, instead of a branch and then a merge.
However, since multiple branches is perfectly fine in a DVCS, you don't have to go through all of this. You can just merge, and continue working. This is how a DVCS is supposed to work. Rebasing is just an extra tool you can use if you really want to.

Mercurial Queues: Merging Patches from Multiple Repositories

I am using Mercurial Queues on a repository, and have placed those patches in a patch repository. Another contributor has cloned my patch queue and made changes of their own. I would now like to merge their changes in my local patch repository.
I am trying to find a good workflow for performing this merge that
reflects the contributor's changesets in the history of the patch repository
invokes the user's merge tool in case of conflicts
Initially, I just tried to merge the patches directly. This is okay in very simple cases, but does not work well when many things have changed, since the patches depend on line number context which doesn't seem like something I should have to worry about adjusting myself. Overall, I find examining a 3 way diff of patches to be too complex.
Is there a better way?
There's no great way to handle this. What I'd probably end up doing is creating two clones, and qfinishing your patch in one and the contributers patch in the other. That that point you'll have repos with each separate patch's net effect applied. Then you hg pull one of those clones into the other, and hg merge will let you use your graphical tools to merge the results of the patches -- and the only differences should be the differences in your patches. At this point, ideally, you'll be able to qimport the merge changeset, but you can't do that, so you have to 'hg diff -r tip-1 -r tip' to get a new diff that is the difference between before-everything-started and after-merging-the-two-results. You then 'qimport` that diff and commit it to your patch queue repo with a note saying where it came from.
Decidedly sub-optimal, but the best I can come up with. I'd love to hear a better solution.
I'm afraid that there is no automated way to merge patches.
However, one "trick" you can use is to create new patches instead of editing/refreshing existing patches when you need to amend them. When you all agree on the right way to do things, then hg qfold the patches.
That way you wont be stepping on each others toes as much since you create new patches.

best practices in mercurial: branch vs. clone, and partial merges?

...so I've gotten used to the simple stuff with Mercurial (add, commit, diff) and found out about the .hgignore file (yay!) and have gotten the hang of creating and switching between branches (branch, update -C).
I have two major questions though:
If I'm in branch "Branch1" and I want to pull in some but not all of the changes from branch "Branch2", how would I do that? Particularly if all the changes are in one subdirectory. (I guess I could just clone the whole repository, then use a directory-merge tool like Beyond Compare to pick&choose my edits. Seems like there ought to be a way to just isolate the changes in one file or one directory, though.)
Switching between branches with update -C seems so easy, I'm wondering why I would bother using clone. I can only think of a few reasons (see below) -- are there some other reasons I'm missing?
a. if I need to act on two versions/branches at once (e.g. do a performance-metric diff)
b. for a backup (clone the repository to a network drive in a physically different location)
c. to do the pick&choose merge like I've mentioned above.
I use clone for:
Short-lived local branches
Cloning to different development machines and servers
The former use is pretty rare for me - mainly when I'm trying an idea I might want to totally abandon. If I want to merge, I'll want to merge ALL the changes. This sort of branching is mainly for tracking different developers' branches so they don't disturb each other. Just to clarify this last point:
I keep working on my changes and pull my fellow devs changes and they pull mine.
When it's convenient for me I'll merge ALL of the changes from one (or all) of these branches into mine.
For feature branches, or longer lived branches, I use named branches which are more comfortably shared between repositories without merging. It also "feels" better when you want to selectively merge.
Basically I look at it this way:
Named branches are for developing different branches or versions of the app
Clones are for managing different contributions to the same version of the app.
That's my take, though really it's a matter of policy.
For question 1, you need to be a little clearer about what you mean by "changes". Which of these do you mean:
"I want to pull some, but not all, of the changesets in a different branch into this one."
"I want to pull the latest version of some, but not all, of the files in a different branch into this one."
If you mean item 1, you should look into the Transplant extension, specifically the idea of cherrypicking a couple of changesets.
If you mean item 2, you would do the following:
Update to the branch you want to pull the changes into.
Use hg revert -r <branch you want to merge> --include <files to update> to change the contents of those files to the way they are on the other branch.
Use hg commit to commit those changes to the branch as a new changeset.
As for question 2, I never use repository clones for branching myself, so I don't know. I use named branches or anonymous branches (sometimes with bookmarks).
I have another option for you to look into: mercurial queues.
The idea is, to have a stack of patches (no commits, "real" patches) ontop of your current working directory. Then, you can add or remove the applied patches, add one, remove it, add another other one, etc. One single patch or a subset of them ends up to be a new "feature" as you probably want to do with branches. After that, you can apply the patch as usual (since it is a change). Branches are probably more useful if you work with somebody else... ?