How do closed branches affect Mercurial performance? - mercurial

I've noticed that some answers to questions about branch names quote the Mercurial wiki to indicate that the branch-per-feature or branch-per-bug naming conventions may cause performance problems.
Does the ability to mark branches as closed with the --close-branch flag on commits have any effect on this performance claim?

Does the ability to mark branches as closed with the --close-branch flag on commits have any affect on this performance claim?
Marking a branch closed with hg commit --close-branch merely creates a new changeset with a close=1 marker in the changeset meta data. Commands like hg branches and hg heads will then know not to show this branch/head. These commands use a branch cache to speed things up and we expect that cache to scale well with the number of branches.
However, there are some operations that have a complexity that is linear in the number of topological heads. This includes the discovery protocol used before version 1.9. The new discovery protocol in version 1.9 will still exchange topological heads in its "samples", but the sample size is capped at 200 changesets.
There might be other code paths that still scale linearly in the number of heads and this is why we suggest close-before-merge:
$ hg update bug-123
$ hg commit --close-branch -m "All fixed"
$ hg update default
$ hg merge bug-123
instead merge-before-close:
$ hg update default
$ hg merge bug-123
$ hg update bug-123
$ hg commit --close-branch -m "All fixed"
The latter approach leaves a dangling head in the graph (a topological head).

Closed branches probably won't make any difference in performance, but that's not the point. The performance implications are small, and certainly not the reason I suggested you avoid permanent branch names for short-lived lines of development. Here's the relevant quote from the wiki:
Mercurial is designed to work well with hundreds of branches. It still works quite well with ten thousand branches, but some commands might show noticeable overhead which you will only see after your workflow already stabilized.
The reason both MG and I (we're the primary answerers in both of your linked questions) is because time and time again we watch people get really annoyed when they learn that branch names are permanent in Mercurial. Here's the usual exchange, that acts itself out in IRC a few times a week:
Person A: "I've got 100 branches and I want to get rid of them!"
Person B: "You can't. You can hide them, but Mercurial branches are forever."
A: "But in git I have have 1000s of branches and get rid of them whenever I want!"
B: "Yes, in Mercurial those are called bookmarks."
or similarly:
Person C: "I named a branch 'stupid feature marketing made me add' and I want to push that change w/o pushing the branch name."
Person B: "You can't. You can merge it into default, but that name is permanent on the changeset. You'd have to re-create the changeset to get rid of it!"
C: "But in git my branch names are local only!"
B: "Yes, in Mercurial those are called bookmarks."
If you want permanent, forever branch names on your changes (and MG, my co-answerer on both of those questions does like exactly that) then by all means use them, and don't worry a bit about performance. But do worry about how your tools represent the branches: like Mercurial itself, tools are typically built to scale in the number of changesets, not the number of branches. So they often do naive things like putting all branch names into a single drop-down menu. This GUI problem will eventually be fixed when named branches become more popular.
Steve Losh's excellent Guide to Branching in Mercurial does a great job spelling out your (four!) options. Pick what you like and be confident there are plenty of folks who like whichever one you selected, and at least a few of them have more branches than you ever will.

Related

Is there a way to set a custom base version when merging with Mercurial?

I'm confronted with a Mercurial repository that has originally been a CVS repository. It has recently been migrated to Mercurial with cvs2hg.
The CVS repository had a couple of branches, lets call two of them "main" and "feature". "feature" has been branched off "main" a very long time ago. Between the branches, changes have frequently been "merged" by checking in changes committed in one branch into the other. There are frequently tagged "merge" revisions where "main" and "feature" have been equal.
When I try to do the first (real) merge in Mercurial, the three-way merge assumes that the "base" version is the revision when "feature" was originally branched off. This means that there is a lot of visual cluster with conflicts where "main" and "feature" are almost equivalent to each other but very different to the "base" version. This is so bad that the merge would take a very long time and would be error prone.
I'm wondering if there is a way to tell Mercurial that the base version is one of the tagged "merge" revisions, e.g.
hg up feature;
hg merge main --base "tag-xyz"
In this case, the merge would be easy.
I would recommend fixing your recent history. This is somewhat messy, but once finished, Mercurial's merges should work correctly out of the box.
First, find the most recent point at which the feature and main branches were exactly the same. Merge the equivalent commits (merge main into feature). This should create a new head with two parents (one from feature, one from main). Furthermore, that head should be on the feature branch, not the main branch. Since the commits are exactly the same, you should not experience conflicts. Next, you will need to rebase the rest of the feature branch onto this head.
Suppose the new head is commit abc. Again, this head has two parents, one of which is on the feature branch. That feature parent has a second child, which is also on the feature branch. Suppose that second child is commit def. Then you can perform the rebase as follows:
hg phase -f -d 'def::' # Unnecessary if you've never pushed
hg rebase -s def -d abc
This will change the commit numbers for def and all its descendants. If you have multiple instances of the repository, they will need to undergo this same fixup, or you will need to re-clone them. Otherwise, your repositories will get very messy when people push and pull.
Once you've done this, merging should "just work."

Mercurial - How to pull all revisions on all branches up to a specific point?

I guess I must have misunderstood the author of this post when he said:
hg pull -r X -f repo # ... will give me all changesets up to X
Because when I did that (in a freshly-created repository), I only got those changesets that were ancestors of the branch that revision X is on.
What I wanted was all the changesets that had been committed to the remote repo up to and including X, chronologically. In other words, in addition to the branch that X is on (and its ancestors), I also wanted all other branches (including closed branches) committed before X that hadn't been merged to X's branch.
How would I express the command to do that?
BTW, in this particular repo, there are closed branches that have names that are identical to currently open/active branches, so if the solution involves enumerating all the branch names (which would be tedious, but do-able), it would still need to get the closed occurrences of such branches as well as the open ones.
(For completeness I suppose I should also say that I ran the command from the command-line of TortoiseHG 2.7 on Windows, in case the behavior of hg pull that I've described above isn't what I should have expected.)
You can't do that on pull in a single command. "Chronologically" means a lot less than you think it might. Anyone can do a commit with any timestamp they want, so the dates aren't good selectors. If you mean "with an earlier revision number" those too can change from repo to repo, so pulling all revisions with a revision number lower than N could give different results for different invocations.
If you want to try the revision-number-based version anyway, you'd probably have your best luck pulling everything to a trash repo locally and then pushing only what you want to a new local repository:
hg clone http://remotehost/path local-clone # clones everything
hg init another-local-clone
hg push --repository local-clone --rev '0:X' another-local-clone
after that another-local-clone will have all the changesets whose revision numbers is X or lower in local-clone, which is (but isn't guaranteed to be) the same as the remote clone
If that seems awkward it's because "committed before" isn't a terribly useful concept in DVCS land -- it assumes a linearity that neither git not Mercurial consider important.

Close an unmerged wasteful branch in mercurial

I decide to start an experiment in a branch
[default] $ hg branch experiment
[experiment] $ [... some commits ...]
Aargh! does not work! I want to throw it away.
[experiment] $ hg commit -m "did not work; closing ..." --close-branch
[experiment] $ hg update default
To get the real tip back -
[default] $ [... some commits ...]
[default] $ hg push
Is this a correct workflow to destroy an experimental branch?
You've got two fine answers on how to undo your branch, but the bigger point is don't use named branches for temporary concepts. Named branches are for long lived entities like 'development' and 'stable'. For features, expiriments, etc. you want either clones, bookmarks, or anonymous branches. All three are contrasted with named branches in this excellent article by Steve Losh:
http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/
You can see similar advice from the Mercurial project here:
https://www.mercurial-scm.org/wiki/StandardBranching
The Mercurial wiki covers all the options for Pruning Dead Branches. Briefly, these options include:
Closing the branch (as done in your original post)
Create a new clone that does not include the dead branch
Use a no-op merge
Use the strip command that is bundled with the mq extension
Closing a branch will leave it in the repository, and the closed branch will be pushed with other changesets next time you do a push.
If you don't want this to happen, and your branch is local, just strip it.
On the other hand, if you have already pushed the experimental branch, stripping it won't help, so you can either close it or do a dummy merge (or both).
In my opinion, you should just close the branch and forget about it.
In the long run, there's no harm in a "dead" branch being present in the repository. Any given branch is almost certainly tiny in comparison to the contents of your repository and any additional "noise" created by the additional changesets is going to fade into the past relatively quickly.
However, by not worrying about cleaning up the branch, you achieve two things:
You don't have to deal with any of the potential issues associated with altering history in a DVCS.
(More importantly) You have a permanent record of your attempt.
That second point is key -- you can actually make use of what you learned if the branch is still around: any fellow developers can learn from it; you can go back and try again if you learn something else; you can prevent trying the same thing again by seeing this branch in history.
A lot of developers have a hard time with keeping history that isn't "pristine" in their DVCS, especially when they recently came from a centralized VCS.* Over time, I've come to realize that there's nothing bad or wrong about that "other" history and in fact it can turn out to be remarkably useful if kept around.
*I'm not necessarily implying that you fall into either of these camps, just making an observation.

Mercurial push problem

I've just got a problem with hg push command. What I did - Firstly I created 2 branches hot-fix-1 and hot-fix-2 made some changes in each branche, merged it back to default and closed those branches with the command:
hg commit --close-branch
If I start hg branches I have the following output:
default 29:e62a2c57b17c
hg branches -c gives me:
default 29:e62a2c57b17c
hot-fix-2 27:42f7bf715392 (closed)
hot-fix-1 26:dd98f50934b0 (closed)
Thus hot-fix-* branches seems to be closed. However if I try to push the changes I have the next error message:
pushing to /Users/user1/projects/mercurial/mytag
searching for changes
abort: push creates new remote branches: hot-fix-1, hot-fix-2!
(use 'hg push --new-branch' to create new remote branches)
and it does not matter which command I use hg push -b . or hg push -b default
So the question is how I can push those changes to repository without creating new branches.
P.S I used to work with git and was hoping that similar branching model can be used in Mercurial. Thanks
First, as many others have pointed out, using a named branch for short lived work is not a recommended practice. Named branches are predominantly for long lived features, or for release management.
Given that you are in this situation, there are a few options available. All of them involve modifying history (as you're obviously trying to change something you've done).
One is to just push the branches as is, learn from the experience, and move on. If the rest of the team is fine with this, then it's a case of adding --new-branch to your push command.
If the rest of the team, or you, really want the history to be clean, then you'll need to dig deeper.
If you aren't pushing, then definitely make a clone of your current repo. This way you have a copy of the original work to fall back on.
I see 2 main approaches here. Strip off the merges and rebase your branches onto default. This will get rid of the named branches or graft/transplant your changes. Both will be the same end result, but the implementation is slightly different.
If you merely want to use graft, that is now a built-in function starting with HG 2.0. It replaces the transplant plugin, and is much nicer to work with as it uses your usual merge tool if there are conflicts.
To use it, update to the default branch. Then, use the command:
hg graft -D "2085::2093 and not 2091"
the string after -D is an hg revision selection query. In your case, you'd likely only need '{start}::{end}' where start is the changeset at the start of the branch, and end is the end changeset of the branch (ignoring the merge).
If you did several merges, you'd have to pick and choose the changesets more precisely.
The other option is to strip the final merges, and use the rebase command that is part of the mq plugin.
You'll have to strip your merge changesets to get rid of them, and then update to the tip of the branch you want to keep. Select the start of the first named branch, and do a rebase. This will change the parentage of the branch (if you're familiar with Git, then this is very much like it's rebase).
Then repeat for the second branch. You should now have one long branch with the name default.
Just do the:
hg push --new-branch
It will send over those branches, but they'll be closed on the receiving end too, so no one should be bothered.
See my comment on the question for why Named Branches are best saved for long-lived entities like 'stable' and anonymous branches, bookmarks, or clones are more suitable for short lived things like hot-fixes and new features.
Your hot-fix changes were made on branches. Regardless of whether the branch is active or closed, it does exist.
To push the changes to the server (without rewriting history), you must use the --new-branch option (e.g. hg push --new-branch`).
Since you merged the branches into default, there will still only be one head (as you have already seen in your local repo).
If you really can't live with pushing the branches to the server, then you must rewrite your local history as suggested in Mikezx6r's answer.
In addition to the methods he mentioned, you can also import the changesets into a patch queue and apply them to the tip of your default.

Is there a downside to this Mercurial workflow: named branch "dead" head?

I love the flexibility of named branches but I have some concerns about the prolifieration of heads.
Even when the branch is closed, it still shows up in the heads. I have an idea for how to clean up the output from "hg heads"
My question to the gurus: "What am I missing?"
First off you may ask, Why might I want to totally hide the head of a named branch? For various reasons:
the feature is a bad idea
the feature is a good idea that is not ready for merging to tip, but maybe in a few months
the branch is a patch release to an older tagged version
edit: It turns out the prolifiration of heads is a symptom of the older version of mercurial I was using. Closing the branch hides the head of the branch it on newer Mercurial versions.
My idea is to have a "dead" head branch onto which all these closed branch heads will be merged.
The dead head would be parented by changeset 0 and serve the sole purpose of bundling up the stray heads that are not needed right now.
The deadhead has only other deadhead children, which never get merged back into the default branch.
You can use hg commit --close-branch to mark a branch as closed:
http://www.selenic.com/mercurial/hg.1.html#commit
Closed branches will not show up in hg branches or hg heads by default (only if the -c/--closed option is specified), so I'm not sure how you're seeing "clutter"?
What exactly would you gain by merging things?
There seems to be a downside to leaving dead heads which is not solved by later versions of Mercurial.
Suppose you have a lot of closed branch heads and only a single non-closed active branch. Suppose further that at some later point you make a bad commit (rev bad) on top of the non-closed head (rev good). Before you push you'd like to clone your repository dropping that bad commit. That is usually a simple thing to do -
hg clone --rev good BadRepo FixedRepo
This unfortunately does not pull the closed branch heads since they are not ancestors of rev good. All those branches which were closed will not be closed in the cloned repository. I tested this with Mercurial 2.3.1.
Thoughts?
p.s. The hgflow extension does close feature and release branches before merging. This avoids the closed heads problem.
Regarding the clone being an ugly approach, it has worked quite well and easily for me. The clone replaces the repository with the bad commit. The clone is a local effort. That bad repository is just discarded. I usually realize I've made a bad commit very soon after.
The -b option is just a way to rephrase the --rev by using a branch name instead of a change set identifier. Using the --rev option does pull the entire topological tree under the revision. If the revision is the head of the branch then the --rev clone is the same as the -b clone. -b leaves the same problem that I described with the --rev option. Branches which were closed in the original repository get reopened if they were left as heads.
If the pattern is to leave closed heads then they will soon greatly outnumber relevant heads. Getting those closures into a clone is quite an effort unless you do a full clone.
I feel I've muddied the waters with why I might do a partial clone. I'll restate my concern about closure heads more carefully.
For any partial clone from repository X to repository Y, if there exists a branch B in repository X with a closure head and that branch is included in the clone for purely topological reasons, then branch B will not be closed in repository Y. Further if the merging pattern is to generally leave closure heads then the number of closure heads is of order development time.
This is a concern to me so I close my branches before I merge. I use hgflow (http://nvie.com/posts/a-successful-git-branching-model). A possible partial clone would be to clone the development branch and follow that with a pull of the master branch (e.g. if you wish to eliminate dead ends). If feature and release branches had been closed after their final merges then those branches would be reopened in the clone.