Does merge direction matter in Mercurial? - mercurial

Take a simple example: I'm working on the default branch, have some changesets committed locally, and I pulled a few more from the master repository. I've been working for a few days in my isolated local repository, so there's quite a few changes to merge before I can push my results back into master.
default ---o-o-o-o-o-o-o-o-o-o-o (pulled stuff)
\
o----o------------o (my stuff)
I can do two things now.
Option #1:
hg pull
hg merge
Result #1:
default ---o-o-o-o-o-o-o-o-o-o-o
\ \
o----o------------o-O
Option #2:
hg pull
hg update
hg merge
Result #2:
default ---o-o-o-o-o-o-o-o-o-o-o-O
\ /
o----o------------o
These two results look isomorphic to me, but in practice it seems that option #2 results in way smaller changesets (because it only applies my few changes to the mainline instead of applying all the mainline changes to my few).
My question is: does this matter? Should I care about the direction of my merges? Am I saving space if I do this? (Doing hg log --patch --rev tip after the merge suggests so.)

They're (effectively) identical. You see a difference in the hg log --patch --rev X output size because log shows the diff of the result and (arbitrarily) its 'left' parent (officially p1), but that's not how it's stored (Mercurial has a binary diff storage format that isn't patch/diff based) and it's now how it's computed (p1, p2, and most-recent-common-ancestor are all used).
The only real difference is, if you're using named branches, the branch name will be that of the left parent.

There is also a difference if you are using Bookmarks. When doing a merge, the branch that you are is the branch that is receiving the changes, so the new changeset will be part of that branch. Supose you have this situation:
default ---o-o-o-o-o-o-o-o-o-o-o -- Head: Rev 200
\
o----o------------o -- Head: Rev 195, Bookmark: my-stuff
If you merge Rev 200 into Rev 195, the bookmark my-stuff will move on to Rev 201, as you are generating a new changeset in the same branch that has the bookmark.
On the other hand, if you merge 195 into 200, you are generating a changeset in the branch that don't have the bookmark. The my-stuff bookmark will remain in Rev 195.

Related

Mercurial - files modified in current branch

Can you help me to create a proper revset for mercurial hg status ?
I would like to list all the files that were changed in the current branch since it was created.
I tried
hg status --rev "branch(foo)"
where foo is the name of my branch, but this also lists files that were changed in the branch from which my branch was created (?).
I don't get how to create a proper revset for this.
I created my branch and made several changes in multiple files. Now I want to reload these files in my application, but only them.
This seems pretty straightforward (see hg help revsets and hg help revisions for where this comes from).
We might start with the set of all commits in a branch, e.g., for branch foo:
-r 'branch(foo)'
Obviously this can produce a dozen, or even a million, revisions; but you want to see what happened between "branch creation"—which needs to examine the parent of the first such revision—and "current status of branch", which needs to examine the last such revision.
The first1 revision of a large set is obtained by first() and the last by last(). However, when various commands are given a revision specifier, they look it up as a single revision, and here a branch name suffices to name the last commit on the branch anyway.
To get the (first) parent of a revision, we use the p1() function (the suffix ^ is only allowed on a literal revision, not a function that returns a revision). Hence the parent of the first revision on branch foo is:
-r 'p1(first(branch(foo)))'
To get a full diff of this against the last commit in the branch:
hg diff -r 'p1(first(branch(foo)))' -r 'foo'
But you don't want a full diff, you want the file names. The command that produces this is hg status, and it has a slightly different syntax:
hg status --rev 'p1(first(branch(foo)))' --rev 'foo'
The status includes the letters at the front, as well as the names: A for newly added files, M for modified files, and so on. Note the use of --rev rather than just -r (in fact, you can use --rev with hg diff as well).
Note that there's a much shorter syntax, ::foo, that gets all the ancestors of the given branch up to and including the last revision in the named branch. However, this gets too many ancestors. You can, however, use p1(first(branch(foo)))::foo as the (entire) argument to --rev, and this also works. so:
hg status --rev 'p1(first(branch(foo)))::foo`
is a slightly shorter way to express this.
Last, note that comparing the first commit in the branch to the last (as would happen with hg status --rev 'first(branch(foo))' --rev foo, for instance) can miss changes made in that first commit on the branch. That's why we use p1 here.
1The first function selects the first in the set, which may not be the same as the numerically-first revision. For instance, suppose the set is made from x | y and a revision in y is numerically lower than a revision in x, or you use reverse(branch(foo)) to put the commits in high-to-low order. In this case, min instead of first would be the function to use.

List all merged changesets since last merge?

Using Mercurial, how can I list all the changesets applied by merging a branch, since the last merge from that branch?
Revsets are your friend. Or your nemesis, depending on how complex they get :)
The following command will show all associated changesets between the last two merges:
$ hg log -r "first(last(merge(),2)):last(merge()) & ancestors(last(merge()))"
That complex little expression (which I'll look at making simpler later) does the following:
x:y gives you all changesets between x and y inclusive
merge() is a revset that contains all merges.
last(...,n) gives the last n changesets of a set, with n defaulting to 1
first(...) gives you the first changeset of a set
ancestors(last(merge())) is a set containing all ancestors of the last merge
Combining all of those, the expression above becomes (ready?): Give me all changesets between the first of the last two merges, and the last merge, inclusive, which happen to be contributing ancestors of the last merge.
The ancestors(...) bit filters out any changesets that are not related.
You can limit this to be the changes on a specific branch by adding & branch(branchname). For example, if you are merging onto a release branch from default, you could do:
$ hg log -r "first(last(merge(),2)):last(merge()) & ancestors(last(merge())) & branch(default)"
This wouldn't include the actual merges themselves, as they would appear on the release branch.
Hopefully this makes sense - I'll have a look this afternoon to see if I can get a simpler way, but that's the first that springs to mind. In the meantime, if you use this, you can make it easier by creating a revset alias in your user hgrc file:
[revsetalias]
contrib = first(last(merge(),2)):last(merge()) & ancestors(last(merge()))
So you can then use:
$ hg log -r "contrib"
$ hg log -r "contrib & branch(default)"
For more information have a look at hg help revsets.
I'm not sure icabod's solution is quite right. Let me see if I can explain.
Let's take this change graph.
o----A1----A2----M1--------A3---M2
\ / /
---B1----B2--- /
\ /
----C1--------C2----C3
B is a branch taken from o, and C is a branch taken from B1. If we're at M2 and run icabod's command then:
last(merge()) is M2
first(last(merge(),2)) is M1
So the expression becomes:
hg log -r "M1:M2 & ancestors(M2)"
M1:M2 is the changes made than have revision numbers between M1 and M2, which in this case is A3, C2 & C3, which completely ignores C1.
What I think is you're looking for the set of ancestors of M2, that weren't ancestors of M1. i.e.
hg log -r "ancestors(M2) & not ancestors(M1)"
or
hg log -r "ancestors(last(merge())) and not ancestors(first(last(merge(), 2)))"
I think this should be C1, C2, C3 & A3. It also has the advantage that it doesn't care how, or when, change-sets were added to the repo.
The only problem with this is if the second to last merge isn't an ancestor of the latest merge. I'll leave that as an exercise for the reader ;-)
Of course, all of this can be avoided by doing hg merge --preview (or -P) prior to doing a merge, and it lists all the change-sets that will be included when you do a merge.

Mercurial commit disappeared

We have switched to Mercurial recently. All had been going well until we had two incidents of committed changes going missing. Examining the logs has not made us any wiser.
Below is an example. The files committed at (1) revert to a previous state at (2) even though those files are not mentioned in the merge.
What can I check to understand why the files reverted?
There are three interesting changesets in this graph that can influence the (2) merge:
Teal changeset: not shown, but looks like it's just below the graph. This is the first parent of (2)
Blue changeset: number five from the bottom, labelled "Fix test". This is the second parent of (2).
Common ancestor of the parents: also not shown, will be further below. Strangely, it looks like the teal changeset could be the common ancestor, but Mercurial will now allow you to make such a degenerate merge under normal circumstances.
When Mercurial does a merge, these are the only three changesets that matter: the two heads you merge and their common ancestor. In a three-way merge the logic is now:
ancestor parent1 parent2 => merge
X X Y Y (clean)
X Y X Y (clean)
X Y Y Y (clean)
X Y Z W (conflict)
Read the table like this: "if the ancestor was X, and the first parent was also X and the second parent was Y, then the merge will contain Y". In other words: a three-way merge favors change and will let a modification win.
You can find the ancestor with
$ hg log -r "ancestor(p1(changeset-2), p2(changeset-2))"
where changeset-2 is the one marked with (2) above. When you say
The files committed at (1) revert to a previous state at (2) even though those files are not mentioned in the merge.
then it's important to understand that "a merge" is just a snapshot that shows how to mix two other changesets. The change made "in" a merge is the difference between this snapshot and its two parent changesets:
$ hg status --rev "p1(changeset-2):changeset-2"
$ hg status --rev "p2(changeset-2):changeset-2"
This shows how the merge changeset is different from its first and second parent, respectively. I'm sure the files are mentioned in one of those lists — unless the merge isn't the culprit after all.
When you examine the three changesets and the differences between them, then you will probably see that someone has to resolve a conflict (the fourth line in the merge table above) and picked the wrong file at some step along the way.
The merge at 2 is between a very old branch (dark blue, forked from the mainline/green branch just after commit 1) and an even older branch (light blue, hasn't been in sync with mainline since before commit 1)
It seems likely that the merge at 2 picked the wrong version of the file - can't tell from here if that was the tool picking the wrong version of the file, or the user manually selecting the wrong version.
Edited to add:
To help track down exactly what changed at 2, you can use hg diff -r REV1 -r REV2, which will show you the line-by-line differences between any two revisions.
When you know that the badness was introduced sometime between point 1 and point 2, hg bisect may help you track down the exact source of the badness:
hg bisect [-gbsr] [-U] [-c CMD] [REV]
subdivision search of changesets
This command helps to find changesets which introduce problems. To use,
mark the earliest changeset you know exhibits the problem as bad, then mark
the latest changeset which is free from the problem as good.
Bisect will update your working directory to a revision for testing
(unless the -U/--noupdate option is specified). Once you have
performed tests, mark the working directory as good or bad, and bisect
will either update to another candidate changeset or announce that it
has found the bad revision.

Mercurial: Picking commits to be included in a release

I have a local mercurial repository (for now) within which I have already made several commits, each commit is a self contained bug fix. Is it possible to pick which of the bug fixes (commits) I want to be included when it is time to build a release version of my application.
To elaborate, assuming A, B, C, D, and E are commits I have already done to my repository and each of them relates to a bug fix like so:
A <- B <- C <- D <- E <- working dir
I need to be able to for example pick which of the bug fixes will go into the release version (this depends on the time allocated for deployment as well as testing outcomes). So for example I might get a report saying the release should only contain bug fixes A, C and D.
Is it possible to construct a release version containing only the A, C and D commits (Keeping in mind that each commit is self contained and does not depend on the other commits to actually be there)?
Probably having a branch for each bug fix and then merging into a release branch is the easiest way to accomplish this (or is it not?), but the current situation at hand is as described above with no branches.
This isn't the normal work mode of Mercurial (or git). A repository can only contain a changeset if it also contains all of that changeset's ancestors. So you can't get D into a repo without also having A, B, and C in there.
So here's:
What you Should have Done
Control the parentage of your changesets. Don't make C the parent of D just because you happen to have fixed D after C. Before you fix a bug hg update to the previous release.
Imagine A was a release and B, C, and D, were all bug fixes. If you do a loop like this:
foreach bug you have:
hg update A
... fix bug ...
hg commit
hg merge # merges with the "other" head
then you'll end up with a graph like this:
---[A]----[B2]--[C2]--[D2]----
| / / /
+-[B] / /
| / /
+-----[C] /
| /
+---------[D]
and now if you want to create a release with only, say, B and D in it you can do:
hg update B
hg merge D
and that creates a new head that has A + B + D but no C.
Tl;Dr: make a change's parent be as early in history as you can, not whatever happens to be tip at the time.
What you can do Now
That's the ideal, but fortunately it's no big thing. You can never bring exactly D across without bringing C (because C's hash is part of the calculation of D's hash), but you can bring the work that's in D into a new head easily enough. Here are some ways, any of which will work:
hg export / hg import
hg transplant
hg graft (new in 2.0)
hg rebase (only possible if you haven't yet pushed)
Any of those will let you bring that patch/delta that's in D over -- it will have a different hash ID and when some day you merge D in for real (using merge) you'll have duplicate work in two different changesets, but merge will figure it all out.
If this was my tree and it hasn't been pushed anywhere, I'd (assuming an empty patch queue and MQ enabled):
hg qimport -g -r B: # import revisions B and later into mq as "git" style patches
hg qpop -a # unapply them all
hg qpush --move C # Apply changes in C (--move rearranges the order)
hg qpush --move D # Apply changes in D
hg qfin -a # Convert C & D back to changesets
hg push <release server> # Push them out to the release branch
Then you can hg qpush -a; hg qfin -a to get B & E back into changesets.
Final Result:
---A---C---D---B---E
Advantages:
Nobody needs know you didn't do things in this order to start with (evil grin)
You could modify any of the change-sets whilst doing this
Alternatively, with graft in 2.0:
hg update -r A # Goto rev A (no need to do anything special for A)
hg graft C # Graft C on to a new anonymous branch
hg graft D # Graft D
This will give you
---A---B---C---D---E
\
--C'--D' <-You are here
An hg push -r D' should just push the new, cherry-picked, head.
You can then hg merge to get one head again with B and E included.
Advantages:
Non destructive, so true history is kept, and no chance of loss if you muck up
hg tags the new changesets with the hash of the original version, so totally trackable
Probably a little simpler.
While it's somehow strange way and release-policy, you can do it in different form. You have to manipulate with two main objects: changesets and branches
Version 1
You use two branches (default + f.e "release 1.0"). Default branch is mainline of your work - all changesets commited to this branch. At release-time, you branch first needed-for-release changeset into (new) branch, transplant or graft rest of needed in release changesets from default to this branch, head of release 1.0 will be prepared for release this way.
Next release will differ only in new branch name
Version 2
One branch used, MQ extension added. Variations:
all changesets are MQ-pathes and only needed for release are applied to repo
changesets are changesets, only unwanted for release converted to mq-pathes, later qfinish'ed and returned to repo

How to show list of unapplied changesets in Mercurial

After pushing changesets to a repository called 'A' how can I see the list of changesets waiting to be applied when I am in 'A'?
Expanding on that,
In repo B I push changesets to repo B
I change to repo B
How can I list the changesets pushed in step 1?
Not sure what you mean by "unapplied" changesets, however here's a couple thoughts.
You can easily see what changesets will be pushed to a repository by doing hg outgoing prior to doing the hg push. This will list all of the changesets that will be pushed using default options.
Similarly you can use hg incoming in the destination repository to show what changesets would be pulled from another repo.
As for "unapplied" changesets, if I assume you mean changesets that are newer than the working directory, you could use hg log -r .:tip, which should (I've not had a chance to test it) show all newer revisions, but not actually all recently-pushed ones.
Edit: I've updated the revision set in the -r option to something that should work. Have a look at revsets on the Mercurial manpage for more possibilities.
$ hg summary
parent: 0:9f47fcf4811f
.
branch: default
commit: (clean)
update: 2 new changesets (update) <<<<<
The update bit tells you what (I think) you want.
I had written a different answer, but I ended up with a better way of doing what is needed here (an even better and definitive –for me– solution is at the end of this post, in the [EDIT] section).
Use hg log.
Specifically, issue an hg sum command first. This will give me:
parent: 189:77e9fd7e4554
<some commit message>
branch: default
commit: (clean)
update: 2 new changesets (update)
To see what those 2 new changesets are made of, I use
hg log -r tip -r 2 -v
Obviously, 2 is to be replaced with the number of changesets that hg sum reports.
This works because tip will refer to the most recent (or "unapplied") changeset. By limiting the output to the 2 latest changes (-l 2), the information is shown only for those changesets that I'm interested in. With -v, the list of files affected by the changeset is also shown.
To make things simpler, I have defined a user command in my .bashrc file:
alias hglog="hg log -r tip -l $1"
This allows me to type hg sum (to get the number of pending/unapplied changesets) and then to type hglog x where x is the number of changesets revealed by hg sum.
There is probably a more complete way of doing this, for instance using custom templates, but I guess it's pushing things too far in terms of sophistication.
[EDIT] (Third iteration)
I have reached the most satisfying answer to this question by expanding on the alias idea so that I no longer have to type hg sum. My .bashrc file now contains this:
show_pending_changesets() {
nb=$(hg sum | grep "update:" | sed 's/update: \([0-9]*\) .*/\1/');
if [ `expr $nb + 1 2> /dev/null` ] ; then
hg log -r tip -v -l $nb
else
echo "Nothing new to report"
fi ;
}
...
alias hgwhatsnew=show_pending_changesets
Explanation: I'm using sed to extract the number of changesets from the last line (which is the one that starts with update:) of the output of hg sum. That number is then fed to hg log. All I have to do then is to type hgw and tab-complete it. HTH