Understanding Mercurial merging logic

Understanding Mercurial merging logic - mercurial

Let's say myfile was modified in two repositories (A and B). In A we do an hg pull ../B which gets us these changesets:
A1 - A2 - A3 - A4 - A5
\
B3 - B4
As we have two heads (A5, B4), we do an hg merge.
Now if there are conflicts, Mercurial fires up our merge tool (Beyond Compare) and we get three views: left is local, center is parent, and right is 'other'. Given our structure, would left (local) be A5, center (parent) be A2, and right (other) be B4?
Secondly, what exactly is the logic that Mercurial uses to determine that a merge is required? Does it see that there are two versions of myfile without any children? And how exactly does it determine that A2 is the parent?

As mentioned in Merge Tool Configuration
The merge tool is run with an argument list of args with the following variables expanded:
$output expands to the existing file which already contains the version from the first parent - and this is also where the result of the merge ends up / must end up
$local expands to file.orig which is created as a copy of file in the working directory version - it thus contains the unmerged version from the first parent
$base expands to /tmp/file~base.* which is created with the version from the common ancestor revision (see hg debugancestor)
$other expands to /tmp/file~other.* which is created with the version from the new second parent revision which the first parent is merged with
(so yes, "left (local) be A5, center (parent) be A2, and right (other) be B4")
(See "performing the merge")

Related

How do I copy commits from one branch to another?

I have a branch with a few revisions in it. I want to try making some code changes that require reordering and patching those commits with histedit, but I want to keep the original branch around in case it doesn't go well. How can I do that?
Example
Before:
master -> change 1 -> change 2 (branch A)
After:
master -> change 1 -> change 2 (branch A)
-> change 1 -> change 2 (branch B)

The integrated and recommended way to copy or cherry-pick commits from one branch to another is using
hg graft -r XX YY ZZ.
where XX YY ZZ etc. are the revisions to copy to your currently checked-out branch. By default they are committed, but you can also use the --no-commit flag so that you can edit changes. Or do it one-by-one and make additions using hg commit --amend.
Compared to exporting and importing it has the added benefit of using the configured merge programme, should a merge be required - but it will NOT be a merge, so no line of ancestors is established from the current commit to the one you copy from.

How to merge a branch with other branch of the same parent but exclude a changeset alone in mercurial

I have a branch a1 whose parent branch is A. Now I have created another branch a2 from same parent A. I want to merge a1 with a2 excluding a chengeset from a1. Is there a way to partially merge a branch excluding a changeset?

The short answer is no: merges work according to the graph topology, not the branch names.
The longer answer is still no, but it's worth drawing the topology. Your description talks about branch parentage, but branches themselves do not have parent/child relationships. Branches are just collections of commits, with each commit containing its branch name. (Commits have parent/child, or more generally, ancestor/descendant, relationships.) So:
I have a branch a1 whose parent branch is A.
This really means that you have some commit(s) <rev-number(s)> whose branch name is a1, and one or more of those commits have parent commit(s) <different-rev-number(s)> whose branch name is A. If we draw them we get, e.g.:
A: ...--o--o--o
\
a1: o--o
Now I have created another branch a2 from same parent A.
Again, this just means that you have yet more commits (with their rev-numbers) whose branch name is a2; at least one such commit has a parent whose branch-name is A; we might draw this any number of ways, but let's try this one:
a2: o--o
/
A: ...--o--o--o
\
a1: o--o
I want to merge a1 with a2 excluding a chengeset from a1.
You don't really merge branches. Instead, you merge commits. You do this while having some commit checked-out into the working tree—this working-tree copy is the proposed next commit—and you are, by definition, on some branch. You then run hg merge and specify some other revision: a commit that is on this or any other branch. Assuming the merge makes sense, Mercurial begins the merging process. When the merge is done, the new commit will have your current branch as its branch, and will have the working-tree's parent commit as its first parent, and the supplied revision as its second-parent.
So here, you might hg update a2 to select the tip of a2 as the commit copied into the working tree. The working-tree is a proposed, but not yet actual, new commit for a2, and its parent is the commit now marked #:
a2: o--#
/
A: ...--o--o--o
\
a1: o--o
Is there a way to partially merge a branch excluding a changeset?
You may select either of the two (we've only drawn two here) commits that are on a1 as your target for the hg merge operation. Let's say you select the second one, using hg merge a1:
a2: o--#
/
A: ...--*--o--o
\
a1: o--●
The filled-in circle is the commit to merge; # is the parent of the working tree; so * is their merge base—the best common-ancestor commit. Mercurial will compare the contents of the snapshot in * to the contents of the snapshot in # (or the working tree—these contents should be the same when you start the process), and also compare the contents of the snapshot in * to those in ●. These two comparisons each produce an en-masse changeset: the contents in ● are modified from those in the o between * and ●. Merge will simply combine these two en-masse changesets (which are then applied to the base's content).
You can, of course, select the other a1 commit for merging, but then you'll be using its snapshot, not the one for ●. This will effectively exclude the changes in ●—probably not what you want.
Let's draw yet a third graph, since this particular one is a bit shallow for our purposes. Suppose instead of just the two commits exclusive to a1, we have many. But one of them, which we'll draw as x here, we'd like to exclude when doing the merge into #:
a2: o--#
/
A: ...--*--o--o
\
a1: o--o--o--x--o--o--o [want to merge this commit]
To achieve what you do want, you must first make new commits, perhaps on a1, perhaps on a new branch entirely. These new commits will look like this:
a2: o--#
/
A: ...--*--o--o
\
o--o--o--x--o--o--o
a1: \
----o--o--o
In this case, I've drawn the new commits as a new mini-branch within a1, but perhaps it would be clearer to make them on a new branch a3:
a2: o--#
/
A: ...--*--o--o
\
a1: o--o--o--x--o--o--o
\
a3: ----o--o--o
The three new commits, in both cases, are simply copies of the changesets that occur after commit x. When using a3, the easy way to make these copies is to use hg graft. Now we can pick the tip commit of a3 as the one to merge into a1:
a2: o--#
/
A: ...--*--o--o
\
a1: o--o--o--x--o--o--o
\
a3: ----o--o--●
and the result of the merge will be a merge commit, which will tie back to both the old #—the new merge is the current commit so it is now #—and ●:
a2: o--o---------------#
/ /
A: ...--*--o--o /
\ /
a1: o--o--o--x--o--o--o /
\ /
a3: ----o--o---●
The merge still used the topology, not the branch names.
(This is a key realization for Mercurial: branch names merely group commits together; but it's the graph topology that really controls things.)

Mercurial diff including first changeset

I have recently encountered the need to generate a Mercurial diff of all changes up to a particular changeset which includes the first changeset of the repo. I realize this kind of stretches the definition of a diff, but this is for uploading a new project to a code review tool.
Let's assume the following changesets:
p83jdps99shjhwop8 - second feature 12:00 PM
hs7783909dnns9097 - first feature - 11:00 AM
a299sdnnas78s9923 - original app setup - 10:00 AM
If I need a "diff" of all changes that have been committed, the only way that I can seem to achieve this is with the following diff command...
diff -r 00:p83jdps99shjhwop8
In this case the first changeset in the argument param (here - 00) takes the regexp form of 0[0]+
This seems to be exactly what we need based on a few tests, but I have had trouble tracking down documentation on this scenario (maybe I just can't devise the right Google query). As a result, I am unsure if this will work universally, or if it happens to be specific to my setup or the repos I have tested by chance.
Is there a suggested way to achieve what I am trying to accomplish? If not, is what I described above documented anywhere?

It appears this actually is documented, but you need to do some digging...
https://www.mercurial-scm.org/wiki/ChangeSetID
https://www.mercurial-scm.org/wiki/Nodeid
So the special nodeid you're referring to is the 'nullid'.
2 digits may not be adequate to identify the nullid as such (as it may be ambiguous if other hashes start with 2 zeros), so you may be better off specifying 4 0's or more.
Eg: hg diff -r 00:<hash of initial add changeset> has resulted in the abort: 00changelog.i#00: ambiguous identifier! error.

I'm a little confused about what you need. The diff between an empty repository and the revision tip is just the content of every file at tip-- in other words, it's the state of your project at tip. In diff format, that'll consist exclusively of + lines.
Anyway, if you want a way to refer to the initial state of the repository, the documented notation for it is null (see hg help revisions). So, to get a diff between the initial (empty) state and the state of your repository at tip, you'd just say
hg diff -r null -r tip
But hg diff gives you a diff between two points in your revision graph. So this will only give you the ancestors of tip: If there are branches (named or unnamed) that have not been merged to an ancestor of tip, you will not see them.
3--6
/
0--1--2--5--7 (tip)
\ /
4
In the above example, the range from null to 7 does not include revisions 3 and 6.

Mercurial commit disappeared

We have switched to Mercurial recently. All had been going well until we had two incidents of committed changes going missing. Examining the logs has not made us any wiser.
Below is an example. The files committed at (1) revert to a previous state at (2) even though those files are not mentioned in the merge.
What can I check to understand why the files reverted?

There are three interesting changesets in this graph that can influence the (2) merge:
Teal changeset: not shown, but looks like it's just below the graph. This is the first parent of (2)
Blue changeset: number five from the bottom, labelled "Fix test". This is the second parent of (2).
Common ancestor of the parents: also not shown, will be further below. Strangely, it looks like the teal changeset could be the common ancestor, but Mercurial will now allow you to make such a degenerate merge under normal circumstances.
When Mercurial does a merge, these are the only three changesets that matter: the two heads you merge and their common ancestor. In a three-way merge the logic is now:
ancestor parent1 parent2 => merge
X X Y Y (clean)
X Y X Y (clean)
X Y Y Y (clean)
X Y Z W (conflict)
Read the table like this: "if the ancestor was X, and the first parent was also X and the second parent was Y, then the merge will contain Y". In other words: a three-way merge favors change and will let a modification win.
You can find the ancestor with
$ hg log -r "ancestor(p1(changeset-2), p2(changeset-2))"
where changeset-2 is the one marked with (2) above. When you say
The files committed at (1) revert to a previous state at (2) even though those files are not mentioned in the merge.
then it's important to understand that "a merge" is just a snapshot that shows how to mix two other changesets. The change made "in" a merge is the difference between this snapshot and its two parent changesets:
$ hg status --rev "p1(changeset-2):changeset-2"
$ hg status --rev "p2(changeset-2):changeset-2"
This shows how the merge changeset is different from its first and second parent, respectively. I'm sure the files are mentioned in one of those lists — unless the merge isn't the culprit after all.
When you examine the three changesets and the differences between them, then you will probably see that someone has to resolve a conflict (the fourth line in the merge table above) and picked the wrong file at some step along the way.

The merge at 2 is between a very old branch (dark blue, forked from the mainline/green branch just after commit 1) and an even older branch (light blue, hasn't been in sync with mainline since before commit 1)
It seems likely that the merge at 2 picked the wrong version of the file - can't tell from here if that was the tool picking the wrong version of the file, or the user manually selecting the wrong version.
Edited to add:
To help track down exactly what changed at 2, you can use hg diff -r REV1 -r REV2, which will show you the line-by-line differences between any two revisions.
When you know that the badness was introduced sometime between point 1 and point 2, hg bisect may help you track down the exact source of the badness:
hg bisect [-gbsr] [-U] [-c CMD] [REV]
subdivision search of changesets
This command helps to find changesets which introduce problems. To use,
mark the earliest changeset you know exhibits the problem as bad, then mark
the latest changeset which is free from the problem as good.
Bisect will update your working directory to a revision for testing
(unless the -U/--noupdate option is specified). Once you have
performed tests, mark the working directory as good or bad, and bisect
will either update to another candidate changeset or announce that it
has found the bad revision.

How do I cherry-pick a single revision in Mercurial?

In Mercurial/TortoiseHg, given the following example, what is the easiest way to merge revision "G" into repo A without taking D,E and F (Assume that G has no dependency on D,E or F).
Repo A: A - B - C
Repo B (Clone of A) A - B - C - D - E - F - G
Is a patch the best bet?

Tonfa is right. What you're describing isn't 'merging' (or 'pushing' or 'pulling'); it's 'cherry-picking'. A push or a pull moves all the changesets from one repo to another that aren't already in that repo. A 'merge' takes two 'heads' and merges them down to a new changeset that's the combination of both.
If you really need to move G over but can't possibly abide having D,E,F there you should 'hg export' G from repo A, and then 'hg import' it in repo A. The Transplant extension is a wrapper around export/import with some niceties to help avoid moving the same changeset over multiple times.
However, the drawback to using import/export, transplant, and cherry-picking in general is that you can't really move over G without its ancestors, because in Mercurial a changeset's name is its 'hashid' which includes the hashids of its parents. Different parents (G's new parent would be C and not F) means a different hashid, so it's not G anymore -- it's the work of G but a new changeset by name.
Moving over G as something new, let's call it G' (Gee prime), isn't a big deal for some uses, but for others it's a big pita. When soon repo B get's a new changeset, H, and you want to move it over its parent will be changing from G to G', which have different hashes. That means H will move over as H' -- 100 changesets down the line and you'll have different hashids for everything all because you couldn't stand having D,E,F in repo A.
Things will get even more out of whack if/when you want to move stuff from Repo A into Repo B (the opposite direction of your earlier move). If you try to do a simple 'hg push' from A to B you'll get G' (and H' and by subsequent descendants) which will be duplicates of the changesets you already have in Repo B.
What then, are your options?
Don't care. Your data is still there you just end up with the same changesets with different names and more work on future exchanges between the two repos. It's not wrong, it's just a little clumsy maybe, and some folks don't care.
Move all of D,E, and F over to Repo A. You can move all the changesets over if they're harmless and avoid all the hassle. If they're not so harmless you can move them over and then do a 'hg backout' to undo the effects of D,E and F in a new changeset H.
Give G better parentage to begin with. It's mean for me to mention this because it's too late to go this route (without editing history). What you should have done before working on changeset G was to hg update C. If G doesn't rely on or require changesets D,E, and F then it shouldn't be their kid.
If instead you update to C first you'll have a graph like this:
A - B - C - D - E - F
\
G
then, the whole answer to this question would just be hg push -r G ../repoA and G would move over cleanly, keeping its same hashid, and D, E, and F wouldn't go with it.
UPDATE:
As pointed out in the comments. With modern Mercurials the hg graft command is the perfect way to do this.

Refering to the title, which addresses cherry picking in general, I give the example of working in one repo, as internet search engines might bring people here for cherry picking in general. Working in one repository, it would be done with hg graft:
hg update C
hg graft G
The result is:
G'
/
A - B - C - D - E - F - G
Extra warning: The two changesets will be treated as independent, parallel commits on the same files and might make you run into merge conflicts, which is why cherry picking should be avoided in general for branch management. For example, if G is a bug fix applied to a stable version branch bookmarked as 1.0.1, you should rather merge the freeze branch with it, and from time to time merge the master branch with the freeze branch's bugfixes.

Here's another approach:
hg import =(hg diff -c 7b44cc577701f956f12b029ad54d32fdce0a002d services/webpack/package.json)
This creates a diff for the changeset you want to patch in, then saves it to a temporary file and imports it. The filename(s) are optional.
<(...) also seems to work if you're not using zsh (creates a named pipe instead). Or you can manually save the patch to a file:
hg diff -c xxx > mypatchfile

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008