Flatten Mercurial Revision Chain

Flatten Mercurial Revision Chain - mercurial

I have a long sequence of mercurial changes (each with only a single parent), some of which conflict with each other and others which do not. I'm trying to "flatten" that sequence of changes such that the resulting tree has the minimum depth, without triggering manual merges.
I have a simple bash script which seems to do this. It works by continually trying to rebase revisions on top of their grandparent revision. However, this approach is incredibly slow for long revision chains.
#!/bin/bash
for rev in $(hg l --template "{node}" | egrep --only-matching "[0-9a-f]+")
do
while :
do
# Attempt to rebase the revision on top of its grandparent.
hg rebase -s $rev -d "first(parents(parents($rev)))"
if [ $? -eq 0 ]
then
# Find the updated revision after the rebase.
rev=$(hg log --hidden --rev "$rev" | egrep rewritten | egrep --only-matching "[0-9]{4}[:][0-9a-z]+")
else
hg rebase --abort
break
fi
done
done
Is there a faster way to do something similar?

Linear history can't contain "conflicting" changes (in case of overlapping changes latest win all)
You have at least two ways (if we'll forget about MQ) for squashing commits for any size of commit-history
histedit
Fold command from Evolve extension

Related

Some hg changesets not merging after graft

I have two hg branches (dev and stable) that aren't merging like I'd expect.
On stable: I grafted in a one-line commit from dev.
On dev: Changed that one line that was grafted, committed change.
On stable: merged dev into stable (no conflicts).
However after this merge stable still has the grafted version of the line (step 1). Not the latest changes to that same line from dev (step 2). Why is this?
The file looks like:
This
file
to
be
merged
Changesets:
Changes "to" to "might" on dev
Grafts changeset 1 to stable
Changes "might" back to "to" on dev
Merges dev into stable. Result is "might" (not "to" like I'd expect to see from changeset 3).

Sorry about the delay here: As soon as you shrank the reproducer to the five commits, I knew what was going on, but I wanted to write my own reproducer before answering, and the priority of this dropped a lot. 😀 The script I used, mktest.hg, to create the commits, the graft, and the merge, appears at the end of this answer.
The key issue here is the way merge actually works in Mercurial. It uses the same algorithm as Git does: that is, it completely ignores any of the branch information, and completely ignores any timing information. It looks only at three specific commits, as found by examining the commit graph, as shown in your image. Here's a text variant via my own reproducer:
$ cd test-hg-graft/
$ cat file.txt
This
file
might
be
merged
$ hg lga
# 4:b027441200d2:draft stable tip Chris Torek
|\ merge dev into stable (9 minutes ago)
| |
| o 3:01c6cc386a08:draft stable Chris Torek
| | back to "to" on stable (9 minutes ago)
| |
| o 2:ad954507e465:draft stable Chris Torek
| | s/to/might/ (9 minutes ago)
| |
o | 1:f7521e4f0941:draft dev Chris Torek
|/ s/to/might/ (9 minutes ago)
|
o 0:a163d2c4874b:draft stable Chris Torek
initial (9 minutes ago)
The lga alias is one I stole borrowed copied from someone else:
lga = log -G --style ~/.hgstuff/map-cmdline.lg
where map-cmdline.lg is in the link above. It's just log -G (aka glog) with a more-compact format.
What's going on
When we run:
hg merge dev
Mercurial locates three specific commits:
The current commit on stable, -r3 in this case (the SHA ID will vary), is one of the two endpoint commits.
The target commit on dev is the result of resolving dev to a revision. We can do this ourselves with hg id -r dev for instance:
$ hg id -r dev
f7521e4f0941 (dev)
$ hg id -n -r dev
1
Note that we can do the same thing with # to identify our current revision, although hg summary spills everything out more conveniently.
Last (or in some sense first, though we need the other two to get here), Mercurial locates a merge base commit from these two commits. The merge base is the first commit in the graph that is reachable from both of the other inputs to the merge. In our particular case, that's rev zero, since we split the branches apart right after -r0.
Technically, the merge base is the output of a Lowest Common Ancestor algorithm as run on the Directed Acyclic Graph. See Wikipedia for some examples. There can be more than one LCA; Mercurial picks one at (apparent) random for this case. In our case there is only one LCA though.
Having found the merge base, Mercurial now runs the equivalent of two diff operations:
hg diff -r 0 -r 3
to see what we changed, and:
hg diff -r 0 -r 1
to see what they changed, since the merge base snapshot.1 If we do this ourselves, we see what Mercurial sees:
$ hg diff -r 0 -r 3
$ hg diff -r 0 -r 1
diff --git a/file.txt b/file.txt
--- a/file.txt
+++ b/file.txt
## -1,5 +1,5 ##
This
file
-to
+might
be
merged
(I have my hg diff configured with git = true so that I get diffs that I can feed to Git—long ago I was doing a lot of conversion work here.)
As far as Mercurial is concerned, then, we did nothing on our branch. So it combines do nothing with make this change to file.txt and comes up with this one change to file.txt. That one change is applied to the files from the merge base commit. The resulting files—well, file, singular, in this case—are the ones that are ready to go into the final merge commit, even though they're not the ones you wanted.
Because Mercurial has more information than Git—in particular, which branch something happened on—it would be possible for Mercurial to behave differently from Git here. But in fact, both do the same thing with this kind of operation. They both find a merge base snapshot, compare the snapshot to the two input commit snapshots, and apply the resulting combined changeset to the files from the merge base. Mercurial can do a better job of catching file renames (since it knows them, vs Git, which just has to guess) and could do a different job of merging here, but doesn't.
1Some might object that Mercurial stores changesets, not snapshots. This is true—or rather, sort of true: every once in a while, Mercurial stores a new copy of a file, instead of a change for it. But as long as we have all the commits needed, storing changes vs storing snapshots is pretty much irrelevant. Given two adjacent snapshots, we can find a changeset, and given one snapshot and a changeset to move forward or backward, we can compute a new snapshot. That's how we can extract a snapshot in Mercurial (which stores changesets), or show a changeset in Git (which stores snapshots).
Script: mktest.hg
#! /bin/sh
d=test-hg-graft
test "$1" = replay && rm -rf $d
if test -e $d; then
echo "fatal: $d already exists" 1>&2
exit 1
fi
set -e
mkdir $d
cd $d
hg init
hg branch stable
cat << END > file.txt
This
file
to
be
merged
END
hg add file.txt
hg commit -m initial
hg branch dev
ed file.txt << END
3s/to/might/
w
q
END
hg commit -m 's/to/might/'
hg checkout stable
hg graft -r 1 # pick up s/to/might/; graft makes its own commit
ed file.txt << END
3s/might/to/
w
q
END
hg commit -m 'back to "to" on stable'
hg merge dev
hg commit -m "merge dev into stable"

Disallow hg push -f - but allow hg pull creating new head

As a followup for Mercurial: enforce "hg pull -u" before "hg commit"
I have started to use a hook
[hooks]
pretxnchangegroup.forbid_2heads = /usr/local/bin/forbid_2head.sh
where forbid_2head.sh looks like this
#!/bin/bash
BRANCH=`hg branch`
COUNT=`hg heads --template '{branch}|{rev}\n' | grep ^${BRANCH} | wc -l`
if [ "$COUNT" -ne "1" ] ; then
echo "=========================================================="
echo "Trying to push more than one head, which is not allowed"
echo "You seem to try to add changes to an old changeset!"
echo "=========================================================="
exit 1
fi
exit 0
It is derivative of the script found at http://tutorials.davidherron.com/2008/10/forbidding-multiple-heads-in-shared.html
where I do allow multiple named branches.
The problem I have now is that
it stops hg push -f which is what I wanted
it also stops hg pull in case there are incoming changeset and I have commits outgoing. This is indeed bad
Can I in any way reuse the same script but change the hook setup and stop "hg push -f"?
Or can I in forbid_2head.sh know whether this is a push or pull command running?

First, the script isn't completely correct: it just counts the number of heads in the branch currently checked out on the server (the one hg branch) reports. You could improve it by using
hg heads tip
to get the heads of the branch of tip. But someone might push changesets on more than one branch at a time, so what you really want is
hg heads --template '{branch}\n' $HG_NODE:tip
to find branch heads for the branches touched by $HG_NODE:tip (the changesets pushed in the not-yet-committed transaction). You can then compare that with
hg log --template '{branch}\n' -r $HG_NODE:tip | sort -u
which are the branches touched by the changegroup.
If you don't want to allow existing multiple heads, then you can simplify the above to just
$(hg heads --template 'x' | wc -c) -eq $(hg branches | wc -l)
which just tests that the number of branch heads is equal to the number of branches — i.e., that there is exactly one branch head per named branch.
With that out of the way, let me mention $HG_SOURCE. That environment variable is set by Mercurial when it runs the hook: it has the value push if the changegroup is being pushed into the repository using direct filesystem access, and the value serve if the changegroup is coming in over SSH or HTTP. See the Mercurial book.
So, to conclude, I believe this is a good "forbid multiple heads" script:
#!/bin/sh
HEADS=$(hg heads --template 'x' | wc -c)
BRANCHES=$(hg branches | wc -l)
test $HG_SOURCE = 'serve' -a $HEADS -ne $BRANCHES

Mark changes as already merged or deliberately ignored with hg pull/push/merge/graft?

I'm transitioning to Mercurial from Subversion, where I'm used to using svnmerge.py to track changes that have already been merged, or which have been blocked from being merged:
# Mark change 123 as having already been merged; it will not be merged again, even if a range
# that contains it is subsequently specified.
svnmerge.py merge -M -r123
#
# Block change 326 from being considered for merges.
svnmerge.py merge -X -r326
#
# Show changes that are available for merging from the source branch.
svnmerge.py avail
#
# Do a catchall merge of the remaining changes. Neither change 123 nor change 326 will be
# considered for merging.
svnmerge.py merge
I want to be able to do something similar for hg pull/push/merge/graft, so that if I know that I never want to merge a given change, I can just block it from consideration, making subsequent cherry-picking, merging, etc., into a more fire-and-forget affair. I have done a lot of googling, but have not found a way to do this.
There also appears to be no way to view a list of as-yet-ungrafted changes.
As I'm often tidying up after other developers and helping them with their merges, it's immensely helpful to be able to do these kinds of things, which one might well consider "inverse cherry-picking;" i.e., marking changes that you do NOT want to merge, and then doing a bulk merge of the remainder.

DAG-based systems like Mercurial ans Git are all or nothing: when you merge two branches, you do a three-way merge of the common ancestor and the two branches.
The three-way merge is only concerned with the final stage of each branch. For instance, it doesn't matter if you make your changes in 10 it 1000 steps — the merge result will be the same.
This implies that the only way to ignore a changeset is to back it out before the merge:
$ hg backout BAD
That will cancel the changeset on the branch, making it appear that it was never made from the perspective of the three-way merge.
If you have a whole branch that you want to merge, but ignore, then you can do a dummy merge:
$ hg merge --tool internal:local --non-interactive
$ hg revert --all --rev .
That goes through the merge, but reverts back to the old state before committing.
The best advice I can give you is to structure your workflow so that the above backouts aren't necessary. This means committing a bugfix on the oldest applicative branch. If a bug is found while creating feature X, then use hg bisect to figure out when the bug was introduced. Now updated back to the oldest branch where you still want to fix the bug:
$ hg update 2.0
# fix bug
$ hg commit -m "Fixed issue-123"
then merge the bugfix into all later branches:
$ hg update 2.1
$ hg merge 2.0
$ hg commit -m "Merge with 2.0 to get bugfix for issue-123"
$ hg update 2.2
$ hg merge 2.1
$ hg commit -m "Merge with 2.1 to get bugfix for issue-123"
If the bugfix no longer applies, then you should still merge, but throw away the unrelated changes:
$ hg update 3.0
$ hg merge 2.2 --tool internal:local --non-interactive
$ hg revert --all --rev .
$ hg commit -m "Dummy merge with 2.2"
That ensures that you can always use
$ hg log -r "::2.2 - ::3.0"
to see changesets on the 2.2 branch that haven't been merged into 3.0 yet.

In Mercurial (hg), how do you see a list of files that will be pushed if an "hg push" is issued?

We can see all the changesets and the files involved using
hg outgoing -v
but the filenames are all scattered in the list of changesets.
Is there a way to just see a list of all the files that will go out if hg push is issued?

First, create a file with this content:
changeset = "{files}"
file = "{file}\n"
Let's say you call it out-style.txt and put it in your home directory. Then you can give this command:
hg -q outgoing --style ~/out-style.txt | sort -u

A somewhat under-appreciated feature: hg status can show information about changes in file status between arbitrary changesets. This can be used to get a list of files changed between revisions X and Y:
hg status --rev X:Y
In this case, we can use hg outgoing, to find the first outgoing changeset X and then do
hg status --rev X:
to see the files changes since revision X. You can combine this into a single line in your shell:
hg status --rev $(hg outgoing -q --template '{node}' -l 1):

I usually use
hg outgoing -v | grep files
It makes the listing shorter, but doesnt sort. But thus far I havent been in a situation where I want to push so much (and at the same time check the files) that its been a problem.
[Edit]
To do what you want:
Use cut to remove the files: part
For changesets with more than one touched file, use tr to put them on separate lines
Finally sort the resulting output with sort
Like so:
hg outgoing -v |grep files: |cut -c 14- |tr ' ' '\n' |sort -u
You can put this in ~/outgoingfiles.sh or something to have it nice and ready.

I use Torgoise Hg, which is a shell extension that has a "synchronize" view allowing you to see outgoing files before you push them. It's convenient for commits as well, and other things.

A simple hg out will also solve this.
It will list all committed but yet to push checkins.

How do I get the current mercurial revision without calling hg?

In Git the current revision hash is stored in
.git/refs/heads/master
Is there an equivalent in Mercurial that doesn't require me making a call to hg log -l1? I know I can get the current branch in .hg/branch.
This is to "display" the current hg hash on screen when browsing a web page.

$ hg parents --template="{node}\n"
52b8cee1e59c91b9147635b7f44a3a8896ee0b00
$ hexdump -n 20 -e '1/1 "%02x"' .hg/dirstate
52b8cee1e59c91b9147635b7f44a3a8896ee0b00
But why can't you just call hg parents --template="{node}\n"?

hg id --debug -i -r .

I'm not a mercurial expert, but taking the sledgehammer approach and doing a grep for the current revision hash in .hg yields only one possible, and that is .hg/branchheads.cache.
I believe this caches all the heads of the repository, so it may have multiple entries. By default, I think it will always have two entries, one for the default branch and one for the tip revision number.
I think that branchheads.cache is rebuilt whenever new changesets arrive, so it should always have the correct current revision hash in it.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008