Disallow hg push -f - but allow hg pull creating new head - mercurial

As a followup for Mercurial: enforce "hg pull -u" before "hg commit"
I have started to use a hook
[hooks]
pretxnchangegroup.forbid_2heads = /usr/local/bin/forbid_2head.sh
where forbid_2head.sh looks like this
#!/bin/bash
BRANCH=`hg branch`
COUNT=`hg heads --template '{branch}|{rev}\n' | grep ^${BRANCH} | wc -l`
if [ "$COUNT" -ne "1" ] ; then
echo "=========================================================="
echo "Trying to push more than one head, which is not allowed"
echo "You seem to try to add changes to an old changeset!"
echo "=========================================================="
exit 1
fi
exit 0
It is derivative of the script found at http://tutorials.davidherron.com/2008/10/forbidding-multiple-heads-in-shared.html
where I do allow multiple named branches.
The problem I have now is that
it stops hg push -f which is what I wanted
it also stops hg pull in case there are incoming changeset and I have commits outgoing. This is indeed bad
Can I in any way reuse the same script but change the hook setup and stop "hg push -f"?
Or can I in forbid_2head.sh know whether this is a push or pull command running?

First, the script isn't completely correct: it just counts the number of heads in the branch currently checked out on the server (the one hg branch) reports. You could improve it by using
hg heads tip
to get the heads of the branch of tip. But someone might push changesets on more than one branch at a time, so what you really want is
hg heads --template '{branch}\n' $HG_NODE:tip
to find branch heads for the branches touched by $HG_NODE:tip (the changesets pushed in the not-yet-committed transaction). You can then compare that with
hg log --template '{branch}\n' -r $HG_NODE:tip | sort -u
which are the branches touched by the changegroup.
If you don't want to allow existing multiple heads, then you can simplify the above to just
$(hg heads --template 'x' | wc -c) -eq $(hg branches | wc -l)
which just tests that the number of branch heads is equal to the number of branches — i.e., that there is exactly one branch head per named branch.
With that out of the way, let me mention $HG_SOURCE. That environment variable is set by Mercurial when it runs the hook: it has the value push if the changegroup is being pushed into the repository using direct filesystem access, and the value serve if the changegroup is coming in over SSH or HTTP. See the Mercurial book.
So, to conclude, I believe this is a good "forbid multiple heads" script:
#!/bin/sh
HEADS=$(hg heads --template 'x' | wc -c)
BRANCHES=$(hg branches | wc -l)
test $HG_SOURCE = 'serve' -a $HEADS -ne $BRANCHES

Related

Some hg changesets not merging after graft

I have two hg branches (dev and stable) that aren't merging like I'd expect.
On stable: I grafted in a one-line commit from dev.
On dev: Changed that one line that was grafted, committed change.
On stable: merged dev into stable (no conflicts).
However after this merge stable still has the grafted version of the line (step 1). Not the latest changes to that same line from dev (step 2). Why is this?
The file looks like:
This
file
to
be
merged
Changesets:
Changes "to" to "might" on dev
Grafts changeset 1 to stable
Changes "might" back to "to" on dev
Merges dev into stable. Result is "might" (not "to" like I'd expect to see from changeset 3).
Sorry about the delay here: As soon as you shrank the reproducer to the five commits, I knew what was going on, but I wanted to write my own reproducer before answering, and the priority of this dropped a lot. 😀 The script I used, mktest.hg, to create the commits, the graft, and the merge, appears at the end of this answer.
The key issue here is the way merge actually works in Mercurial. It uses the same algorithm as Git does: that is, it completely ignores any of the branch information, and completely ignores any timing information. It looks only at three specific commits, as found by examining the commit graph, as shown in your image. Here's a text variant via my own reproducer:
$ cd test-hg-graft/
$ cat file.txt
This
file
might
be
merged
$ hg lga
# 4:b027441200d2:draft stable tip Chris Torek
|\ merge dev into stable (9 minutes ago)
| |
| o 3:01c6cc386a08:draft stable Chris Torek
| | back to "to" on stable (9 minutes ago)
| |
| o 2:ad954507e465:draft stable Chris Torek
| | s/to/might/ (9 minutes ago)
| |
o | 1:f7521e4f0941:draft dev Chris Torek
|/ s/to/might/ (9 minutes ago)
|
o 0:a163d2c4874b:draft stable Chris Torek
initial (9 minutes ago)
The lga alias is one I stole borrowed copied from someone else:
lga = log -G --style ~/.hgstuff/map-cmdline.lg
where map-cmdline.lg is in the link above. It's just log -G (aka glog) with a more-compact format.
What's going on
When we run:
hg merge dev
Mercurial locates three specific commits:
The current commit on stable, -r3 in this case (the SHA ID will vary), is one of the two endpoint commits.
The target commit on dev is the result of resolving dev to a revision. We can do this ourselves with hg id -r dev for instance:
$ hg id -r dev
f7521e4f0941 (dev)
$ hg id -n -r dev
1
Note that we can do the same thing with # to identify our current revision, although hg summary spills everything out more conveniently.
Last (or in some sense first, though we need the other two to get here), Mercurial locates a merge base commit from these two commits. The merge base is the first commit in the graph that is reachable from both of the other inputs to the merge. In our particular case, that's rev zero, since we split the branches apart right after -r0.
Technically, the merge base is the output of a Lowest Common Ancestor algorithm as run on the Directed Acyclic Graph. See Wikipedia for some examples. There can be more than one LCA; Mercurial picks one at (apparent) random for this case. In our case there is only one LCA though.
Having found the merge base, Mercurial now runs the equivalent of two diff operations:
hg diff -r 0 -r 3
to see what we changed, and:
hg diff -r 0 -r 1
to see what they changed, since the merge base snapshot.1 If we do this ourselves, we see what Mercurial sees:
$ hg diff -r 0 -r 3
$ hg diff -r 0 -r 1
diff --git a/file.txt b/file.txt
--- a/file.txt
+++ b/file.txt
## -1,5 +1,5 ##
This
file
-to
+might
be
merged
(I have my hg diff configured with git = true so that I get diffs that I can feed to Git—long ago I was doing a lot of conversion work here.)
As far as Mercurial is concerned, then, we did nothing on our branch. So it combines do nothing with make this change to file.txt and comes up with this one change to file.txt. That one change is applied to the files from the merge base commit. The resulting files—well, file, singular, in this case—are the ones that are ready to go into the final merge commit, even though they're not the ones you wanted.
Because Mercurial has more information than Git—in particular, which branch something happened on—it would be possible for Mercurial to behave differently from Git here. But in fact, both do the same thing with this kind of operation. They both find a merge base snapshot, compare the snapshot to the two input commit snapshots, and apply the resulting combined changeset to the files from the merge base. Mercurial can do a better job of catching file renames (since it knows them, vs Git, which just has to guess) and could do a different job of merging here, but doesn't.
1Some might object that Mercurial stores changesets, not snapshots. This is true—or rather, sort of true: every once in a while, Mercurial stores a new copy of a file, instead of a change for it. But as long as we have all the commits needed, storing changes vs storing snapshots is pretty much irrelevant. Given two adjacent snapshots, we can find a changeset, and given one snapshot and a changeset to move forward or backward, we can compute a new snapshot. That's how we can extract a snapshot in Mercurial (which stores changesets), or show a changeset in Git (which stores snapshots).
Script: mktest.hg
#! /bin/sh
d=test-hg-graft
test "$1" = replay && rm -rf $d
if test -e $d; then
echo "fatal: $d already exists" 1>&2
exit 1
fi
set -e
mkdir $d
cd $d
hg init
hg branch stable
cat << END > file.txt
This
file
to
be
merged
END
hg add file.txt
hg commit -m initial
hg branch dev
ed file.txt << END
3s/to/might/
w
q
END
hg commit -m 's/to/might/'
hg checkout stable
hg graft -r 1 # pick up s/to/might/; graft makes its own commit
ed file.txt << END
3s/might/to/
w
q
END
hg commit -m 'back to "to" on stable'
hg merge dev
hg commit -m "merge dev into stable"

Mercurial automatic buld workflow?

I'm trying to replicate a workflow using Mercurial. It seems like this should be common, but I'm not quite sure how to do the magic bits.
User A creates changesets A1 and A2. User A does not have changesets B or C.
A1 and A2 are submitted to the server, which puts them in a queue. Changesets from other users B and C are ahead in the queue.
Each changeset (B then C then A1/A2) is rebased into the main branch automatically, built, and then accepted into the trunk (or rejected).
The codebase is very large so builds take a long time. In the meantime A3 and A4 are generated by User A.
User A does a pull and gets B', C', and the new A1'/A2' without getting duplicates. A3 and A4 move to the tip of the trunk as development continues.
Step 5 is the one that has me the most stymied. A "git pull --rebase" seems to recognize that the changesets are the same and so A1/A2 disappear whereas with Hg it is a conflict. I don't expect Hg to be exactly the same workflow, I just need some way for a developer to be able to pull the trunk and not have to do manual fix-up of their tree to get their changesets in order. I also need some explainable workflow as to how to recover if your changeset is rejected. Does anyone have experience with this type of workflow that can recommend a tactic?
Thanks
Edit: Here's a simulator for the workflow. I'm certainly willing to try any other workflow that will solve the problem of being able to continue to build while changesets are progressing through acceptance and coming back smoothly.
rm -rf master
rm -rf build
rm -rf c1
rm -rf c2
rm -rf c3
rm -rf bundles
# Master repository
mkdir master
hg init master
echo x >> master/m1.txt
hg -R master add master/m1.txt
hg -R master commit master/m1.txt -m"m-1"
echo x >> master/m1.txt
hg -R master commit master/m1.txt -m"m-2"
echo x >> master/m1.txt
hg -R master commit master/m1.txt -m"m-3"
# Build repository
hg clone master build
# Setup first client
hg clone master c1
echo x >> c1/client1.txt
hg -R c1 add c1/client1.txt
hg -R c1 commit c1/client1.txt -m"c1-1"
echo x >> c1/client1.txt
hg -R c1 commit c1/client1.txt -m"c1-2"
# Setup second client
hg clone master c2
echo x >> c2/client2.txt
hg -R c2 add c2/client2.txt
hg -R c2 commit c2/client2.txt -m"c2-1"
echo x >> c2/client2.txt
hg -R c2 commit c2/client2.txt -m"c2-2"
# Setup third client
hg clone master c3
echo x >> c3/client3.txt
hg -R c3 add c3/client3.txt
hg -R c3 commit c3/client3.txt -m"c3-1"
echo x >> c3/client3.txt
hg -R c3 commit c3/client3.txt -m"c3-2"
# Create the 3 bundles simulating the queue; all clients have pushed
# Hopefully this is done with a push hook
# All changesets are still draft phase
mkdir bundles
hg -R c2 bundle bundles/c2.bundle
hg -R c3 bundle bundles/c3.bundle
hg -R c1 bundle bundles/c1.bundle
# Process first bundle
hg -R build pull bundles/c2.bundle --rebase
hg -R build update
hg -R build push master
# Client 1 pulls at this point
hg -R c1 pull master -u --rebase
# Process second and third bundle
hg -R build pull bundles/c3.bundle
hg -R build rebase -b 5 -d 4
hg -R build pull bundles/c1.bundle
hg -R build rebase -b 7 -d 6
hg -R build push master
# Client 1 pulls again, getting the changesets that were pushed
hg -R c1 pull master -u --rebase
There's one difference between git and mercurial usually-used setups: git often allows rebasing, pruning and other rewrite / destructive operations on other people or remote repositories - mercurial by default does not and only allows amend operations.
However, there's a mercurial way: mercurial introduced some time ago the concept of phases, notably the unmutable phase public and the mutable phase draft. You can now declare repositories as non-publishing (it's a but unfortunate term in my eyes - it basically means that commits pushed there do NOT become phase public but remain in draft phase, thus mutable.
Thus setup your "central" repository as one of those non-publishing repositories, tell all your contributors to make use of a reasonably modern mercurial and to push changes as phase draft (or make sure so in a server-side hook and reject pushes of new changesets with public phase - but that might be problematic). Mercurials exchange of obsolescence markers makes sure that users get the information of which changesets become deprecated and by which new changesets they are being replaced.
The server-side setup then will need to take care of changing the phase of the accepted changesets from draft to public.
Mind: once a changeset is converted to phase public, that changeset becomes unmutable. And the phase will propagate to all who pull - and thus even reverting the phase to draft, and changing or pruning that changeset WILL inevitably permanently leave the repos of all who pulled with extra changesets which everyone will have to prune manually themselves!
You (and all your contributors) also probably should have a look at the evolve extension which makes handling non-publishing repositories and working with and exchanging mutable changesets and their modifications much more comfortable.

Show if the revision of subrepos changed, or if they are dirty

I have a mercurial repository (main repo) with several sub repositories.
I need a mercurial command to show if the revision of a sub repo changed (including information on old and new revision) in the working copy or if the sub repo state is dirty.
How can I do this?
Most mercurial commands accept the -S or --subrepos flag. Thus by calling hg st -S you get a list of all changed files which include those in the sub-repos, if their state differs from the state recorded in the .hgsubstate file:
$ cd opengfx/
$ hg st
$ hg id
10065545725a tip
$ cd ..
$ hg st -S
M opengfx/.hgtags
M opengfx/Makefile
A opengfx/lang/japanese.lng
$ cat .hgsubstate
785bc42adf236f077333c55c58490cce16367e92 opengfx
As to your wish to obtain the actual revisions, that's AFAIK not possible with one command. However you can always check the status of the individual sub-repos like above or can check them from the main repo by giving mercurial another path to operate on:
$ hg id -R opengfx
10065545725a tip
In order to get the status of each repo compared to what is required by the parent repo, I'd resort to some simple bash:
for i in $(cat .hgsubstate | cut -d\ -f2); do echo $i is at $(hg id -R $i) but parent requires $(cat .hgsubstate | grep $i | cut -d\ -f1); done
which gives output like
opengfx is at 10065545725a tip but parent requires 785bc42adf236f077333c55c58490cce16367e92
In a similar fashion you can also check whether the state is modified by using hg st instead of hg id.

Mercurial list files from changegroup

I want a simple command to get the list of files which differ after pushing to a mercurial repository on a server,
that is the differences between the previous push and the current push.
On the server, I have a hook on the changegroup event which calls a bash script.
$HG_NODE is the revision for the first commit in the changegroup.
My attempts:
hg status --rev $HG_NODE::
Exclude changes made in the first commit
hg status --rev $HG_NODE^1::
Includes changes that affected the parent revision through others pushes.
hg log -v --rev $HG_NODE:: | grep ^files
Include committed then reverted changes, still has 'files:' and files are not one per line
hg status --rev $HG_NODE:: && hg status change --rev $HG_NODE
Does not give exactly what I want, rather the change between the parent to the first commit in the changegroup + the change between this one (instead of the two changesets merged)
hg status --rev some_tag ; hg tag --remove some_tag ; hg tag --local some_tag
(Have not tried, is it a good idea?) Uses a local tag to keep track of the head for the last push, updates it each time.
Also I want to only monitor the default branch, so I assume I will use something like --rev "revision and branch(default)" or the branch option for hg log.
As to why, I need to go through the changes to determine parameters for an automated build.
hg stat -r is to list the removed files. To get the list of files which differ, you should use hg st --rev as follows.
hg st -m --rev tip^:tip

Close multiple branches in TortoiseHg

I use TortoiseHg 1.1 and in our team we tend to create named branches for each case we work on then merge with default when we're done. Unfortunately most of the named branches don't get closed. This isn't a huge problem but it's annoying when you try and filter by the branch you are currently working on and there is a massive list of old branches to search through.
So is there an easy way in TortoiseHg or from the command line to close multiple branches quickly without having to manually commit and close on each branch?
Unfortunately no.
You need to close each branch separately with its own commit on that branch.
The best way to do that is to use the command line, you could even create a simple batch file to close a bunch of them:
for %%f in (branch1 branch2 branch4) do (
hg up "%%f"
if exitcode 1 goto quit
hg commit --close-branch -m "closing branch %%f"
if exitcode 1 goto quit
)
:quit
The workflow I use is to use the bookmarks extension to keep only local bookmarks for my lines of development, this way I use normal unnamed branches in Mercurial, but can still easily tell them apart locally. The commit messages are used to separate them later.
This way I don't have a bunch of named branches cluttering up my history, I don't have to remember to close them, and I only have to keep track of the branches I use locally.
See the following for all the possible options:
https://www.mercurial-scm.org/wiki/PruningDeadBranches
of which the closing branch option can be used.
hg up -C badbranch
hg commit --close-branch -m 'close badbranch, this approach never worked'
hg up -C default # switch back to "good" branch
Also, my understanding has been that it is preferable to clone for most work and use named branches only for few possible long lived trains of development.
Here is a powershell script that will close all active branches except the default.
Add your own filtering logic to exclude branches you don't want to close.
$branches = hg branches -a | sort
$exclude = "default"
foreach ($item in $branches)
{
$branch = $item.Split([string[]] " ", [StringSplitOptions]::RemoveEmptyEntries)[0]
if(!$exclude.Contains($branch))
{
hg update $branch
hg commit --close-branch -m "closing old branch"
hg status
Write-Host $branch
}
}
hg push
Okay so this is way late, but I set out to do this via bash and after a lot of pain I got a working solution. Take note that this does not do an hg push at the end, because I wanted to look over the results in tortoise before pushing.
#Protected branches to not close
develop='develop'
default='default'
IFS=$'\n'
#Get all branches
allBranches=$( hg branches | sort )
#Loop over and close old branches
for item in $allBranches
do
#Trim off everything but the branch name
branch=$( echo $item | sed -r 's/\s*[0-9]+:[0-9a-f]+(\s\(inactive\))?$//' )
if [ $branch != $develop ] && [ $branch != $default ]
then
hg update $branch
hg commit --close-branch -m "Closing old branch"
fi
done
After running the script I asked the team which branches they were actively working on and stripped those closure commits.