Hg clone only public commits - mercurial

We have a big Hg repo, hosted in a remote location. Performing an hg clone from this master repo takes about an hour. What we generally do to speed things up is to hg serve a local repo of a colleague, hg clone http://colleague-machine, and then change de default path in .hg/hgrc to the address of the master repo.
This is all well and good, but this workaround has one drawback: because we are cloning the repo of a developer, some draft commits can be cloned along with the public ones. Moreover, these commits become public in the cloned repo, making them indistinguishable from the others.
One possibility I found is to make the developer's repo non publishing, in order to preserve the phases of the commits and to remove them later on. Another possibility is to create a bundle containing only the public commits, instead of cloning directly.
These methods are more complex to explain and to document. Is there an option for hg clone to clone only the public commits? I tried with hg clone -r "public()", but clone does not take a revset, just a regular commit identifer. Alternatively, is there an option for hg serve to serve only the public commits?

Throw disk space at the problem: just keep a local mirror clone that you update regularly.
Cloning the "true master" is slow because it's far away over a slow link. But updating the mirror is fast because, while the true master is far away over a slow link, little data needs to traverse it; and cloning the mirror is fast, and gets you the state of the true master as of the last time the mirror was updated.
As you mention, you can then just replace the default path (and maybe run a subsequent hg pull to pick up anything not-yet-mirrored, if needed). Your new clone is then the same as it would have been, had you cloned from the far-away slow true master, except that it went fast.
Git has this kind of cloning built in, as what's called a reference clone. You point your git clone process at two repositories: the true source, and the "close and fast" reference. It gets hash IDs from the true source but then uses the close-and-fast reference's storage for its data. You can then choose to continue to rely on the reference (default) or "dissociate" from the reference so that your clone is independent. It needs this dissociate operation because it can do a somewhat dangerous path-name-based "link" (not really a link in the sense of hard links; more an in-Git analogue to symbolic links) to the original, and does so by default here.
I don't think Mercurial has anything equivalent "out of the box". I imagine it should be relatively easy to write as an extension, though, if you are up for that sort of thing. You wouldn't need --dissociate at all, it would be the default wherever hard links are not feasible.

One way to do this is to use hg clone -r <rev> where <rev> is public. That will ensure that you won't get any draft commits, although you will miss any branches that aren't ancestors of <rev>.
I don't think there's a generic way to clone only public changes. It might be possible via a server-side extension or in-process hook though.

I ended up using a combination of hg serve option and hg strip.
On the existing repository:
hg serve --config phases.publish=False --port 0 --prefix repo-name
On the target machine:
hg clone <address printed by `hg serve`>
cd repo-name
hg strip -r "draft()"
The phases.publish=False config makes the repo non-publishing, and thus preserves the phase of the commits that are cloned. Now that the phases are kept on the target machine, it is easy to strip them off after the clone.

Related

Mercurial: why do pulled changesets not become public?

Consider the following situation:
$ md repo1; cd repo1
$ echo some data > myfile
$ hg init; hg addremove; hg commit -m "First commit."
adding myfile
myfile
committed changeset 0:32c7aa047f3b
$ hg serve
listening at http://vostro.rath.org:8000/ (bound to *:8000)
And then in another terminal:
$ hg clone http://vostro.rath.org:8000/ repo2
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
updating to branch default
resolving manifests
getting myfile
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cd repo2; hg phase tip
0: public
..and in the first terminal again:
127.0.0.1 - - [25/May/2013 16:38:40] "GET /?cmd=listkeys HTTP/1.1" 200 - x-hgarg-1:namespace=bookmarks
^Cinterrupted!
$ hg phase tip
0: draft
To me this looks very wrong. Someone just pulled the changeset from the first repository, so it is obviously public. However, it still appears as "draft" in the repository.
Can someone explain the rationale for this behavior? As the owner of the first repository, I would very much like to know when someone has pulled a revision (so that e.g. I don't rebase it anymore), so I think it would be sensible if the hg server process would update the phase accordingly.
You will probably get a better answer on the mailing list for this, but my understanding is this:
hg pull has always been a read-only command and can be run without write access to the remote repository. Changing the phase in the remote repository would (obviously) require a write. On the other hand, hg push has always written to the remote repository, and so phases introduced no change.
Changing hg pull from read-only to read-write could cause some people's work flows to break, and that's a mortal sin in mercurial development. (E.g. An anonymous user pulling from a public server, sending back changes via e-mail bundles)
Basically it's a historical quirk because phases are a retro-fit.
The hole this leaves open is that the original owner of the change-set could amend it, without realising that the change has already gone into the wild. I expect that this hole hasn't worried too many people because the "change-set evolution" features that are being developed solve the problem in a better way.
I tend to think of the phases as:
Public - Publicly visible and immutable
Draft - Publicly visible and mutable
Secret - Not publicly visible and mutable
I think draft is only there because that's basically where we were before phases were added, and is a bit of a weak concept. Really, if your working in an environment where people may pull directly from you, then I suggest working more with public and secret phases, and avoid draft.
As #zerkms said, pull isn't intended to change the remote repository.
If your working repository is being used as a server, you have a few options:
Set the default of commits to "public" instead of "draft". Others can pull at any time so just assume they are public.
Set the default of commits to "secret". Others won't be able to pull them. Set them to "public" when you are ready to share.
Set your repository as "non-publishing". Others can pull your draft changesets, but they will still be marked as "draft".
Here's how to specify these behaviors in mercurial.ini/hgrc.
[phases]
publish = False
new-commit = public
pull isn't intended to change the remote repository phase but the phase of your local repository.
And to be clear - you shouldn't care what phase is in the remote repository.
And even more - remote repository may be hosted using old mercurial version which doesn't support phases.
Why this behavior?
Because phases are only make sense for the local repository and made to help preventing history modification mistakes.

In Mercurial, is there a way to save/commit an unfinished change but not available for others via clone/pull/push?

I often find myself in the situation that I need to switch to work on a different change before the one that I am currently working on is done. I want to find out if there's a way, in Mercurial, that I can save/commit my unfinished change, which is not available for others (ie. not clonable/pushable/pullable).
Mercurial phases may be the answer to this.
Starting with Mercurial v2.1, you can configure mq changesets to automatically be marked secret. secret changesets are ignored by incoming/pull and outgoing/push commands.
To enable this behavior, you need to add the following to your config:
[mq]
secret = True
Once enabled, it behaves as follows:
$ hg qpush --all
applying my-patch
now at: my-patch
$ hg phase -r .
16873: secret
$ hg outgoing
comparing with https://www.mercurial-scm.org/repo/hg
searching for changes
no changes found (ignored 1 secret changesets)
Take a look at the Shelve Extension. This gives you the basics and might be more than enough for what you need.
There is also the Mercurial Queues Extension, but I find this can be a little odd to work with.
As a final alternative, you could always commit your changes onto another branch so that they don't affect mainline development, but I think these may still be visible.
You can clone your repo to a new place to work on new changes. That way your pending changes are kept on your local machine and never pushed. Of course, this depends on the size of your repo. If it's too big, cloning becomes a little prohibitive.
As others have suggested, you can mark your unavailable changes to be on a private branch. When you push, you can push an explicit branch using the -b argument. So, if your private branch is TimPrivateBranch, and other changes are on default:
hg push -b default
TimPrivateBranch stays on your local computer. Of course, this requires you to remember the -b argument every time you push.
When you're done with your private branch, just merge back into default:
hg up default
hg merge TimPrivateBranch

Mercurial: enforce "hg pull -u" before "hg commit"

I have in some cases a need to enforce that Mercurial-users have run hg pull -u before any hg commit can be allowed, i.e., hg pull will mean that the incoming queue is empty — and furthermore I also want that the person is using the head version of the branch.
How can I set up such a restriction?
(I am fully aware that this goes against parts of the DVCS design core)
You could ask your developers to install
[hooks]
pre-commit = hg pull -u
in their config files (it should probably be installed in the per-repository .hg/hgrc file since this workflow is repository specific).
This makes Mercurial a little Subversion-like: your developers will only have one outstanding changeset. But note as soon as someone pushes to the server, hg pull -u cannot update to the new branch tip since it will cross branches (topological branches) to do so. So a proper merge will be needed at that point (or a rebase, see hg pull --rebase).
Normally mercurial will NOT let you push an open head to the server without using the -f flag (force). You can write a hook to pull automatically but that can not be enforced server side due to the server not knowing what you have. There is an article on mercurial's website about this scenario:
https://www.mercurial-scm.org/wiki/TipsAndTricks?highlight=%28heads%29#Prevent_a_push_that_would_create_multiple_heads
As Adam says, perhaps what you really need to do is prevent multiple heads (per branch). This is what we do, using the 'forbid_2head' hook from Netbeans (linked from here https://www.mercurial-scm.org/wiki/TipsAndTricks#Prevent_a_push_that_would_create_multiple_heads)
The result is that the hook prevents any push that creates multiple heads on a branch (so one on the anonymous/default branch plus one each on named branches). This effectively forces a pull before commit because you have to pull, get the two heads locally, then merge or rebase to remove it.
note, the hook is on the server/master repo

Can I mark a branch as 'not going to push'?

I use named branches in Mercurial.
In doing so I have created one branch called playground where I can try out various wacky experiments. I never intend to merge this branch into any others and I never want to push it to our main repository.
Since creating it, every time I do a push I am told I have added a new branch and I have to use the --new-branch flag. At this point hg push -b default (or whatever branch I'm pushing) works fine but it's annoying. Is there any way to suppress that message by letting Hg know that I am not interested in pushing that branch ever?
Starting with Mercurial 2.1 (released in February 2012), you can mark your changesets secret to keep them from being pushed to another repository. You use the new hg phase command to do this:
$ hg phase --force --secret .
This mark the current working directory parent revision (.) as being in the secret phase. Secret changesets are local to your repository: they wont be pushed or pulled. Pushing now looks like this:
$ hg push
pushing to /home/mg/tmp/repo
searching for changes
no changes to push but 2 secret changesets
There is no equivalent mechanism in older versions of Mercurial. There your best bet is to create a local clone for the changesets you don't want to push.
Update:
Mercurial 2.1 introduced the hg phase command which allows users to control what change sets are exchanged with remote repositories. #MartinGeisler answer to this question details this method.
Original Answer:
If you want to create a local branch of your code you have a couple options. You can hg clone the repository which will locally create a branch of the entire repository in your filesystem. The other alternative is you can try to use a Mercurial extension like LocalbranchExtension.
There are many ways to branch in Mercurial without using a named branch. Just find a method that suits your needs.
Further reading: http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/
In addition to the excellent answer above concerning phases, you can also specify 'default-path' (in the [paths] section of your .hgrc) to refer to the local repository:
[paths]
default = ...
default-push = .
This will cause all outgoing changesets to be compared to the specified repository. In this case, comparing outgoing changesets in your local repository TO your local repository results in nothing to push.
You can still pull/update/merge from the main repository, but no push will ever send anything back to that main repository.
If you work on multiple machines/repositories, you can set one up as described above, and configure the others to specify the 'default' path to point to the server that pushes to itself. In this way, the other machines can push/pull to your local central repository, and these changesets will never escape your carefully configured collection of repositories.

How do I move a private Mercurial repository to a central server?

I’m just getting started with Mercurial, and I’ve read Joel Spolsky’s Hg Init tutorial, which I liked.
I’m wondering: let’s say I have a private repository and I work on it for about a month. Then I decide I want to centralize it or make it public, like on bitbucket.org. I want to retain all the history.
The intuitive thing would be to use hg clone, but according to the docs:
The location of the source is added to
the new repository's .hg/hgrc file, as
the default to be used for future
pulls.
I don’t think this is what I’d want, since the source is my local, private repository, and the destination is the public server. I don’t want the public server trying to pull from my private repository in the future thinking it’s the central one. I hope this makes sense.
Do I have to tweak the .hg/hgrc file on the server manually? Am I approaching this correctly?
BitBucket's help says it's as easy as making an empty repo on BitBucket, then pushing to it:
... create a new empty repository via the "Create repository" page. We will assume that this repository is named blonk and is to be found on http://bitbucket.org/jespern/blonk.
Now, just push to it:
$ cd ~/Work/blonk # our existing hg repository
$ hg push http://bitbucket.org/jespern/blonk
...
Done!
You can edit .hg/hgrc in your repository to include the default path to Bitbucket:
$ cat .hg/hgrc
[paths]
default = http://bitbucket.org/jespern/blonk
Now you can simply enter hg push and hg pull without having to specify the full URL.
Doing this operation using 'hg push', as described, is probably the best way to do this, overall.
However in other circumstances it might be convenient, or reassuring, to note that all of the Hg state is contained within the .hg directory, and so simply moving this directory is enough to move the repository.
For example, if you have ssh access to a machine at example.com, you can tar (or zip) up your .hg directory in the 'private' repository, unpack it in, say, ~/repo/foo on the remote machine (thus creating a directory ~/repo/foo/.hg there), and then simply clone this:
$ hg clone ssh://example.com/repo/foo
This does have a slight back-door feel to it, I agree. However, there's nothing really under-the-hood happening here, and no editing of configuration files is necessary. When I do this, I find it less confusing than the 'proper' way.