Why does Mercurial not authenticate before pushing data? - mercurial

I have a big size clone to push to Google Code, and it takes a long time for the authenticate dialog to show up after the push is fired up, so does TortoiseHg push data first authenticate second?

Update: The bug is now fixed.
Edit by durin42: It's not entirely fixed. We're close, but there's still some work for me to do before the rewrite is on-by-default. We're trying to be really conservative with the switchover. (Accurate status as of January 2012, watch release notes for Mercurial for further updates.)
TL;DR: httplib is essentially broken here, and that causes this problem hg. People are working on fixing this.
This is an unfortunate side effect of the way urllib and httplib work. httplib/urllib won't preemptively send authorization, which is unfortunate.
The good news is that there's ongoing work to fix this, the bad news is that it looks like it'll take essentially a complete rewrite of httplib to get it to reasonable behavior. In particular, httplib is half-duplex, and has no way to peek at incoming packets (to detect an early response), so it has to send a request first before it can get a digest auth prompt (assuming use of digest auth, which is the best option). Some server implementations even close the socket once they send out a 401 Authorization Required, which actually breaks httplib completely by raising a broken pipe error. I submitted a workaround for that problem which is in hg 1.4, but it's only a user-annoyance fix, not an actual performance solution.

I expect it will be gathering a list of changesets from the server so it knows which local changesets do not appear on the server, and thus need to be transferred. Basically the equivalent of hg outgoing. Only once it has determined the changesets to push does it need to write anything (and potentially not, if there are no differences) so it won't authenticate until it actually needs to.

Related

mercurial workflows when collaborating with people who make errors

I need to collaborate on a Mercurial repository (let's call it "foo") with some people who are novices at version control in general, Mercurial in particular.
I am trying to come up with a workflow that will enable us to use Mercurial without a lot of extra effort on either their end (confusion) or my end (cleanup).
My concern is that as novices I need to expect them to make errors, and I need to allow them to do so in a controlled way, otherwise they won't use the tool at all because they're too scared. But I don't want a bad change to pollute the repository unnecessarily.
I do not expect them to be able to merge properly or to use the mq extension. This is not
a matter of underestimating them, instead it is a realistic assessment given past experience with SVN and my own experience with Hg.
Which of the following approaches would make the most sense? Or if there's a better approach, what is it?
We have a repository foo-submit, read/writable by all, and a repository foo-trunk, readable by all but writable by admins. Users pull from foo-trunk, and push changes to foo-submit. Cleanup: If I find a good change, I let it through as is; if I find a bad change, I "bypass" it by merging with the previous version.
We have a repository foo-trunk readable by all, writable by admins. Each user is responsible for maintaining their own clone which is read-accessible to the rest of the team. When someone wants to push a change, they let me know and I pull it from their repository, with proper cleanup as necessary (same as in #1)
We have a repository foo-dev, read/writable by all, and a repository foo-trunk, readable by all but writable by admins. Users pull/push to foo-dev, and work in named branches if they need to do extensive development. I am responsible for performing merges and cleanup. The foo-trunk repository is merely for having a "clean" copy that has branches where the tip is always in a good state.
Good question, and one that I've never seen a great answer to.
That said, I like option 2. This is the "Pull Request" model used by the Linux Kernel and made popular by GitHub. It allows the admins to act as gatekeepers / reviewers, only allowing good change-sets to get past them when they're happy. If they decide a developer hasn't delivered something worthy, then the pull request is rejected (with reasons). Then the developer can go away, fix up their code / repo, and submit another pull request.
Running a server with something like RhodeCode on it can help keep on top of pull requests. As things grow you can have lower level gatekeepers that deal with subsystems, and higher level gatekeepers that deal with the whole project.
The bit I've never quite got my head around is what should happen to change-sets that are rejected, and that the developer decides to abandon rather than fix up and try again. They could be closed, but then could possibly appear by mistake as part of a future pull request. They'd be harmless, but possibly confusing. The alternative is stripping them, but that sounds like giving people tools they'll cut themselves on.
The other 2 options you give deserve a little comment.
1 is similar to 2. You're still doing a "Pull Request" type flow, but now you have server side branches which mirror the developer's clones. There's little difference and this is how a RhodeCode, GitHub, BitBucket server would let you work, except you don't have to go searching for changes. The server would tell you they're waiting for you to look at.
3 has the problem that everyone's changes are all merged together on foo-dev before you get to them. They would start becoming inter-dependent, and cherry-picking is going to be messy. You'd probably end up grafting change-sets on to foo-trunk which means you're creating new change-sets with new hashes. When the developers pull those they'll now have the change in two places; their original foo-dev version and your grafted foo-trunk version. This doesn't sound sustainable to me.
Best way i can think of if you don't want to use mq (understand with the least hassle for you) is to have your dev
create their own branch for the current feature being developped
merge it back to the main dev branch (or graft/transplant) when it's completed and validated
and then close the branch.
In the long term see for them to learn mq, it's not too hard to grasp.
3a - foo-dev has protected default branch (only some admins can push/merge-to this branch), users use named branches

Reading command line arguments from Mercurial prechangegroup hook

I'm attempting to disallow pushes to a Mercurial repository if a certain condition holds true. However, it is essential that if the user uses push --force, the push goes through regardless.
I know that it's easy enough to do this on the machine that's doing the push by using the pre-push hook, which passes in the command line arguments to the hook. However, since hooks aren't propagated, I'd have to somehow distribute the hook to every single user of the repository and rely on them not messing with it.
Therefore, I thought the way to go would be to have a prechangegroup hook on the repository server which checked the condition and aborted the push if necessary, but I can't figure out a way to obtain the command line arguments the user used while pushing from this hook. Is there a way to accomplish this just by using a hook on the repository server?
I know that a possible workaround would be using the pretxnchangegroup hook instead and allowing the push if the commit message of the latest changeset follows a certain pattern. However, the --force option seems much easier from a repository user's perspective, since it wouldn't force them to potentially do a dummy commit to get the message right.
Sorry, the --force command line option isn't ever sent on the wire, so it won't be available on the server side at all. You'll need to figure out some way to signal "I really mean it!" out of band, be it special usernames, special commit messages, or the like.
Consider just having a 2nd server side repo that dosn't have the banning hook and have pushers use it only when they really mean it. Something like:
hg push http://your-server/repo
.. rejected due to hook failure
hg push http://your-server/repo-and-I-really-mean-it
where on the server side the repo-and-I-really-mean-it repo doesn't have the hook and automatically pushes to the plain repo.

How do I reject pushes to a Mercurial server based on a script, without risking a bad pull during that time?

I'd like to write a script that examines the incoming changesets on a push to a mercurial server and rejects the push if the changesets do not conform to a particular standard. It seems like my options are the prechangegroup, pre-changegroup, and pretxnchangegroup hooks. Unfortunately, the prechangegroup and pre-changegroup hooks do not appear to have access to the incoming changesets, so I would need pretxnchangegroup. But according to the documentation at http://hgbook.red-bean.com/read/handling-repository-events-with-hooks.html#sec:hook:pretxnchangegroup, this can lead to inconsistent state for people using the repository while the hook is executing:
"While this hook is running, if other Mercurial processes access this repository, they will be able to see the almost-added changesets as if they are permanent. This may lead to race conditions if you do not take steps to avoid them."
I'm really not crazy about random weirdness happening if someone does a pull while my script is in the process of rejecting a changeset. Is there another hook that I can use? If not, what are the "steps to avoid them" that I need to take? Is there a way I can lock the repository during my hook?
If you expand the comments for the quoted paragraph, Meister Geisler confirmed some users' observation that the issue was resolved since hg 1.2, such that the not-yet-permanent incoming changesets are not visible, thus will not be pulled.

Enforce mercurial commit message policies via pretxnchangegroup?

As described in: http://hgbook.red-bean.com/read/handling-repository-events-with-hooks.html I thought I could write a small hook which rejects checkins with malformed commit messages. Thats no problem, the issue I encounter is the following work flow:
If a developer makes let's say 10 local commits, some of them are malformed, and then pushes them to the central repository all will be rejected, but he is unable to edit the old commit messages since rollback will work only once..
How do you solve this?
Using the HistEdit extension, you can change the commit message locally, then push back the whole changes in the main repository.
I suppose you can't mandate developers to use the same precommit hook to check commit messages, because it's not a centrally-managed project?
An alternative to #gizmo's answer is to let developers use MQ and mandate code review before pushing (or better, someone pulling from them). Then if reviewers (or some review scripts) spot the malformed messages, the developer can use qrefresh to change the message.
You need to be careful about a couple of things in that workflow, though:
NEVER EVER push/pull unfinished patch, even though qfinish does not change the hash. It's just too easy to screw up.
Make sure developer qcommit every time before sending things out for review, otherwise you won't know if s/he slips in other changes in the next iteration (not that s/he would, but s/he could).

Mercurial Pull Error

I am new to the dvcs world. My company uses perforce and I'm not a fan so I thought I'd try to use mercurial as a front end. I set it up on a windows machine with TortiseHG, enabled the Perfarce extension, did a small checkout (limiting the target revision) and pulled for the rest. This seemed to be more robust than clone alone.
This seems to be working fairly well as I've been able to get up to change 8700 or so.
My problem is with an error in the perforce repo. During the hg pull command it hits an error abort: file path/to/file.pl missing in p4 workspace and rolls back the transaction.
Is there anyway to bypass or skip that file and force it to continue since this is not a file I care about.
Update:
According to the admin, the file in question was a symlink. Would that cause this kind of problem? If so, how do I/admin fix or bypass it?
Is it possible to check out just a part of a perforce repo rather than the whole thing?
The issue is with symlinks that are not supported out on Windows.
This is fixed in the current version of Perfarce, which should appear in TortoiseHG soon.
I suggest that you have someone check that the Perforce repository is actually in a sane state. There might be something broken which you triggered and the data of your company might be at stake, so someone should definitely look what is causing the problem.