Mercurial pre-push hook scanning the working copy - mercurial

I need to setup a hook on a repository where people can push, that would run some validation (the goal is to reject the push if validation fails). I already have some hooks setup to auto-update after a successful push, and prevent multiple heads.
I have no problem writing a validation script (for example a shell script that runs unit tests), but it needs to run on the full working copy.
My problem is that, if I just put it in a pretxnchangegroup hook, it does not operate on updated files. If I try to hg update inside the hook, this leads to repository corruption whenever validation fails and the push is rollbacked.
My current solution is to clone the repository to some temporary folder and run the actual validation there. But I find this a bit ugly, and it is less efficient that an in-place validation doing just updates. Is this kind of hook possible to setup ?

There are ways to do this, but before I start, are you sure you want to do this? Doing blocking, maybe-slow, push-rejecting changes in a hook annoys the hell out of people, and it holds the repository writelock the whole time, so no one else can be pushing while the validation script is running. One can pretty easy toss half the productivity advantages of a DVCS out the window in an attempt to gain a little control.
You could avoid most of those disadvantages with a two-tier repository setup. Something like project-push where people can push w/o passing validation and project-pull that only has changesets which passed some validation. Then you have an out-of-band (cron or hook triggered) script that moves changesets from project-push to project-pull only after validation is confirmed. Because that test is done out of band pushes aren't blocked but non-validating changesets never make it into project-pull. You can have it email the author when their push threw project-pull into a non-validating state. Users with their .hg/hgrc configured like this wouldn't have to think about their being two repositories at all:
[paths]
default=http://host//path/to/project-pull
default-push=http://host//path/to/project-push
They could just use hg push and hg pull and things would "just work".
That said if your validation doesn't need access to all the updated files at the same time you can do your test with something like this in an external pretxnchangegroup hook:
for thefile in $(hg manifest -r tip) ; do
if ! hg cat -r tip $thefile | ./validate_on_file.sh ; then
exit
fi
done
If you do need to operate on all the files at once (compiling, etc.) and you're not willing to do a two-repo structure that frees people to push quickly, then you do need a separate repo for testing, but you don't have to clone/create it each time. Something like this in a pretxnchangegroup hook should do:
hg push /scratch/long-lived-testrepo ## Warning: may not work,
## if pretxnchangegroup locks the repo and
## blocks this push to testrepo
hg -R /scratch/long-lived-testrepo update
cd /scratch/long-lived-testrepo
./validation.sh

Related

Mercurial Workflow: Need to track file in my local working copy, but not push to "trunk"

This is my setup:
I have a main Mercurial repository (call it trunk). When I want to work on a feature, I do a clone and start working on it (usually add a bookmark as well).
I use various tools to do my work, which tend to generate convenient text files in the directory. It would be very helpful for me to track those files as well. However, I need to ensure those files do not get pushed to trunk.
In a sense, I'd like a "parallel" Mercurial repository in that directory where I can track these files.
How do people manage this? I'm open to using (stable) Mercurial extensions. Ideally, I do not want to "remember" to remove stuff before pushing to trunk.
There are two possibilities: patch queues and phases. Probably for what you want to do, phases are the less-friction approach. But there is no really "parallel" solution to my knowledge.
Have a look at hg help phases for an overview and hg phase for the command to manipulate the phase of a changeset.
To be sure you never push to trunk inadvertently, you have to make your commits "secret" by default (in your HOME/.hgrc):
[phases]
new-commit = secret
Then hg push will never allow to push anything by default, you would have to selectively change the phase of the changesets you want to push.
You could also not use the above configuration and use the --secret option of hg commit when committing something you want to keep for you, but it is too risky to forget.
Note that, both with patch queues and phases, you have to be proficient with history rewriting with hg histedit, to reshuffle around the commits.

Is there a simple method to keep Mercurial from accepting section of code for commit?

I have a change to a file which I explicitly never want to commit to the repo. (In this case, its a hack around a bug that needs to be resolved by fixing an unrelated codebase.)
Is there a way to mark the change so that Hg will error out if I try to commit it? Ideally, it would be something inline in a comment so that I could choose to not commit that section of code (with TortoiseHg) and still be able to commit other portions.
Currently, I just have the change labelled with a nasty comment block, but it would be a great security blanket if I could tell the repo that this is dangerous code.
An alternative to writing a hook is to commit the change, and move it to the secret phase. This will prevent ever pushing it to another repository. It also allows you to easily apply the change on top of any changeset by rebasing the secret commit.
Use a bookmark or a named branch to make the changeset easy to select.
hg branch externalbugworkaround
hg commit -m "HACK - workaround external bug. DO NOT PUSH"
hg phase -s -f externalbugworkaround
then to move it around
hg rebase --keepbranches -d rev -r externalbugworkaround
As you suggest yourself: the solution is to write a client-side commit hook which parses the file with a regex and errors-out when the code you want to skip committing is part of the commit.
A simple hook which checks for bad file extensions and commit messages is found for instance here - it should be easy to extend to checking a certain file for a specific pattern.

How to track a file in develop branch but not in master(or default) branch

I have projectA execute projectB. ProjectB writes ProjectA's build info to ProjectA.buildInfo.
I have a Mercurial repo for projectB, and use HgFlow (git-flow workflow) using sourceTree.
I would like ProjectA.buildInfo to be part of the develop branch of my mercurial repo, but not part of the default branch.
When I have removed ProjectA.buildInfo from default, develop eventually merges back into default and brings the unwanted file with it.
You simply have to pay attention to the merges you do. hg remove the unwanted file every time you merge your development branch back into default.
In order to avoid this (easy) mistake to happen, I suggest to ward yourself against that by means of a commit hook. That hook should check whether the commit so done on default branch and fail when it detects the unwanted file(s) being added. Additionally run the same or similar as pretxnchangegroup hook on the central repository (if any).
Check hg help config and search for help on hooks therein. We use a hook to avoid people committing build artefacts to every branch, you could possibly extent that: https://hg.openttdcoop.org/misc/files/tip/mercurial/hooks/check_commit.py

What is the standard commit process for Hg?

Is it
pull
update
merge
commit
push
? Or can you do the commit first?
I don't like the idea of pulling and merging without having a version of my local code backed up somewhere in case the merge explodes, but presumably you have to do the merge before you can do a push, because you can't have conflicts in the central repo. Not quite understanding this whole process yet; used to my nice simple SVN.
I recommend to always commit before pulling in changes to your working directory, unless you are 100% sure that your changes and the changes to be merged into your working directory will not conflict.
If you do an updating pull (hg pull; hg update, or shorter hg -u pull) and have any outstanding non-committed changes, any changes coming from outside will be combined with your changes. When conflicts happen, it might be difficult to decide how the merge result should look like, because you can't easily distinguish between your changes and the changes merged in.
When you did commit first, it is much easier to decide how the merge result should look like, because you can always look at both parents of the merge.
So, in effect it is:
hg commit
hg pull -u (if no merge necessary, go to 5)
hg merge
hg commit
hg push
Update: As Martin Geisler has pointed out, it is possible to get at the "original" changed version of a file using:
hg resolve --unmark the-file
hg resolve --tool internal:local the-file
or for all files at the same time:
hg resolve --unmark --all
hg resolve --tool internal:local -all
Still, I find the "commit first" system nicer. At the end, it is personal preference...
I don't know as there's a standard per se, but one of the ideas behind Mercurial is that you can commit as often as you like since it goes to your local repository. So you can commit to your heart's content as much as you like before you pull updates.
I tend not to commit very often, saving up for when I'm preparing to push, but that's me. I can see the utility of committing early and often. I do pull updates frequently as I work to cut down on merge fun.
One other thing I do is to keep a parallel clone of my working repo (cloned from the same repository as my working repo, not cloned from my working repo) so that I can check the original state of a file easily, and if need-be check in an out-of-band emergency fix or what-have-you without complicating my current change set.
Do edits
Commit
Goto 1 until satisfied
Pull
Merge & commit
Push if you want to.
Definitely commit before trying to do something complex like a merge. I don't think mercurial will allow you to merge before committing, but even if it did, what if the merge goes wrong. You have no pre-merge revision to go back to.
Commit early, commit often.
If you don't, you are missing out on a huge benefit of a DVCS.
but presumably you have to do the merge before you can do a push, because you can't have conflicts in the central repo
Wrong statement and poor understanding of distributed workflow and parallel development.
You can merge heads before push, but not have or must. Push can put any data to repo, if it needed and intended to be so
By default, push will not allow creation of new heads at the destination,
since multiple heads would make it unclear which head to use. In this
situation, it is recommended to pull and merge before pushing.
(NB: "recommended to pull and merge before" statement)
You can use commit-pull-merge, stash-pull-unstash-merge, perform fetch with modified WC and merge on the fly, don't merge heads at all or sporadically and push --force with +1 heads - there are not common rule for everybody. And any and every such workflow doesn't produce "conflicts in the central repo", but only different DAG.
Each point of divergence, which appear in case of existing your and other changeset from commmon parent in your (or even central) repo is a point of starting anonymous branches in Hg, which (technically) are absolutely legal, applicable and usual way. How they handled is defined by policy and agreement between developers, PM, QA-team and others
I, personally, prefer finish my task (in one or more amount of commits), after it pull and maybe merge, when it approved by development-policy

Merging changes to a workspace with uncommitted changes

We've just recently switched over from SVN to Mercurial, but now we are running into problems with our workflow. Example:
I have my local clone of the repository which I work on. I'm making some highly experimental changes to our code base, something that I don't want to commit before I'm sure it works the way it is supposed to, I don't want to commit it even locally. Now, simultaneously, my co-worker has made some significant improvements/bug fixes which I need. He pushes his commits to our main repository. The question is, how can I merge his changes to my workspace without the requirement that I have to commit all my changes, since I need his changes to test my own code?
A more day-to-day problem we have with the exact same workflow is where we have a couple of configuration files which are in the repository. Each developer makes a couple of small environment specific changes to the configuration files, but do not commit the changes. These couple of uncommitted files hinders us from making any merges to our workspace, just like with the example above. Ideally, the configuration files probably shouldn't be in the repository, unfortunately, that's just how it has to be for here unnamed reasons.
If you don't want to clone, you can do it the following way.
hg diff > mylocalchanges.txt
hg revert -a
# Do your merge here, once you are done, import back your local mods
hg import --no-commit mylocalchanges.txt
There are two operations, as you've discovered, that makes changes from one person available to someone else (or many, on either side.)
There's pulling, which takes changes from some other clone of the repository and puts them into your clone.
There's pushing, which takes changes from your repository and puts them into another clone.
In your case, your coworker has pushed his changes into what I assume is your central master of the repository.
After he has done this, you can pull the latest changes down into your repository, and merge them into your branch. This will incorporate any bugfixes or changes your coworker did into your experimental code.
This gives you the freedom of staying current on other coworkers development in your project, and not having to release your experimental code until it is ready (or even at all.)
So, as long as you stay away from the Push command, you're safe.
Of course, this also assumes nobody is pulling directly from your clone of the repository, if they do that, then of course they will get your experimental changes, but it doesn't sound like you've set it up this way (and it is highly unlikely as well.)
As for the configuration files, the typical way to do this is that you only commit a master file template into the repository, with a different name (ie. an extra extension .template or similar), and then place the name of the real configuration file into the ignore filter.
Each developer then has to make his or her own copy of the template, rename it, and change it in any way they want, without the risk of committing database connection strings, passwords, or local paths, to the repository.
If necessary, provide a script that will help the developer make the real configuration file if it is long and complex.
Regarding your experimental changes, you should commit them. Often.
Simply you commit them in a clone you don't push. You only pull to merge whatever updates you need from other repos.
As for config files, don't commit them.
Commit template files, and script able to generate complete config files from the template.
That way, developers will only modify "private" (i.e. not committed) config files with their own private values.
If you know your uncommitted changes will not collide with the merge commit that you are creating - then you can do the following...
1) Shelve the uncommitted changes
2) Do the pull and merge
3) Unshelve the uncommitted changes
Shelf effectively stores your uncommitted changes away as into diff (relative to your last commit) then rolls back those files in your local workspace. Then un-shelving then applies that diff, bringing back your uncommitted changes.
Tools such as TortoiseHg have shelf built in.