Using hooks vs. wrapping commands in mercurial - mercurial

What are the relative pros and cons for using hooks vs. using an extension that wraps a command for a particular task?
In other words, what are the criteria for deciding whether to use hooks or wrap a command?
Please also list the cases where one approach is the only option. One case I can think of is to add new arguments for existing commands. You could also change/remove arguments, for example I default log to log -g but graphlog aborts in the presence of some "incompatible" arguments (see graphlog.check_unsupported_flags), so I added a log wrapper to remove -g in those cases, because forced abortion is a crime against humanity.
It feels like hooks are more clean-cut. Python hooks run in the hg process so there's no performance issue. And while it's easy to use extensions.wrapcommand to create command wrappers, it's trivial to create/disable hooks, and to adjust the order in which they are applied (they should be self-contained in the first place).
And here's a quote from hgrc doc that recommends standard hooks over pre/post command hooks, but it also applies to hooks over wrapper:
... hooks like "commit" will be called in all contexts that generate a commit (e.g. tag) and not just the commit command.
Also I guess that hooks are not subjected to GPL (or are they?), whereas command wrappers in extensions are.
(I hope a 1.5k+ user can create a mercurialhooks tag. Git fan boys have beaten us with githooks.)

I can't speak to licensing issues, but the biggest difference between a hook and an extension is that a hook can be written in any language whereas extensions are always python.
If one's writing in python then there's little difference between a hook and an extension:
either can delve deeply into mercurial internals
both require users modify their .hgrc to enable them
both can wrap/intercept commands
I think your log command argument modification could be done with a pre-log hook in addition to being done as an extension.
TL;DR: If you're writing in python there's little difference, and if you're not hooks are your only option.

Related

In Unison's UCM, is there a way to display a diff before running update?

This would basically be analogous to git diff. I see there are namespace diffing commands for things already committed to a Unison code base, but I may have missed how to do a diff prior to updating, if it is possible.
At present, there isn't a way to get a line-by-line diff of your work-in-progress in a scratch file with already added content of a namespace. You can see which terms and functions would be changed with the update.preview command and then view 1-n of the terms listed there in the console to see the last saved state of your code, but that command alone won't give a content breakdown.
The fork, update, then merge workflow for adding a feature might provide a lightweight way of viewing incremental changes without polluting your original namespace.
To do this, fork your original namespace to a myWIP namespace, then update your changes in this namespace. You can use the namespace diffing and view source tools between your original namespace and this myWIP one to see changes. Once you're done you can merge your namespace back into the original one and delete ones you don't need.

Does Mercurial have something like git's "filter" feature? [duplicate]

Inspired by this answer, I wonder if there is a way to obtain the same behavior in Mercurial than the one obtained with the smudge/clean filters specified in the .gitattributes file for git. This is, applying some preprocessing to some files before committing, without affecting the working copy.
You can find a proper description of what I mean in the git documentation on gitattributes in the filters subsection. Also, from the Pro Git book:
It turns out that you can write your own filters for doing substitutions in files on commit/checkout. These are called “clean” and “smudge” filters. In the .gitattributes file, you can set a filter for particular paths and then set up scripts that will process files just before they’re checked out (“smudge”, see Figure 8-2) and just before they’re staged (“clean”, see Figure 8-3). These filters can be set to do all sorts of fun things.
My use case is similar to the one stated in this other question: to clean up part of some files before committing them to the repository but without affecting the working copy.
The most similar thing I was able to find is the encode/decode functionality of Mercurial. The problem is that the documentation on this feature is quite succinct (I couldn't find much information anywhere else).
But then, the encode/decode functionality is marked as an unloved feature. Why is that? Does it mean there is a better way to do what it does? For some reason there is no proper way to do it but I should go for this one like everyone else?
Looking at your use case, the intended way to overlay local modifications over a repository is generally using the MQ extension, which allows you to apply patches locally that don't get pushed to a remote repository and can be applied and unapplied as needed (and can themselves be put under version control).
In general, automated modification of files upon checkin or checkout is problematic:
It may not interact well with the rest of your VCS-related tooling, especially those parts that expose patches as diffs or that rename files.
It is generally error-prone; you're checking in a version that you never tested and have to be sure that the encode/decode steps properly roundtrip.
The encoding and decoding setup is not actually part of the repository, but of your VCS configuration. This may lead, for example, to you accidentally pushing passwords because you forgot to set up the configuration correctly in a new checkout. In particular, a fresh hg clone does not copy .hg/hgrc over and may thus checkout undecoded files.
The larger problem that you have when you are using a VCS to handle both permanent and temporary artifacts is that you are trying to make it do something that it isn't designed for. What you are missing is a build or deployment step that creates the temporary artifacts from permanent ones, possibly in conjunction with a local configuration (say, via a template system). This can also be combined with a hook that prevents the accidental checkin of temporary artifacts.
That said, if you absolutely want to use filters, it works as follows: you need matching [encode] and [decode] sections. Each section has a series of pattern = shell-command entries, where pattern describes a filename or set of filename and shell-command is a shell command that transforms an input file into an output file. This command can be prefixed either by pipe: (which is the default) and then has to convert standard input into standard output or tempfile:, in which case the command transforms the files given on the command line (specified by the placeholders INFILE and OUTFILE).
Examples:
[encode]
secretfile = pipe: sed -e 's/FOO/BAR/g'
[decode]
secretfile = pipe: sed -e 's/BAR/FOO/g'
With tempfile:
[encode]
secretfile = tempfile: sed -e 's/FOO/BAR/g' <INFILE >OUTFILE
[decode]
secretfile = tempfile: sed -e 's/BAR/FOO/g' <INFILE >OUTFILE
Both examples convert occurrences of FOO into BAR upon checkin and BAR into FOO upon checkout. Note how this does not actually roundtrip properly: If a file contains the string BAR upon checkin, it will become FOO upon checkout. It can be fairly tricky to write filters that do this correctly in all cases. This is one of the reasons why a separate build step is almost always better than squeezing extra magic into checkins and checkouts.

Have Mercurial use a custom merge tool for its own merges

Mercurial docs about what mercurial does when it has to do a 3way merge:
By default, Mercurial will attempt to do a classic 3-way merge on text
files internally before trying to use an external tool.
When it invokes the external tool, that is always a "manual merge".
Not all merge tools are created equally, and as it turns out my merge tool of choice (Araxis Merge), is often able to do an automatic merge of 3 files, where mercurial's internal merge tool was not able to do so.
This leads to the scenario of big merges where maybe a bunch of files merge cleanly, done by hg's internal mergetool, and then some other files do not merge cleanly but could have if hg would let me specify its mergetool. I find this to make big merges very inefficient, as you need to context switch a lot: hg pops up my merge tool, I think "oh darn, conflict", to then realize "oh wait, theres no conflict at all"
I wonder if I'm missing something here, or if there is really no way to make hg able to use a custom merge tool for its automatic attempts at doing merges.
I think you're looking for a switch to make Araxis Merge close itself automatically if it can auto-merge. I looked at the command line reference and their SCM integration document, but I'm actually not sure what switch it would be. You'll have to experiment yourself.
From Mercurial's point of view there is no such thing as a "manual merge". Mercurial tries to merge internally first (the so-called "premerge" step) and if that fails it looks for an external tool. The merge can still be fully automatic if that tool exists with an exit code of zero (successful exit). Mercurial will then consider the merge successful and go on to the next file. Depending on the tool, you wont notice this at all: Mercurial just runs the tool in the background and you're only prompted for action when there is a serious merge conflict.
You can use a custom merge tool with Mercurial. help merge-tools shows the order in which tools are chosen to run:
1. If a tool has been specified with the --tool option to merge or
resolve, it is used. If it is the name of a tool in the merge-tools
configuration, its configuration is used. Otherwise the specified tool
must be executable by the shell.
2. If the "HGMERGE" environment variable is present, its value is used and
must be executable by the shell.
3. If the filename of the file to be merged matches any of the patterns in
the merge-patterns configuration section, the first usable merge tool
corresponding to a matching pattern is used. Here, binary capabilities
of the merge tool are not considered.
4. If ui.merge is set it will be considered next. If the value is not the
name of a configured tool, the specified value is used and must be
executable by the shell. Otherwise the named tool is used if it is
usable.
5. If any usable merge tools are present in the merge-tools configuration
section, the one with the highest priority is used.
6. If a program named "hgmerge" can be found on the system, it is used -
but it will by default not be used for symlinks and binary files.
7. If the file to be merged is not binary and is not a symlink, then
"internal:merge" is used.
8. The merge of the file fails and must be resolved before commit.
More information can be found in help config - look for merge-tools and merge-patterns

How good is my method of embedding version numbers into my application using Mercurial hooks?

This is not quite a specifc question, and more me like for a criticism of my current approach.
I would like to include the program version number in the program I am developing. This is not a commercial product, but a research application so it is important to know which version generated the results.
My method works as follows:
There is a "pre-commit" hook in my .hg/hgrc file link to version_gen.sh
version_gen.sh consists solely of:
hg parent --template "r{rev}_{date|shortdate}" > version.num
In the makefile, the line version="%__VERSION__% in the main script is replaced with the content of the version.num file.
Are there better ways of doing this? The only real short coming I can see is that if you only commit a specfic file, version.num will be updated, but it won't be commited, and if I tried to add always committing that file, that would result in an infite loop (unless I created some temp file to indicate I was already in a commit, but that seems ugly...).
The problem
As you've identified, you've really created a Catch-22 situation here.
You can't really put meaningful information in the version.num file until the changes are committed and because you are storing version.num in the repository, you can't commit changes to the repository until you have populated the version.num file.
My solution
What I would suggest is:
Get rid of the "pre-commit" hook and hg forget the version.num file.
Add version.num to your .hgignore file.
Adjust version_gen.sh to consist of:
hg parent --template "r{node|short}_{date|shortdate}" > version.num
In the makefile, make sure version_gen.sh is run before version.num is used to set the version parameter.
My reasons
As #Ry4an suggests, getting the build system to insert revision information into the software at build time, using information from the Version Control System is a much better option. The only problem with this is if you try to compile the code from an hg archive of the repository, where the build system cannot extract the relevant information.
I would be inclined to discourage this however - in my own build system, the build failed if revision information couldn't be extracted.
Also, as #Kai Inkinen suggests, using the revision number is not portable. Rev 21 on one machine might be rev 22 on another. While this may not be a problem right now, it could be in the future, if you start colaborating with other people.
Finally, I explain my reasons for not liking the Keyword extension in a question of mine, which touches on similar issues to your own question:
I looked at Mercurials Keyword extension, since it seemed like the obvious solution. However the more I looked at it and read peoples opinions, the more that I came to the conclusion that it wasn't the right thing to do.
I also remember the problems that keyword substitution has caused me in projects at previous companies. ...
Also, I don't particularly want to have to enable Mercurial extensions to get the build to complete. I want the solution to be self contained, so that it isn't easy for the application to be accidentally compiled without the embedded version information just because an extension isn't enabled or the right helper software hasn't been installed.
Then in comments to an answer which suggested using the keyword extension anyway:
... I rejected using the keyword extension as it would be too easy to end up with the string "$Id$" being compiled into the executable. If keyword expansion was built into mercurial rather than an extension, and on by default, I might consider it, but as it stands it just wouldn't be reliable. – Mark Booth
A don't think that there can be a more reliable solution. What if someone accidentally damages .hg or builds not from a clone but from an archive? – Mr.Cat
#Mr.Cat - I don't think there can be a less reliable solution than the keywords extension. Anywhere you haven't explicitly enabled the extension (or someone has disabled it) then you get the literal string "$ID$" compiled into the object file without complaint. If mercurial or the repo is damaged (not sure which you meant) you need to fix that first anyway. As for hg archive, my original solution fails to compile if you try to build it from an archive! That is precisely what I want. I don't want any source to be compiled into our apps without it source being under revision control! – Mark Booth
What you are trying to do is called Keyword Expansion, which is not supported in Mercurial core.
You can integrate that expansion in make file, or (simpler) with the Keyword extension.
This extension allows the expansion of RCS/CVS-like and user defined keys in text files tracked by Mercurial.
Expansion takes place in the working directory or/and when creating a distribution using "hg archive"
That you use a pre-commit hook is what's concerning. You shouldn't be putting the rest of version_gen.sh into the source files thesemves, just into the build/release artifacts which you can do more accurately with an 'update' hook.
You don't want the Makefile to actually change in the repo with each commit, that just makes merges hell. You want to insert the version after checking out the files in advance of a build, which is is what an update hook does.
In distributed systems like Mercurial, the actual "version number" does not necessarily mean the same thing in every environment. Even if this is a single person project, and you are really careful with having only your central repo, you would still probably want to use the sha1-sum instead, since that is truly unique for the given repository state. The sha1 can be fetched through the template {node}
As a suggestion, I think that a better workflow would be to use tags instead, which btw are also local to your repository until you push them upstream. Don't write your number into a file, but instead tag your release code with a meaningful tag like
RELEASE_2
or
RELEASE_2010-04-01
or maybe script this and use the template to create the tag?
You can then add the tag to your non-versioned (in .hgignore) version.num file to be added into the build. This way you can give meaningful names to the releases and you tie the release to the unique identifier.

How do I suppress keyword expansion in Starteam at the client level to enable local mirroring to DVCS?

I am trying to mirror my corporate Starteam CM server with a local distributed version controls system (Mercurial). I am running into problems with seeing many changes due to Starteam's keyword expansion on checkout feature. For example, the server is setup to expand $History to a log of each checkins comments and other metadata. These often cause annoying conflicts when I try to merge.
I can manually "un-expand" the keywords, but the codebase is extremely large and this would take a prohibitively long.
If the keywords look like CVS/RCS keywords ($Id$ and so on), then the keyword extension bundled with Mercurial might be able to help with unexpanding those. But unfortunately it only supports simple keywords, and it sounds like $History will expand incrementally like the $Log$ CVS keyword.
But maybe you can use the keyword extension as a starting point?
Another option on the mercurial side would be to use a precommit hook to automatically un-expand the keywords in your starteam checkout files.
Something like this in your ~/.hgrc might do the trick:
[hooks]
precommit.unexpand_starteam = find . -name '*.cpp' -print0 | xargs -0 perl -pie 's/$History.*?\n\n//m' ; exit 0
That would remove everything from $History through the first blank line in every file right before committing. I've not used starteam, but there must be some way to identity the end of a history block (blank line was a guess), and with the perl line altered to reflect that you should be good to go.