Mercurial : user friendly way to display exact revision number of files? - mercurial

When I was using Subversion as part of the build process I'd run an 'svn info' and capture the unique ID number and echo it to a header file for inclusion by other programs. This made it easy for users to say for example, 'I'm running build 456' and given the number 456 I could always cross reference exactly what they were running.
I'm trying to figure out how to achieve the same thing with Mercurial. 'hg summary' displays an integer id as well as the hex hash code. From what I was reading the integer id could be different for different people. I'm supposing the hash code is unique, but it's not very user friendly.
Is the hg hash code the only unique way of identifying a particular version of files in Mercurial?
Thank you,
Fred

Yes it is the only way to uniquely identify a changeset.
More details in the documentation : ChangeSet and ChangeSetID
If you want to use an integer number, I see two possible solution depending on your build process.
If the build always happens on the same machine (ie: same repository), you can use the integer id because it never changes on a particular repo (except if you do history rewriting)
If the build of a particular version only happens once, you can use a variable that you increment each time in your build script.

hg id command will give you needed changeset. You can add someoptions to command also, but most useful and permanent part is changeset id
For the same repo
>hg id -nibt
6c4d15d8cfbd 841 default tip
>hg id
6c4d15d8cfbd tip
you can also think about some commands, which support templating of output, and combine nice output from template-keywords mix: hg help templating
Example for already mentioned repo
>hg log --template "{rev}:{node|short}-{latesttag}+{latesttagdistance}" -r tip
841:6c4d15d8cfbd-1.3+3

Related

Mercurial: detect which change sets where on main branch at a particular revision

With repos laying around waiting to be checked in, people merging branches at random time, and general mayhem, I need to figure out exactly what(which changesets) was at the tip of main branch at the time of release (label/revision). And I have to do it retroactively and automatically.
It seems like a basic thing that should be able to be achievable with hg log, but I can't figure out how. Please help!
With #planetmaker advice, I've tried the following, and it seems to be working!
hg log -r "branch(default) and ::HashNumber"
The answer to your problem lies in the use of revsets. Checkout hg help revset for a complete list of what you can do with them.
If you are interested in the changeset last changeset in BRANCHNAME prior to a certain time: hg log -r"last(date(<2012-01-01)) and branch(BRANCHNAME)" (hg help dates for options how to define the date, including exact time).
Now, if you want to use this information in a script you do not want the full log output, but just the revision or hash itself. Use the template capability to format output. Thus amend the log call appropriately:
hg log -r"last(date(<2012-01-01)) and branch(BRANCHNAME)" --template="{node|short}\n"
in order to only get the hash. Or use {rev} for the numerical changeset version (which is local to that very repo only, though); see hg help templates for a full list of what you can output.

Mercurial : "diff" of a changeset in template

I am trying to display the diff of each changed file in a changeset, using a template.
What I need is something very similar to "hg diff" command. I cannot find anything which might serve my purpose in the help here
To add context, I am trying to use this template in Bugzilla extension. I need to add the diff of the changes which went in to bugzilla ticket.
You can use diff() pattern
(extract from hg help templates - better than URL referenced by you)
- diff([includepattern [, excludepattern]])
You if you don't specify any patterns, it will simply give you the equivalent of hg log -p. If you want to print diff per file, you will need to pass explicit filenames as includepattern parameter, like
hg log -r tip --template "{diff('mercurial/bundlerepo.py')}"
Looping through the list of files (like "{files % '{file}'}" in templates help) seems broken in this case (well, I didn't manage to make it work). Probably it's a bug, so you can write to mercurial discussion list to get confirmation.
Anyways, to get more luxury support, better to write to mercurial discussion list, or join #mercurial IRC and ask :)
Also they will guide you on how to achieve what you are trying to do in better way - seems you are trying to reinvent something

Given a file, how to find out which revision in a mercurial repository this is?

Assume that there is a file under hg version control. I have a particular version of that file, and I would like to find out in which revision this file was in this version.
I suspect that there are two possible ways to do this.
Do hg update in a loop and diff the file against subsequent versions (sloooow, but should work).
Make Mercurial put the rev number in a, say, comment in the second line of the file right before committing. From what I have read, a precommit hook might be of use. Then I don't have to compare anything, just look at the file itself (I'm assuming no-one will change this, of course, but this is rather safe assumption in my case).
My use case is a joint paper, written in LaTeX, with two coauthors who have no idea about version control at all, but I prefer to use it (for obvious reasons). We communicate by email, and there's effectively a human-based lock system ("I will not work on this file until you send me the next version, ok?"). The only problem that arises is that I'm sending version X to author B to proofread, then author C sends me a corrected version Y and I commit it into my repo, then author B sends his corrections Z (to version X) and I'm starting to get lost-but I can check the attachment in the email sent to B, and I only need to find out which revision it is.
So, my question is: which of the two ideas above would be better, or maybe there's yet another one to help me deal with this mess?
hg archive is good method for future work, but I can suggest at least 3 alternative work-styles and 1 fix for find-correct-version with updates
Future work
You can use separate named branches for co-authors and default for merged results, send co-author always head from his branch, update his branch after getting corrections (you'll always know, that you sent) and merge branches to default
One branch, revision-of-coworker marked with bookmark, which you later move to next point
Mercurial keywords considered somehow as a "feature of last resort", but in your case it's obvious and usable solution: just add keyword with hash-id in file (defaul extension instead of hook - easier and more reliable)
Current state
For finding changeset with source of file, you can try to use bisect (example) and test in test-script, f.e, CRC of file (you have needed CRC of unversioned file, check versioned file across history)
If you're happy to rely on finding the emails you send the reviewers, why not just include the revision hashes in them along with the files?
You can get this for almost zero extra effort by generating your attachment using hg archive, which will create a file containing 1) your files for review, and 2) .hg_archival.txt, complete with revision hash.
Though I'd be surprised if there isn't a more elegant way, even if your collaborators are dead-set against using version control.

How do I anonymise a mercurial repository?

I have a programming assignment I'm to hand in at my University for the end of this week and they have strict rules about anonymity of the assignments to maintain impartiality, so if my name (or any other obvious identifying info) appears anywhere in the work it may be automatically disqualified.
While preparing to burn everything to disc, I've just noticed/remembered that my HG repo is full of copies of my name. The code is all clean, but the author of every changeset is either my full name and email or my university login ID and the hostname of a lab computer (depends where I was working).
I need to create an anonymised version of the repo (or swap out all names for my student ID number) without losing any of the other information it holds.
So, as the headline says, how do I anonymise a mercurial repository?
You can use Mercurial's Convert extension with the --authors option to "convert" your repository into a new Mercurial repository, changing the authors' names during the conversion.
Quote from the second link:
Convert can also remap author names during conversion, if the
--authors option is provided. The argument should be a simple text file maps each source commit author to a destination commit author. It
is handy for source SCMs that use UNIX logins to identify authors (eg:
CVS). Example:
john=John Smith <John.Smith#someplace.net>
tom=Tom Johnson <Tom.Johnson#bigcity.com>
If you don't have any merge changesets, then you could try using the graft command in Mercurial 2.0 to graft your repository to a new repository while changing the recorded user name.
If you do have merge changesets, then it might be possible to use the transplant extension in Mercurial 2.2, although changing the recorded user name appears to be harder.

Mercurial repository identification

I need to be able to uniquely identify a Mercurial repository and have that identifier placed in a file that is included when cloned. If I can put the identifier in a file in the .hg folder that is preferable to simply adding a normal file to the repo.
I understand that I can get a near certain identifier from the first changes that are committed. I know that the hgrc file cannot be used to store the identifier, because it is not cloned.
So, my question is: Is there another file in the .hg folder that is cloned that I can use to put the identifier? Thanks.
From first read, it sounds like you want to be able to make sure that a clone of the repository is a clone of the correct repository and not some stand-in impostor. However, if the identification information you're thinking of using is cloned with everything else, then an impostor would still pass this test. You'd need to keep that identifier separate so that it can be compared against information in the clone.
Whether that is your purpose or not, any file in .hg that is cloned you may not want to edit. You'd have to add a file to be tracked in the other areas of the repo, outside of .hg. However, you don't really need an extra file at all, as the changeset hash is not just near certain, but very certain, so the information for handily identifying a repository is built-in to the repository itself.
On the commandline, you can get either the short or full versions of the very first changeset's hash identifier:
> hg id -i -r0
89abf5502e3c
> hg log -r0 --template "{node}"
89abf5502e3c5c65e532db04d8d87141f0ac8b73
If I am correct about your desire to compare 2 identifiers so that you or someone else knows a clone of the repository is a true clone and not a false clone, you would have the same changset id available separately so that someone can use one of the above commands to see the id of their clone and compare it to what you say it should be. This is much like how many websites with downloadable executable files show a hash identifier next to the download link so that you can hash the file yourself and compare the result to the hash on the website.
Edit regarding your comment that sheds light on the purpose of this:
Since you need to be able to read it from a file, there are a couple options:
Tracked file in repository root
There is one file you might consider, other than creating your own: .hgtags.
hg tag -r0 ident
...would tag the very first revision, allowing you to use ident as a reference to that changeset rather than -r0. Mercurial always uses tag information from the latest version of .hgtags, no matter what changeset the working directory is updated to, but that may not matter to your app. hg tag appends a line such as this to the .hgtags file, creating the file if it doesn't exist:
a247494248c4b96a571bbd12e90eade3bf559281 ident
This is most handy if you don't have a tags files yet in your repos, because it will be the first line in the file for easy finding. You might think could simply write this file yourself, but then you'd still have to call hg to get the changeset id and again at some point for adding it to tracking and then committing: hg tag does all that for you.
If there is already the possibility of a tags file to consider, that's ok, too, because they tend to be relatively short and you just need to look for the 1 line that ends with your chosen tag name. Mercurial is designed for append-only operations to .hgtags, but everything would still work fine if you inserted the line for this tag as the very first line if .hgtags already exists because: 1. The tag will never be moved or removed. 2. You'll be using a tag name not already used in the file.
Reading hg's guts
There are files that normally only Mercurial itself touches deeper in .hg that can be read to get the first changeset's hash. I looked into Mercurial's File Formats, Revlog, and RevlogNG, and at least for 2 of my own repos, .hg\store\00changelog.i contains the first changeset's hash at offset 0x20 (20 byte length). Probably, at least since Mercurial 0.9, it will be the same in all repos. RevlogNG also notes the first 4 bytes of that file will indicate Revlog version number and flags. While the changeset id is only 20 bytes long currently, the actual field for it is 32 bytes long, probably for future expansion to a longer hash.
Since this option requires no alteration of existing repositories and only involves reading the first 52-64 bytes of the main index, it's the one I'd probably go with. If I was catching this requirement in the early stages of the product before any repos it manages were out in the wild, I would lean toward the custom file approach because I would probably have my own metadata file created and added from the beginning of the repo.
error: repository is unrelated message come from mercurial/treediscovery.py:
base = list(base)
if base == [nullid]:
if force:
repo.ui.warn(_("warning: repository is unrelated\n"))
else:
raise util.Abort(_("repository is unrelated"))
base variable store last common parts of two repositories. By giving this idea of push/pull checks we may assume that repositories are related if they have common roots, so check hashes from command:
$ hg log -r "roots(all())"
For unknown to me reason hg log -r 0 always shown same root, but you may have situation that FIRST_REPO hold SECOND_REPO history, but obviously 0 revs of SECOND_REPO different from FIRST_REPO but Mercurial check is passed.
You may not trick roots checking by carefully crafting repositories because building two repositories looks like these (with common parts but different roots):
0 <--- SHA-256-XXX <--- SHA-256-YYY <--- SHA-256-ZZZ
0 <--- SHA-256-YYY <--- SHA-256-ZZZ
impossible because that mean you reverse SHA-256 as each subsequent hash depends on previous values.