Mercurial -- Ignore certain files based on the existence of other files

Mercurial -- Ignore certain files based on the existence of other files - mercurial

I use mercurial to keep track of a repository which contains both PDF files (generated by others, which I need to keep track of), and latex files, written by me.
For instance, assume a directory structure like this:
root
- Requirements.pdf
- MyReport.tex
- MyReport.pdf
In this case, MyReport.pdf changes every time MyReport.tex does, and can be wholly determined by the contents of the tex file, so it should not be under version control.
I am looking for a way to tell mercurial to ignore such files. Obviously I can add a rule to .hgignore like this (http://www.selenic.com/mercurial/hgignore.5.html)
syntax: glob
*.pdf
But that will ignore the PDFs that I do need to keep version controlled.
There's also this link: https://www.mercurial-scm.org/wiki/TipsAndTricks#Avoid_merging_autogenerated_.28binary.29_files_.28PDF.29 but that doesn't really solve my problem either, because while it handles building the PDFs, it does not handle telling hg which files are important.
Or I could just do this manually, but I would like a way to script it, to make it more general, since these repositories can have several dozen tex and pdf files and manually managing this has become cumbersome.
It seems like quite a simple rule: If there is a file by the name of "blah.pdf", check to see if there is also a file name "blah.tex" and if so, ignore it, otherwise, pay attention to it. But I can't find anything about that.

There is no such feature in Mercurial, nor in Git, nor will there likely ever be such a feature because it's extremely niche. However, you might consider simply putting your "generated" files into a separate output subdirectory, and then ignoring all such directories. For example, if you have an input like foo/bar.tex, the output could be foo/gen/bar.tex, and you could ignore gen/.

Obviously I can add a rule to .hgignore like this
(http://www.selenic.com/mercurial/hgignore.5.html) ... But that will
ignore the PDFs that I do need to keep version controlled.
.hgignore ignore all newly added or existing not versioned files, matching pattern, but bolded texts give you at least two usable solutions:
Write regexp, which means "all pdf, except some filename(s)" (with manually added filenames, most probably)
use wide pattern, but add needed files into repository explicitly (hg add FILENAME)

Related

Remove not-matching lines from hgignore

Is there an Mercurial extension that removes lines from .hgignore that aren't matching any files in the local repository.

There exists no extension or built in function that does this. You could jerry rig a script to do to find lines that are ignoring nothing without too much work, but consider that this is probably a bad idea.
Just because the .hgignore line isn't matching an files on your local repository doesn't mean it's not matching them on anyone else's repository. Within .hgignore files you'll often find patterns like .swp and .bak. You might not use vi (which creates .swp files) and you might not use an editor that creates '.bakfiles, but other do. Or perhaps your editor creates .swp files but you don't currently have any because you're not actively editing a file. Removing that line means you'd not be ignoring a .swp file next time you had one andhg addremove` would cause it to become tracked.

Sharing files between Mercurial repositories

There are one or two files, like .hgignore, which I generally want to be the same in each of a bunch of projects.
However, the nature of these files means that I can't simply move them to a common shared project and just make the other projects depend on that project. They have to be inside each project. Symbolic links are not an option either because some of our developers use Windows.
How can I share these files between repositories and have changes propagated across (on my local machine, at least)? I'm using Eclipse.

For your specific case of hgignore you can put an entry like this in each project's .hg/hgrc file:
[ui]
ignore.common = ~/hgignore-common
If you you know your common library will always the in the parent directory, as is often the case with a subrepo setup you could do:
[ui]
ignore.common = ../hgignore-common
or if you know it will always be in a sibling directory of project checkouts you could do:
[ui]
ignore.common = ../company-wide-defaults/hgignore-common
Unforunately there's no absolute way to reference a file that's valid everywhere, but you can at least to to a point where on your machine all your checkouts are referencing a common ignore.

Hardlinking instead of copying the relevant files sort of works with Eclipse - although you have to refresh each of the other projects to get it to pick up the change. However, you can configure Eclipse to watch the filesystem and automatically refresh whenever it needs to - I recommend this.
It does not work by default with Emacs, because Emacs breaks hard links (which is normally the right thing to do, but is not what you want in this case). However, you can disable this behaviour for multiply-linked files with (setq backup-by-copying-when-linked t).

How can I ignore all directories except one using .hgignore?

I'm managing $HOME using Mercurial, to keep my dotfiles nice and tracked, or at least the ones that matter to me.
However, there's a profusion of files and directories in ~ that do not need to be tracked, and that set is ever-changing and ever-growing.
Historically, I've dealt with this by having this .hgignore:
syntax: glob
*
This keeps my status clean, as far as it goes, making only previously tracked files visible. However, I have some directories (in my case, scripts, .emacs.d) that I would like to see untracked files in; I almost always want to track new additions to those directories.
I know that I can run hg st -u scripts to identify untracked files, but I want a means whereby I can achieve the same function using plain ole hg status.
Is there a way to do this?

Try this in .hgignore instead:
syntax: regexp
^(?!(scripts|foo|bar)/)[^/]+/
^ matches start of path
(?!(scripts|foo|bar) uses negative lookahead to ignore all files except those in directories scripts, foo or bar
/) ensures that directories which have a tracked directory as a prefix are ignored
[^/]+/ then actually matches any directory (excluding those ruled out by the lookahead), so that files in ~ aren't ignored
Credit for the central idea in this solution (the negative lookahead) goes to Michael La Voie's answer to this question

This question has been asked here on SO quite a few times, and you'll get a lot of convoluted answers using zero-width negative look ahead assertions, an oft abused regex trick, but the better solutions are to either (a) just make the repo in that directory alone or (b) just add the files in that directory. For option (b) you'd just put .* in your .hgignore file to ignore everything, and then manually hg add the files you want tracked. In mercurial, unlike svn and cvs, you can override an ignore with an add.

Mercurial (Hg) and Binary Files

I am writing a set of django apps and would like to use Hg for version control. I would like each app to be independent of the others so in each app there may be a directory for static media that contains images that I would not want under version control. In other words, the binary files would not all be in one central location
I would like to find a way to clone the repository that would include copies of the image files. It also would be great if when I did a merge, if there were an image file in one repo and not another, that there would be some sort of warning.
Currently I use a python script to find images and other binary files that are in one repo, but not the other. But a lot of people must face this problem, so there must be a more robust and elegant solution.
One one other thing...for reasons I do not want to go into, usually one of my repos is on a windows machine, and the other is on Linux. So a crossplatform solution would be nice.

Since Mercurial 2.0 the extension largefiles is now included in the main distribution. That extension keeps and manages large files outside of the "normal" repository in a way that you get the benefit of DCVS but without the benefit of exponential size and processing time growth.
Other extension that work along similar lines are SnapExtension and BigFilesExtension. However, those two are not distributed with Mercurial (you have to get them manually).

Mercurial can track any kind of file, for binary files if something changes then the whole file gets replaced not just the changes.
On the getting a warning if one repo doesn't contain a file, that's kind of the point of a DVCS is that the repos are related but are autonomous. You could always check and see what files were added during a synch or merge operation.

The current Mercurial book (by Bryan O'Sullivan) says, that Mercurial stores diffs also for binary files. How efficient this is, obviously depends on the nature of changes to binary files.

How good is my method of embedding version numbers into my application using Mercurial hooks?

This is not quite a specifc question, and more me like for a criticism of my current approach.
I would like to include the program version number in the program I am developing. This is not a commercial product, but a research application so it is important to know which version generated the results.
My method works as follows:
There is a "pre-commit" hook in my .hg/hgrc file link to version_gen.sh
version_gen.sh consists solely of:
hg parent --template "r{rev}_{date|shortdate}" > version.num
In the makefile, the line version="%__VERSION__% in the main script is replaced with the content of the version.num file.
Are there better ways of doing this? The only real short coming I can see is that if you only commit a specfic file, version.num will be updated, but it won't be commited, and if I tried to add always committing that file, that would result in an infite loop (unless I created some temp file to indicate I was already in a commit, but that seems ugly...).

The problem
As you've identified, you've really created a Catch-22 situation here.
You can't really put meaningful information in the version.num file until the changes are committed and because you are storing version.num in the repository, you can't commit changes to the repository until you have populated the version.num file.
My solution
What I would suggest is:
Get rid of the "pre-commit" hook and hg forget the version.num file.
Add version.num to your .hgignore file.
Adjust version_gen.sh to consist of:
hg parent --template "r{node|short}_{date|shortdate}" > version.num
In the makefile, make sure version_gen.sh is run before version.num is used to set the version parameter.
My reasons
As #Ry4an suggests, getting the build system to insert revision information into the software at build time, using information from the Version Control System is a much better option. The only problem with this is if you try to compile the code from an hg archive of the repository, where the build system cannot extract the relevant information.
I would be inclined to discourage this however - in my own build system, the build failed if revision information couldn't be extracted.
Also, as #Kai Inkinen suggests, using the revision number is not portable. Rev 21 on one machine might be rev 22 on another. While this may not be a problem right now, it could be in the future, if you start colaborating with other people.
Finally, I explain my reasons for not liking the Keyword extension in a question of mine, which touches on similar issues to your own question:
I looked at Mercurials Keyword extension, since it seemed like the obvious solution. However the more I looked at it and read peoples opinions, the more that I came to the conclusion that it wasn't the right thing to do.
I also remember the problems that keyword substitution has caused me in projects at previous companies. ...
Also, I don't particularly want to have to enable Mercurial extensions to get the build to complete. I want the solution to be self contained, so that it isn't easy for the application to be accidentally compiled without the embedded version information just because an extension isn't enabled or the right helper software hasn't been installed.
Then in comments to an answer which suggested using the keyword extension anyway:
... I rejected using the keyword extension as it would be too easy to end up with the string "$Id$" being compiled into the executable. If keyword expansion was built into mercurial rather than an extension, and on by default, I might consider it, but as it stands it just wouldn't be reliable. – Mark Booth
A don't think that there can be a more reliable solution. What if someone accidentally damages .hg or builds not from a clone but from an archive? – Mr.Cat
#Mr.Cat - I don't think there can be a less reliable solution than the keywords extension. Anywhere you haven't explicitly enabled the extension (or someone has disabled it) then you get the literal string "$ID$" compiled into the object file without complaint. If mercurial or the repo is damaged (not sure which you meant) you need to fix that first anyway. As for hg archive, my original solution fails to compile if you try to build it from an archive! That is precisely what I want. I don't want any source to be compiled into our apps without it source being under revision control! – Mark Booth

What you are trying to do is called Keyword Expansion, which is not supported in Mercurial core.
You can integrate that expansion in make file, or (simpler) with the Keyword extension.
This extension allows the expansion of RCS/CVS-like and user defined keys in text files tracked by Mercurial.
Expansion takes place in the working directory or/and when creating a distribution using "hg archive"

That you use a pre-commit hook is what's concerning. You shouldn't be putting the rest of version_gen.sh into the source files thesemves, just into the build/release artifacts which you can do more accurately with an 'update' hook.
You don't want the Makefile to actually change in the repo with each commit, that just makes merges hell. You want to insert the version after checking out the files in advance of a build, which is is what an update hook does.

In distributed systems like Mercurial, the actual "version number" does not necessarily mean the same thing in every environment. Even if this is a single person project, and you are really careful with having only your central repo, you would still probably want to use the sha1-sum instead, since that is truly unique for the given repository state. The sha1 can be fetched through the template {node}
As a suggestion, I think that a better workflow would be to use tags instead, which btw are also local to your repository until you push them upstream. Don't write your number into a file, but instead tag your release code with a meaningful tag like
RELEASE_2
or
RELEASE_2010-04-01
or maybe script this and use the template to create the tag?
You can then add the tag to your non-versioned (in .hgignore) version.num file to be added into the build. This way you can give meaningful names to the releases and you tie the release to the unique identifier.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008