Mercurial repository narrow clone? - mercurial

Update because of new insights: Upon seeing this question five years later, I realise that this stems from trying to use a version control system as a package manager. This of course leads to all sorts of unexpected issues, and we shouldn't be using it that way. If you're reading this question, I suggest searching for a package manager for your preferred language.
My original question: I'm currently in the process of moving from Subversion to Mercurial, and I have to say I don't regret that decision. However, when trying to convert my project, I ran into a problem of Mercurial, which I can't seem to get fixed. I have two distinct projects: one is a framework, and the other is an application that relies on that framework. Here's what the repositories look like:
The Framework repository:
docs/
deploy/
lib/
tests/
The Application repository:
application/
config/
lib/
tests/
www/
What I'd like is for the application's lib directory to contain a copy of the frameworks' lib/ directory. I used to do this using svn:externals. Now, I am aware that Mercurial supports the concept of subrepositories, but that doesn't seem like the "correct" solution, as it doesn't actually pull in the lib/ directory like I wanted, as you'll still have to pull and push changes manually. That, plus once you clone the framework repository, you'll get all of it, not just the lib/ directory. I only need the lib/ directory, not the tests, or the docs.
Now, I thought up two different solutions to this problem, but I wonder which is the best. The first solution would be to clone the framework in a different directory altogether and create symlink in the application's lib/ directory which points to the framework's lib/ directory. Putting the symlink in .hgignore should make sure all is well, I think? That means that you could edit the frameworks code, and commit that, and you could edit the application's code and commit that, too.
The other option is to have multiple repositories. The framework gets pulled as a whole, which means you'll get the docs/, deploy/, test/ etc. directories, which are not needed for usage of the framework. I thought maybe creating a repository purely for the library might be a solution, although I sincerely doubt it, as the Unit Tests are very dependant upon the library itself.
Does anyone know a decent solution for this problem?

You should put separate components in their own repositories. Then, when you create an application, you can use the convert extension to create a pullabale framework repository out of the normal one:
$ hg convert --filemap map.txt framework new-framework
with map.txt containing the renames/includes/excludes (the following one should only include the lib directory and move everything in there to the repository root):
include lib
rename lib .
From the application repo, you can now just pull the framework repo (use -f the first time, since the repositories will likely have nothing to do with each other).
$ cd project
$ hg pull -f ../new-framwork
$ hg merge
Now, when the development goes on, you just have to re-create the converted repo every time before you pull and you're good to go. We actually have a hook on our framework repository that re-creates the converted repo on every changegroup (= every push).
This way you have both work areas (app and framework) in their own repositories, while the app repo contains the complete framework history and is able to be updated by simply pulling from the converted repo.

You should separate out the libraries into a separate repository if you need to refer to only that.
Provided you actually do need to only refer to that. What is the problem with referring to the repository containing the lib directory + other stuff, other than "it doesn't feel right"?
As for "still have to pull and push manually", when you pull your main clone, it'll pull the subrepos as well, but it won't update them to a newer revision, which is a good thing, you need to do that manually, just as you should with Subversion.

Related

Using Mercurial via a USB flash drive

In short:
How can I use Hg to synchronize repositories between two computers using a flash drive as intermediary?
With more detail:
I often develop code on computers that aren't networked in any way, and I transfer files between these machines using a USB flash drive. Now I would like to develop some software across these machines using Hg repositories on each machine that I can frequently sync-up using the flash drive transfer mechanism.
I'm slightly familiar with Hg, as I use it in the most simple way possible for versioning only my own work on independent machines, but am uncertain as to exactly what I should do to use it to synchronize repositories between two computers using a flash drive as intermediary. Maybe, for example, I need to create a temporary repository on the flash drive (using “clone”) from which I then sync to (using “push” and “pull”), and do this by A→flash, flash→B, B→flash, flash→A? The more specificity in your answer regarding the sequence of actions and commands, the more useful to me.
Finally, how do I get this process started? Do I need to do something so Hg knows these are all part of one code base? For example, each of my current repositories on the different computers was created independently from a time before I started using Hg, and although all the code is similar, independent changes have been made to each, and the repositories know nothing about each other. If what I need to do with this is different than what I need to do for the ongoing case once I have everything unified, spelling this process out for me as well would also help.
In case it's important, these machines can be running any of Windows, Mac, or Linux, and my versions of Mercurial are slightly different on each machine (though the Mercurial versions could be unified if needed).
What you have described above in terms of using the flash drive as an intermediate storage location should work. My process would be:
initial setup
create repo on computer A (using hg init)
clone the repo from computer A to flash drive
hg clone C:/path/to/repo/A X:/path/to/flash/drive/repo
clone the repo from flash drive to computer B
hg clone X:/path/to/flash/drive/repo C:/path/to/repo/B
working process
edit/commit to repo on computer A
push from computer A to flash drive
hg push X:/path/to/flash/drive/repo
pull from flash drive to computer B
hg pull X:/path/to/flash/drive/repo
edit/commit repo on computer B
push from computer B to flash drive (same commands as above)
pull from flash drive to computer A (same commands as above)
Finally, how do I get this process
started? Do I need to do something so
Hg knows these are all part of one
code base?
Mercurial knows if two arbitrary repositories have a common ancestor by looking at the SHA1 hash keys of the commits in each repo. In other words, assuming both repos have at least one common hash key in their histories, Mercurial will attempt to merge them. In your specific case, where both repos are initially un-versioned, Mercurial will need some help. The best thing to do would be to get to a place where both repos are identical and then perform your hg init. Mercurial should handle sharing from this point on.
When working offline on different machines. It is better to use the bundle command that comes with Mercurial. So echoing what dls wrote but a slight change process.
Initial setup as mentioned by dls.
or
Go to your Mercurial repository top directory
Create bundle: hg bundle --base null ../project.hg
Copy the project.hg file to your other computer
Create a directory there
Make it an Mercurial repository : hg init
Incorporate the bundle: hg pull <path/project.hg>
hg update
Check hg log, both the repository will show same base revisions and tip
Workflow using bundle
I use a slightly different workflow. I keep these repositories as distinct repositories.
I mention them as repo1 and repo2.
Suppose that the current tip of repo1 is 4f45839f613c.
You make changes and commit them in repo1
Create a bundle of the changes :
Command : This bundle contains all changes since the specified base version.
hg bundle --base 4f45839f613c changes.bundle
Take it to repo2 by copying the bundle.
You can simply pull the bundle to repo2 :
Command :
hg pull changes.bundle
If the bundle contains changes that are already present in repo2, then these will be ignored when pulling. As long as the bundle doesn't grow to large, this allows to use the bundle command with the same --base revision again and again to create bundles including further changes.
About bundles: these are (very well) compressed.
creates a (compressed) backup of the repository
hg bundle --base null backup.bundle
[Edit : Adding some links on this topic]
http://blog.experimentalworks.net/2010/09/review-remote-changes-offline-in-mercurial/
https://www.mercurial-scm.org/wiki/Bundle
[Edit: What I think is advantage of using bundle]
Bundles can be created offline, copied or sent via mail. Using push to repo on flash drive, requires it to be connected. Bundles are easier since it does not maintain that the two repo from which you push and pull have to be available at the same time.
Apart from that, bundles can also be of two types : Changesets and Incremental. Changeset bundles are complete standalone bundles. You can also use bundles for backup as a single file.

Mercurial - How to stop tracking modified file but keep the first version in repository

I create the hg repository with my source tree. I want to keep the first version of some files such as Makefile in the repository and then hg don't see it modified even through I modified it.
Original problem is that ./configure usually modifies the Makefile but I don't want the build files to committed in the repository. So I want to keep only first version of configure and Makefile in the repository so that everybody who clone my repository can run ./configure by themself and not bother the repository
I tried hg remove or hg forget but those are stop tracking and also delete the files in the next revision of reporitory.
.hgignore doesn't do the things too.
I think of hg revert everytimes I run ./configure or make but it's not efficient way.
Are there any better ways?
Its usually good form to not track the configure script at all. There are some reasons for this:
Its huge. I've seen code bases where the configure script and helper macro libraries were more than ten times the size of the actual code being compiled.
When other developers make changes to configure.in(.ac), they are going to need to commit a new configure script. If three people do that, there's a good chance that Mercurial will require at least one of them to manually resolve a merge conflict in configure itself. Keep in mind, configure is machine generated, attempting to read it (much less resolve merge conflicts) may make your eyes bleed.
Generally, you'll offer a program in source form via two methods:
Download of a release archive (e.g. foo-1.2.3-rc2.zip), this can contain the configure script.
Downloading the repository directly using Mercurial. If they want to work with that, they'll need to have autoconf installed.
In the root of my repositories, I usually include a file called autogen.sh that runs all of the steps needed (aclocal, autoconf, ...), which also handles alerting the user if they need something installed. I.e. Could not find tool aclocal, please install the autoconf package.
Its really best to just go with the autogen.sh method. This means only tracking configure.in (or configure.ac) and the associated Makefiles (from Makefile.in). Let each build configure their own, and provide a distclean target to remove all files configure generates. Finally, provide a maintainer-clean target to remove anything that the configuration suite itself generated, e.g. configure.
That should help make nightly builds easy.
You could try and setup a pre-commit hook which would always restore the original Makefile content if found in the changeset.
The SO question illustrates reading the content of the changeset to be committed.
Make sure to use the pre-commit hook, and not precommit.

How to best set up Mercurial on a Clearcase static view? (Set up "checkout" hooks?)

I'd like to set up a mercurial repository in a clearcase static view directory. My plan is to clone from that directory, do all my real work in a mercurial repo and then push my changes back to the shared Hg/Clearcase dir.
I'd like to hear general suggestions on how this might work best, but I foresee one specific problem: Clearcase locks files as read-only until they are checked-out. They way I'd like it to work is to set up a mercurial hook to checkout the file before the push is completed and roll-back the push if the checkout doesn't work.
Should I be looking at the pretxncommit hook? Or the pull hook? Also, I'm not quite clear on how to write the actual hooks either. I know the clearcase command, but I'm not sure how to contruct the hook to pass in the filename for each file in the changeset.
Suggestions?
The question I just answered 2 days ago: How to bridge git to ClearCase? can gives you an illustration of the process.
I like to take the ClearCase checkout/checkin step separate from the DVCS work:
I will unlock files as I need them within the DVCS repo (made directly within the snapshot view), and then update the snapshot view, which will tells me the "hijacked" files (which I can the easily checkout and checkin through the cleartool update GUI).
But if you have clone you DVCS repo somewhere else, and push it back to a local repo which is not the ClearCase snapshot view, what you could do is simply copy back the view.dat hidden file of your snapshot view at the root directory of the DVCS repo.
That simple file is enough to transform back the local repo in a ClearCase snashot view!
Then you make all the files read-only (except those modified after a certain date, i.e. the time when you started working), to avoid ClearCase considering all the files as hijacked.
The rest is similar to the first approach: update, checkout/checkin.

Is it possible to checkout a single directory from a Mercurial (HG) repository?

So, I'm trying to checkout just the TestNG plugin from the Netbeans contrib repository. (Or is it module? I'm new to Mercurial, so I don't really know the lingo yet.)
When I run the following command...
hg clone http://hg.netbeans.org/main/contrib/
...I get the entire repository, which contains all of the the contrib plug-ins. Is it possible to just pull this location?
http://hg.netbeans.org/main/contrib/file/tip/testng/
Thanks!
This concept is called "narrow cloning" and no, it's not possible at the moment in Mercurial.
It's on the radar of some of us that contribute to Mercurial but it's a hard problem to solve. For example:
How do you calculate the hash of any new commits you make if you don't have all of the files in the repo?
What happens if you try to view the history of a file in contrib/testng if that file was moved from another folder?
I'm not sure, but I think the answer in the general case is "probably not".
If the repository is local (it doesn't sound like it is in your case), you can do something like:
hg archive -R /path/to/my/repo -I /path/to/my/repo/folder/i/want export-folder-name
(The command would need to be something that exports non-VC'd files, rather than creating a partial repo, since the .hg stuff is stored once at the toplevel, rather than in pieces in each folder as SVN does.)
It doesn't work on remote repositories, though. Neither does "hg log", and the hg folks explained why:
Imagine I send a log -p command to http://www.kernel.org/hg/linux-2.6, which is
approaching 100k changesets. At one diff per second (lots of seeking), this will
take about 3 hours of CPU/disk time on the server, nevermind metric tons of
bandwidth. It would be faster and simpler for everyone just to clone the repo
and do the log locally.
I suspect hg archive can't work remotely for the same reason.

How do I clone a sub-folder of a repository in Mercurial?

I have a Mercurial repository containing a handful of related projects. I want to branch just one of these projects to work on it elsewhere.
Is cloning just part of a repository possible, and is that the right way to achieve this?
What you want is a narrow or partial clone, but this is unfortunately not yet supported.
If you already have a big repository and you realize that it would make sense to split it into several smaller repositories, then you can use the convert extension to do a Mercurial to Mercurial conversion. Note that this creates a new repository foo and you cannot push/pull between your-big-repo and foo.
The convert extension is not enabled by default so add the following to your repo's hgrc file or your mercurial.ini file:
[extensions]
hgext.convert=
Then create a map.txt file with
include "libs/foo"
rename "libs/foo" .
(note you can use forward slashes even on Windows) and run
$ hg convert --filemap map.txt your-big-repo foo
That will make foo a repository with the full history of the libs/foo folder from your-big-repo.
If you want to delete all evidence of foo from your-big-repo you can make another conversion where you use exclude libs/foo to get rid of the directory.
When you have several repositories like that and you want to use them as a whole, then you should look at subrepositories. This feature lets you include other repositories in a checkout — similarly to how svn:externals work. Please follow the recommendations on that wiki page.
Instead of doing a partial clone, you can use the Convert Extension to split your repo into more than one repo by sub repository.
Specifically, see the section, Converting from Mercurial:
It's also useful to filter Mercurial repositories to get subsets of an existing one. For example to transform a subdirectory subfoo of a repository foo into a repository with its own life (while keeping its full history), do the following:
$ echo include subfoo > /tmp/myfilemap
$ echo rename subfoo . >> /tmp/myfilemap
$ hg convert --filemap /tmp/myfilemap /path/to/repo/foo /tmp/mysubfoo-repo
I've stumbled accross this issue and found one way to do it: Using symlinks (Linux only unfortunately)
For example, if you only need /project in the repository, on your computer clone the repo in another folder, then use ln -s /repo/location/ project. Mercurial will handle it
(Late 2016) Mainline Mercurial still doesn't package support for "narrow clones" but there are third party extensions that tackle the problem in different ways.
If you can cope with just a narrow checkout (aka "sparse checkout" or "partial checkout by file path") then Facebook's sparse.py extension from the hg-experimental repository (look inside the hgext3rd/ directory) may be workable. In this scenario, you still clone the full history (thus the .hg directory is no smaller) but your working directory only shows/acts on a subset of the full repository.
Alternatively Google have created a NarrowHG extension that does narrow cloning (aka "partial cloning by file path"). You will need to be in control of the server, the client and be willing to use experimental features but it really does restrict the clone's copied history in .hg to a subset of what was in the original repository.
(2019) The sparse extension was merged into Mercurial 4.3 as the experimental sparse extension. The NarrowHG extension was merged into Mercurial 4.6 as the hgext.narrow extension.
It is not possible, hg clone will clone the whole repository.
You can take a look a the sub-repository extension that allows you to have repositories inside a repository, which might match your needs.
This is straight forward with the Convert extension.