Ignore a directory when pull/push in mercurial - mercurial

I've got a situation where I have a dev and QA version of a repo.
I have a directory foo, which contains code that I always want to be pushed/pulled.
I have a directory bar, which typically contains items specific to that region (what they are is irrelevant), but sometimes will contain items that I do want to push/pull.
push to qa - in such a way that it will only take foo, but ignore the changes in bar.
push to qa - in such a way that it will take both foo and bar.
push to qa - in such a way that it will take foo and some of bar.
From what I can tell, I would need to merge, then revert certain files back and then commit these. This is OK, but seems a bit backwards. Is there a better way to get Mercurial to support this workflow (Or a similar one)?

As Chris Morgan says, that's not really the way that Mercurial works. When you push to another repository, you push a number of changesets - anything that's in those changesets gets pushed.
One possible solution, although it's a very horrid workaround (in my opinion) would be to use the Convert extension to create a new repository, filtering out what you don't want by using the --filemap option.
However, if this is going to QA, would they need a Mercurial repository anyway? You could simply archive a (tagged) version. By default the archive will contain information on the changeset that you're archiving in the .hgarchival file. For example
hg archive -X "bar/**" archive.tgz
... will create a file archive.tgz containing an archive of your repo at the current point, without the "bar" directory. The -X option can be used multiple times if you wish to exclude specific files.

Related

Mercurial: how do I create a new repository containing a subrange of revisions from an existing repo?

I have a repo with subrepos, with a long history. At some point the main subrepo became fully self-contained (doesn't depend on other sister subrepos).
I don't care anymore about the history of the whole thing before the main subrepo became self-contained. So I want to start a new repo that contains just what the subrepo has in it from that moment on. If possible, please describe in terms of TortoiseHg commands.
You probably want to make use of mercurial's convert extension. You can specify revisions to be converted, paths, branches and files to include or exclude in the newly-created repository.
hg convert from-path new-repo
Convert is a default extension which just needs activation.
You might be able to weed out any other changesets you don't need by using the hg strip command.
It will entirely remove changesets (and their ancestors) from a repository. Most likely you would want to make a fresh clone and then work on stripping it down.
One potential pitfall is that the final stripped repo would still have shared parentage with the original; therefore the potential exists for someone to accidentally pull down again changesets which were stripped.
The hg convert command (noted in another answer) does not have this downside.

Mercurial Repo Living Archive

We have an Hg repo that is over 6GB and 150,000 changesets. It has 8 years of history on a large application. We have used a branching strategy over the last 8 years. In this approach, we create a new branch for a feature and when finished, close the branch and merge it to default/trunk. We don't prune branches after changes are pushed into default.
As our repo grows, it is getting more painful to work with. We love having the full history on each file and don't want to lose that, but we want to make our repo size much smaller.
One approach I've been looking into would be to have two separate repos, a 'Working' repo and an 'Archive' repo. The Working repo would contain the last 1 to 2 years of history and would be the repo developers cloned and pushed/pulled from on a daily basis. The Archive repo would contain the full history, including the new changesets pushed into the working repo.
I cannot find the right Hg commands to enable this. I was able to create a Working repo using hg convert <src> <dest> --config convert.hg.startref=<rev>. However, Mecurial sees this as a completely different repo, breaking any association between our Working and Archive repos. I'm unable to find a way to merge/splice changesets pushed to the Working repo into the Archive repo and maintain a unified file history. I tried hg transplant -s <src>, but that resulted in several 'skipping emptied changeset' messages. It's not clear to my why the hg transplant command felt those changeset were empty. Also, if I were to get this working, does anyone know if it maintains a file's history, or is my repo going to see the transplanted portion as separate, maybe showing up as a delete/create or something?
Anyone have a solution to either enable this Working/Archive approach or have a different approach that may work for us? It is critical that we maintain full file history, to make historical research simple.
Thanks
You might be hitting a known bug with the underlying storage compression. 6GB for 150,000 revision is a lot.
This storage issue is usually encountered on very branchy repositories, on an internal data structure storing the content of each revision. The current fix for this bug can reduce repository size up to ten folds.
Possible Quick Fix
You can blindly try to apply the current fix for the issue and see if it shrinks your repository.
upgrade to Mercurial 4.7,
add the following to your repository configuration:
[format]
sparse-revlog = yes
run hg debugupgraderepo --optimize redeltaall --run (this will take a while)
Some other improvements are also turned on by default in 4.7. So upgrade to 4.7 and running the debugupgraderepo should help in all cases.
Finer Diagnostic
Can you tell us what is the size of the .hg/store/00manifest.d file compared to the full size of .hg/store ?
In addition, can you provide use with the output of hg debugrevlog -m
Other reason ?
Another reason for repository size to grow is for large (usually binary file) to be committed in it. Do you have any them ?
The problem is that the hash id for each revision is calculated based on a number of items including the parent id. So when you change the parent you change the id.
As far as I'm aware there is no nice way to do this, but I have done something similar with several of my repos. The bad news is that it required a chain of repos, batch files and splice maps to get it done.
The bulk of the work I'm describing is ideally done one time only and then you just run the same scripts against the same existing repos every time you want to update it to pull in the latest commits.
The way I would do it is to have three repos:
Working
Merge
Archive
The first commit of Working is a squash of all the original commits in Archive, so you'll be throwing that commit away when you pull your Working code into the Archive, and reparenting the second Working commit onto the old tip of Archive.
STOP: If you're going to do this, back up your existing repos, especially the Archive repo before trying it, it might get trashed if you run this over the top of it. It might also be fine, but I'm not having any problems on my conscience!
Pull both Working and Archive into the Merge repo.
You now have a Merge repo with two completely independent trees in it.
Create a splicemap. This is just a text file giving the hash of a child node and the hash of its proposed parent node, separated by a space.
So your splicemap would just be something like:
hash-of-working-commit-2 hash-of-archive-old-tip
Then run hg convert with the splicemap option to do the reparenting of the second commit of Working onto the old tip of the Archive. E.g.
hg convert --splicemap splicemapPath.txt --config convert.hg.saverev=true Merge Archive
You might want to try writing it to a different named repo rather than Archive the first time, or you could try writing it over a copy of the existing Archive, I'm not sure if it'll work but if it does it would probably be quicker.
Once you've run this setup once, you can just run the same scripts over the existing repos again and again to update with the latest Working revisions. Just pull from Working to Merge and then run the hg convert to put it into Archive.

Remove branches in Mercurial but keep changes?

I just found out that we're not supposed to create named branches in our local Mercurial repos, because they get carried along and pushed to the upstream repo, where they live forever.
Unfortunately, I already have multiple feature branches in my local repo, which I was all set to merge to default and push up to the main repo. So, what I need to do is keep the work in these branches (somehow) but remove the named branches.
Is there a way to do this, short of a lot of horrible manual revert/cut/paste?
Using TortoiseHg:
(Ensure rebase extension is enabled in File -> Settings -> Extensions)
Update to parent of the first change set of the named branch.
Right click on the first change set of the nambed branch and click rebase.
Ensure "Keep orignal branch names (--keepbranches)" is NOT checked.
Ensure "Rebase entire source branch (-b/--base)" is checked.
Click rebase.
Another approach would be to use the hg convert command (a standard extension) and the branchmap option.
For a private repository or at least one with changesets which have not been published this would work fine, even though like a rebase it would modify history.
An advantage as opposed to using rebase is that it could be used to deal with multiple branches (named or unnamed) all at once. That might be quicker / easier if there was a lot of things to move around.
A secondary plus is that since convert will create a NEW repository there is no risk to the original if something doesn't work as expected. (Of course when using rebase you could make a backup first which I think is a reasonable precaution.)

Paralell branches/clones in Mercurial?

This question is not only technical. I want to get into the concept itself, too.
There is a foreign project on BitBucket (e.g. ObjectListView). And I need to work on two problems at the same time. In git I would just create to branches in my local repository after clonening.
But how would I does this with Mercurial?
When I create branches there I am not able to push my local commits back to the remote repository because of the missing '--allow-new-branch' option.
So it doesn't want me to branch.
In my current understanding I would just create a clone for each new feauter (branch). So this means I have to create two forks on BitBucket for one project when I want to work on two different problems.
How would you solve this?
There is some missing information here, so I'm not sure I know what your problem is. I'll try to take some educated guesses and cover a number of possibilities. Feel free to ask back if I failed to address your problem adequately.
First, you can definitely just create new branches if you wish to do that. You'll need to use hg push -f or hg push --new-branch (if you used a named rather than an anonymous branch) to make the remote server accept them (that's to prevent you from pushing new branches by accident). You will, of course, need write access to the repository (or fork the repository and work on that fork).
Second, if you just want to push the current revision/branch (and any associated revisions of a feature branch) and not sync the entire repository, then hg push -r . will do that (here, . denotes the current revision, you can also specify others). If you use that frequently, you may want to create an alias, e.g. submit = push -r ..
Third, if you actually need separate workspaces, it's probably more convenient to use hg share instead of cloning the repository. (Note that you'll first have to enable the share "extension" for that; the word "extension" is in quotes because it's actually a part of core Mercurial and not really an extension in the traditional sense.) hg share creates a separate directory that is still linked to the same repository, but has a separate checkout with separate files.
It is also important to understand that branches in Git and Mercurial are completely different things. Branches in Git, aside from naming specific commits, exist to keep revisions alive (so they don't get garbage-collected). In Mercurial, nothing ever gets garbage-collected, so you don't need Git-like branches for that purpose; instead, Mercurial has anonymous branches (which are sort of like detached HEADs in Git -- sort of) and named branches (which are used to label sets of revisions with a permanent name). Bookmarks can be used to put temporary labels on either type of branch (or specific revisions). Bookmarks + anonymous branches can be made to feel fairly Git-like if you desire that.
So, if you want a Git-like approach, you'd just create anonymous branches (and optionally put a bookmark on them for ease of reference, though having them as branch heads in two directories created with hg share can be the better choice if you wish to work on both concurrently without having to switch. You'd then use hg push -r . to push that specific branch to the remote repository (you may also need -f if you're creating a new head).
However, if it is not your repository, you may want to check with the owner what structure they prefer; for example, plenty of Mercurial users prefer named branches in order to be able to tell later which revisions belong together (sort of like a tag that spans multiple revisions).

Mercurial Remove History

Is there a way in mercurial to remove old changesets from a database? I have a repository that is 60GB and that makes it pretty painful to do a clone. I would like to trim off everything before a certain date and put the huge database away to collect dust.
There is no simple / recommended way of doing this directly to an existing repository.
You can however "convert" your mercurial repo to a new mercurial repo and choose a revision from where to include the history onwards via the convert.hg.startrev option
hg convert --config convert.hg.startrev=1234 <source-repository> <new-repository-name>
The new repo will contain everything from the original repo minus the history previous to the starting revision.
Caveat: The new repo will have completely new changeset IDs, i.e. it is in no way related to the original repo. After creating the new repo every developer has to clone the new repo and delete their clones from the original repo.
I used this to cleanup old repos used internally within our company - combined with the --filemap option to remove unwanted files too.
You can do it, but in doing so you invalidate all the clones out there, so it's generally not wise to do unless you're working entirely alone.
Every changeset in mercurial is uniquely identified by a hashcode, which is a combination of (among other things) the source code changes, metadata, and the hashes of its one or two parents. Those parents need to exist in the repo all the way back to the start of the project. (Not having that restriction would be having shallow-clones, which aren't available (yet)).
If you're okay with changing the hashes of the newer changesets (which again breaks all the clones out there in the wild) you can do so with the commands;
hg export -o 'changeset-%r.patch' 400:tip # changesets 400 through the end for example
cd /elsewhere
hg init newrepo
cd newrepo
hg import /path/to/the/patches/*.patch
You'll probably have to do a little work to handle merge changesets, but that's the general idea.
One could also do it using hg convert with type hg as both the source and the destination types, and using a splicemap, but that's probably more involved yet.
The larger question is, how do you type up 60GB of source code, or were you adding generated files against all advice. :)