On converting from cvs to mercurial - mercurial

I've got a gigantic CVS repository that I'd like to convert to a few hg repos. This question is similar, but I'm using hg convert and I've got more than one directory in cvs that should go in a given hg repo. Here's an example:
/sw
.../dir1
.../dir2
.../dir3
.../dir4
.../dir5
I want dir1, dir2, and dir4 in my hg repo, but dir3 can rot with dir5 in a separate one that nobody will ever use.
I've been converting the whole thing with hg convert --branchsort sw where sw is a sandbox checkout containing only the directories I care about. That nets me a 1.7GB hg repo with all 4 directories. The CVS repo is 2.3GB, but a sandbox is only 159MB. The hg repo has history going back to 1997, which is awesome, but some of the stuff in there is from products that have been discontinued. They don't need to be in a regular developer sandbox.
So, is there a way to cherry pick CVS directories to go into my new hg repository?

You should follow my advice on that question to which you linked. Convert everything together and then use hg -> hg converts to pick the parts you want in each repo using a filemap. Any revisions that would be empty (have only files that you're excluding in that conversation) will be omitted.
Also hg to hg converts are much faster, so you can do the slow CVS convert only once and then do the hg to hg parts again and again until you're happy with the results.

Do you have access to the CVS repository storage?
If yes, then you should be able make a complete copy of the repository, remove any files/directories that you don't want to convert, and then run hg convert with a working directory checked out from this local repository.
I say should because I haven't tried it, but since CVS stores revision history on a per-file basis, there's no reason I can think of that it shouldn't. It would not work with a SCM like Subversion, which stores multiple changes in a single revision.

Related

Accidentally committed a large amount of raw data in Mercurial, how do I keep it from overloading my bitbucket repository?

I copied a large amount of data from my labs file server to my laptop, then accidentally committed it to the Mercurial repository I'm using to back up my thesis.
Now I have 200+ MB of data I don't need taking up space on my hard disk, even after I've deleted the files. This isn't really a problem, but the bitbucket repository I sync to only gives me 1 GB of space, which I will need for my data.
Stupidly I then deleted the files and committed again, so I can't just use rollback as described in https://stackoverflow.com/a/3290523/961959 without creating a new repository and having to reset up everything with bitbucket.
Is there a way I can fix this, so I don't waste a quarter of my repository space? Can I rollback the last 2 commits I did?
You can:
Use hg strip to remove the changeset where you've added the files and all of its
descendants (note that this will obliterate them completely, so only do it if these files are the only thing committed during this time).
Import those changesets to MQ and edit them.
Use hg convert from Mercurial to Mercurial with --filemap option.
Clone your repository up to faulty changeset and push from this one instead, as described in FAQ.
I'm going to describe what I would do if I wanted to roll back two most recent commits.
I assume that you haven't pushed to Bitbucked yet. I also assume you have these changesets on top of your DAG:
o C: Deleted files FILE1 and FILE2
|
o B: Some nice changes; also added large files FILE1 and FILE2
|
o A: Some good changeset
Here, C should be removed altogether, B should be edited (FILE1 and FILE2 additions should be rolled back), and A and below should be left as they are.
Caveat: this only works if B and C were not pushed onto Bitbucked (or any other public repo) yet.
You'll need to enable the MQ extension to do this. To do this, add these lines to the .hg/hgrc file in your repo:
[extensions]
mq=
Steps
First, I strip C:
$ hg strip C
Now, C is obliterated, and the parent is B. I import B as a Mercurial Queues patch to edit it:
$ hg qimport -r B -n B.patch
Now I have one patch on top our queue, which was created from B. I remove the big files, and refresh the patch:
$ hg forget FILE1 FILE2
$ hg qrefresh
Now B.patch no longer includes the big files. However, they are still on the disk, albeit not controlled bu Mercurial. I now finish my work with MQ:
$ hg qfinish -a
Here's what I have at the moment:
o B1: Some nice changes, no big files here
|
o A: Some good changeset
In case the changes are already pushed to BitBucket, it does offer an option to strip changesets from the server. After performing the strip locally, you should access the url:
https://bitbucket.org/<user>/<repo>/admin/strip
It'll offer an option to strip all changes following a specific commit.
NOTE: There used to be a link to reach that URL in the repo config menu. Now the only way to access it seems to be typing it directly.

It it possible to update specific directory to old revision in Mercurial?

I tried to update some files to old revision in many ways, but I haven't found yet.
(not permanently, just temporarily updating for testing)
For example, the following is OK in SVN.
svn up -r 100 foo.cpp
U foo.cpp
But in Mercurial, 'up' command doesn't permit file name argument.
Only is it possible to update entire source tree in Mercurial?
You'd have to use hg revert:
hg revert -r 100 foo.cpp
Note that this gives you local changes, as can be seen by running hg diff.
See hg help revert for more info.
This is fundamentally disallowed by Mercurial and other DVCSs. Both CVS and Subversion track which revision you have checked out on a per-file basis. You could have r1 of file x and r2 of file y. In a DVCs the entire repository is at a single version, which in Mercurial you can see with hg id.
As #Tom points out you can have modified files from different revisions, but if you want to see another revision without changes showing up you need to do the update in another clone (which given that local clones use hard links to be (a) instant and (b) space efficient) that's not much of a hassle.

cloning a part of the repository

Is it possible to clone a part(a single folder or even a single file) of a repository?
Basically it's not possible, there is nothing like Subversion's svn checkout http://example.com/project/dir1.
But you can get a partial clone by rewriting the changeset history with hg convert. On the upside, it will be a partial clone. On the downside, the resulting repository will be not related any more to the initial one. The changeset IDs will be different and it will be very hard to continue interacting with the source repo.
An example of creating a partial clone. Suppose you want to clone only the doc directory from the repo:
$ hg clone http://example.com/project local-project-repo
$ cat > filemap.txt << END
include doc
exclude .
END
$ hg convert --filemap filemap.txt local-project-repo docs-only-repo
Nope. That's called partial cloning (some file paths but not all) or shallow cloning (some revisions but not all), and not provided because the point of a DVCS is that everyone has a full copy of the full repository.
Some online repositories will let you download .tar.gz files of all the files in a specific revision or a specific file from a specific revision, but that's not done using the Mercurial tool.

Transplanting mercurial changeset onto hgsubversion converted repository

I have been migrating from subversion to mercurial piecemeal and this has created a bit of a tangle. I had an old SVN server (pre 1.4) to start with so here is what happened.
HgSubversion did not want to pull in the full history from trunk, so I did a shallow convert.
My colleagues did their last commits to SVN and I pulled those into Hg.
We moved over to Hg and started pushing to it
Just to be safe one of my colleagues made some more commits to SVN.
I managed upgrade the SVN server and get the full repo to the new SVN.
HgSubversion pulled in the full history successfully - including the few extra/duplicate commits.
Now I would like to "transplant" the commits in the shallow mercurial repository into the full history, the repositories are related in content but unrelated in mercurial hashes.Short of just copy pasting content what would be the best way to migrate the changes ? Eventually everybody should be able:
To switch the Hg repo with full history and keep working
Have automated push from the Hg repos of suitable squashed/rebased changesets to SVN as a service user.
I would like a concrete example with the following scenario.
Last mercurial hash at step 2 - A
Current mercurial hash after pushes step 3 - B
Hash of last commit after pull from upgraded SVN - C
Hash of step 2 commit after full SVN history pull - D
I am not very good at ASCII art, feel free to add one for bonus points.
We did a CVS -> Mercurial migration last week at work. Like in your case, some people continued to use the CVS for a time.
In order to sync the the two repositories when the CVS server was finally shutdown, I did the following :
Convert the CVS to mercurial via cvs2hg
Export each new revision with hg export
Import the patch files with hg patch
There was only a dozen revisions, so this was no big deal... If you have many more revisions, you could maybe take a look at the transplant extension.
Hope this helps.

Using mercurial to merge a cvs branch

Currently my company is using cvs for version control. We have an old branch of code which has been used specifically for one client (don't ask) that we'd like to merge to the head.
Due to the delta between this branch and the head I think the merging capabilities of mercurial should make my job a bit easier. My line of reasoning is:
Create mercurial repositories of the branch and the current head.
Do a merge of the branch repo to the trunk repo.
At this stage I'm expecting mercurial to provide better merge support than cvs.
Then I would commit my changes to the trunk repository back into cvs.
Is this approach sound? Will this strategy result in a less painful merge as I think it will, or is there something I'm missing?
The reason Mercurial merges better than CVS or Subversion is because it tracks the most recent common ancestor of the two heads/branches, so you'll to make sure you're providing that information accurately or you'll probably end up with a worse merge.
If you do something like this:
hg init newrepo
cd newrepo
cvs checkout POINT_OF_DIVERGENCE
hg commit --addremove -m 'commiting point of divergence'
cvs checkout BRANCH
hg addremove --similarity 90 # let Mercurial discover any renames
hg commit -m 'committing branch head'
hg update -r 0 # jump back to point of divergence
cvs checkout HEAD
hg addremove --similarity 90 # let Mercurial discover any renames
hg commit -m 'commiting HEAD head'
hg merge
then the common ancestor for the two heads will be the point of divergence (sorry if the CVS commands are wrong, it's been a mercifully long time).
The trouble is knowing where the POINT_OF_DIVERGENCE is in the CVS repo -- CVS doesn't keep track of that at all, which is why its own merge doesn't use it.
CVS best practices always suggest that whenever you branch you create a tag that marks the point of divergence, but if you didn't do that then there's some tricky hunting ahead of you.
Without a correct most recent common ancestor the Mercurial merge won't be any better than CVS's.
Yes, that should work but there might be tiny little problems along the way (like how to get the source from CVS into Mercurial: Using hg convert or doing a manual import of a certain revision, etc).
Please create a new question if you run into a problem.
The built-in convert extension won't do well with CVS except for very simple histories. Use the fork of cvs2svn called cvs2hg
http://hg.gerg.ca/cvs2svn