Cloning a mercurial repository, .hgsub refers to a dead external subrepo

Cloning a mercurial repository, .hgsub refers to a dead external subrepo - mercurial

We're trying to clone a Mercurial repository A where it references a subrepository B that's moved hosts. We'd like to update .hgsub in A to point to the new location of B, but it's a chicken and egg problem if we can't hg clone A in the first place.
Does anyone know how to work around this?

$ hg help subrepos
...
Remapping Subrepositories Sources
---------------------------------
A subrepository source location may change during a project life,
invalidating references stored in the parent repository history. To fix
this, rewriting rules can be defined in parent repository "hgrc" file or
in Mercurial configuration. See the "[subpaths]" section in hgrc(5) for
more details.
$ man hgrc
...
subpaths
Defines subrepositories source locations rewriting rules of the form:
<pattern> = <replacement>
Where pattern is a regular expression matching the source and replacement is the replacement string used to
rewrite it. Groups can be matched in pattern and referenced in replacements. For instance:
http://server/(.*)-hg/ = http://hg.server/\1/
rewrites http://server/foo-hg/ into http://hg.server/foo/.
All patterns are applied in definition order.
...
So, you can do it in .hgrc in a [subpaths] section.

First note that clone is init + pull + update and that subrepo cloning is part of the update step, not the pull step. This means that you can avoid clone failing simply by skipping the update step:
$ hg clone -U <url>
Now the problem is reduced to "how do I update to a revision with a problematic .hgsub/.hgsubstate file?" There are two possibilities here:
remap subrepos using the [subpaths] feature (see hg help subrepo and hg help config)
manual update and repair
A "manual update" can be done like this:
$ hg revert -a -r default -X problematic-file
[adding a bunch of files]
$ hg debugrebuildstate -r default
Now you can manually fix-up your subrepos and .hgsub and commit. Be sure to test your fix with a clone before pushing it.
Also, see this mailing list thread on the topic: http://markmail.org/thread/ktxd2rsm7avkexzr

It could be easier to tamper with DNS as a quick workaround (e.g. hosts file on Windows) and then fix .hgsub.

Related

Add a parent to the original changeset in Mercurial

I have a project with 24 months of source control history in a Mercurial repository.
I've recently found some old tarballs of the project that predate source control, and i think they would be useful to import into the repository as "pre-historic" changesets.
Can i somehow add a parent to my initial commit?
Alternatively, is it possible to re-play my entire repository history on top of the tarballs, preserving all metadata (timestamps etc)?
Is it possible to have the new parent commits use the timestamps of these old tarballs?

You can use the convert extension to build a new repository where the tarballs are imported as revisions before your current root revision.
First, you import the tarballs based on the null revision:
$ hg update null
$ tar -xvzf backup-2010.tar.gz
$ hg addremove
$ hg commit -m 'Version from 2010'
$ rm -r *
$ tar -xvzf backup-2011.tar.gz
$ hg addremove
$ hg commit -m 'Version from 2011'
I'm using addremove above to give Mercurial a chance to detect renames between each tarball (look at the --similarity flag to fine-tune this and use hg rename --after by hand to help Mercurial further). Also, I remove all the files in the working copy before importing a new tarball: that way the next commit will contain exactly the snapshot present in the tarball you unpack.
After you've imported all the tarballs like above, you have a parallel history in your repository:
[c1] --- [c2] --- [c3] ... [cN]
[t1] --- [t2] --- [tM]
Your old commits are c1 to cN and the commits from the tarballs are t1 to tM. At the moment they share no history — it's as if you used hg pull -f to pull an unrelated repository into the current one.
The convert extension can now be used to do a Mercurial to Mercurial conversion where you rewrite the parent revision of c1 to be tM. Use the --splicemap flag for this. It needs a file with
<full changeset hash for c1> <full changeset hash for tM>
Use hg log --template '{node} ' -r c1 -r tM > splicemap to generate such a file. Then run
$ hg convert --splicemap splicemap . spliced
to generate a new repository spliced with the combined history. The repository is new, so you need to get everybody to re-clone it.
This technique is similar to using hg rebase as suggested by Kindread. The difference is that convert wont try to merge anything: it simply rewrites the parent pointer in c1 to be tM. Since there is no merging involved, this cannot fails with weird merge conflicts.

You should look at using rebase. This can allow you to make the changes the 2nd changeset on your repo ( you have to rebase from the 1st ).
https://www.mercurial-scm.org/wiki/RebaseExtension
However, note that if there are other clones of this repo existing ( such as for fellow developers, or on a repo server ), you will have issues with them pulling the revised repo. You will probably have to co-ordinate with the owners of those clone's to get all work into a single clone, rebase that clone, and then have everyone re-clone from the revised clone. You will also have to change the phase the of the changesets.
https://www.mercurial-scm.org/wiki/Phases
Honestly though, I would just add them to your 'modern-day' repo, I don't think making them pre-historic would give you any notable advantage over adding them to the top.

mercurial: any command or python api to get repository name

Is there any Mercurial command or Python API that could yield the repo name? This will help developing cross-repo scripts.
The only related solution that I found is to parse the .hg/hgrc [paths] section 'default' config option.
[paths]
default = ssh://server//path/tools
There must be a more elegant solution, I think.

There is no real concept of a "repository name" in Mercurial (a repository doesn't "know" or care about its own name). I think you mean "last past component of the default pull path"?
If so, then parsing the output of hg path default would be the most direct way to get that information.
However, you should note that the default path can (and often is) changed: think of cloning a local clone time for testing:
$ hg clone http://server/lib-foo
$ hg clone lib-foo lib-foo-test
$ hg clone lib-foo-test lib-foo-more-testing
The lib-foo-more-testing clone has a default push path back to lib-foo-test.
This means that parsing hg paths default wont be much more reliable than using basename $(hg root) — both can be completely different from the (base)name of the repository that was originally cloned.
If what you really want is to get an "identity" for a repository, then you should instead use
$ hg log -r 0 --template "{node}"
The first changeset hash in a repository will normally uniquely identify the repository and it will be stable even when clones change names. (If a repository has two or more roots, then the zeroth changeset can in principle differ between clones. People will have to actively try to make it differ, though.)

If you want to get last segment of path for remote default alias, processing output of hg path default will be better choice
If you want to get local directory name of you mercurial repository, I haven't good solution, except checking code of Notify extension (in which, after some tricks, you can get project-name)

Get tip changeset of remote Mercurial repository

My .hg/hgrc file has the line:
default = http://some/remote/repository
Is there a quick command to print the tip revision of that repository (which may or may not be inside my local repository)?

You can use the identify command like this:
$ hg identify $(hg paths default)
This is one of the few commands that can operate on a remote repository. If you need more information about the remote repository, then I suggest you take a look at hg incoming.

The following returns the latest changeset number (tip) of a remote repository:
hg identify --id http://www.myrepo.com

hg id default
This is a shorter form of "hg identify $(hg paths default)".

Mercurial suprepositories

I have got a question regarding suprepositories. Our project is set up like this:
+ projectA
+ some files
+ dependencyA
+ some files
dependencyA is a subrepository. It was created this way:
cd projectA
mkdir dependencyA
cd dependencyA
hg init
hg pull ssh://hg#somerandomiphere/dependencyA
cd ..
echo dependencyA = ssh://hg#somerandomiphere/dependencyA > .hgsub
hg add
hg commit
hg push
If I make changes to the suprepository, then commit and push them from main project. Both of them will be pushed to the server since its recursive. Now my colleague wants to pull changes from the server. But since nothing was changed in the main project, it wont work. But if I change something in the main project and push it to server. Upon hg pull he will get the newest changeset and if he does hg update then, it will update the subrepository as well. This is expected behaviour.
Now my question would be, if there is a way to pull changes, but only for subrepository without making a new clone of it or what would be the best way to do it.

Subrepository in Mercurial wiki, p. 2.5 "Pull"
The 'pull' command is by default not recursive. This is because
Mercurial won't know which subrepos are required until an update to a
specific changeset is requested. The update will pull the requested
subrepositories and changesets on demand. To get pull and update in
one step, use 'pull --update'.
Note that this matches exactly how 'pull' works without
subrepositories, considering that subrepositories lives in the working
directory:
'hg pull' gives you the upstream changesets but doesn't affect your working directory.
'hg update' updates the contents of your working directory (both in the top repo and in all subrepos)
It might be a good idea to always pull with --update if you have any
subrepositories. That will generally ensure that updates not will miss
any changesets and that update thus not will cause any pulls. If the
pull with update fails due to crossing branches then 'hg update' must
be used to get all the subrepository updates.

What was suggested above works like I thought it would. The real problem was my way of creating a subrepository.
Instead of:
cd projectA
mkdir
dependencyA
cd dependencyA
hg init
hg pull ssh://hg#somerandomiphere/dependencyA
It should have been a simple:
hg clone ssh://hg#somerandomiphere/dependencyA dependencyA
As we know .hgsusbtate will lock the subrepo on specific revision after commit. This is what happened, but (!) doing hg pull in subrepository ended with an error
paths cannot contain dot file components
So this means my subrepo was locked on the revision it was updated after commit and it could not pull changes from its repository due to the error shown above. Why this happened is explained pretty well in this accepted answer.
Solution:
cloning is the way to go

Mercurial `hg clone` but ignoring all subrepos?

Is there a way to clone a repo that comes with subrepos, but without having Mercurial pull all the subrepos?
It appears that while hg clone -U can be used to obtain an empty clone of a repo, there's nothing that would convince hg update to avoid starting off by pulling all of the subrepos.
I should point out that it is crucial to retain the ability to easily sync to the head revision after creating such a clone.

This should do what you want:
REM Take a new clone, but do not update working directory
hg clone --noupdate %REPO_PATH% %DESTINATION%
REM Update working directory but exclude the certain subprojects
hg revert --all --rev %BRANCH% --exclude %SUBREPO_PATH_1% --exclude %SUBREPO_PATH_2%

This answer may add more than the question required, but provides some valuable notes on working with Mercurial when you can't update do to a bad subrepository path or revision.
Step 1: Clone the repository without any updates
hg clone --noupdate source_repository destination_repository
Step 2: Use revert to get the right files
hg revert --all --rev revision_number --exclude subrepo_1 --exclude subrepo_2 ...
At this point, you have a new changeset; you may need to make sure the parent revision is correct. When I did this, my new changeset's parent was changeset 0. To fix this I had to set the parent changeset AND switch branches (since my changeset was on a different branch).
Step 3: Change the parent of the current changes
hg debugsetparents revision_number
hg branch branch_name
That should do it.

Found a hacky way. It still requires all subrepos to be checked out once, but afterwards they can be deleted.
Clone the whole lot, including subrepos. No way around this.
Delete subrepos
hg remove .hgsub
I tried to convince Mercurial to hg remove .hgsub before the subrepos are cloned, but the best I got is not removing .hgsub: file is untracked.

If you have a subrepo, a working directory must include some version of that subrepo. That version may be a fixed older revision if specified, or the tip if not.
You cannot update your repo without getting the subrepos; if you had a complete working dir without them, you shouldn't be using subrepos - use truly external repos instead.
If your subrepos are pegged against a certain remote version, then updates after the first will not trigger a subrepo update - they're already up-to-date. But for the initial creation of the working directory, you will have to do a remote pull.
You can trick Mercurial by munging the hgsubstate file. But really, your model and the conceptual model differ, so you're probably not a good match for subrepos if this is a concern.
edit: If you find yourself cloning and then updating to the tip many times, try using local branches or mq instead. That way you only have to do the initial clone once.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008