Is there any way to clone a repository from the web incrementally? - mercurial

I'm on dialup in lousy place (yes, it still happens in 2011), and trying to clone a huge repository. It starts without problem, but every time the dialup disconnects (which is unavoidable, it seems), the !#%$* hg rolls everything back and I'm left again with an empty directory.
Is there a solution other than doing it on a remote PC and then downloading the whole thing by FTP or something?

In a bash-like shell you could do something like this:
$ hg init myclone
$ cd myclone
$ for REV in `seq 10 10 100` ; do hg pull -r $REV <REMOTEREPO>; done
Starting at 10, each pull downloads the next 10 revisions, up to 100. In case of a lost connection, adjust the first argument to seq to match what you've already pulled.

Depending on how flaky your connection is, there are two options for performing initial clones.
First, you can try so-called “streaming clones”. These minimize Time To First Byte, but do generally require a bit more data to be transferred.
Here’s how to do a streaming clone:
$ hg clone --uncompressed https://~~~~
Your second option will be a hg clone –-rev operation, followed by a number of incremental pulls. This behaves similarly to cloning a repository in some distant past and doing occasional updates.
$ hg clone --rev 5 https://~~~~

Based on the suggestions here,
I created a repo that did this.
https://github.com/nootanghimire/hg-clone-bash
It's optimized for a single repo, but i guess you can fork and work on it! :)

Related

Mercurial view number of commits ahead of upstream

This is an extension of the following question:
Mercurial show number of commits ahead of "origin"
I want to find out the number of commits yet to be pushed to the remote repository without contacting the remote (So that I can add it to my prompt).
In git I can do that with:
git rev-list branchname#{upstream}..HEAD | wc -l //I am counting the number of lines to get the number of commits by which i am ahead.
The original answer advices to use:
hg summary --remote
But this contacts the remote repository and takes quite sometime, so putting it in prompt seems a bad idea.
Does anyone know if mercurial allows to do this, since the original question is quite outdated, i thought some new method or extension might have come up.
hg outgoing : log of everything that has yet to be pushed (but it does contact the remote repository)
hg log -r "draft()" : log all of the commits that are in draft phase in your repository (without contacting remote). This doesn't definitively mean that they are not in the remote repo, but it's very close.
You can use --template templates to customize the output.
Hope this helps.

How do I diff incoming changesets with Beyond Compare 4 and hg?

I have been using the mercurial and Beyond Compare 4 tools together for about 2 weeks now and feel fairly confident in my usage, however I still seem to have a problem when comparing incoming changesets against my current local codebase. The problem is emphasized when I attempting a complicated merge.
Just to clarify, I am avoiding the use of tools such as TortoiseHg,
although I do have it installed. I am searching for feedback via cmd line operations only.
My current templated method to pull down the incoming changesets via the following ( as an [alias] )
hg in --verbose -T "\nchangeset: \t{rev}\nbranch: \t{branch}\nuser: \t\t{author}\ndate: \t\t{date(date,'%m-%d-%Y %I:%M%p')}\ndescription: \n\t{desc|fill76|tabindent}\n\n{files % ' \t{file}\n'}\n----------\n"
As an example, here is a simplified (and cleverly abstracted) block returned ::
changeset: 4685
branch: Feature-WI209825
user: Jack Handy <jhandy#anon.com>
date: 01-19-2015 10:19AM
description:
Display monkey swinging from vines while whistling dixie
Zoo/MonkeyCage/Resources/Localization.Designer.cs
Zoo/MonkeyCage/Resources/Localization.resx
Zoo/MonkeyCage/Utility/Extensions.cs
If I were to be comparing changes locally, I would simply use the following command ::
hg bcomp -r 4685 -r default <optional file name>
and then I would get an instance of Beyond Compare with a folder structure and files and I could just navigate accordingly to view the changes...however, when I attempt to do this with a changeset that has yet to be pulled into my local repository, I can't.
How do I diff incoming changesets with my local repository?
---- UPDATE --------------------------------
I pursued the idea of bundling the incoming changes and then trying to use BC4 to diff the bundle to any given branch/revision on my local repo.
hg in --bundle "C:\Sandboxes\Temp\temp.hg"
This creates a compressed file archive containing all the new changes.
Now I simply need to diff this bundle with my local, however am having difficulty optimizing this. Currently, I am using variations on the following command:
hg -R "C:\Sandboxes\Temp\temp.hg" bcomp -r default
Alas, I am still having difficulty perfecting this...any insight is appreciated.
I don't see how you can, since your local repository doesn't yet have that changeset, so mercurial can't create a local copy of the revision, as it doesn't have visibility of what the change actually is.
The -p flag to hg incoming will show you the patch for each revision, but that isn't what you want.
Why not just pull the remote changes anyway? It wont hurt unless you actually update. You can then do your diff in the normal way.
hg diff is a local operation.
But you can simply call hg incoming -p in order to obtain a diff view of what you're going to pull. See hg help incoming for more options and refinement (e.g. if you need to diff against a specific rev etc)

In Mercurial, is there a way to save/commit an unfinished change but not available for others via clone/pull/push?

I often find myself in the situation that I need to switch to work on a different change before the one that I am currently working on is done. I want to find out if there's a way, in Mercurial, that I can save/commit my unfinished change, which is not available for others (ie. not clonable/pushable/pullable).
Mercurial phases may be the answer to this.
Starting with Mercurial v2.1, you can configure mq changesets to automatically be marked secret. secret changesets are ignored by incoming/pull and outgoing/push commands.
To enable this behavior, you need to add the following to your config:
[mq]
secret = True
Once enabled, it behaves as follows:
$ hg qpush --all
applying my-patch
now at: my-patch
$ hg phase -r .
16873: secret
$ hg outgoing
comparing with https://www.mercurial-scm.org/repo/hg
searching for changes
no changes found (ignored 1 secret changesets)
Take a look at the Shelve Extension. This gives you the basics and might be more than enough for what you need.
There is also the Mercurial Queues Extension, but I find this can be a little odd to work with.
As a final alternative, you could always commit your changes onto another branch so that they don't affect mainline development, but I think these may still be visible.
You can clone your repo to a new place to work on new changes. That way your pending changes are kept on your local machine and never pushed. Of course, this depends on the size of your repo. If it's too big, cloning becomes a little prohibitive.
As others have suggested, you can mark your unavailable changes to be on a private branch. When you push, you can push an explicit branch using the -b argument. So, if your private branch is TimPrivateBranch, and other changes are on default:
hg push -b default
TimPrivateBranch stays on your local computer. Of course, this requires you to remember the -b argument every time you push.
When you're done with your private branch, just merge back into default:
hg up default
hg merge TimPrivateBranch

Is it possible to clone with hgsubversion in steps?

I'm trying to clone a rather large subversion repository with hgsubversion.
hg clone --startrev 8890 svn+https://my.reposit.ory/trunk trunk_hg
After about an hour, the clone operation aborts with an out of memory message:
[r20097] user: description
abort: out of memory
Is it possible to specify an end revision for the clone operation and get the remaining revisions with a pull? Or somehow break up the clone in smaller steps?
You can specify a stop revision with -r for clone, as others have suggested. Another option (if you kept the clone where things crashed) would be to just run hg pull in the trunk_hg copy. You might have to edit/create .hg/hgrc yourself to add the [paths]\n default = svn+https://my.reposit.ory/trunk, since I think we add that at the end of the cloning process. Maybe run hg svn rebuildmeta before your pull just for good measure in case the tracking metadata for hgsubversion got hosed when the OOM happened.
I hope this helps!
http://www.selenic.com/mercurial/hg.1.html#clone
Could try using the -r <revid> flag to clone only a particular changeset. Though that may or may not work with hgsvn.
Cloning with a limited range of revisions and then pulling is the recommended method and I can confirm that it works flawlessly for svn repositories in the several GB size range.
Here is a workaround to clone the whole svn repo:
start cloning
abort it immediately (Ctrl+C in windows)
than hg pull
you've got out of memory
repeat step 3 until you check out all commits

What is the best Mercurial clone / repository strategy?

There can be:
1) just clone from remote repo as needed (each new one can take 20 minutes and 500MB)
2) clone 2 local ones from remote repo, both 500MB, total 1GB, so always have 2 local repo to work with
3) clone 1 local one from remote repo, called it 'master', and then don't touch this master, but clone other local ones from this master as needed
I started off using (1), but when there is a quick bug fix, I need to do a clone and it is 20 minutes, so then method (2) is better, because there are 2 independent local repos all the time.
But then sometimes a repo becomes "weird" because there are merges that do damages and when it is fixed on the remote repo, any local repo's merge that shows up in hg outgoing will cause damage later when we push again, so we just remove that local repo and clone from remote again to start "fresh", taking 20 minutes again. (Actually, we can use local repo 2 first, rename local repo 1 as repo_old, and then before sleep or before going home, do a clone again)
Is (3) the best option? Because on a Mac, the master takes 500MB and 20 minutes, but the other local clones are super fast and takes much less than 500MB because it uses hard link on a Mac (how to find out how much disk space without the hard linked content?). And if using (3), how do we do commits and push? Suppose we clone from remote repo to local as "master", and then clone local ones as "clone01", "clone02", 03, etc, then do we work inside of clone01, and then when an urgent fix is needed, we go to master, do an hg pull, and hg update, and go to clone02 and also do hg pull and hg update, and fix it on clone02, test it, and hg commit, hg push to the master, and then go to master, and do an hg push there? And then when clone01's project is done, again go to master, pull, update, go to clone01, pull, update, merge, test, commit, push, go to master, push to remote repo? That's a lot of steps!
Maybe a fourth option might work better in your case: Mercurial Queues that are kept in a local Mercurial repository.
Using MQ you can:
Clone the master repository locally.
Work on your code and keep your changes isolated in patches.
When new updates from upstream are available, remove your batches, apply the updates, and then re-apply your patches on top of the new update.
Once you're happy with your work, fold it into your local repository and push it upstream.
You don't have to keep the patches in a local repository, but it's a nice bonus option that is worth considering.
Chapter 12 from Mercurial: The Definitive guide explains the process in fairly good detail.
I don't know that your understanding of the space considerations are correct. When cloning a local repository Mercurial will use hardlinks for the .hg directory, the actual repository, which takes up no additional space. The working directory takes up space (though hopefully not 500GB!) but the .hg directory only looks like it does depending on the tools you use to check.
If you do a clone -U you create a clone without a working directory and it should take up almost no additional space and be created almost instantly.
I always keep a clone -U of the central repo in an unmodified state and then create clones off of that as needed. I push directly from those clones back to the remote repository.
Mercurial Queues look really powerful, but I've never given myself the time
to read all that documentation, just to be able to put my current work aside to
work an a small bug.
I use the attic extension.
It'll be like this:
...working happily, but then there is a quick bug fix...
$hg shelve work
...quickly fix the bug...
$hg ci
$hg unshelve
...continue with work
Sometimes I get an idea, but no time to really play with it. To prevent me from forgetting it.
...working happily, idea drops in...
$hg shelve work
...start a unittest for the idea or some other unfinished piece of code, enough to sketch the idea
$hg shelve idea
$hg unshelve work
...continue with work
$hg ls
idea
*C work