How long should it take to clone the cpython hg repository? - mercurial

Last night, I tried to clone the cpython hg repo, but after ~ 30 min. of waiting, I cancelled, because it didn't seem to be working. Based on process time, it seemed to be doing hardly anything. Was I simply too impatient? Or should hg clone be pretty fast?
I'd just downloaded the latest hg:
$ hg --version
Mercurial Distributed SCM (version 2.5.4+20130405)
I ran this on Mac OSX 10.8.3.
I was using a good Internet connection: Comcast Business Class over WiFi, with the wireless router under my desk.

Looks like it should take < 5 min.
$ time hg clone http://hg.python.org/cpython python-repo-2
requesting all changes
adding changesets
adding manifests
adding file changes
added 83508 changesets with 184511 changes to 9865 files (+1 heads)
updating to branch default
3677 files updated, 0 files merged, 0 files removed, 0 files unresolved
real 3m11.586s
user 1m44.192s
sys 0m6.959s
I'm pretty sure I waited longer than that last night. Maybe the repo was experiencing high traffic last night, but everything is OK today? I am using a different Internet connection today, so it could be that.
Hopefully, someone finds this one data point to be useful.

Related

mercurial: No usable temporary file name found

I have a repo on a network drive (served by Windows server), with local repos pushing/pulling to it on the various machines I work on.
I just dealt with this problem, and solved it by cloning the repo from the network drive to a local disk, pushing, then cloning it back again. The machine from which I did this had not problem pushing further changes after this.
Now I just tried pushing from my laptop, and this happens:
% hg --debug push "Z:\[main repo]"
pushing to Z:\[main repo]
query 1; heads
searching for changes
all remote heads known locally
listing keys for "bookmarks"
2 changesets found
list of changesets:
2ed25c8975482734e3b9eed828573fd711d26fd8
19a424c011ffd0c887cf1d54ed0b537a6c1af714
adding changesets
add changeset 2ed25c897548
add changeset 19a424c011ff
adding manifests
adding file changes
adding GEM.py revisions
transaction abort!
rollback completed
abort: No usable temporary file name found
[command returned code 255 Thu Mar 09 18:51:11 2017]
The only info pertaining to this error message I have found so far is this, and I definitely have no files named con.*in my project. There are several named con*.py but they have never been a problem, and both the laptop and my workstation are running Windows 7, and I've been working on this project for a few years now.
I have happily pushed from this laptop for over a year, and it was never a problem. I don't really have any good idea where to even start looking. Could it be connected to the fact that my workstation had the main repo opened at the same time? It was definitely not doing anything to it at the time.
Update:
I ran hg verify, and this is what it returns -- no problem as far as aI can tell
% hg --debug verify
repository uses revlog format 1
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
73 files, 74 changesets, 226 total revisions
[command completed successfully Fri Mar 10 08:58:02 2017]
I had faced the same error as well.
I just ran tortoise hg as as administrator and that fixed it for me
I don't have an answer yet but I would try the following:
Update to the latest mercurial version (4.1) and try again
Verify the repo integrity with hg verify
Although I understand it always worked as is, try to rename all the con.py files. The thing with CON is that it represents a device, I think it comes from DOS times :-)
If I understand correctly, you push to Z:[main repo] where Z: is a Windows share. Try to push to the same repo in another way, with ssh (requires some setup, yes)
Good luck, very bizarre problem :-/

"hg log" command is taking hours instead of seconds on new machine

The Jenkins mercurial plugin runs an hg log command at the beginning to determine which commits are new for that build. Here's an example: hg log --template "<changeset node='{node}' author='{author|xmlescape}' rev='{rev}' date='{date}'><msg>{desc|xmlescape}</msg><added>{file_adds|stringify|xmlescape}</added><deleted>{file_dels|stringify|xmlescape}</deleted><files>{files|stringify|xmlescape}</files><parents>{parents}</parents></changeset>\n" --rev pcdmis2015:0 --follow --prune 4e2c98f139772300206e87349c4d7b63e1a17d05 --encoding UTF-8 --encodingmode replace
On my old, out of warranty win7 machines, this command takes between 20 and 90 seconds to complete, depending on the machine.
But on my new win10 virtual machines, which have shown to be faster in every other regard so far, this same command in the same repository takes about 4.5 hours.
Why might this be? What could be happening that takes so long?
Is there any way overcome or ameliorate this problem?
It can be different Mercurials (standalone vs "for Pythons")
It can be different versions|configurations of Pythons (if used)
It can be damaged repo (check hg verify)
hg log --debug --time --profile will show you main time-eaters (as last resort)
Does you new machine have a virus scanner running? They intercede in all file access and a log accesses a lot of files.

mercurial: recover from deleted repository after commit, before push

I have a mercurial repository my_project, hosted at bitbucket. Today I made a number of changes and commited them to my local repository, but didn't push them out yet.
I then majorly stuffed up and fatfingered rm -rf my_project (!!!!!).
Is there some way I can retrieve the changes that I committed today, given that I hadn't pushed them out yet? I know a day's worth of commits doesn't sound like much, but it was!
All the other clones I have of this project are only up-to-date to the most recent push (which didn't include today's changes).
cheers.
mercurial cannot save you. The data from mercurial is stored in a hidden directory in the base of your project folder. In your case, probably at my_project/.hg. Your recursive delete would have trashed this folder as well.
So maybe a file recovery tool?
No. The changes are only stored in the local repository directory (the .hg directory therein) until you've pushed. They're never put anywhere else (not even /tmp).
There is a possibility that you'll be able to recover the deleted files from the disk, though; search around for instructions and tools for doing that.
I'm afraid the commit is deleted together with the working copy and file recovery tools are your only option to recover the missing .hg folder. I see you could recover the code from the install — great!
If you're afraid of this happening again, then you could install a crude hook like
[hooks]
post-commit = R=~/backup-repos/$(basename "$PWD");
(hg init "$R"; hg push -f "$R") > /dev/null 2>&1 || true
That will forcibly push a copy of all your commits to a suitable repo under ~/backup-repos. The -f flag ensures that you will push a backup even if you play with extensions like rebase or mq that modify history. It will also allow pushing changesets from unrelated repos into the same backup repo — imagine two different repos named foo. So the backup repositories will end up with a gigantic pile of changesets after a while and you might want to delete them once in a while.
I tested this briefly and for everyday work I don't think you'll notice the overhead of the extra copy and you might thank yourself later :-)

Is there any way to clone a repository from the web incrementally?

I'm on dialup in lousy place (yes, it still happens in 2011), and trying to clone a huge repository. It starts without problem, but every time the dialup disconnects (which is unavoidable, it seems), the !#%$* hg rolls everything back and I'm left again with an empty directory.
Is there a solution other than doing it on a remote PC and then downloading the whole thing by FTP or something?
In a bash-like shell you could do something like this:
$ hg init myclone
$ cd myclone
$ for REV in `seq 10 10 100` ; do hg pull -r $REV <REMOTEREPO>; done
Starting at 10, each pull downloads the next 10 revisions, up to 100. In case of a lost connection, adjust the first argument to seq to match what you've already pulled.
Depending on how flaky your connection is, there are two options for performing initial clones.
First, you can try so-called “streaming clones”. These minimize Time To First Byte, but do generally require a bit more data to be transferred.
Here’s how to do a streaming clone:
$ hg clone --uncompressed https://~~~~
Your second option will be a hg clone –-rev operation, followed by a number of incremental pulls. This behaves similarly to cloning a repository in some distant past and doing occasional updates.
$ hg clone --rev 5 https://~~~~
Based on the suggestions here,
I created a repo that did this.
https://github.com/nootanghimire/hg-clone-bash
It's optimized for a single repo, but i guess you can fork and work on it! :)

What is the best Mercurial clone / repository strategy?

There can be:
1) just clone from remote repo as needed (each new one can take 20 minutes and 500MB)
2) clone 2 local ones from remote repo, both 500MB, total 1GB, so always have 2 local repo to work with
3) clone 1 local one from remote repo, called it 'master', and then don't touch this master, but clone other local ones from this master as needed
I started off using (1), but when there is a quick bug fix, I need to do a clone and it is 20 minutes, so then method (2) is better, because there are 2 independent local repos all the time.
But then sometimes a repo becomes "weird" because there are merges that do damages and when it is fixed on the remote repo, any local repo's merge that shows up in hg outgoing will cause damage later when we push again, so we just remove that local repo and clone from remote again to start "fresh", taking 20 minutes again. (Actually, we can use local repo 2 first, rename local repo 1 as repo_old, and then before sleep or before going home, do a clone again)
Is (3) the best option? Because on a Mac, the master takes 500MB and 20 minutes, but the other local clones are super fast and takes much less than 500MB because it uses hard link on a Mac (how to find out how much disk space without the hard linked content?). And if using (3), how do we do commits and push? Suppose we clone from remote repo to local as "master", and then clone local ones as "clone01", "clone02", 03, etc, then do we work inside of clone01, and then when an urgent fix is needed, we go to master, do an hg pull, and hg update, and go to clone02 and also do hg pull and hg update, and fix it on clone02, test it, and hg commit, hg push to the master, and then go to master, and do an hg push there? And then when clone01's project is done, again go to master, pull, update, go to clone01, pull, update, merge, test, commit, push, go to master, push to remote repo? That's a lot of steps!
Maybe a fourth option might work better in your case: Mercurial Queues that are kept in a local Mercurial repository.
Using MQ you can:
Clone the master repository locally.
Work on your code and keep your changes isolated in patches.
When new updates from upstream are available, remove your batches, apply the updates, and then re-apply your patches on top of the new update.
Once you're happy with your work, fold it into your local repository and push it upstream.
You don't have to keep the patches in a local repository, but it's a nice bonus option that is worth considering.
Chapter 12 from Mercurial: The Definitive guide explains the process in fairly good detail.
I don't know that your understanding of the space considerations are correct. When cloning a local repository Mercurial will use hardlinks for the .hg directory, the actual repository, which takes up no additional space. The working directory takes up space (though hopefully not 500GB!) but the .hg directory only looks like it does depending on the tools you use to check.
If you do a clone -U you create a clone without a working directory and it should take up almost no additional space and be created almost instantly.
I always keep a clone -U of the central repo in an unmodified state and then create clones off of that as needed. I push directly from those clones back to the remote repository.
Mercurial Queues look really powerful, but I've never given myself the time
to read all that documentation, just to be able to put my current work aside to
work an a small bug.
I use the attic extension.
It'll be like this:
...working happily, but then there is a quick bug fix...
$hg shelve work
...quickly fix the bug...
$hg ci
$hg unshelve
...continue with work
Sometimes I get an idea, but no time to really play with it. To prevent me from forgetting it.
...working happily, idea drops in...
$hg shelve work
...start a unittest for the idea or some other unfinished piece of code, enough to sketch the idea
$hg shelve idea
$hg unshelve work
...continue with work
$hg ls
idea
*C work