We discovered that our Hg repository got corrupted several weeks ago. This corruption seems to have propagated: all clones (the central repo and the user repos) are corrupted, pretty badly, in the same way. I think we could have prevented it if we did verification at that time.
Is there some Hg setting that would cause verification on each push, and prevent push in case of verification failure? I know I could implement it as a hook in Python, but is there maybe a simpler solution?
Is it also possible to do the opposite: make sure the remote repository is verified before pulling?
FWIW, I am on Windows 10 and we are using TortoiseHg.
Update: I've tried creating hooks as suggested by Jordi. Hg now hangs waiting for locks. Here is what I see:
c:\Users\username\test-hook>hg init
c:\Users\username\test-hook>cd ..
c:\Users\username>hg clone test-hook test-hook-clone
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
# At this point I edited clone repository settings to include
# [hooks]
# preoutgoing = hg verify
#
# Then I created a test.txt file and "added" it via TortoiseHg context menu.
c:\Users\username\test-hook-clone>hg commit
c:\Users\username\test-hook-clone>hg status
c:\Users\username\test-hook-clone>hg outgoing
comparing with c:\Users\username\test-hook
searching for changes
changeset: 0:a61d33af6cdb
tag: tip
user: username
date: Mon May 06 20:32:54 2019 +0200
summary: test file added
c:\Users\username\test-hook-clone>hg push -verbose
pushing to c:\Users\username\test-hook
searching for changes
running hook preoutgoing: hg verify
waiting for lock on repository c:\Users\username\test-hook-clone held by process '16840' on host 'LT407233'
To answer your question, the hook does not have to be written in Python. In the appropriate server's hgrc (either at the repository level or the system level), simply set
[hooks]
preoutgoing = hg verify
preincoming = hg verify
This may significantly slow down all pull and push operations, but perhaps you are willing to sacrifice speed for correctness.
This will result in output like this when a client tries to pull from a corrupt repo:
$ hg clone http://localhost:9000 sample-repo
requesting all changes
remote: abort: preoutgoing hook exited with status 1
and in your server logs, you should see output similar to
127.0.0.1 - - [18/Apr/2019 12:41:09] "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=lheads+%3Bknown+nodes%3D x-hgproto-1:0.1 0.2 comp=zstd,zlib,none,bzip2 partial-pull
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
a#0: broken revlog! (index data/a.i is corrupted)
warning: orphan data file 'data/a.i'
checked 2 changesets with 1 changes to 2 files
1 warnings encountered!
1 integrity errors encountered!
(first damaged changeset appears to be 0)
You can enable a server side option to perform more validation all incoming content, just set the server.validate=yes option on your server.
The simplest way is to enable it in your server global hgrc or in the repository .hg/hgrc file by adding the following two lines:
[server]
validate = yes
This is a server option but you can also use it on the client. It should validate pull too.
(By the way, what kind of corruption are you seeing?)
Related
I try to clone but I get a rollback. I could clone at another computer previously but now I get a rollback and I don't know why:
C:\Users\Niklas\montao>hg clone https://niklasr#bitbucket.org/niklasr/montao
http authorization required
realm: Bitbucket.org HTTP
user: niklasr
password:
destination directory: montao
requesting all changes
adding changesets
adding manifests
adding file changes
transaction abort!
rollback completed
abort: connection ended unexpectedly
C:\Users\Niklas\montao>
Currently I'm just trying to do it over again but I suspect that it wil ltime out, can you tell me how to debug more what's happening and possibly resolve the issue? I ran it in debug mode and this is what happens.
adding google_appengine/lib/django_1_3/django/contrib/localflavor/locale/mn/LC_M
ESSAGES/django.mo revisions
files: 10223/50722 chunks (20.15%)
transaction abort!
Your TCP connection to bitbucket is dying before the whole repo is downloaded -- probably a flaky net connection or a full disk. If it's the former you can do it in small chunks using -r like this:
hg init montao
cd montao
hg pull -r 50 https://niklasr#bitbucket.org/niklasr/montao # get the first 50 changesets
hg pull -r 100 https://niklasr#bitbucket.org/niklasr/montao # get the next 50 changesets
...
That should only be necessary if something's wrong with your network route to bitbucket or the repository is incredibly huge.
An easier syntax compared to Ry4an Brase's answer:
hg clone -r 1 https://niklasr#bitbucket.org/niklasr/montao # get the first 1 changeset
cd montao
hg pull -r 50 # first 50 changesets
hg pull -r 100 # first 100 changesets
...
hg pull # all remaining changesets
hg update # create working copy
If you're using TortoiseHg Workbench, I found checking "Use compressed transfer" under Options in the Clone dialog worked for me.
Hosting Mercurial on a windows box thru IIS.
I have a root directory where I put all of my repos
d:\repos
- ProjectA
- .hg
- hgrc
- ProjectB
- .hg
- hgrc
- ProjectC
- .hg
- hgrc
All of the repos' hgrc files setup the notify extension with:
config =d:\hg\Repositories\NotificationList.txt
That way I have a single file to manage all of the notification recipients, like the wiki describes:
https://www.mercurial-scm.org/wiki/NotifyExtension
But the wiki makes mention of controlling that NotificationList.txt file thru it's own repository? How can I do that? If I create a separate repo at d:\repos\HgNotify and have the NotificationList.txt file in there, users can change, commit and push back, but when the push occurs, NotificationList.txt does not get updated on the hg server.
Is there a way to update that file somehow? Am I missing a key setup on my Hg server? Or do I need to use a post-push hook to deploy that file?
Update 1
I added the details from Tim's answer and I kept getting HTTP 500: Server Error on the push. I finally figure out how to trace the python calls (python -m win32traceutil), and here is what seems to be the problem:
File "C:\Python27\lib\site-packages\mercurial\util.py", line 402, in hgexecutable
exe = findexe('hg') or os.path.basename(sys.argv[0]) AttributeError: 'module' object has no attribute 'argv'
It doesn't seem to be able to find hg.exe.
Update 2
I installed TortoiseHg and rebooted the system. Now I get:
emote: added 1 changesets with 1 changes to 1 files
remote: notify: sending 1 subscribers 1 changes
remote: warning: changegroup.update hook exited with status 1
So that makes be think that it has found the hg.exe, but it is not doing its job, because the file does not get updated
Update 3
Found my solution here: https://stackoverflow.com/a/8023594/698
The command line I ended up using was:
changegroup = cmd /c hg update
I also added:
[ui]
debug=true
To my hgrc. Those two combined gave me a lot more meaningful messages. In the end I saw "Access denied". I gave Users full permission, but I am not sure why giving IUSR full permission didn't work. Something I'll have to dig into at a later time.
On the server repo containing your notification list you need to add a changegroup hook:
[hooks]
changegroup.update = hg update -C
or if you want to ensure the repository is always clean:
[extensions]
purge =
[hooks]
changegroup.update = hg update -C && hg purge --all
Is it possible to backup a filesystem with many Mercurial repositories (e.g., with rsync on the filesystem) and have the backup in an inconsistent state?
The repositories are served by ssh and serves this set of requests: {push, pull, in, out, clone}. It does not have 'hg commit' applied to it directly (which has a known race condition).
Mark Drago is correct that Mercurial writes its own files in a careful order to maintain integrity. However, this is only integrity with regard to other Mercurial clients. The locking design in Mercurial allows one Mercurial process to create a new commit by writing files in this order:
filelogs (holds compressed deltas for all revisions of a given file)
manifest (has pointers back to the filelogs associated with a given changeset)
changelog (has metadata and a pointer back to the manifest for the changeset)
while other Mercurial processes will read the files in this order
changelog
manifest
filelogs
The reader will thus not see a reference to the new filelog data since the changelog is updated last in an atomic operation (a rename, which POSIX requires to be atomic).
A backup program will not know the correct order to read the Mercurial files and so it might read a filelog before it was updated by Mercurial and then read a manifest after it was updated:
rsync reads .hg/store/data/foo.i
hg writes .hg/store/data/foo.i
hg writes .hg/store/00manifest.i
hg writes .hg/store/00changelog.i
rsync reads .hg/store/00manifest.i
rsync reads .hg/store/00changelog.i
The result is a backup with a changelog that points to a manifest that points to a filelog revision that does not exist --- a corrupt repository. Running hg verify on such a repository will detect this situation:
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
foo#1: f57bae649f6e in manifests not found
1 files, 2 changesets, 1 total revisions
1 integrity errors encountered!
(first damaged changeset appears to be 1)
This tells you that the manifest of revision 1 refers to revision f57bae649f6e of the file foo, which cannot be found. It is possible to repair this situation by making a clone that excludes the bad revision 1:
$ hg clone -r 0 . ../repo-fixed
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cd ../repo-fixed
$ hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
1 files, 1 changesets, 1 total revisions
So, all in all, it is not that bad if you use a general backup program to backup your Mercurial repositories. Just be aware that you might have to repair a broken repository after you restore it from backup. The changeset you lose will most likely still be on the developer's machine and he can push it again after you repair the restored repository. The Mercurial wiki has more information on repairing repository corruption.
The completely safe way to backup a repository is of course to use hg clone, but it might not be practical to integrate this with a general backup strategy.
Why don't "backup" it with just hg clone? ;-)
The short answer is: You can copy (cp, rsync, etc.) a mercurial repository without problems.
The longer answer is: https://www.mercurial-scm.org/wiki/Presentations?action=AttachFile&do=get&target=ols-mercurial-paper.pdf (in particular section 5, sub-heading "Committing Changes").
Mercurial writes out changes in an order that makes it safe for any other process to read a mercurial repository at any time. If you copy a repository to some other location while a change is being made to the repository, you'll get some of the new data, but mercurial is smart enough to ignore partially written commits. When you use the copy you made as a mercurial repository you will either see the new commit or not, there will not be any corruption.
I just did hg pull on a repository and brought in some changesets. It said to run hg update, so I did. Unfortunately, when I did that, it failed with the following error message:
abort: integrity check failed on 00manifest.i:173!
When I run hg verify, it tells me there are a number of issues with things not in the manifest (with some slight path obscuring):
>hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
somewhere1/file1.aspx#172: in changeset but not in manifest
somewhere2/file1.pdf#170: in changeset but not in manifest checking files
file3.csproj#172: ee005cae8058 not in manifests
somewhere2/file1.pdf#171: 00371c8b9d95 not in manifests
somewhere3/file1.ascx#170: 5c921d9bf620 not in manifests
somewhere4/file1.ascx#172: 23acbd0efd3a not in manifests
somewhere5/file1.aspx#170: ce48ed795067 not in manifests
somewhere5/file2.aspx#171: 15d13df4206f not in manifests
1328 files, 174 changesets, 3182 total revisions
8 integrity errors encountered!
(first damaged changeset appears to be 170)
The source repository passes hg verify just fine.
Is there any way to recover from an integrity check failure or do I need to re-clone the repository completely from the source (not a huge issue in this case)? What could I have done to cause this, so I don't do it again?
Well, since the first damaged changeset is 170, you could clone your local repository to 169 and then pull from the source. That means only pulling 5 changesets.
hg clone -r 169 damagedrepo fixedrepo
cd fixedreop
hg verify
And then:
hg pull originalsource
As for manual recovery of repository corruption, this page expounds on that better than I can. See section 4:
I have found corruption once in a while before, and although the above
documentation says it is usually from user error, my instances were on
removable USB drives with empty working directories. Sometimes things
just don't get written correctly or are interfered with somehow: it's
not always user error. But I always have multiple copies I can reclone
from so I've been able to get away with basic fixing.
If the simple fix of a partial local clone and pulling from the server doesn't fix it, you're down to 2 options after backing up your changes (if any) to a bundle or patches:
Manually hacking at Mercurial's files.
Doing a new full clone from the server. Usually the easier and faster of the two.
Beware: This method will change all hashes.
Actually there is another way to recover the repository when it is corrupted like this -
You can do a complete rebuild of the repository by using the convert extension. See Section 4.5 on https://www.mercurial-scm.org/wiki/RepositoryCorruption#Recovery_using_convert_extension
First enable the convert extension by adding the following to your ~/.hgrc file
[extensions]
convert=
Then convert the bad repo to create a fixed repo:
$ hg convert --config convert.hg.ignoreerrors=True REPO REPOFIX
This worked for me when I had the experience of suddenly finding that there were missing files in the manifests - "error 255".
Try remove your file 00manifest.i from repo and next use hg remove 00manifest.i and hg commit commands. Worked for me.
What we ended up doing was making a new copy of our 'central' repository, deleting the .hg folder in this copy, creating a new repository there (hg init), and then working with this as the central repository.
Be aware however this is only an appropriate solution if you don't need your changeset history other than as a reference (which we don't). You can still use your old central repository for this purpose.
When remotely updating a Mercurial Repository, I am getting the following error from the hg update command that is being run on the remote server after the push. I looked around online for some help for this however was unsuccessful in finding anything useful. At this point, I am hoping for some ideas and / or insight as to what would be causing this problem.
The error is just below. It occurred when pushing two changesets. One changeset included an unrelated index.html file change. The other changeset was a merge, which included the index.html change as well as the renaming of the two image files.
levinaris#server01:/home/web/repository$ hg push
pushing to ssh://10.10.1.12//home/web/repository
searching for changes`remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 2 changesets with 1 changes to 1 files
remote: abort: Operation not permitted: /home/web/repository/html/images/image.gif
remote: warning: changegroup hook exited with status 255
Additional details:
Both images are 10385 bytes in size. (yes, this error occurs on two images I have)
The two images had their names changed in changesets that were already pushed and hg updated due to case-folding collisions when attempting to pull the repositories down to Windows PCs.
The target server has the following hook in /etc/mercurial/hgrc:
[hooks]
changegroup = hg update
As a work-around, I did the following:
Deleted image.gif.
Deleted another image file that produced the error.
Ran hg update - success!
Ran hg revert html/image/image.gif
Ran hg revert html/image/otherimage.gif
At this point, I am trying to better understand the cause of this problem, so that I can ensure a solid, easy-to-use implementation in my environment. I really appreciate your help!!
After using hg --debug update in the hook, I received this output:
levinaris#server01:/home/web/repository$ hg push
pushing to /home/web/staging/repository
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 1 changes to 1 files
resolving manifests
overwrite False partial False
ancestor 58a5edb95c9a local 58a5edb95c9a+ remote 3aafb97b148c
searching for copies back to rev 6
html/index.php: remote is newer -> g
html/images/otherimage.gif.casefolding: update permissions -> e
html/images/image.gif: update permissions -> e
abort: Operation not permitted: /home/web/staging/repository/html/images/image.gif
warning: changegroup hook exited with status 255
Additional Permission Information:
All 3 files in the 2 changesets have 775 permission with the webuser:dev user:group.
My Global hgrc file has the webuser trusted
[trusted]
users = webuser
Is it possible that the permissions that file on the server were such that it couldn't be overwitten by the person doing the push?
If, for example, two different people have done that push (and thus update) the second person isn't going to be able to overwrite the files created by the first person's push triggered update.
Maybe try changing the hook to this for a test (you don't actually have those single quotes on your hook, right?):
[hooks]
changegroup = hg --debug update
If it is a permissions issue the usual fix is to put everyone who will be pushing and updating into the same group (I call mine 'hg') and then using the sticky group bit on all the directories in the repo to make sure new files have that group.