Why does hg purge abort complaining about *.* - only the first time? - mercurial

I'm trying to track down why Mercurial is crashing during a purge operation on a Win7 machine. It fails with this message:
abort: The system cannot find the path specified: 'C:\working\app/Build/deploy/x86/*.*'
after which the hg process terminates.
The path C:\working\app/Build/deploy/x86/ is a valid, existing folder (despite the strange variation in slashes).
HG is run as follows from c:\working as the current directory:
hg purge --config extensions.purge= --all --dirs --files
At first I thought that there could actually be a file named "*.*" but that is not the case.
If I run it a second time, immediately afterwards, it works. (This is unsatisfactory, however, because the first failure causes an automated build script to fail.)
I also tried to look for NTFS hard links which could be malformed, maybe having *.* actually in the link definition, but that doesn't seem to be the problem either (I used https://www.nirsoft.net/utils/ntfs_links_view.html to exhaustively enumerate all links in the folder tree).
Using Mercurial 4.6.1.

Related

How to recover repository when hg recover aborts?

Due to unknown reasons an HG command failed with the message:
abort: abandoned transaction found - run hg recover!
But when I tried to use recover to get rid of the abandoned transaction, I got a different error:
$ hg recover
rolling back interrupted transaction
attempted to truncate path/to/file to 12345 bytes, but it was already 456 bytes
and it aborted. The actual file was named like:
_some_filename.cs.i
which is an internal HG data file. So it seems like the HG data records for some_filename.cs are badly clobbered. And indeed running hg verify shows errors like this:
# hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
warning: revlog 'data/project/folder/some_filename.cs.i' not in fncache!
13691: empty or missing project/folder/some_filename.cs
project/folder/some_filename.cs#13691: manifest refers to unknown revision b269f6036278
project/folder/some_filename.cs#13741: manifest refers to unknown revision 651b96abf6da
...
(goes on for a long time)
which corroborates that the file is damaged but doesn't do anything useful to fix it.
hg recover --help doesn't show anything else that I can do...
And aside from the apparent damage to this one file, no ordinary HG commands work at this point. All of them report the repository is damaged. How can I recover from this?
I concluded that this was just a situation that HG was not designed to deal with. For whatever reason HG's internal data file (the .i file) was destroyed.
It was helpful to review the HG source code which produced the key error:
if fp.tell() < o:
raise error.Abort(
_(
b"attempted to truncate %s to %d bytes, but it was "
b"already %d bytes\n"
)
% (f, o, fp.tell())
)
(https://fossies.org/linux/mercurial/mercurial/transaction.py)
I'm not especially familiar with HG internals but this seemed to make it clear enough that HG was not able to handle the situation (understandably - arbitrary destruction of one of its data files leaves few options!).
The best workaround I could come up with was to manually copy the damaged .i file from another copy of the repository (on another PC). I didn't actually have local changes to that source file, so this seemed reasonable.
I copied the file (replacing the damaged original having made a backup first).
Then ran hg recover. This was now able to resolve the original "abandoned transaction" issue. Other HG commands work as well.
Worth noting that running hg verify still reports some errors (though many fewer). Perhaps the history of this single file is still not right; ultimately I think I will need to re-clone this repository but at least I can complete my immediate tasks without losing any work.

Build fails because repo pull fails with repository exists or timeout waiting for lock

Summary
I've got a build in TeamCity using Mercurial as the VCS and it's repeatedly failing for one of these two reasons:
hg init - repository already exists, except I deleted the whole directory before this so it definitely didn't exist.
hg pull - timed out waiting for lock, but the lock it's waiting for seems to be its own lock.
I'm really hoping that someone has come across this before, or might be able to give me some ideas for how to troubleshoot it anyway.
Setup
I'm using TortoiseHg as the mercurial client, and I've updated it
(and hence Mercurial) to version 4.6.1 on both the build server and
agent.
The agent is running on a Windows 7 VM.
I have a Windows 10 VM with the same TeamCity/Mercurial setup that's
working fine.
The repo being pulled from is located on a network share.
The folder being pulled to is on a secondary drive on the VM.
The two problems I'm seeing are as follows:
1. Hg init failure
Steps:
Manually delete the whole working directory from the buildagent, so that's .hg folder and it's parent folder.
The working folder doesn't even exist now, so TeamCity will have to completely recreate the folder.
Run build on TeamCity, with clean all files selected.
Build starts, creates directory and calls hg init.
Error message that hg init failed because the "repository already exists".
When I look at the directory I can see a .hg folder, and some files inside it including a wlock file.
2. Pull failure
Steps:
Leave the working directory from problem 1 in place, including the .hg directory.
Ensure any lock files are deleted and hg recover has been run just in case.
Run build on TeamCity, without cleaning the directory.
The logs show hg pull starting and bundling files, but also says "waiting for lock on working directory of E:\blah held by process '3408' on host 'BUILDAGENT'
3408 here is an example, the number changes every time and corresponds to the hg.exe process that seems to be doing the pull.
Eventually after a lot of bundling and files messages I'll get a message saying it timed out waiting for the lock.
But of course the lock it's waiting for seems to be the lock it's holding itself!
If I delete the wlock file during this time, I'll see messages saying "got lock after X seconds" and immediately after it "waiting for lock on repository E:\blah held by process '3408' on host 'BUILDAGENT'. Then eventually it'll fail with a message about an abandoned transaction.
Does anyone have any ideas?

mercurial update leads to abort (filename too long)

After creating a symlink to a file I checked the file into my repo and it worked fine up to the point when I shared the repo with my teammate who is using Windows (his code goes into branch 'devui', mine is on the default branch).
If I switch from his latest changes (being on branch 'devui') to my default branch using hg upd default I get this message:
abort: could not symlink to '...<complete contents of symlinked file here>...':
File name too long: <symlink-filename>
This occurs after about half of the checkout so only a part of the files will be updated and the rest of the files (after the abort) is missing.
I also tried a fresh clone and hg upd -C default leading to the same result. In the moment my 'default' branch is in an unusable state and I cannot get back to my branch. I can get back to the revision before the 'devui'-branch was created though.
So my question is: Is it possible to skip the bad symlink, ignoring the abort and continue with the rest of the files? (I could recover that file easily).
I'm using mercurial 2.3 on MacOSX (via brew).
Thanks for your help.
This thread from 2010 (much older version of mercurial) suggests cloning the repo on a windows box, which may be unaffected by the problem, and reverting the symlink there.

hg clone aborts reporting that it can't find .hg/store/lock in the repository

I created a repository on a remote machine using:
hg init
hg add
hg commit
The repository was created.
I cloned the repository on a local machine with no errors reported; The files seem to be there
Now I'm trying to make a clone of the clone (as a working copy) using:
hg clone "path to original clone"
It returns:
destination directory: "name of repository"
abort: No such file or directory: "path to original clone"/.hg/store/lock
What am I doing wrong?
Thanks
What filesystem is used on the partition where the main repository is ?
Actually, when Mercurial is doing some operations, it needs to lock the repository. For doing this it creates a symbolic link to an nonexistent file, when the filesystem supports it, in the .hg repository, telling every other processes that the repository can't be modified at this time. When symbolic links aren't supported by the filesystem, a normal file is created.
However, there's some problems with some FUSE filesystems, typically SSHFS with the follow_symlinks option activated. FUSE reports that he knows about symbolic links, but since SSHFS follows the symbolic link and the file doesn't exist, the "state" of the link is marked as unknown thus Mercurial thinks the repository isn't correctly locked and abort the operation.
I see you're using Cygwin, so maybe it's the same kind of problem with tools designed for UNIX on a windows filesystem. It's a strange, coworkers of mine are using Mercurial via Cygwin just fine.
I don't know if it's the case for you, but I lost nearly half a day on this problem. Maybe this answers can help some people in the future.
Please paste in the actual command that's failing and the output including the actual path to the clone that you're cloning. When you do the clone use --debug and --traceback too.
As a workaround you can can always try hg init newclone followed by hg pull -R newclone pathtooriginalclone, which is effectively equivalent except it doesn't use local hardlinks when possible.

Mercurial suddenly thinks all files have changed - waiting for lock on working directory

I've been using Mercurial v 1.1 for several months to version documents and other files. Yesterday it suddenly failed with the message:
waiting for lock on working directory
This happens in all projects I have under .hg control. Mercurial also thinks that all files in all projects have changed.
There is no .hg/store/lock file in the project it says it is waiting on the lock for.
The only thing that could have caused this is that Windows installed security patch on my computer overnight.
Has anyone else seen this with Mercurial?
I've had success by deleting that file .hg/wlock entirely if it exists, then everything is back to normal. If you are worried about losing something, just make a copy
For working directory, the lock is .hg/wlock. Does the file exists?
For rebuilding the dirstate (beware it won't restore changes like adds/remove/renames/copies), you can use hg debugrebuildstate.
I upgraded to hg version 1.3.1 and everything works now.
I must have had corruption in the 1.1.1 binaries (from Cygwin).
Cygwin is still on 1.1.
To find out which file is locking the directory, in your working directory:
hg debuglocks
This should give a result indicating which file is locking the directory e.g.
lock: free
wlock: (461232s)
To unlock use force:
hg debuglocks --force-wlock
or:
hg debuglocks --force-lock
for more information:
hg debuglocks -h
Note this paragraph:
Locks protect the integrity of Mercurial's data, so should be treated
with care. System crashes or other interruptions may cause locks to
not be properly released, though Mercurial will usually detect and
remove such stale locks automatically.